Lesson: Learning HTML: How Web Pages Work

A web page is really a collection of files on a computer, somewhere. What makes it a web page is that other computers can connect to that computer, and get access to those files. Because web pages are written in a standard format, Hypertext Markup Language (HTML), Web Browsers (e.g. Netscape, MS Explorer) are able to interpret and display these files in a standard way.

Where a Web Page Lives

At its simpliest level, a web page is a single HTML file residing in a directory in a computer. We will be thinking in terms of Unix servers, but it works pretty much the same way for PC servers too (except that they crash all the time).

For example, the web page for this class is saved in a folder on a Unix-based machine supported by SENS called "willrogers". This machine has an IP address that with the name

willrogers.nsm.buffalo.edu

A directory on the computer is dedicated to this course called:

/usr/local/info/geology/courses/gly560

The URL for this home page, however, is:

http://www.glyfac.buffalo.edu/courses/gly560/

You will notice that the end of the URL is the same as the actual file directory. This is because when the Web Server reads the URL for http://www.glyfac.buffalo.edu it immediately looks for files in the directory /usr/local/info/geology. There are other web pages in this same directory such as

http://www.glyfac.buffalo.edu/ITD/

which is just another directory in the /geology directory:

/usr/local/info/geology/ITD

When a browser is pointed to a directory, without a specific file name, it immediately looks for a file called "index.html". This is the index or starting page for the web page contained in that directory. In the directory /usr/local/info/geology/courses/gly560 there is a file called index.html which is the home page for the course. In the directory /usr/local/info/geology/ there is an index.html page that corresponds to the UB Geology home page,

http://www.glyfac.buffalo.edu

which is equivalent to:

http://www.glyfac.buffalo.edu/index.html

 

So in summary, the Web Server receives a request for a Web page such as

http://www.glyfac.buffalo.edu/index.html
and maps that Uniform Resource Locator (URL) to a local file on the host server.

In this case, the file

index.html
is somewhere on the host file system. The server then loads this file from disk and serves it out across the network to the user's Web browser. This entire exchange is mediated by the browser and server talking to each other using Hypertext Transfer Protocol (HTTP). This workflow is shown in the figure below.

 

Setting Up a Web Page

The reason that the web server knows how to map files to URL, is that the web adminstrator set it up that way. What that means to you and me, is that we need to contact the almighty superuser to create a web address. If the web address is already created, and we have permissions to write in the folder, we can create sub-webs by ourselves. For example, in this class you will create a web page in your home directory. Once that main web address set up, then you can create as many sub-webs in the at folder as you might like.

For example, everyone who has a SENS account has a home directory:

/nsm/home/mwbecker/

Using an automated procedure (next topic) you can tell the web server that you have a home page, and it will set up a folder in your home directory called "public_html", and give it the proper permissions for web access. This means that everyone in the world can get into this directory so be careful what you put in it:

/nsm/home/mwbecker/public_html

For this course, you will need to create a sub-web called gly560

/nsm/home/mwbecker/public_html/gly560

To create this sub web, all you need to do is make a new directory and add html files that have the permission equivalent to chmod 755.

 

Site Structure

Although you will be asked to create only a few html pages for this course, you should think about the structure while you are building your site. Typically, a site is designed such that in the root folder, there is html dedicated only to the "front page" or introduction to your site. For each topic, it is best to create a subfolder in which all html dedicated to that topic lives. The standard is to put all non-html related in a folder within that folder, called "assets". The assets folder is usually filled primarily with images.

If you are using a pc or a machine other than the server to create your web, make sure that the root folder of the authoring machine corresponds to the root folder of the server machine. In this way, all links can be relative within the site. If the site must be moved (migrated) to another server, then all the links will still work.

 

Exercise:

You will set up a web page that will be used for content from this class. This is where you will publish (make available to the internet) your Methods. First you must set up a personal web page, if you don't already have one. To do this follow the instructions extracted from the SENS web site.

Once you have your personal web page, use the mkdir command in a terminal window to create a directory in public_html called gly560. Change the permission to 755. Now create another directory WITHIN gly560 called HTMLtut. Also give this directory a 755 permission.

The you should now have a directory:

/nsm/home/username/public_html/gly560/HTMLtut/

This directory will be used in for the next lesson on basic HTML.

NEXT: Introduction to HTML