Introduction
For many of us, the Internet and Web Browsing has become a daily
activity. Whether it is for checking stock prices, buying food, doing work, ordering books
and music, or just to browse a favorite site, web browsing has become an institution
in our lives much the way television is. Have you ever wondered how this whole
web thing works, though? This tutorial is designed to explain the history
and concepts of the Web and how it works technically. After you browse to
a site, you will understand actually how it is done and how your computer
retrieves this information. Our first stop, is the history of the Web.
History of the Web
The Web finds its roots at CERN, the European Organization for
Particle Physics Research, in 1989 when Tim Berners-Lee and Robert Cailliau
designed a system called Enquire. This system would allow documents to have
links between different pieces of data whether they be files on the local
computer or stored on a remote computer. The main motivation is said to have
been the ability to access library information that was spread across multiple
servers at CERN.
On November 12th, 1990, Tim Berners-Lee published a formal proposal
called "Information Management: A Proposal" that outlined the World
Wide Web as we know it today by using a system for displaying information
called HyperText, which was first described 1945
by a man named Vannevar Bush, to link documents into a large scale information
pool. The following day on November 13th, 1990, Tim Berners-Lee created the
first web page and that following December wrote the first web browser and
web server. The name of this program that was created, was called the WorldWideWeb.
Thus we have the name we use today.
As development of the WorldWideWeb continued, more people from
around the world started to get involved , until in 1992 one of the first
web browsers that supported graphics was introduced called Pei-Yuan Wei's
Viola. This led to Marc Andreessen of NCSA, releasing in 1993 a program for
UNIX called Mosaic. Mosaic was the spark that marked the rise in popularity
of the World Wide Web and no longer kept it confined in the academic circles.
Marc Andreesen went on to form Mosaic Communications, which then evolved into
Netscape Communications. Netscape was the first mainstream graphical
Web Browser.
As time went on, more features started to be added to the browser,
more companies got on the Internet, and personal homepages started springing
up everywhere, and the Web as we know it was created.
The Technology behind the Web
The web works on three standards. These standards are generally
adhered to by all companies that make products that work with the World Wide
Web.
These standards are:
URL (Uniform Resource Locator): These are the
addresses that you enter into your web browser to connect to a web site. The
URL is broken up into 4 parts which are the protocol, the hostname, the port
number, and the path that you are requesting.
- Protocol:
- The protocol part of an URL is the funny string of characters that you
see before the hostname. Examples are http, ftp, telnet:, etc. They are
separated from the hostname with a colon and two forward slashes ( ://
). These protocols tell your browser what type of service to use when
you connect with the web browser to the hostname. If you leave the protocol
off your address, by default the Web Browser will assume you are using
the HTTP protocol, which is for connecting to web sites, so there is no
need to type in the http:// every time you go to a web site. If you specify
another protocol like ftp, then the browser will act as an ftp client
that will enable you to connect to a ftp server to download files.
- Hostname:
- The hostname is the address you are going to. For example, if
you are going to the address http://www.bleepingcomputer.com, then www.bleepingcomputer.com
is the hostname.
- Port Number:
- The port number is a number that you can append to the hostname with
a colon ( : ) between them. For example http://www.bleepingcomputer.com:80.
If you leave the port number off, which almost everyone does, then the
browser will automatically use port 80 as that is the default port for
the http protocol.
- Path:
- This is the path on the server, culminating with the filename you are
trying to reach. For example, the URL http://www.bleepingcomputer.com/examples/example1.html.
The path in this case is /examples/example1.html. This path corresponds
to an actual directory structure on the web server. So on the web
server there is a root directory, an examples directory underneath that
root directory, and a file called example1.html underneath that.
HTTP (Hyper Text Transfer Protocol): This is a defined process of how to transfer information between a web browser and a web server. All web browsers and web servers follow this process.
HTML (Hyper Text Markup Language): This is the language used in web pages to format text, images, and page layout. This language is in pure text and is entered into a file that has an ending of html. It is possible to put HTML in documents that do not end in html, but for the purpose of this tutorial, we are only focusing on pure HTML documents. The text in these documents contain special codes, called tags, that tell the web browser when it reads the file how to format the text. Lets try an example below.
If you were to create a file called helloworld.html and save it on your hard drive, you could then open this file with your browser and have it displayed. The contents of this file will have the following text:
Hello World!!!!
If you were to open up this document in your browser you would see the following:
Hello World!!!!
As you can see the text, Hello World, has been shown to you in bold print. This was because we enclosed the words in the tags <b>, which means any text after it will be bold, and then the ending </b> means this is the end of the bold formatting. All tags in HTML have a beginning tag, that starts the formatting, and an ending tag, that stops the formatting. There are many many more tags available to use in HTML, the bold ( <b> ) tag being just one of them.
Web Browser and Web Servers
In order for the Web to work you need web browsers and web servers which work hand in hand. The web browser is a piece of software that is used to interpret the information found in an HTML document and display the content of that document based upon the HTML tags found within it. A web server is a computer that stores HTML documents, otherwise known as web pages, and waits for connections from web browsers. When a web browser connects to a web server, the web server sends the requested document, if it exists, back to the web browser for display.
Actually Browsing a Web Site
Now that you understand the basics behind how the Web works, lets walk you through the actual process of how your computer goes to a web site and displays it in your browser.
The first step of course is to open your web browser, whether that be Netscape, Internet Explorer, or Mozilla. When your browser opens, you have the option of connecting to another web site. In the address field, type the location of where you would like to go. For this example, lets go to www.bleepingcomputer.com.
You type http://www.bleepingcomputer.com, or www.bleepingcomputer.com as the http:// is optional, in the address field and press enter or go. The below diagram explains what happens:

As you can see, when you try to connect to a site, your web browser opens an Internet connection and tries to connect to the web server specified in the host portion of the URL. If it connects, the web browser sends the web server the path portion of the URL. If that path exists on the web server, the web server sends the content of the HTML file back to your browser. Your browser reads through the HTML of the document, following the instructions found there as it displays the information on your screen.
That is all there is to it to retrieving a web page from a remote computer.
Conclusion
I hope you have enjoyed this tutorial and as always if you have any questions, do not hesitate to ask us them in the forums.
--
Lawrence Abrams
Bleeping Computer Basic Internet Concepts Series
BleepingComputer.com: Computer Support & Tutorials
for the beginning computer
user.