Paul McFedries' Web Home


Navigating the Internet, Third Edition

Chapter 11—Global Hypertext: The World Wide Web


Contents:
The Wide World of the Web
The Textual Web, Part I: The Line Mode Browser
The Textual Web, Part II: Lynx
More Good Web Starting Points
Dealing with Files
Winnowing the Web: Search Tools
Other Web Search Tools
A Web Publishing Primer
What Do I Need to Get Started?
The Basic Structure of an HTML Document

All growth is a leap in the dark, a spontaneous unpremeditated act without benefit of experience.—Henry Miller

Although Telnet, FTP, and Gopher are powerful ways to surf the oceans of the Internet, they're by no means the only seaworthy craft available to intrepid Netfarers. In recent years, a new type of ship has emerged and it's quickly becoming the standard method for Internet navigation: the World Wide Web.

The World Wide Web (or, if you prefer less of a mouthful: the Web, W3, or WWW) is no mere dinghy drifting directionlessly on the currents of cyberspace. No, this baby's a veritable luxury liner with all sorts of fancy-schmancy new bells and whistles that make navigating the Net easy and even—gasp!—fun. This chapter introduces you to the World Wide Web, takes you through some example Web sessions from the UNIX prompt, and even shows you how to publish your own Web material. The next chapter (Chapter 12, "Windows on the Web: A Field Guide to Graphical Browsers") gives you the scoop on some programs that'll help you get the most out of the Web.

**NOTE**
If you'd like to see a version of this chapter in World Wide Web format, send your
Web browser to the following URL:
http://www.mcfedries.com/Books/Navigating/sample.asp.

The Wide World of the Web

The World Wide Web was invented in the early 1990s by Tim Berners-Lee while working at the European Laboratory for Particle Physics (CERN) in Geneva, Switzerland. The idea was to develop a service that would make it easier to access and share Net resources, and to move from one island of cyberspace to another.

Hypertext: The Secret of the Web's Success

As you saw in Chapter 10, "Navigating By Menus: Gopher," selecting an item from a Gopher menu could grab you a file from some other Net locale or even telnet you to another Net computer. The World Wide Web takes this idea of leaping from place to place in cyberspace to its highest form: hypertext. Hypertext is information that contains certain keywords (or keyphrases) that are links to other Net resources. When you select the link, the underlying resource is automatically displayed on your terminal.

For example, a hypertext version of this chapter might designate, say, "European Laboratory for Particle Physics (CERN)" as a keyphrase. Selecting this link might then display a document that tells you more about CERN. That document may contain its own hypertext links that you can follow, and so on.

One of the key points about hypertext documents is that they don't have to present information in a hierarchical fashion (like, say, a Gopher menu). Any word or phrase can be designated a hypertext link. Heck, there's no reason the link even has to be a word or phrase; a picture or button would do just as well. And there's also no reason why the link should point to only text documents. Why not start a telnet session, FTP a file, or even access a Usenet newsgroup? As you'll see, the Web can do all this and more.

The Advantages of the Web

Hypertext is a powerful concept that's revolutionizing the way people navigate the Net. In particular, the Web's hypertext nature brings four advantages to the table:
  • The Web is non-linear. Services such as Gopher require you to trudge through menu after menu to get to the good stuff. Similarly, FTP sites store there files hierarchically in directories. Getting what you want out of these services isn't terribly hard, but it can by mind-numbingly tedious. With Web documents, you just select the link you want and off you go. (It should be said, however, that traipsing after links hither and thither around the globe presents its own navigational problems. Non-linearity can, sometimes, lead to chaos.)

  • The Web is graphical. When the Web's architects were designing the protocols that would transport hypertext documents, they were smart enough to anticipate the coming multimedia revolution. In particular, to they didn't restrict Web pages to just mere text. Instead, they made it possible for hypertext files to contain pictures, buttons, fancy fonts, radio buttons, check boxes, and more. Depending on the client software you use to access the Web, hypertext pages can be a real feast for the eyes and ears.

  • The Web is interactive. By "interactive" we don't just mean you can select links until your fingers are numb. Rather, some of the more sophisticated Web documents are truly interactive: you can enter text, fill in forms, select options, run programs, play sounds, even paint pictures. Clearly, this is a major leap forward from the Net's usual find-it-get-it-read-it modus operandi.

  • The Web is (almost) all-encompassing. As we've said, there's nothing that restricts hypertext links from pointing only to text files. The Web is configured in such as way that a particular link might connect with a Web server to open a different hypertext document, start an FTP session, tunnel to a Gopher menu, crank up an Archie or Veronica search, telnet to a remote computer, or download the articles from a Usenet newsgroup. In other words, the Web has a shot at becoming the first "Swiss Army Knife" of Internet services.

Some Web Words to Live By

Like all Net services, the Web has its own vernacular and acronyms. To help you out as we work through this chapter and the next, here's a rundown of some common Web jargon:
browser The client software you use to display and interact with a Web hypertext document. There are two kinds of browsers: those can display only text and those that support graphics and other non-text elements. If you're connecting to the Net through a shell account, you'll only be able to use one of the text-only browsers (either on your service provider's system of by telneting to a remote computer). To get the splendor of a graphical browser, you need either a SLIP/PPP connection or a direct connection. We'll cover two text-only browsers in this chapter—the basic line mode browser and Lynx—and we'll tackle some graphical browsers in Chapter 12, "Windows on the Web: A Field Guide to Graphical Browsers."

form A Web document used for gathering information from the reader. Most forms have at least one text field where you can enter text data (such as your name or the keywords for a search). More sophisticated forms also include check boxes (for toggling a value on or off), radio buttons (for selecting one out of several options), and push buttons (for performing an action such as submitting the form).
**NOTE**
If you're a shell account user, but you have Microsoft Windows installed on your
own computer, you may not be stuck in the text-only Web after all. A new browser called
SlipKnot is a Windows application that can display Web graphics without a TCP/IP
connection. See Chapter 12 for details.

home page The first hypertext document displayed when you follow a link to a Web server.

HTML (Hypertext Markup Language) The encoding scheme used to format a Web document. The various HTML symbols define hypertext links, reference graphics files, and designate non-text items such as buttons and check boxes. For the basics of HTML, see "A Web Publishing Primer," later in this chapter.

HTTP (Hypertext Transfer Protocol) The protocol used by the Web to transfer hypertext documents and other Net resources.

hyperlink Another name for a hypertext link.

URL (Uniform Resource Locator) A Web addressing scheme that spells out the exact location of a Net resource. Most URLs take the following form:

     <protocol>://<host.domain>/<directory>/<file.name> 

     <protocol>       The network protocol to use for retrieving 
                      the resource (such as http or ftp).
     <host.domain>    The domain name of the host computer where 
                      the resource resides.
     <directory>      The host directory that contains the resource.
     <file.name>      The filename of the resource.

Web server A program that responds to requests from Web browsers to retrieve resources. This term is also used to describe the computer that runs the server program.

The Phenomenal Growth the World Wide Web

The explosion of interest in the Internet over the past couple of years has produced some spectacular growth rates: Gopher traffic is up 197%, NSFnet traffic is up 110%. These are big numbers, but they pale in comparison to perhaps the most startling statistic ever generated by the Net:

In 1993, traffic on the World Wide Web increased by 443,931%.

Yes, you read that right: 443,931 percent! That's a truly mind-boggling number and it tells us, if nothing else, that something's going on here that we need to check out. (That kind of growth is, as you can well imagine, unsustainable. To wit: Web traffic grew by "only" 1,713% in 1994!) Clearly the advantages we looked at in the last section had more than a little to do with it, but I think we can also identify a few other reasons:

  • The online world was ready for a graphical approach. For the first 20 years of its existence, the Net was staunchly pro-text. But the advent of the Macintosh interface, the success of Windows, and the grudging acceptance of X Window in the UNIX community brought graphical user interfaces into the mainstream. This is especially true of the neophytes who've flocked to the Net in recent years. Most of them came from GUI backgrounds and so were eager to embrace a system that was at least slightly familiar.

  • Many of these same net.newusers were put off by the multitude of tools required to surf cyberspace. It was confusing to have to fire up separate programs for each service—programs that almost invariably had completely different interfaces. For them, the Web's "Jack-of-all-trades" approach is certainly appealing.

  • One of the biggest contributors to the Web's surge in popularity was undoubtedly the NCSA Mosaic browser. This was the first browser software that really took advantage of the Web's most innovative features. It's attractive interface showed the Web in its best light, and not just to UNIX mavens: Mosaic's early cross-platform strategy brought the wonders of the Web to Windows and Mac users, as well.

The Textual Web, Part I: The Line Mode Browser

Okay, enough theory. Let's actually get on the Web and start browsing around. For our initial foray, we'll use the basic line mode browser that's available on many systems. Later we'll show you how to use one of the most popular text-only browsers: Lynx.

Getting Your Web Feet Wet

Without further ado, let's crank up the Line Mode browser. You have two choices:
  • If the line mode browser is installed on your service provider's system, type www and press Return.
  • Use the line mode browser provided by CERN by telneting to www0.cern.ch (as explained in Chapter 7, "Remote Control: Telnet"). You don't need to enter a login name or a password.

As an example for this section, we'll telnet into CERN and borrow their browser for a while. Here's the opening hypertext document you'll see:

$ telnet www0.cern.ch
Trying 128.141.201.214...
Connected to www0.cern.ch.
Escape character is '^]'.


UNIX(r) System V Release 4.0 (www0)


Last login: Thu Mar 30 17:04:07 from 144.92.23.31
WWW Alert:  Can't save data to file -- please run WWW locally
                                              Welcome to the World-Wide Web
                              THE WORLD-WIDE WEB

This is just one of many access points to the web, the universe of
information available over networks. To follow references, just type the
number then hit the return (enter) key.

The features you have by connecting to this telnet server are very primitive
compared to the features you have when you run a W3 "client" program on your
own computer.  If you possibly can, please pick up a client for your
platform to reduce the load on this service and experience the web in its
full splendor.

For more information, select by number:

   A list of available W3 client programs[1]

   Everything about the W3 project[2]

   Places to start exploring[3]

Have fun!

1-3, Up, <RETURN> for more, Quit, or Help: 

With the line mode browser, the hypertext links are denoted by numbers in square brackets (e.g., Places to start exploring[3]). The bottom line tells us that this document has three links (1-3). Table 11.1 lists some of the commands you can run at the line mode browser screen. In each case, type the command's key or keys and press Enter.

Table 11.1. Some line mode browser commands.


Command Key What it does
Next page Enter Displays the next page (if any) of the current document. Top t Moves to the top of the current document. Bottom bo Moves to the bottom of the current document. Up u Moves up one page in the current document. Down d Moves down one page in the current document. Select link # <n> Selects the link given by the number <n>. Back b Returns to the previous document. Next n Selects the next link in the document that led to the current document. Previous p Selects the previous link in the document that led to the current document. Go g <URL> Displays the resource given by <URL>. Home ho Returns to the first document. Recall r Displays a numbered list of the documents you've visited. Recall r <n> Displays document number <n> from the Recall list. Quit quit Exits the Line Mode browser.
**NOTE**
The Next and Previous commands may require a bit more explanation. In a nutshell,
they enable you to navigate the links in a document without having to return to it. For
example, suppose a document has three links: [1], [2], and [3] and you select link [1]. If you
enter, say, the Next command while viewing the new document, the browser will select link
[2] from the original document. If you then run the Previous command, the browser
will select link [1] again.

Trying Out Some Links

Now that you know your way around the line mode browser, let's surf some links to get comfy with the Web. For starters, select the third link (Places to start exploring[3]). Here's the new page you'll see:

                                               Overview of the Web
[1]
                              GENERAL OVERVIEW OF THE WEB

There is no "top" to the World-Wide Web. You can look at it from many points 
of view. Here are some places to start.

by Subject[2]           The Virtual Library organises information by subject
                        matter.

List of servers[3]      All registered HTTP servers by country

by Service Type[4]      The Web includes data accessible by many other
                        protocols. The lists by access protocol may help if
                        you know what kind of service you are looking for.

If you find a useful starting point for you personally, you can configure
your WWW browser to start there by default.

___________________________________ 


1-5, Back, Up, <RETURN> for more, Quit, or Help:

From this page, we can surf the Web by subject (link [2]), by server (link [3]), or by service type (link [4]). Let's try the by Service Type[4] link:

                                     Data sources classified by access protocol 

                    RESOURCES CLASSIFIED BY TYPE OF SERVICE
See also categorization exist by subject[1] .  If you know what sort of a
service you are looking for, look here:

World-Wide Web servers[2] List of W3 native "HTTP" servers. These are generally
                          the most friendly. See also: about the WWW
                          initiative[3] .

WAIS servers[4]           Find WAIS index servers using the directory of
                          servers[5] , or lists by name[6] or domain[7] . See
                          also: about WAIS[8] .

Network News[9]           Available directly in all www browsers. See also this
                          list of FAQs[10] .

Gopher[11]                Campus-wide information systems, etc, listed
                          geographically. See also: about Gopher[12].

Telnet access[13]         Hypertext  catalogues by Peter Scott. See also: list
                          by Scott Yanoff[14] . Also, Art St George's index[15]
1-27, Back, Up, <RETURN> for more, Quit, or Help: 

Here we see a list of Internet services that you can access through the World Wide Web. Pressing Enter displays even more links to Net services:

                                     Data sources classified by access protocol (45/49)
                          (yet to be hyperized) etc.

VAX/VMS HELP              Warning: this is no longer working with http 1.0 .
                          This is a known bug . Try it[16] ; Available using the
                          help gateway[17] to WWW.

Anonymous FTP[18]         Tom Czarnik's list of (almost) all sites. Search them
                          all with full hypertext archie gateways[19] (or telnet
                          to ARCHIE[20] )-- An index of almost everything
                          available by anonymous FTP.

TechInfo[21]              A CWIS system from MIT. Gateway access thanks to
                          Linda Murphy/Upenn.

X.500[22]                 Directory system originally for eletronic mail
                          addresses. (Access: Slightly uneven view though gopher
                          gateway in Michigan[23], or telnet to UC London
                          service[24]).

WHOIS[25]                 A simple internet phonebook system.

Other protocols           Other forms of online data[26] .
1-27, Back, Up, <RETURN> for more, Quit, or Help:

We'll try out one of the services in a sec, but the third link—about the WWW initiative[3]—looks interesting, so let's check out it out (note that you can still select the link—by, in this case, pressing 3 and Enter—even if you've moved to the next page):

                           The World Wide Web Initiative:

  The Project (23/88)
                              THE WORLD WIDE WEB

   Announcement: 4th WWW conference[1]
   Clarification on security protocols[2]
   ___________________________________
   The WorldWideWeb (W3) is the universe of network-accessible information, an
   embodiment of human knowledge. It is an initiative started at  CERN[3], now
   with many participants.

   It has a body of software, and a set of protocols and conventions. W3 uses
   hypertext and multimedia techniques to make the web easy for anyone to roam,
   browse, and contribute to.

   The W3 Consortium[4]  now ensures the continued interopability which is W3
   though its rapid evolution. This is run by MIT[5] with INRIA[6] acting as
   European host, in collaboration with  CERN[7] where the web originated. (See
   hosts[8]).

   Everything there is to know about W3 is linked directly or indirectly to
   this document.
1-45, Back, Up, <RETURN> for more, Quit, or Help: 

This document tells you about the World Wide Web and features tons of links (a whopping 45 in all!). For example, you could select the third link—CERN[3]—to get info about CERN:

                                                         
                                                                   CERN Welcome
   The European Laboratory for Particle Physics, located near Geneva[1] in
   Switzerland[2] and  France[3].  Also the birthplace of the World-Wide
   Web[4].

   This is the CERN laboratory main server. The support team provides a set of
   Services[5] to the physics experiments and the lab. For questions and
   suggestions, see WWW Support Contacts[6] at CERN
   ___________________________________
   About the Laboratory[7] - Hot News[8] -  Activities[9] - About Physics[10] -
    Other Subjects[11] - Search[12]
   ___________________________________

About the Laboratory

      Help[13] and  General information[14], divisions, groups and
      activities[15] (structure),  Scientific committees[16]

      Directories[17] (phone & email, services & people), Scientific
      Information Service[18] (library, archives or Alice), Preprint[19] Server

      News from the Users' Office[20], current seminars[21], CERN schools[22],
      internal newsletters[23], internal news groups[24] and other news[25].
1-43, Back, Up, <RETURN> for more, Quit, or Help: 

Again, you get a boatload of links, all of them dealing with CERN and related topics. Suppose now you decide you'd like to access Gopher from the Web. You could get back to the Data sources classified by access protocol document by running the Back command twice, but let's try out a different method. Pressing r (the Recall command) and Enter displays the following:

Documents you have visited:-
R  1)   in Welcome to the World-Wide Web
R  2)   in Overview of the Web
R  3)   in Data sources classified by access protocol
R  4)   in The World Wide Web Initiative:
R  5)   CERN Welcome

This is a list of the Web documents you've linked to on this trip. As you can see, the document we want is number 3, so we can get there directly by typing r 3 and pressing Enter. Now that we're back in the Data sources classified by access protocol document, we can try out, say, some Web Gopher tunneling. To do this, you'd select the Gopher[11] link. This takes you to the Gopher server at the University of Minnesota:

Select one of:
         All the Gopher Servers in the World[1]
         Search All the Gopher Servers in the World[2]
         Search titles in Gopherspace using veronica[3]
         Africa[4]
         Asia[5]
         Europe[6]
         International Organizations[7]
         Middle East[8]
         North America[9]
         Pacific[10]
         Russia[11]
         South America[12]
         Terminal Based Information[13]
         WAIS Based Information[14]
         Gopher Server Registration[15]
[End]
1-15, Back, Up, Quit, or Help: 

From here, choosing menu items or displaying files is a simple matter of selecting the appropriate links. As you can see, lots of information is available, and the linkages make it a very rich system to use. It gives you the ability to follow a chain of interrelated ideas. However, you'll find that getting sidetracked is a real danger!

The Textual Web, Part II: Lynx

The line mode browser is fine as far as it goes, but it does suffer from some glaring limitations:

  • If you telnet into CERN's browser, you can't save files to your local computer. Also, some Internet services aren't available via the Web to non-CERN users.
  • The line mode browser doesn't support forms.
  • There's no way to save your favorite URLs for easy access later on.

To bypass these limitations and get an interface that's a step up from the line mode browser's hard-to-read jumble of hypertext link numbers, why not try the Lynx browser on for size? Lynx is becoming (if it's not already) the preferred Web surfing tool for shell account users and those without access to a direct TCP/IP Net connection. This section introduces you to Lynx, and then we'll use Lynx throughout the rest of this chapter to explore some truly useful Web nooks and crannies.

Launching Lynx

If you have Lynx on your local computer, you can start the program in one of two ways:

  • Type lynx and press Enter. In this case, the first document you'll see will either be your service provider's home page (if they have one), or a Web site that the provider has selected as a starting point.
  • Type lynx <URL> and press Enter. This method starts Lynx and loads the Web document specified by <URL>.

If you don't have Lynx on your system, you can telnet to one of the sites listed in Table 11.2 and use it from there.

Table 11.2. Telnet addresses for Lynx servers.


Address Location Login
ukanaix.cc.ukans.edu Kansas www www.njit.edu New Jersey www fatty.law.cornell.edu New York www sunsite.unc.edu North Carolina lynx www.twi.tudelft.nl Netherlands lynx
**NOTE**
If you'd prefer to use Lynx on your local computer, you can get it via anonymous
FTP from ftp2.cc.ukans.edu in the directory /pub/WWW/lynx. Once you're in, change
to the directory with the latest version of Lynx. For example, the current version is
2.3.7, so you'd run the command cd lynx2-3-7. In the new directory, get the
README file for instructions on how to proceed.

In our case, we'll start Lynx at our favorite World Wide Web launch pad: the Yahoo Web server. Here's how to get there:

  • From the UNIX prompt, type lynx http://www.yahoo.com/ and press Enter.
  • If Lynx is already loaded, type g to select the Go command, type lynx http://www.yahoo.com/, and then press Enter.

In either case, you'll see a screen similar to the one shown in Figure 11.1.

Figure 11.1 A Lynx screen showing the home page of the Yahoo Web server.

The Lay of the Lynx Land

There are three main elements to the Lynx screen:

  • Hypertext links appear either in a different color (they're red on our version of Lynx) or as boldface.
  • The current link is highlighted.
  • The third line from the bottom gives you instructions.
  • The bottom two rows give you a list of commonly used Lynx commands.
**NOTE**
The command reminders at the bottom of the screen are handy when you're a Lynx
neophyte, but you'll probably find you don't need them after you've become a true
Webmeister. To hide them and get more screen real estate for Web documents, type o
to select the Options command. In the Options Menu that appears, press u to select the
U)ser mode field and then press the Spacebar (or any other key except Enter) to cycle
through the choices. If you select Intermediate, Lynx removes the command mnemon-
ics from the bottom of the screen. If you select Advanced, Lynx uses the bottom line of
the screen to display the URL of the current document. When you have made your
choice, press Enter and then press r to return to the regular Lynx screen.

To move through the links, press either the down arrow key (to move right and down) or the up arrow key (to move left and up). Once the link you want is highlighted, press the right arrow key to select it (you can also press Enter). To go back to the previous document, press the left arrow key. Table 11.3 summarizes the other Lynx commands you can run.

Table 11.3. A summary of Lynx commands.


Command Key Description
Next link Down arrow Move to the next link. Previous link Up arrow Move to the previous link. Select link Right arrow Select the current link. Select link Enter Select the current link. Previous document Left arrow Return to the previous document. History list Backspace View your past links in the current session. History list Delete Same as backspace. Next page + or Spacebar Display the next page in the current document. Previous page - or b Display the previous page in the current document. Search document / Search for a string in the current document. Toggle source \ Toggle the current document between the source view and the rendered view. Link info = Display the address information for current file or link. Help ? or h Display the Lynx HELP! screen. Add bookmark a Add the current document to your bookmark file. Comment c Send a comment to the creator of the document. Download d Download the current document. Go g Go to a specific resource. Main screen m Return to the main screen (the home page). Next search n Find the next instance of the search string specified with \. Options o Set some Lynx (such as your e-mail address). Print p Print, save or download a document. Quit q Quit Lynx with confirmation. Quick quit Q Quit Lynx without confirmation. Reload Ctrl+R Reload the current document. View bookmarks v View your bookmark file. Cancel transfer z Cancel the document or image transfer in progress.

A Sample Session

Okay, let's give Lynx and test drive by trying out a few links. We'll begin by pressing the down arrow key to highlight Yahoo's What's Cool? link, and then pressing the right arrow key to display the screen shown in Figure 11.2.

Figure 11.2. Yahoo's COOL LINKS page.

From here, we'll press the down arrow key until the Britannica's Birthday Calendar link is highlighted. Pressing the right arrow key displays the BRITANNICA'S BIRTHDAY CALENDAR page, as shown in Figure 11.3. This is your first look at a Web form. In this example, the idea is to enter a month and a day and the document will provide you with biographies of all the famous people born on that day.

Figure 11.3. The BRITANNICA'S BIRTHDAY CALENDAR page.

To remind you, a form is a special type of Web document that's used to gather information from you. Although forms can have traditional hypertext links to other documents, most form links are actually special elements that are similar to the controls you see in dialog boxes (check boxes, radio buttons, etc.). To help you out when filling in a form, Lynx's instructions line (the third line from the bottom) tells you what kind of control is currently highlighted.
For the BRITANNICA'S BIRTHDAY CALENDAR page, the first control is an option list (also called a selection list). Here's how they work:

  1. With the option list highlighted, press Enter. This displays a list of options. For example, the Pick a month option list looks like this when you open it:

         *************
         * January   *
         * February  *
         * March     *
         * April     *
         * May       *
         * June      *
         * July      *
         * August    *
         * September *
         *************
    

  2. Use the up arrow and down arrow keys to highlight the option you want.
  3. Press Enter. Lynx fills in the control with the new value.

You'd follow the same procedure to fill in the and a day option list. When you're done, you'd then highlight the third control on the form: Show biographies. This control is a form submit button and selecting it sends the form data to the Web server for processing. A few seconds later, a new page appears showing you the list of people born on the date you choose. Figure 11.4 shows the results for August 23rd, my (Paul McFedries) birthday (nudge, nudge, wink, wink).

Figure 11.4. The results that appear after the form has been submitted.

A Closer Look at the Yahoo Server

As we've said, Yahoo is one of the best places on the Net to begin your World Wide Web expeditions. Let's head back to Yahoo so we can take a closer look at how it's set up. You can use any of the following three methods to return to Yahoo's home page:
  • Press the left arrow key until you're back to the home page.
  • Type m and press Enter to display the following prompt: Do you really want to go to the Main screen? (y/n) [n]. For prompts like these, the letter in the square brackets is the default choice and you select it by pressing Enter. In this case, press y (for yes), instead.
  • Press Backspace or Delete to see a list of the links you've visited in the current Lynx session, as shown in Figure 11.5. Highlight the document you want (Yahoo, in our case) and press the right arrow key.

Figure 11.5. Press Backspace or delete to see a list of the places you've been in the current Lynx session.

While most Web servers are dedicated to a specific topic or a specific category, the Yahoo server is designed to be a general directory of Internet resources. At the time of writing, Yahoo's database of Net resources contained over 35,000 entries and was growing fast.

**NOTE**
35,000 sure sounds like a truckload, but, these days, Web documents number in the millions.
The Yahoo authors, however, have chosen quality over quantity, so you know most of
the links will be worthwhile.

As you might expect, then, the Yahoo home page is jam-packed with links of all kinds. The top line, for example has links for What's New? (Web documents added to the Yahoo database in recent days), What's Cool? (resources that the Yahoo authors think are "cool"), What's Popular? (Yahoo's top 50 categories and documents), and Search (finding stuff in the Yahoo database; discussed in detail below), Help (instructions for using Yahoo).

The nitty-gritty of Yahoo, however, is the list of subject categories. These categories run the gamut from Art to Society and Culture. In each case, the entry shows the number of links associated with the category and whether or not anything new has been added to the category in the last three days.

To get a feel for how things are organized, let's do some surfing. Suppose, for example, that we want to know who directed the movie One Flew Over the Cuckoo's Nest. At first you might think we should use the Yahoo Search facility to accomplish this. Unfortunately, Search only covers the titles, URLs, and descriptions of the documents in the Yahoo database, so it isn't likely to be of use in this instance.

Instead, we'll begin our quest by selecting the Entertainment category in Yahoo's home page. Figure 11.6 shows the resulting document.

Figure 11.6. The links in Yahoo's Entertainment category.

There are over 40 different links for the various Entertainment subcategories. The Movies and Films subcategory is the one we want, so we select it. Figure 11.7 shows the result.

Figure 11.7. Selecting the Movies and Films link displays this screen.

Wow! That's a lot of potential movie info (there are over three hundred links in this document). This illustrates one of the big problems you'll find as you navigate the Net: sometimes there's just too much data.

You'll notice that the first three links ( Actors and Actresses, Animation, and Awards ) look like most of the other links we've seen, but the next two are different. Instead of numbers beside them, we see either nothing (in the case of CinemaSpace title page and xcohen reference ) or a description (in the case of the CineMedia Site link). This tells us that these are links to specific files (text most likely).

Our first instinct was to try the Directors link, but that proved to be just a bunch of links pointing to documents about famous directors. No help at all, unfortunately. A better choice might be the The Internet Movie Database. So we give it a shot and get the screen shown in Figure 11.8.

Figure 11.8. Yahoo's links to Cardiff's Movie Database Browser.

There are actually seven links displayed for the Movie Database Browser: one with an asterisk (*), and several others from around the world. We'd normally choose the link closest to us (USA), but we know from reading Yahoo's Help page that sites with asterisks are particularly good. So we select the The Internet Movie Database - UK [*] link. Now, for the first time, we're out of Yahoo and doing some serious surfing. The link took us to a Web server called The Internet Movie Database, located in Cardiff, England. Most of the screen is introductory rambling, so we press the Spacebar to head for the next page (shown in Figure 11.9).

Figure 11.9. The second page of the Internet Movie Database.

Ah, now we're getting somewhere. This page is actually a form with a text entry field that allows you to enter a movie title. (The field is the blank space to the left of the Search for THIS movie title link.) So we type cuckoo and then select the Search for THIS movie title link.

**NOTE**
When filling in a text entry field, you can press the Backspace key if you make
a typing boner. If the entire field is a mess, you can start over by pressing
Ctrl+U.

Why just cuckoo? Well, for three reasons:

  • As a general rule, the more words you give an Internet search engine, the longer it takes to process the query.
  • The word "cuckoo" is unique enough that it should find the movie we want without matching a million other movies.
  • We're lazy and prefer to type as little as possible.

The server chugs away for a few seconds and then displays a list of the movie titles that matched out input string, as shown in Figure 11.10.

Figure 11.10. The results of the search for movie titles containing the word "cuckoo."

Well, hallelujah! There's a link titled One Flew Over the Cuckoo's Nest (1975)! Now, index finger trembling in anticipation, we select the link. Hmmm. The first page shows only the production company, the running time, and a few other useless (to us) tidbits. So we press the Spacebar to move to the next page as shown in Figure 11.11 and—success!—there's the info we needed: The director's name is Milos Forman.

Figure 11.11. The fruits of our labors: this page shows the director's name.

Bookmarks: Navigational Shortcuts

Whew! That was quite a journey just to pick up a little scrap of information. The good news, though, is that we'll know where to go in future if we have a similar query. Actually, Lynx's bookmarks feature can make the journey even shorter. In the same way that we use a real bookmark to remind us of where we left off in a book, so too can a Lynx bookmark "remind" us of places we've visited in cyberspace.

For example, suppose we think we'll be using the Movie Database regularly. We can set a bookmark for it by following these steps:

  1. Press Backspace or Delete to display the list of Web documents we just perused.
  2. Highlight the link for the main page of the Movie Database (Main Page: The Internet Movie Database at Cardiff UK) and then open it.
  3. Press a. Lynx displays the following prompt:
    Save D)ocument or L)ink to bookmark file or C)ancel? (d,l,c):
  4. Press d to add the document to the list of bookmarks.

Now, you can head directly to a bookmarked document by pressing v. This displays a list of your bookmarks, as shown in Figure 11.12.

Figure 11.12. Use bookmarks for easy access to your favorites Web sites.

**NOTE**
If you'd like to change the order of your bookmarks or edit the name of a book-
mark, you can edit the bookmark filing using any UNIX text editor. Look for the file
lynx_bookmarks.asp in your home directory. Keep in mind that this is a hypertext
document, so you'll see all kinds of strange HTML hieroglyphics. We'll explain what
they mean later in this chapter in the section "A Web Publishing Primer."

More Good Web Starting Points

Yahoo is certainly a great place to begin your World Wide Web globe-trotting, but there are plenty of other servers that can get those with Web wanderlust off to a rousing start. Here's a list of a few of our other fave-rave Web stomping grounds:

  • EInet Galaxy An extensive list of Web links organized by subject. Top-notch search page with tons of links to other Web search engines and links to reference materials such as a dictionary, thesaurus, weather map, and the CIA World Factbook.
    URL: http://www.einet.net/
  • NCSA's Starting Points for Internet Exploration Unlike the other URLs listed here, this one isn't a subject-oriented catalog. Instead, it's just a list of handy resources that can help Web newcomers get organized and Web veterans get started. There are lots of home pages listed, as well as links to Net resources such as WAIS, Gopher, FTP, USENET, and more.
    URL: http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/StartingPoints/NetworkStartingPoints.asp

**NOTE**
The NCSA, by the way, is the National Center for Supercomputing
Applications. It's the home of Mosaic, the graphical browser that we'll be covering
in the next chapter.

Dealing with Files

The Web's forte is handling hypertext documents, of course, but most other types of files pose no special problems. In particular, you can use your browser to surf anonymous FTP sites and grab files without resorting to any of those cryptic FTP commands we looked at back in Chapter 8. Even better, your browser will automatically log you in as "anonymous" and send your e-mail address as the password. Why it's almost enough to make FTP fun (almost).

**NOTE**
Before rushing in to some anonymous FTP Web sessions, you should tell Lynx your e-
mail address. First, type o to select the Options command. In the Options Menu that
appears, press p to select the P)ersonal mail address field and type in your e-mail
address. Press Enter when you're done and then press r to return to the regular Lynx
screen.

As an example, we'll use Lynx to anonymous FTP to rtfm.mit.edu—the repository of USENET FAQ lists. In particular, we'll head for the directory /pub/usenet-by-group/news.answers. In Lynx, we press g to get to the Go command's URL to open: prompt. (If the prompt displays a URL from a previous Go command, you can press Ctrl+U to start with a fresh prompt.) We then enter the following URL:

ftp://rtfm.mit.edu/pub/usenet-by-group/news.answers

Lynx connects to the rtfm.mit.edu server, logs us in, and then displays the directory as shown in Figure 11.13.

Figure 11.13. Web FTP: the /pub/usenet-by-group/news.answers directory at rtfm.mit.edu.

A Lynx FTP screen sports the name of the current subdirectory at the top of the screen and the first link (Up to usenet-by-group in Figure 11.14) takes you to the parent of the current directory. The rest of the screen displays the contents of the current directory:

  • The first two columns tell you the date and time the file was last modified.
  • The third column tells you whether the entry is a Directory or, if it's a file, what type of file it is (e.g., text/plain).
  • The fourth column is a series of links to each file or directory. File sizes are also shown, where applicable.

To open one of the Directory entries, just highlight the link and press the right arrow key (or Enter). When you see the file you want, you have two choices: you can view the file, if it's text (by highlighting it and pressing the right arrow key), or you can download the file to your local computer. For the latter, you need to follow these steps:

  1. Highlight the file you want to download.
  2. Press d to select the Download command. Lynx gets the file and then displays the DOWNLOAD OPTIONS screen with (probably) a single link: Save to disk.
  3. Select the Save to disk link. Lynx displays the Enter a filename: prompt with the file's remote name as the default.
  4. If the remote name is fine, press Enter. Otherwise, enter a name yourself and press Enter.

**NOTE**
As you can see in Figure 11.13, the list of links in the /pub/usenet-by-group/
news.answers
directory is a whopping 17 pages long! This may seem like a big moun-
tain to cross if you're looking for, say, the www subdirectory. But you can chop that
mountain down to a mere molehill by taking advantage of Lynx's search feature. Press /
to display the Enter a search string prompt, type in your string (say, www), and then
press Enter. Lynx will immediately jump to the first link that contains the text you
entered.

Winnowing the Web: Search Tools

The Web's startling rate of growth applies not just to the number of people traversing its pages, but also to the sheer volume of available content. These days, it seems everyone is publishing Web pages. (You can get in on the act, too. Just check out the section "A Web Publishing Primer," later in this chapter.) The upshot of all this publishing promiscuity is that the Web is now home to hundreds of thousands of documents. How the heck does a poor Net navigator keep track of it all?
The easy answer is: you don't have to! There are lots of Web resources that enable you to search for specific words or phrases in URLs, document titles, and even, to a limited extent, document text.

The Yahoo Search Form

To get some practical experience with a Web search engine, let's check out Yahoo's Search link. As an example, we'll try to track down an acronym we saw in an astronomy newsgroup: BEM.

To display the Yahoo Search document, either select the Search link in Yahoo's home page, or else use the Go command to head for the following URL: http://www.yahoo.com/search.asp. Figure 11.14 shows the Yahoo Search document. As you can see, it's a form with all kinds of bells and whistles. Here's a rundown of the various controls that are available:

Figure 11.14. Use the Yahoo Search form to hunt down Web documents by URL, title, or comment.

  • Find all matches containing the keys The first control is a text entry field that you use to type in your search text. You can enter part of a word, a whole word, or multiple words separated by spaces. In our example, we aren't likely to be successful trying to find the acronym "BEM" directly. Instead, we'll try to find a document that contains a list of acronyms and hopefully BEM will be among them. So, we'll type acronym in the text entry field.
  • Search This is the form submit button. You select this link when you've filled out the other options.
  • Clear This is called a form reset button. It returns the form to its default values.
  • Find matches in These three controls are check boxes. You toggle them on and off by highlighting them and pressing Enter. (A check box is activated when an asterisk appears inside the parentheses.) Activate Title to search through the titles of the documents in Yahoo's database; activate URL to search the document URLs; activate Comments to search the short descriptions that appear beside Yahoo's document links. We'll leave all three activated for our search.
  • Case sensitive matching This is another check box. When activated, it tells Yahoo to only find documents that match the exact combination of uppercase and lowercase letters you entered. Unless you're sure about the case of the string, leave this option deactivated.
  • Find matches that contain Although Lynx call these three fields check boxes, they're actually radio buttons: they represent a series of mutually exclusive choices, so only one of them can be active at a time. (As with check boxes, you activate an option button by highlighting it and pressing Enter.) This group sets up a Boolean search on multiple words:
  • If you select the At least one of the keys (boolean or) option, Yahoo will match documents that contain any one of the search words you entered.
  • If you select All keys (boolean and), Yahoo only matches a document if it contains, anywhere, all of the words you entered.
  • If you select All keys as a single string, Yahoo only matches documents where the words you entered appear together. For example, if we'd entered acronyms usenet as our search string, Yahoo won't match a document that contained, say acronyms in usenet.
  • Consider keys to be These two fields are also radio buttons. Select Substrings if you want Yahoo to match documents with words that contain your search string. Select Complete words if want Yahoo to only match documents that contain the exact word you entered.
  • Limit the number of matches to Use this option list to select the maximum number of documents you want Yahoo to return.

When we ran our search, Yahoo returned eight matches, as shown in Figure 11.15.

Figure 11.15. Yahoo found eight documents that matched our search string.

The first document isn't much use to us, but the Acronym list sounds promising. Selecting it displays the WORLDWIDEWEB ACRONYM SERVER, which has, among others, the following link: Search for an acronym. We selected that and ended up at the document shown in Figure 11.16.

Figure 11.16. An example of a "searchable index" document.

Notice the prompt at the bottom of the screen:

This is a searchable index. Use 's' to search

You'll come across these so-called searchable indexes from time to time in your Web hunts. To use them, press s and then, at the Enter a database query: prompt that appears, type your search text and press Enter. We entered bem and, after a few seconds, found the info we were looking for (see Figure 11.17): a BEM is a Bug-Eyed Monster.

Figure 11.17. The results of our search.

Other Web Search Tools

As we have said, there's no shortage of search tools for Web hunting and pecking. Yahoo is one of the better ones, and most of the Web catalog sites that we listed earlier have good search facilities (especially the one at Einet Galaxy). The following Web locales are search-only sites and are all highly recommended:

CUI W3 Catalog This service combines several large resource databases (including Yanoff's List, CERN's Virtual Library, and the NCSA's Starting Points for Internet Exploration) and provides a simple search form for tracking down words or phrases. If you're familiar with Perl, you can use Perl regular expressions as search criteria.
URL: http://cuiwww.unige.ch/cgi-bin/w3catalog

CUSI-R (Customizable Unified Search Index via Radio Buttons) This is the "one-stop shopping" site for Web searching. CUSI-R lets you select from several different Net search engines (including Einet Galaxy, the CUI W3 catalog, Lycos, and WebCrawler). You can also search Gopherspace (Veronica), Usenet FAQs, WAIS, Archie, and more. The form uses radio buttons to select the indices to use during the search.
URL: http://www.scs.unr.edu/~cbmr/net/search/cusi-r.asp

JumpStation II This engine enables you to search for words in document titles, headers (the second-level (H1) headings; see "A Web Publishing Primer" for details), and subjects (a set of keywords found in the document). You can also use JumpStation II to search for URLs (the URL Scanner) and Web servers (the Server Scanner).
URL: http://js.stir.ac.uk/jsbin/jsii

Lycos This is one of the most extensive databases of Web documents anywhere on the Net. At the time of this writing, the "big" Lycos catalog boasted 2.7 million unique URLs. (This is about 10 times the number that appeared in the catalog a mere 6 or 7 months ago.) It's also one of the most popular search engines, so getting on during peak hours can be problematic. One of nicest features of the Lycos database is that it indexes not only the URL and title of a document, but also the first 20 lines and the 100 most significant words.
URL: http://lycos.cs.cmu.edu/

WebCrawler This is a large (one million URLs) database of Web documents that lets you search not just the usual titles and URLs, but also the content.
URL: http://www.biotech.washington.edu/WebCrawler/

World Wide Web Worm This is a sophisticated search engine that lets you enter simple search strings or complex grep-like expressions. It searches document titles, home pages, and even links in documents (which the WWWW calls "citations").
URL: http://www.cs.colorado.edu/mcbryan/WWWW.asp

A Web Publishing Primer

These days, merely surfing the Web isn't good enough to earn your Webmaster merit badge. To become a true wizard of the Web, you need to publish your own home page and thus establish your own personal port on the high seas of the Net. This section shows you how to wield the basic building blocks of Web documents—the codes and symbols of the Hypertext Markup Language (HTML).

**NOTE**
A complete discussion of Web publishing (Weblishing?) is well beyond the range of
this book. The basics we cover will be more than enough to get you started, but if you
plan any serious HTML hot-rodding, you'll need some beefed-up sources. Here are
some suggestions:
  • For starters, keep an eye what's happening in the Usenet newsgroups comp.infosystems.www.authoring.asp and comp.infosystems.www.authoring.misc. These groups feature dozens of posts daily from intrepid Web authors proffering tips and troubleshooting advice.
  • Read the two-part article "World Wide Web Frequently Asked Questions (FAQ)" that's available in any of the comp.infosystems.www.* groups. In particular, read Part 2 for info on providing Web material. (If you don't see the FAQ, try anonymous FTPing to rtfm.mit.edu, head for the /pub/usenet-by-group/news.answers/www/faq directory, and grab the files part1 and part2. Alternatively, point your Web browser to the following URL:
    http://sunsite.unc.edu/boutell/faq/www_faq.asp
  • On the Web, check out The Web Developer's Virtual Library at http://www.charm.net/~web/. This site is brim full to bursting with documents related to developing Web documents.
  • Get yourself a copy of the Sams book Teach Yourself Web Publishing with HTML in a Week, by Laura Lemay. This excellent book takes you from the basics of HTML all the way to advanced forms-based pages. A must.
  • Examine the underlying hypertext documents for the pages you read. Most Web browsers have a command that lets you see the source document of the current page. This means you see the page with all its HTML symbols in place. In Lynx, for example, press \ to toggle the current document between the normal view and the source view.

What Do I Need to Get Started?

Publishing your own Web pages is not as difficult as you might think. In fact, everything you need to get started is included in the following list:

A text editor or word processor Surprisingly, a Web document is nothing but text: the usual collection of alphanumeric characters with a few special HTML symbols thrown in to the mix. Your Web browser does most of the work. All those links, form controls, and graphics are just the browser's interpretation of the HTML symbols.

**NOTE**
If you use a word processor to create your Web documents, be sure to
save them as pure text without any extra formatting or symbols.

A Web server This is the toughest part of Web publishing. Although most Web browsers can figure out and display any HTML document, to get access to the full HTTP protocol your document needs to be farmed out by a Web server. Ask your service provider if they have set up a Web server. If they haven't, you can also try leasing space on a Web server. To get a list of such servers, point your browser to the following site:
http://union.ncsa.uiuc.edu/HyperNews/get/www/leasing.asp
A little imagination and creativity With thousands, nay millions, of Web documents already out there, you can't expect people to visit your home page regularly if all you do is slap up some text. Hey, you're publishing in the big leagues now! Crank up that imagination and show us all something really unique.

That's it! With just these three items, you can create home pages that rival anything published by big-time corporations and universities with massive budgets.

The Basic Structure of an HTML Document

Okay, let's get down to brass tacks and start creating some Web documents. As an example, we're going to convert this chapter into a hypertext Web page complete with links to all the Web URLs we've mentioned. (To see the finished product, send your Web browser to the following URL: http://www.mcp.com/sams/books/navigate/sample.asp.)

As we have said, hypertext documents are just plain text files with a few HTML codes thrown in for good measure. These codes—or markup tags, as they're called—define the structure of the document and tell the Web browser how to display the page. These tags are normally used as follows:

<TAGTYPE> The text affected by the tag </TAGTYPE>

There are three parts to this sequence:

<TAGTYPE> This is the type of tag you're using. This first tag in the sequence tells the browser what type of structure you're creating. For example, the tag <TITLE> is used for the title of the document.

The text affected by the tag This is the text the browser must format according to the type of tag given by <TAGTYPE>. For example, <TITLE>My Home Page defines the phrase "My Home Page" as the title of the document.

</TAGTYPE> This tag marks the end of the tag initiated by <TAGTYPE>. For example, to end the <TITLE> tag, you would use </TITLE>. As you'll see, not all tags require the use of an end tag.

The next few sections take you through the basic tags as we build our hypertext version of this chapter.

The <HTML> Tag

The <HTML> tag appears at the top of the file and indicates that the file is an HTML document. The corresponding end tag (</HTML>) appears at the end of the document. So our hypertext document always begins with the following simple structure:


<HTML> </HTML>

The <HEAD> Tag

The next tag we need in our document is <HEAD>. This tag defines the header of the document, which we can then use to add information used by the Web server (such as the <TITLE> tag discussed in the next section). Again, we need to add the corresponding end tag, as well (/HEAD>). Our structure now looks like this:


<HTML> <HEAD> </HEAD> </HTML>

The <TITLE> Tag

The <TITLE> tag defines the title of the document, and it always goes inside the header (that is, between the <HEAD> and </HEAD> tags). Here's the general form of the <TITLE> tag:

<TITLE>The title goes here</TITLE> For our document, the title is "Navigating the Internet - Sample Chapter," so we add the appropriate <TITLE> tag:


<HTML> <HEAD> <TITLE>Navigating the Internet - Sample Chapter</TITLE> </HEAD> </HTML>

The <BODY> Tag

After the heading comes the body of the document. This section—denoted by the <BODY> tag and its corresponding end tag, </BODY>—is where all the document's text, links, and images will appear. Here's what our document looks like with these tags added:


<HTML> <HEAD> <TITLE>Navigating the Internet - Sample Chapter</TITLE> </HEAD> <BODY> </BODY> </HTML>
That takes care of the preliminaries. The preceding seven lines constitute the basic structure for all HTML documents. You could even, if you like, display this document with a Web browser. Of course, all you would see would be a blank page with a title, so it's not particularly interesting. To add some substance to it, let's insert this chapter's first paragraph:


<HTML> <HEAD> <TITLE>Navigating the Internet - Sample Chapter</TITLE> </HEAD> <BODY> Although telnet, FTP, and Gopher are powerful ways to surf the oceans of the Internet, they are by no means the only seaworthy craft available to intrepid Netfarers. In recent years, a new type of ship has emerged, and it is quickly becoming the standard method for Internet navigation: the World Wide Web. </BODY> </HTML>

The <P> Tag (Paragraphs)

When a Web browser interprets a hypertext document, it ignores blank lines and carriage returns. So how are you supposed to structure your document into separate paragraphs? That's what the <P> tag is for. You just insert a <P> before each new paragraph, like so:


<HTML> <HEAD> <TITLE>Navigating the Internet - Sample Chapter</TITLE> </HEAD> <BODY> Although telnet, FTP, and Gopher are powerful ways to surf the oceans of the Internet, they are by no means the only seaworthy craft available to intrepid Netfarers. In recent years, a new type of ship has emerged, and it is quickly becoming the standard method for Internet navigation: the World Wide Web. <P> The World Wide Web (or, if you prefer less of a mouthful: the Web, W3, or WWW) is no mere dinghy drifting aimlessly on the currents of cyberspace. No, this baby is a veritable luxury liner with all sorts of fancy-schmancy new bells and whistles that make navigating the Net easy and even--gasp!--fun. This chapter introduces you to the World Wide Web, takes you through some example Web sessions from the Unix prompt, and even shows you how to publish your own Web material. The next chapter (Chapter 12, "Windows on the Web: A Field Guide to Graphical Browsers") gives you the scoop on some programs that will help you get the most out of the Web. </BODY> </HTML>
**NOTE**
Notice how the <P> tag doesn't need a corresponding </P> end tag. That's because the
browser assumes that the previous paragraph ends where the new one begins. However,
the </P> tag is part of the new HTML specification (it's called HTML 3.0), so you
might see it crop up from time to time.

The <Hn> Tags (Headings)

The last of the basic tags are the <Hn> tags, where n can be any integer from 1 to 6. These tags represent different levels of headings in the document, where <H1> is the highest level and <H6> is the lowest level. How these different headings appear depends on the browser you use. In Lynx, for example, <H1> headings are centered and appear entirely in uppercase. In the graphical browser Mosaic, <H1> headings appear in a large, bold font, whereas <H2> headings appear in a smaller, regular font. Figure 11.18 shows the Mosaic view of the following hypertext:


<HTML> <HEAD> <TITLE>Navigating the Internet - Sample Chapter</TITLE> </HEAD> <BODY> <H1>Chapter 11 - Global Hypertext: The World Wide Web</H1> <P> Although telnet, FTP, and Gopher are powerful ways to surf the oceans of the Internet, they are by no means the only seaworthy craft available to intrepid Netfarers. In recent years, a new type of ship has emerged, and it is quickly becoming the standard method for Internet navigation: the World Wide Web. <P> The World Wide Web (or, if you prefer less of a mouthful: the Web, W3, or WWW) is no mere dinghy drifting aimlessly on the currents of cyberspace. No, this baby is a veritable luxury liner with all sorts of fancy-schmancy new bells and whistles that make navigating the Net easy and even--gasp!--fun. This chapter introduces you to the World Wide Web, takes you through some example Web sessions from the Unix prompt, and even shows you how to publish your own Web material. The next chapter (Chapter 12, "Windows on the Web: A Field Guide to Graphical Browsers") gives you the scoop on some programs that will help you get the most out of the Web. <H2>The Wide World of the Web</H2> The World Wide Web was invented in the early 1990s by Tim Berners-Lee while working at the European Laboratory for Particle Physics (CERN) in Geneva, Switzerland. The idea was to develop a service that would make it easier to access and share Net resources and move from one island of cyberspace to another. <H3>Hypertext: The Secret of the Web's Success</H3> As you saw in Chapter 10, "Navigating by Menus: Gopher," selecting an item from a Gopher menu could grab you a file from some other Net locale or even telnet you to another Net computer. The World Wide Web takes this idea of leaping from place to place in cyberspace to its highest form: hypertext. Hypertext is information that contains certain keywords (or keyphrases) that are links to other Net resources. When you select the link, the underlying resource is automatically displayed on your terminal. <P> For example, a hypertext version of this chapter might designate "European Laboratory for Particle Physics (CERN)" as a keyphrase. Selecting this link might then display a document that tells you more about CERN. That document may contain its own hypertext links that you can follow, and so on. </BODY> </HTML>
Figure 11.18. How Mosaic formats <H1> and <H2> headings.

Taking the Document for a Test Surf

Once you have a basic structure for your document, as well as some text and a few headings, you should try it out to see if it works. There are three ways to go about this:

  • If you have a shell account and are using Lynx on your service provider's computer, first use FTP to upload the file to your home directory. Then load Lynx and specify the name of the HTML file as part of the Lynx command lynx <filename>. For example, if the hypertext document is named homepage.asp, you would start Lynx with the following command: lynx homepage.asp.

  • If you're using a Web client on your own computer, most browsers let you open a "local" file (that is, one on your computer). (In Netscape Navigator, for example, select the File menu's Open File command. We'll talk more about Netscape in Chapter 12) This enables you to try out a hypertext document without even connecting to the Internet.

  • If you would like to try an official Web connection, upload the HTML file to the directory designated by the provider of your Web server and then use any client to load the URL. (The Web service provider should tell you the proper format of the URL.)

The Jump to Hyperspace: Adding Links

So far, our document is decidedly undynamic: it doesn't do much other than display some text in a structured format. To make this a truly useful page, we need to add some links that will take you to all the URLs we have mentioned in this chapter.

To add links to a document, you use the <A> tag:

<A HREF="URL">Link text</A>

Here, URL is the URL you want the link to point to, and Link text is the document text that the user chooses to select the link. For example, consider the following sentence from our document:

The World Wide Web was invented in the early 1990s by Tim Berners-Lee while working at the European Laboratory for Particle Physics (CERN) in Geneva, Switzerland.

For our first link, we would like to point to the Web site for CERN (http://www.cern.ch/). In the following sentence, we have added the <A> tag and made the phrase European Laboratory for Particle Physics (CERN) the link text:

The World Wide Web was invented in the early 1990s by Tim Berners-Lee while working at the <A HREF="http://www.cern.ch/">European Laboratory for Particle Physics (CERN)</A> in Geneva, Switzerland.

A Few Other HTML Goodies

When you check out our finished hypertext document, you'll notice lots of other elements that we haven't covered: italics, boldface, numbered lists, and so on. Here's a quick summary that tells you some of the HTML tags we used to create these elements:

Italics To italicize a word or phrase (in browsers that support italics; this is ignored in Lynx), surround the text with the <I> and </I> tags: <I>Text</I>. For example, to italicize the phrase "World Wide Web," you would use the following code: <I>World Wide Web</I>.

Boldface To display a word or phrase in boldface (again, only in browsers that support it; Lynx underlines text tagged as boldface), enclose the text with the <B> and </B> tags: <B>Text</B>. For example, to boldface the word "hypertext," you would use the following code: <B>hypertext</B>.

Monospaced To display a word or phrase in a monospaced, typewriter-like font (such as Courier in a graphical browser; Lynx ignores monospace), surround the text with the <TT> and </TT> tags: <TT>Text</TT>. For example, the following code displays "http://www.cern.ch" in a monospaced font: <TT>http://www.cern.ch</TT>.

Numbered list To create a numbered list of items, you use the <OL> and </OL> tags (OL stands for "ordered list"). Each item in the list is preceded by the <LI> tag. Here's an example:

For the latter, you need to follow these steps:

<OL>
 <LI>Highlight the file you want to download.
 <LI>Press d to select the Download command. Lynx gets the file and then displays the DOWNLOAD OPTIONS screen with (probably) a single link: Save to disk.
 <LI>Select the Save to disk link. Lynx displays the Enter a filename: prompt with the file's remote name as the default.
 <LI>If the remote name is fine, press Enter. Otherwise, enter a name yourself and press Enter.
</OL>

Here's how it appears in Lynx:

For the latter, you need to follow these steps:

 1. Highlight the file you want to download.
 2. Press d to select the Download command. Lynx gets the file and then displays the 
    DOWNLOAD OPTIONS screen with (probably) a single link: Save to disk.
 3. Select the Save to disk link. Lynx displays the Enter a filename: prompt with the 
    file's remote name as the default.
 4. If the remote name is fine, press Enter. Otherwise, enter a name yourself and 
    press Enter.

Bulleted list To create a bulleted list of items, you use the <UL> and </UL> tags (UL stands for "unordered list"). Again, each item in the list is preceded by the <LI> tag, like so: <P>

The line mode browser is fine as far as it goes, but it does suffer from some glaring limitations:

<UL>
<LI>If you telnet into CERN's browser, you can't save files to your local computer. Also, some Internet services aren't available via the Web to non-CERN users.
<LI>The line mode browser doesn't support forms.
<LI>There's no way to save your favorite URLs for easy access later.
</UL>

Here's the Lynx version:

The line mode browser is fine as far as it goes, but it does suffer from some glaring limitations:
  * If you telnet into CERN's browser, you can't save files to your local computer. Also, some Internet services aren't available via the Web to non-CERN users.
  * The line mode browser doesn't support forms.
  * There's no way to save your favorite URLs for easy access later.

Using the Web

The Web is a true net.wunderkind: it's already an outstanding tool despite its relative youth. It's a tool that's well worth playing with and following as it develops. The most interesting part is how the Web can make other Internet tools much easier to use. We think the creators of the Web should be encouraged and applauded for a tool that's well on its way to becoming the all-time great Internet resource (if it isn't there already). For a look at how it's becoming even greater, see the next chapter, "Windows on the Web: A Field Guide to Graphical Browsers."


Back to the Navigating the Internet Home Page

Copyright © 1995-2008 Paul McFedries and Logophilia Limited