The grand design for information exchange on the Web has several novel features.
This article surveys how to use the Web, with a focus on accessing signal processing information resources. Printed references on using the Web are out-of-date when they reach the bookstores. The Web, its protocols, and HTML are evolving much too rapidly for the standard book publishing process to keep up. Instead, use the Web itself to learn of the latest supported protocols and what information resources have emerged. Sprinkled throughout this article are network locations for basic information about the Web and its resources. Another resource is the ongoing column Traveling the information highway on the Web and the Internet written by Bob Alden that appears in The Institute.
General Web Resources | |
---|---|
http://www.w3.org/ | Source for information about the World Wide Web, its protocols, and its future. |
http://www.cern.ch/ | The URL for where it all began: The CERN high-energy physics laboratory in Switzerland (ch is the Internet abbreviation for Switzerland). |
ftp.netscape.com ftp.ncsa.uiuc.edu | Internet addresses for obtaining (via ftp) copies of the public domain versions of Netscape and Mosaic, respectively. As described in a previous article[1], use anonymous ftp to acquire these software systems. |
http://www.rice.edu/ | Regarded as one of the best sites for starting general information searches |
Digital Signal Processing Resources (Good initial starting points) | |
http://www.ieee.org/sp/ | The home page for the Signal Processing Society. |
http://spib.rice.edu/spib.html | The Signal Processing Society's online database (supported by the National Science Foundation). |
http://www.ieee.org/sp/SPS.htmlA URL beings with a protocol specification followed by a colon and two slashes. Protocols include http, ftp, file (which means that local files can be viewed), mailto (the URL specifies an e-mail address), and gopher. Internet addresses for URLs frequently begin with www; thus, to explore whether a site provides information for the Web, try combining this prefix with a network address. For example, movie previews of a major Hollywood studio can be found at http://www.mca.com/ and MathWorks at http://www.mathworks.com/. If no pathname is given, a default filename, usually index.html, is used. This default name is determined by the site being accessed, not the Web. When given, the path is rooted somewhere in the site's file system, the exact location again determined by the information site's operating system. UNIX conventions are used for pathnames. If no // is given, the information is assumed to be located on the computer from which the last retrieval was made, and a path constitutes the remainder of the URL.
protocol://site/path
"Information" should be interpreted in a very broad sense:
Text, tables, graphics, images, video, and audio can all be "displayed" using current browsers.
Not only does HTML specify which information to display and how the display should appear, it expresses where on the Web the information resides.
The browser uses the URL to access the information sites with the specified protocol.
Information files can be transferred using the classic ftp method, or using the Web's new information transfer protocol http.
gopher searches and retrievals can also be made within the context of the Web.
(See the article[1] on the Signal Processing Information Base for a description of these classic transfer protocols.)
Browsers allow the user to print the displayed information and to print the HTML source for any page.
Information can also be sent from the user to a Web site using CGI (Common Gateway Interface).
Here, fields can be selected and text entries filled, then sent to a URL for processing or storage.
Thus, the Web can be used (and is) for completing application forms and for controlling simulations.
Using Browsers
Browsers are available free of charge for all the common computational platforms, be they PC, Macintosh, or UNIX based.
UNIX browsers typically use an X-windows based interface.
Commercial, fully supported, browsers are appearing on the market, which can be purchased using the Internet of course.
The browsers most commonly used are Mosaic, developed at the National Center for Supercomputing Applications (NCSA), and Netscape, which has both public domain and commercial versions.
These browsers display text and graphics by translating the information format expressed by an HTML file.
The user controls font size and window size;
thus, HTML files can only express formatting information broadly.
An example home page is shown in the accompanying figure, along with the HTML source file.
Text or graphics that can be clicked to obtain more information are highlighted in some fashion (for the moment, text is underlined and displayed in a special color, and graphics are surrounded by a special border).
Typically, as one moves the cursor over a highlighted section, the cursor's shape changes, indicating that it has been positioned correctly and that that information is just a click away.
While information is being loaded, the browser indicates how much is left to transfer, an estimate of the time remaining, and shows that it is busy with a dynamic graphic in one of the window's upper corners.
Note how the browser, Netscape in our example, uses the purple color to indicate which information resources\emrule links in Web parlance\emrule have already been selected and viewed.
The ones in blue have not yet been viewed.
(All browsers allow these colors to be altered by the user.)
The time frame used by the browser to define a previous search is not limited to the current session;
the duration of previous search history can be defined by the user.
Browsers are equipped to display text in a variety of fonts and styles, and to display images represented in the GIF (Graphical Interchange Format) format.
Sound, movies, PostScript files, and alternately formatted images (JPEG,
TIFF, etc.) are displayed using helper applications.
What these applications are can be controlled from within the browser, and are heavily system dependent.
HTML
Markup languages, such as HTML, specify where and how text and graphics are to be positioned.
Because the browser, the user (he or she selects which browser to use, window size, and overall font size), and the HTML file writer conspire to control the display, only general formatting can be specified in HTML.
HTML files consist solely of text-based commands, which means any editor can create a HTML file.
An example page and its corresponding source are shown in the example displaying SPIB's home page.
TeX users will be familiar with this way of formatting text;
WYSIWYG users might find this approach cumbersome, but a text-based specification means that the file is portable across all platforms and operating systems.
A section of text is formatted according to paired instructions that surround the text.
For example, the HTML phrase
<A HREF="http://www.rice.edu/">Rice University</A>specifies that the text Rice University can be clicked for information corresponding to the URL http://www.rice.edu/. In HTMLese, special locations in a file are anchors, and they are each sandwiched by a <A>text</A> pair. HTML formatting commands are always enclosed in angle brackets, with the formatting instruction consisting of one or more letters (A in this case) and the terminal member of the pair consisting of the same instruction enclosed in angle brackets and preceded by a slash (/). This example also illustrates that formatting commands can have options. Here, HREF is an option to the anchor command, and indicates that it can be clicked to load the specified HTML file. An option consists of the option's name, an equal sign (=), and its value. The file's URL is the value of the HREF option in this case, and corresponds to Rice University's home page. Commands are case-insensitive, even to the extent that upper and lower cases can be mixed within a command: <a HrEf="http://www.rice.edu/"> works just as well in the example. HTML commands can specify headings, bulleted and numbered lists, tables, and limited equation formatting. (Well, almost. At the time of this writing, HTML 3.0 was being defined. This protocol will eventually be extended to express equations. Equation formatting instructions do not correspond to TeX's definition. This said, recall the previous caution about the timeliness of printed descriptions of Web software and protocols.) Various text styles (boldface, italics, etc.) can be displayed, and ways of receiving user information specified. The various commands can be nested to achieve more extensive effects. For example, to let an image specify a URL, you would use the construct
<A HREF="http://www.somewhere/file.html"><IMG SRC="arrow.gif"></A>Example HTML commands are shown in the table.
Selected HTML Formatting Commands | |
---|---|
<HEAD> <TITLE> page title </TITLE> </HEAD> <BODY> Rest of page </BODY> | Outline of what is minimally needed to construct a Web page. The <HEAD> command places a title in the window's title bar. <BODY> frames the information displayed on the page. |
<H1>Text</H1> | Produces a level-1 headline that appears in a large, boldface font. Levels 1-6 are supported, with level~1 corresponding to the largest font. |
<B>text</B> | Format text in boldface. The <I> command produces italics, <U> underlined text, and <TT> typewriter characters. |
<P> | Begin a new paragraph. To end a line, but not start a new paragraph, use the <BR> command. Neither of these require pairing: No </P> is needed, for example. |
<UL> <LI> item 1 <LI> item 2 </UL> | Unordered list of the indicated items. Each item is preceded by a bullet in an unordered list. Note that the <LI> needs no pairing. An ordered list, in which items are assigned numbers, is produced by the <OL> command. |
<A>text</A> | Create an anchor. It can have the optional name label when NAME="label" is added to the after the anchor command. If this command were located in the file named file.html, a URL ending in file.html#label indicates not only to load the file, but start the display at the anchor having the name label. The option HREF="url" makes the anchor into a link, enabling loading of information located at the specified URL when text is clicked. |
<IMG SRC="url"> | Display the image specified by the URL. The image representation format is gleaned from the URL's postfix: .gif specifies the GIF format, .tiff the TIFF format, etc. |
A detailed description of HTML can be found at http://www.w3.org/hypertext/WWW/MarkUp/MarkUp.html. However, perhaps the best way to learn is to view others' pages; the browsers allow you to easily view the HTML file that corresponds to a displayed page.
<HEAD> <TITLE>Signal Processing Information Base (SPIB)</TITLE> <H1>Signal Processing Information Base (SPIB)</H1> <HR SIZE=4> </HEAD> <BODY> The Signal Processing Information Base (SPIB) is a project sponsored by the Signal Processing Society and the National Science Foundation. SPIB contains information repositories of data, papers, software, newsgroups, bibliographies, links to other repositories, and addresses, all of which are relevant to signal processing research and development. <P> For general information, send e-mail to <A HREF="mailto:spib@spib.rice.edu">spib@spib.rice.edu</A> containing the message: <PRE> send help </PRE> <UL> <LI> <A HREF="gopher://spib.rice.edu:70/11/SPIB/addresses"> addresses</A> <LI> <A HREF="gopher://spib.rice.edu:70/11/SPIB/bibliography"> Signal processing bibliography</A> <LI> <A HREF="http://spib.rice.edu/directory.html"> data</A> <LI> <A HREF="gopher://spib.rice.edu:70/11/SPIB/help"> help</A> <LI> newsgroup and e-letter archives <UL> <LI> <A HREF="gopher://spib.rice.edu:70/11/SPIB/news/e-letter"> E-Letter on digital signal processing</A> <LI> <A HREF="gopher://spib.rice.edu:70/11/SPIB/news/imdsp-e-letter"> E-Letter on image and multidimensional signal processing</A> <LI> <A HREF="gopher://spib.rice.edu:70/11/SPIB/news/comp.dsp"> USENET digital signal processing newsgroup</A> </UL> <LI> <A HREF="http://spib.rice.edu/papers.html"> papers</A> </UL> <HR> <ADDRESS> -- <A HREF="mailto:spib-admin@spib.rice.edu">spib-admin@spib.rice.edu</A> 1/3/95 </ADDRESS>
Of special interest to the signal processing community is the Signal Processing Information Base (http://spib.rice.edu/spib.html). It serves as the repository of data and reference materials, such as preprints of articles (or links to them) that provide early dissemination of results in a primitive electronic form. Many of the data files are quite large. Consequently, we have designed a HTML interface to Matlab programs that allows users to preview data and to extract data segments (or download the entire file). At the moment, waveform and spectral displays are provided; an example of how this interface can be used is shown in the accompanying figure. Depicted there is a spectrogram of a segment taken from an acoustic recording of a machine gun. We intend to add more previewing schemes in the future. To access the previewer, go to the SPIB home page and select data.