Previous | Table of Contents | Next |
A problem with most search tools is that they dont index the content in relation to the tagging. So even though you can search for John and Smith, you cant insist that they show up in the same paragraph. You cant search for Save only where its mentioned in a FORM element, or Picture only within an A (link) element. You especially cant search for Save where its in the documentation of menu items for a software package (since HTML has no tag for that).
SGML-aware servers usually can do all these types of searches, and enable you to be much more precise in your queries. Even if your client is getting only HTML, it should still be possible for the server to search the original SGML for you. If it cant, you might want to look for a better server rather than changing your data.
Note:
Precise searching is really a technical term for information retrieval nerds; it means that you dont get a lot of irrelevant information in addition to what you want. Watch out for the opposite problem too, even though it doesnt happen as much with most Web search tools; if you dont get enough of the information you do want, thats called a recall problem.
URLs are notorious for failing when people get a new computer, a new disk, or even a new job. This is because the data moves. SGMLs support for Formal Public Identifiers (FPIs) helps you get away from this. FPIs are expected to be generic, permanent identifiers for data, instead of todays location of the data.
Public Identifiers can sit around until needed, and only then get converted into a physical location. That way if the datas location changes, you can fix the problem by updating one table instead of every link everywhere in the world. The Internet standards groups are working on URNs, which are a lot like FPIs and help in the same way.
Note:
FPIs and URNs work a lot like a cellular phone. Behind the scenes a lot has to happen: as you move around and get out of range for one cell, you get switched to the next one. When that happens, you dont have to give everybody a new phone number to reach youyou dont even notice that anything happened at all. Thats because your cell phone isnt identified by location (like a URL) but by a special, unique serial number that always stays the same for your individual phone, wherever you carry it (this is a lot like an FPI or URN).When someone calls you, the phone company looks up the serial number, and then uses it to look up your current location in a table somewhere and send the call to where you really are. When your phone switches to a new cell, because the old one is too far away, your phone sends a little message to the new cell saying Im here. The new cell tells the phone company to update the table. Files on the Web will be able to move around just as freely once theyre identified by names instead of locations.
In the meantime, you can make your links a little safer in two ways. First, use HTMLs new BASE feature so that all your URLs are relative to a place you specify up frontthen you only have to change that one place when something happens. Second, use SGML linking capabilities and leave it to your server to translate more generic pointers, such as FPIs, into URLs when it sends out your data. The next section goes into a lot more detail about some of the linking capabilities that have been built on top of SGML using the HyTime standard.
SGML provides a lot of tools for representing different kinds of documents, with the most important one being the ability to make up new tag-sets whenever you need to. But it only provides limited capabilities for creating linksSGML doesnt provide a standard way to mark up links between separate documents, for example.
ISO noticed this, and recently put out a standard that specifically extends SGML to deal with hypertext and multimedia. Its called HyTime.
Note:
HyTime is built on top of SGML. After reading this book, youll be ready to find out about HyTime. You can learn about it in Steve DeRoses and David Durands book called Making Hypermedia Work. Your bookstore may have it already, or they can order it from the publisher.
HyTime specifies ways of using SGML to represent the things needed for hypermedia. These include references to documents, graphics, video, sound, and other media, as well as particular places in them; links to connect pairs or groups of such references; and ways to schedule presentations out of referenced pieces. Obviously, this is a whole lot more than <A>. HyTime, therefore, is a pretty big standard. The good news is that, like SGML, you can do a great deal even if you only learn a little bit of HyTimeyou can always learn more as you need it.
HyTime support can be added on top of any system that supports SGML. Like SGML, HyTime has some very complex features, but you can accomplish a lot even by using only a few of the most basic features. Several SGML products have already added support for those more basic features already, and more are on the way.
HyTime links have three important parts:
The A element in HTML is like a link with the location address included (on the HREF attribute). Whatever the HREF points to is one anchor; the other anchor is the contents of the A element.
Previous | Table of Contents | Next |