Where would you like to go tomorrow?™
(Note that if your browser displays the word
above then you should contact your browser maker and ask them to support the wider range of ISO character entities.)™
Copyleft notice
The HTML composite DTD is Copyleft 1996--97 by Silmaril Consultants and is protected by the terms of the GNU General Public License, a copy of which is included in the distribution of this software as file gnugpl.html.
You may freely distribute and use this material, provided that nothing is done to prevent its further distribution or use, and subject to the condition that it may not be distributed without a similar condition, including this condition, being imposed on the subsequent user. Modifications should be reported to the author so that they can be incorporated in a controlled manner.
HTML Pro is an up-to-date, reliable, and robust
Document Type Description (DTD) for the HyperText
Markup Language, which is used to create hypertext pages
as used
in the World-Wide Web.
HTML Pro is designed for use by professional document authors and designers to replace the earlier versions of HTML which were subject to constraints for standards work which made them inflexible. HTML Pro provides extended support for editors or authors which is not available in most other versions.
The present DTD is built as a composite of all known HTML DTDs to date. It has been arranged in such a way as to allow authors and designers to support their target browsers but still ensure that the files they create are conformant. This means they are valid, parsable instances of SGML (ISO 8879 --- Standard Generalized Markup Language, the language in which HTML is written).
Conformance to a standard is important for businesses and other organizations, and for authors, designers or other individuals because it provides flexibility. This means that portable, reusable, long-term or large-scale Web information can be edited, controlled, and reused on multiple platforms using a wider range of software than is possible when tied to a specific manufacturer.
Web pages can now easily be based in the same SGML
syntax used for a wide range of other corporate and institutional
document projects (archives, servers, databases, marketing and
publication systems,
The HTML Pro DTD is available free of charge from the sources shown below, and is updated following the professional advice of the members of the www-html mailing list.
HTML Pro is an SGML system conforming to International Standard ISO 8879 --- Standard Generalized Markup Language.
The first HTML Document Type Description was devised to support the original versions of World-Wide Web software in the early 1990s. A revised and more widely-discussed version was codified as HTML 2.0 by the Internet Engineering Task Force Working Group on HTML, and adopted by the IETF as a draft standard, RFC1866, in November 1995.
A more advanced but experimental version, HTML+ (now obsolete), had been under discussion for several years, and many of its features were republished as HTML3 in an Internet Draft (March 1995). On expiry, this draft was taken back into discussion, and passed to the World-Wide Web Consortium (W3C), when it took over HTML development from the IETF in 1996.
The W3C has attempted to reconcile the sometimes conflicting aspirations of some of its members by publishing two experimental versions of HTML in May and July 1996: respectively Wilbur, the codename for HTML 3.2, which is largely HTML 2.0 disguised with the addition of stylesheet and scripting support; and Cougar, which is a less structured version of HTML3 with some changes to the tables model, and some of the new content markup removed. While both have their adherents and proponents, both are highly selective, and neither can be said to be in any way stable or comprehensive. Both also suffer from a lack of discipline, because they try to accommodate historical, even obsolete, practice in a fully backwards-compatible manner. The result is that both use a flat content model for the body of the document, and have reintroduced --- without warning or explanation --- several elements which the original designers had clearly marked for removal.
Companies
implementing Web software (particularly browsers), and individuals
implementing HTML pages, have throughout this period
continually sought additional markup facilities for their own and their
customers' purposes. In some cases element names were simply invented on
an
The edition of HTML presented here, codenamed Aardvark, is a composite of all known versions to date. It contains all the elements published (as far as they can be found or identified) in the various forms of HTML since 1990, in a manner which can be used by editors, browsers, parsers, databases, search engines, formatters, indeed any conforming application of ISO 8879 --- Standard Generalized Markup Language (SGML, the language in which HTML is written).
HTML Pro is intended for use by the information professional who has neither the time nor the inclination to participate in the interesting but destructive competition between rival browser makers, and who is unwilling to mislead her clients by creating anything other than valid, conformant instances of HTML.
There are now 140 elements in the DTD sharing
245 different attribute names. Of these attributes, seven are common to
all elements (see below). The element with
the greatest number of attributes is INPUT
(52).
The
HTML Pro DTD is available in
several forms: this file with active links is maintained at http://www.arbornet.org/~silmaril/dtds/html/htmlpro.html
and the distribution files are held at ftp://ftp.ucc.ie/pub/html/
The full distribution as a single compressed archive for UNIX (htmlpro.tar.gz), PCs (htmlpro.exe), or Macs (htmlpro.zip.hqx), and a VAX/VMS save-set version is in preparation (htmlpro.sav);
A minimal installation (panofkit.exe) consisting of the DTD, stylesheet, navigator, and support files to demonstrate the use of a free SGML browser/viewer for PCs (Panorama: installation instructions below).
The htmlpro.dtd file on its own, for existing users who are upgrading, or who don't want or need the background and supporting files;
This document (htmlpro.html), also available for use with Panorama (htmlpro.sgml, see below), and as a PostScript file (htmlpro.ps) for printing or viewing;
The full or the minimal distribution should be saved to a suitable temporary directory before installing (unless you are running WinZip, which automates this). There is a DOS htmlpro.zip version available for non-Windows users.
The installation needs to be able to create (or if they exist, to write to) the /usr/local/lib and /html directories (on a PC, this is C:\usr\local\lib and C:\html; on a Mac this means creating folders at the top level in your hard disk).
If you use a shared computer system, you may not have permission to create directories or folders at this level, in which case you need to obtain that permission from your systems administrators, or have them do the installation for you.
It is quite possible to install HTML Pro elsewhere on your hard disk, but you will need to consult the manuals for your applications software to see what directories the resulting files need to be moved into before it will work. The list of files in the distribution (see below) shows in [square brackets] the directories or folders into which files go by default.
The UNIX version unpacks with the commands:
gunzip htmlpro.tar.gz tar -xvf htmlpro.tar
UNIX users who intend to use Emacs with psgml-mode to parse SGML Public Identifers in their files will need to create downcased soft links to the subdirectories DTD and ENTITIES within any of the Public Owner directories in /usr/local/lib/sgml, for example with the commands (in each directory needed):
ln -s DTD dtd ln -s ENTITIES entities
The PC/Windows version is a WinZip self-installer which unpacks automatically when you double-click it. If you prefer not to execute such programs, you should download the DOS version instead and unzip it manually.
Emacs users under DOS and Windows do not need to change any directory names.
The Mac version is a BinHex'd zip file which can be unpacked by UnStuffIt or similar dearchiving program. The BinHexing should be unnecessary, in which case you can download the DOS zip file referred to above, and drop it onto UnStuffIt. Macs are poorly catered for by the SGML software community, so little information is available about where files like DTDs and character entity collections should be installed: if you have this information, please mail us.
The versions of HTML included in this edition were taken from public copies of DTDs and fragments on the Web: this list appears in slightly altered form later on, showing the codes used in the DTD to identify sources.
1.0(CERN, 1992)
*Asterisked items were not produced by the companies
themselves, but are compatibility releases
generated from the
HTML3 DTD or other sources in
order to provide some foundation for the experiments and claims of the
companies. While the browser makers are occupied in developing the
interface, it is unlikely that these DTDs will be
revised.
My thanks to all the many authors and contributors to these DTDs, whose notes and comments have made it easier to work out what to do, and whose research has made it possible to identify to some extent which browser can support which elements.
The current version of the HTML Pro DTD is 0.11, and it is released for comment by the relevant interested parties on the www-html mailing list and elsewhere. Please forward all comments and suggestions to the mailing list for discussion.
Very few changes to structure have been made, although those familiar with the internals of previous versions will notice the large amount of additional material from the other sources, including the unilaterally-introduced elements invented by some browser makers (which in many cases has involved a guess at their structure, as the companies have often been reluctant to provide details).
The header is as defined in HTML3, with the
addition of NOSCRIPT
and BGSOUND
. The
content model has been rearranged to allow intermingled META
and LINK
elements, and still try to help editing systems
keep header elements in some semblance of logical order (my thanks to
Joe English for this code). The use of META
is now
expanding to include the
Dublin
Core
metadata qualifiers (Title, Subject, Author, Publisher,
OtherAgent, Date, ObjectType, Form, Identifier, Relation, Source,
Language, and Coverage).
The most significant structural change
in the text body is to the way in which content models are used. The
original mechanism was for %body.content;
to contain both
structural and descriptive elements as peers, so that there was no
distinction between, say, a list and some running text in italics. The
implication was originally that inline markup, which was
identified as %flow;
in earlier versions, should be
contained in a paragraph-level element, and this is in fact what
SoftQuad's
HoTMetaL Pro does when it imports existing invalid
HTML.
The DTD has not been modularised in any way: on the contrary, it is currently an entirely flat file with only the barest minimum of entities. This is deliberate at this stage, as some non-SGML tools were used in the conversion of the code from the various sources, and following the text formatting established by Near&Far for the layout proved useful for this purpose, where each declaration and each attribute definition is followed by a line containing a comment. In a future version, use will be made of the work of Altheim, Connolly, Maler, and Allen in defining the modularised version of HTML.
There are now four classes of elements, defined by the entities following:
Parameter entity | Elements represented |
Elements which are generally used to contain the continuous text of the document. |
DIV , CENTER ,
H1 to
H6 , P , UL , OL ,
DL , DIR ,
MENU , PRE , XMP , LISTING ,
BLOCKQUOTE , BQ ,
MULTICOL , NOBR , FORM , TABLE ,
ADDRESS , FIG , BDO , NOTE ,
and
FN ; plus WBR , LI , and LH |
Elements which usually contain special-purpose material, or no text material at all. |
|
Elements within the
| Descriptive or analytic markup: Visual
markup: Hypertext
and graphics:
Mathematical:
Documentary: |
Mathematical content. |
|
The distinction is that the insertions
class can be
peer with text
as well as with
structure
, whereas text
can nest only within
structure
.
Some changes have been made to the content models of the
text
elements to accommodate this, but the only noticeable
ones are SUP
and SUB
, which may now contain
math elements even when used in non-math mode. The exclusion exceptions
for
MATH
have been significantly reduced because it would
appear from discussions with mathematicians that it is in fact
unobjectionable for math to contain many of the elements previously
proscribed. The effect of these changes is an increase in the amount of
mixed content (see note below), which some
analysts regard as pernicious.
The exclusion exceptions for PRE
now include
TT
and BR
, as neither have any relevance in
preformatted fixed-width material, but SUB
and
SUP
are now permitted, as it seems perfectly reasonable
that an author might want to represent typewritten material containing
subscripts and superscripts. PRE
can now include IMG
to enable character-cell-sized image glyphs, and APPLET
,
EMBED
, and OBJECT
to allow the dynamic
generation of fixed-width content.
The ICADD fixed attributes which were added by the late Yuri Rubinsky have been reinserted for those elements to which they were attached in RFC1866. Work is proceeding with the developers of the ICADD DTD to identify what equivalents to add for the remaining elements, and this is being coordinated with a similar exercise being undertaken on the Text Encoding Initiative (TEI) DTD. A note to RFC2070 says:
HTML contains SGML Document Access (SDA) fixed attributes in support of easy transformation to the International Committee for Accessible Document Design (ICADD) DTD "-//EC-USA-CDA/ICADD//DTD ICADD22//EN". ICADD applications are designed to support usable access to structured information by print-impaired individuals through Braille, large print and voice synthesis. For more information on SDA & ICADD:
Following the informal discussions at SGML'96, some recent suggestions from Harvey Bingham have been incorporated in respect of Tables (additional attributes).
I am
grateful to Foteos Macrides for keeping me updated on the
implementations used for Lynx and other browsers. His
request for a TITLE attribute for FORM
and
MAP
triggered a rework of the common
attributes which are now applied to all elements:
The LANG and DIR (and some changes to the SGML Declaration) are reintroduced directly from the Internet Draft on Internationalization of HTML (now RFC2070). Authors should note the following (from RFC2070) and ensure that they use only valid numeric character references:
The
SGML declaration, like that of HTML
2.0, specifies the 32 character numbers 128 to 159 (decimal) as
UNUSED. This means that numeric character references within
that range (
The relevant RFCs referred to here and in the DTD source code are included in the distribution in /usr/local/lib/sgml/IETF/HTML for want of a better location.
Scott Preece suggested
the NOINDEX attribute to provide a means of allowing an
author to specify to indexing engines that the document is not
considered meaningful for index-gathering (valuable for pages
originally generated from scripts but stored as temporarily static
files).
Foteos also
brought to my attention the need to replace more of the material
originally in HTML3 which had fallen away, as well
as the attributes supported by Lynx, including the DISABLED
attribute for INPUT
, OPTION
, SELECT
,
and TEXTAREA
; the PLAIN attribute for
UL
; CONTINUE for OL
; ALT
for BGSOUND
, APPLET
, AREA
,
IMG
, and EMBED
; and the HREF
attribute (and ACTION as a synonym, from Mosaic)
for ISINDEX
.
It is worth warning that some browser and editor manufacturers are accepting or generating very careless code of the form
<isindex http:foo.bar.com/blort>
which is not only unparsable, and thus unusable anywhere else, but unnecessary given the availability of HREF.
Drazen Kacar suggested the reinclusion of the CLEAR attribute on all structural elements, so this has been done. The assumption is that it is not needed on inline or descriptive markup.
The
controversy over EMBED
OBJECT
was no less acrimonious than the earlier one over APP
or
APPLET
FIG
, and has been
treated by including all five elements as implemented. The now-expired
Internet
Draft on Compound Documents by EMBED
,
rather than the attribute soup
approach piloted by some
browser-makers, and the W3C now has
their own
document on the subject.
Little comment has been received to date about math content. The view was expressed at SGML'96 that the whole question of SGML math was still up for grabs, so it is unlikely that any significant change to the HTML3--based content will be made unless mathematicians present some kind of conclusion.
I am
grateful to Dave Carter, however, for the suggestion to include FONT
in the model, so that manual/visual adjustment can be made to large
symbols where browsers fail to render them automatically. There is as
yet no default inclusion of the ISOams* character entities (although
that can easily be done with a simple edit).
There is one
significant new attribute, ROLE on
MATH
, which can take the values INLINE or
DISPLAY. This corresponds with standard mathematical usage
as reflected in TeX (this attribute has been made #REQUIRED).
Following hints from Tom Magliery, it is possibly true that most math
authors are writing in TeX and making GIFs.
There has been one special group of added elements:
those implemented by WebTV without a DTD. The
elements added are AUDIOSCOPE
and SIDEBAR
in
the insertions
class, and BLACKFACE
, LIMITTEXT
,
NOSMARTQUOTES
, and SHADOW
to the text
class. A substantial number of specialist attributes have been added to
handle the WebTV output, and these are identified by (T)
in the DTD source.
The
only requested addition has come from FirstFloor Software, who provide
the Smart Bookmarks module shipped in Netscape's PowerPack.
This allows for specification on any A
(anchor) element of
the date and time (HTTP format,
Manufacturers who are experimenting with additional markup are invited to send details (with a DTD fragment) so that the elements can be included in a subsequent release of HTML Pro. The address for submissions is silmaril@m-net.arbornet.org
The astute reader will have noticed the new elements
ELEMENT
and ATTRIB
, which have been added to
this version to test their use in documentation, in order to accompany
the already proposed ENTITY
and COMMENT
. It
is not the intention that they should remain past v1r0 unless a formal
approach is made to the then controllers of the HTML
standard.
The COMMENT
element is in any case wholly
redundant, and it is assumed that the element was invented in an
uncontrolled burst of misdirected enthusiasm, coupled with a significant
failure to read the HTML specification, as COMMENT
does not correctly implement comments at all (embedded markup gets
interpreted instead of ignored, in implementations to date).
Where it has been possible
to find out who invented an element, this is noted in the comment which
follows every element and attribute in the DTD
source code. This comment is in the form used by Microstar's Near&Far
which they call a title
:
<!element foo - - %bar; --<Title>(X)comment goes here--> ... <!attlist foo %commonatts; bar CDATA #IMPLIED --<Title>(Y)comment goes here-->
The attribution is given by quoting one or more single-letter codes in parentheses immediately before the comment (see Figure 2 for the codes). I am grateful to the authors of the WebTV proposals for their work on identifying some of these origins.
Code | Source |
1 | HTML, CERN original version (1) |
2 | HTML 2.0 (RFC1866), the formal standard |
3 | HTML3, the expired Internet Draft |
C | Cougar, HTML 3.2, a W3C stopgap |
W | Wilbur, a W3C experiment |
M |
Microsoft proposals |
N | Netscape proposals |
S | Sun proposals |
L | Lynx proposals |
T | WebTV proposals |
I |
Internationalization (I18N, RFC2070) |
F |
Form-based file uploads (RFC1867) |
O | Compound Documents in HTML (Burchard & Raggett) |
P | HTML Pro experimental or enabling mechanism |
D | ICADD/HTML Interest Group (from SGML'96) |
B | FirstFloor Software's Smart Bookmarks Netscape plugin |
? | Unknown, information welcomed |
The HTML3 concept of HTML-specific character entity files has been ditched, and this version includes the whole of ISOlat1, ISOlat2, ISOnum, ISOpub and ISOtech, which should have been done years ago.
The infamous problem of mixed content in list items (and elsewhere) has been tackled head on by simply permitting it: text data is allowed as well as paragraphs and other structural markup like lists, tables, and forms. This may cause some less well-endowed editors a little grief, but the advantages of being able to tweak the performance characteristics of various browsers are too great to pass up.
For
those unfamiliar with mixed content, this is a position which obtains
when an element can contain both running text (PCDATA
in SGML terms), and the inline markup that usually
accompanies it, as well as structural markup. Inline markup
is the term given to markup which normally occurs at paragraph level,
such as emphasis, hypertext links, or font style changes like bold or
italics. Structural markup is that which defines the bones
of the document: paragraphs, lists, headings, sections, tables
White space
(linebreaks, tabs, or spaces) between structural elements is usually not
significant to an SGML parser, as it cannot have any
meaning in terms of the document structure: meaning
is borne by
the markup at this level. For this reason, HTML
browsers (and some SGML display systems) disregard
or remove such white space, so it can freely be inserted by editors and
authors for their own visual comfort and ease of editing, as it will not
affect the display or use of the document. White space within
running text (at the paragraph level, for example) is of course very
significant, as it separates words.
Where the content of an element is mixed, therefore, there is an ambiguity about whether any space is there for redundant (non-significant) purposes, or if it is a part of the textual data content. A machine simply cannot tell --- only a human can detect this from the contextual meaning. To enable parsing to work, linebreaks are not permitted by the rules of SGML in certain circumstances immediately prior to, between, and immediately following, structural elements in mixed content.
This has
been regarded as unavoidable, given that different usages are needed,
and the same has been permitted within the cells of tables, as
established practice dictates. Rather than produce several DTD
versions, one permitting text where another permits only further markup
(the lax
strict
versions of many
HTML DTDs to date), it has been
assumed that professional users would be equipped with a suitable editor
capable of handling the behavioral and formatting requirements of mixed
content.
It is assumed that the user is aware of the
meaning and usage of the elements of HTML, and of
the principles of SGML. Those who are unfamiliar
with these will find help in the only SGML-compliant
book on the use of HTML,
The
World-Wide Web Handbook by
A more extensive list including the proposed additional elements, with a short comment on their intended usage, is included in the HTML Pro distribution as file htmlpro.tag (as used in SoftQuad's RulesBuilder), and is reproduced here in Figure 3.
Periodic Table of the Elementsfor HTML
(1) Hypertext anchor | |
(3) Abbreviation | |
(3) Numerator of fraction | |
(3) Acronym | |
(1) Address block | |
(S) Obsolete name for | |
(N) Java or other applet | |
(N) Area in clientside imageMAP | |
(3) Math array | |
(3) Top half of unlined fraction | |
(P) SGML attribute name | |
(3) Author name | |
(T) Visual display of a sound played | |
(L) Author | |
(2) Unidentified bold type | |
(3) Unscrollable banner | |
(3) Math overbar | |
(1) Home URL of this file | |
(M) Base size for subsequent fonts | |
(I) BiDirectional override | |
(3) Denominator of fraction | |
(M) Background sound | |
(3) Larger type | |
(T) Extra bold | |
(N) Blinking | |
(1) Block (indented) quotation | |
(1) Body of the text | |
(3) Identifies body after BANNER | |
(3) Math fraction | |
(3) Alternate BLOCKQUOTE | |
(2) Forced linebreak | |
(3) Math bold typewriter type | |
(3) Caption of TABLE or FIGure | |
(N) Arbitrary centering | |
(3) Math binomial choice | |
(3) Citation (book, product, etc) | |
(2) Program code | |
(3) Column spec in TABLE | |
(3) Group of COLs | |
| |
(3) Credits for FIGure (illustration) | |
(1) Definition list discussion | |
(3) Math double-dot accent | |
(3) Text marked for deletion | |
(3) Definition of new (index) term | |
| |
(3) Structural division | |
(1) Definition list | |
(3) Math dot accent | |
(1) Definition list term | |
(P) SGML element name | |
(1) Emphasis | |
(M) Embedded object | |
(?M) SGML or other entity name | |
(?N) Group of FORM fields | |
(3) Figure or illustration | |
| |
(3) Footnote | |
(N) Specific font change | |
(2) Fill-in form | |
(N) Subdivision of user's window | |
(N) Group of FRAMEs | |
(1) Top-level heading | |
(1) Second-level heading | |
(1) Third-level heading | |
(1) Fourth-level heading | |
(1) Fifth-level heading | |
(1) Sixth-level heading | |
(3) Math hat (circumflex) accent | |
(1) Documentation header | |
(1) Horizontal rule | |
(1) Entire HTML document | |
(2) Unidentified italics | |
| |
(2) Inline image (picture, icon) | |
(2) Inline input in FORM | |
(3) Text to be inserted | |
(1) Input to non-FORM script | |
(3) Item in MATH ARRAY | |
(1) Keyboard key or screen button | |
(N) Generates PublicKey | |
(?N) FORM field name label | |
(3) Foreign language | |
(3) MATH fraction left delimiter | |
(3) List heading | |
(1) List item | |
(T) Width-limited text | |
(2) Link to other resources | |
| |
(N) Clientside imagemap | |
(M) Marching display | |
(3) Mathematics | |
| |
(2) Metainformation | |
(N) Encloses multicolumn text | |
| |
(N) Prohibit linebreak | |
(M) Alternative for browsers with no EMBED | |
(N) Alternative for browsers with no FRAMEs | |
(N) Alternative for browsers with no SCRIPT | |
(T) Turns off | |
(3) Note of any kind | |
(W) Inline embedded object | |
(3) MATH ROOT divider | |
(1) Ordered (numbered) list | |
(2) Choice in a FORM SELECTion menu | |
(3) MATH fraction (BOX) divider | |
(3) FIGure image overlay | |
(1) Paragraph | |
(S) Parameter to APPLET or OBJECT | |
(3) Personal name | |
| |
(3) Preformatted text | |
(3) | |
(3) Text between two SPOTs | |
(3) MATH fraction right delimiter | |
(3) MATH root | |
(3) Row in MATH ARRAY | |
(2) Strikeout text | |
(1) Inline computer input/output | |
(SN) Executable inline script | |
(2) FORM selection menu | |
(M) Invokes server-side
| |
(T) Shadowed text | |
(T) Unscrollable sidebar | |
(3) Smaller type | |
(N) Inserts concrete spacing | |
(3) Identifies arbitrary portion of text | |
(3) Start or end of RANGE | |
(3) MATH square root | |
(3) Strikeout text | |
(1) Strong emphasis | |
(3) Embedded stylesheet formatting | |
(3) Subscript | |
(3) Superscript | |
(3) MATH typewriter type | |
(3) Relocatable tab across screen/page | |
(3) Tabular alignment | |
(3) Body within a TABLE | |
(3) Cell of TABLE Data | |
(2) Free-form text input in FORM | |
(3) Dummy | |
(3) Foot of a TABLE | |
(3) Row/col header cell in TABLE | |
(3) Headings of a TABLE | |
(3) MATH tilde accent | |
(1) Document title | |
(3) TABLE row | |
(1) Unidentified typewriter type | |
(1) Underlined type | |
(1) Unordered (bulleted) list | |
(1) Computer variable or filename | |
(3) MATH vector | |
(N) Wordbreak | |
| |
(P) Container for corpora of HTML |
The machine-generated status of the DTD file
meant that there was a substantial amount of legacy comment from the
assorted versions used in compositing the DTD. The
majority of this has now been edited out so that obsolete comments and
parameter entities are not left to confuse the unwary reader, but users
who locate undetected
Now that the DTD is stabilizing, there is a need to re-parameterize some of the material, and use some of the architectural material already tested in the modular version of the standard.
The elements for which no ICADD fixed attribute exists need analyzing and the relevant values adding from the International Committee for Accessible Document Design DTD ("-//EC-USA-CDA/ICADD//DTD ICADD22//EN"; work is under way on this).
A longer-term goal is to devise and maintain publicly a table of which elements and which attributes are actively supported in which browsers.
There are undoubtedly other errors, both of omission and of commission, and I would be very grateful for details so that they can be fixed: silmaril@m-net.arbornet.org.
The current distribution includes the following files. The default installation directory is given in [square brackets].
On a UNIX or DOS-based system, this usually means the mapping given in Figure 4. The somewhat strange names of some existing files are a result of a misunderstanding of the nature of the role of the ISO 9070 Public Owner Identifier by the authors of some of the DTDs.
Software manufacturers who would like compiled versions in their own format included are invited to contact the authors.
|
Registered (+) or unregistered (-) under ISO/IEC 9070 | Ignored while most owners are unregistered |
Silmaril// | ISO/IEC 9070 Public Owner Identifier | Treated as a directory name under /usr/local/lib/sgml/, with spaces replaced by underscore characters. |
|
Class | Class of document treated as further subdirectory (note Emacs psgml-mode expects this to be lowercase; other systems expect uppercase; DOS/Windows doesn't care). |
HTML Pro v0r11 19970101// |
Name | Document name: treated as filename, with spaces replaced by underscore characters. |
|
Language | Ignored by most software (default English). |
No SGML software is included, as this can all be obtained online. A good list of software and associated information on projects and activities is kept by Robin Cover at http://www.sil.org/sgml/sgml.html.
The minimum to get started editing HTML in a fully-conformant manner is an SGML editor. Some of the most popular cross-platform systems seem to be:
Author/Editor from SoftQuad. This is a straightforward graphical editor for any DTD, and is available for MS-Windows (3.x and 95), Apple Mac, and UNIX/X. It is only an editor, not a DTP system, so its paper-publishing facilities are limited (SoftQuad do a separate system called Sculptor which takes A/E into the professional publishing field).
GNU Emacs with psgml-mode. This free editor is well-known in the UNIX field (it also reads mail and Usenet news, browses the Web, does FTP, and a thousand other things as well as just edit), but it also works well under MS-DOS and Windows, and there are user-supported versions available separately for Macs and VMS.
Softquad also do a Web browser add-on for Windows PCs called Panorama. There's a free version online, but if you're planning on using SGML more than trivially, the commercial version, Panorama Pro, is strongly recommended.
This file can be used with either version, but with PanoFree you need to install the minimal subset of HTML Pro, whereas PanoPro it will automatically download the right bits and pieces for you. If you've got PanoPro already installed and working, you can go right ahead and click here.
If you want to get started with the free version, do the following:
Make sure your regular browser (Netscape, Mosaic, Explorer) is already running first: presumably it is, or is at least possible, or you wouldn't be reading this (unless someone else printed it off for you);
Download PanoFree and double-click to install it, making sure you use the default directory (C:\Softquad\Panorama). If it doesn't mention your browser as one it works with, don't worry --- after installation is finished, go to your browser's Options menu and reconfigure your Helper applications for the MIME type text/x-sgml to make file types .sgm and .sgml start the c:\softquad\panorama\panorama.exe program;
Close down and restart your browser so that it picks up the link to Panorama correctly;
Don't be tempted to try and run Panorama yet before the installation of HTML Pro is complete;
Download the PanoFree HTML Pro starter kit (about 45Kb) and double-click to install it;
Now you should be able to click here and view this file using all the facilities of a real SGML application (well, nearly all: some of them are reserved for customers of the commercial version, Panorama Pro).
Disclaimer
Silmaril has no connection with or financial interest in SoftQuad, Inc. or with any other company mentioned in this document.
Don't forget, it's spelled HTML Pro but it's pronounced A-a-r-d-v-a-r-k