History of the Kinder, Gentler HTML Validator

This page lists the history of the Kinder, Gentler HTML Validator.

December 18, 1997:

W3C HTML validator announced, based on the KGV's code. (I started working for W3C in Sep 1997.)

October 22, 1997:

Updated SP to the most recent version, 1.2.1. This version fixes a bug with URL-based system identifiers for DTDs served from HTTP/1.1-capable HTTP servers (like Apache, which is used by W3C.)

Updated the DTDs for HTML 4.0 to the most recent version, dated "1997/09/17 11:45:33".

Note: I couldn't get the HTML declaration included with this package to work with my version of SP, so I am still using the decl from the 970708 distribution.

July 9, 1997:

Added the DTD for HTML 4.0.

May 14, 1997:

Updated the Cougar DTD to the most recent version, dated "1997/05/14 09:54:36".

May 12, 1997:

Added the DTD for HTML Level "Cougar."

Updated my version of SP to 1.1.1, with support for multibyte character sets (such as Unicode, the document character set used in Cougar.)

April 27, 1997:

Added the DTD for HTML 3.2 + Style.

February 21, 1997:

Updated the DTD for HTML Pro, to version 0 revision 11. I had to remove the entities "zwnj", "zwj", "lrm", and "rlm", because they were causing errors since I'm not using HTML Pro's SGML declaration. (I can't use that one because my nsgmls doesn't yet have multibyte support.)

January 29, 1997:

Updated the HTML 3.2 DTD to the most recent version (the Final version, as it appears in the HTML 3.2 Reference Specification.)

November 15, 1996:

Added the DTD for Microsoft Internet Explorer 3.0 (and miscellaneous supplementary files) to the SGML catalog.

October 17, 1996:

Added the DTD for Spyglass HTML 2.0 Extended.

October 6, 1996:

Added the DTD for HTML Pro (version 0 revision 8).

September 9, 1996:

Updated the HTML 3.2 DTD to the most recent version (dated "Tuesday August 21st 1996" [sic] in the file, but "Sun, 08 Sep 1996 16:09:34 GMT" on the HTTP server.)

This version of the HTML 3.2 DTD fixes the problems people were having with empty APPLETs.

August 15, 1996:

Updated the HTML 3.2 DTD to to the most recent version (dated "13-August-96".)

August 5, 1996:

Updated the HTML 3.2 DTD to to the most recent version (dated "25-June-96".)

June 3, 1996:

Updated the HTML 3.2 DTD to to the most recent version (dated "31-May-96".)

May 27, 1996:

Updated Weblint to version 1.016.

May 18, 1996:

Updated the Netscape DTD maintained by WebTechs.

Added the Softquad DTD to the SGML catalog.

Changed the blurb given when no DOCTYPE is present to recommend using HTML 3.2 and WebTechs public identifiers rather than HTML 3.0 and the old DOCTYPE for Netscape (since the DTD isn't published by Netscape.)

Added an icon for HTML 3.2, thanks to Lars Balker Rasmussen.

Added an icon for HTML 1996-01, thanks to Cyril Slobin.

Updated Weblint to version 1.015.

May 7, 1996:

Added the HTML 3.2 DTD (a.k.a. Wilbur) to the SGML catalog.

March 15, 1996:

Added the O'Reilly HTML Extended DTD 1.0 (and miscellaneous supplementary files) to the SGML catalog.

March 13, 1996:

Added the DTD for Microsoft Internet Explorer 2.0 (and miscellaneous supplementary files) to the SGML catalog.

February 11, 1996:

Added Dan Connolly's "HTML 1996-01" modular DTD (in a directory of its own) to the SGML catalog.

Added OVERRIDE YES to the SGML catalog, to prevent errors with system identifiers. For more information, see the man page for nsgmls or TR 9401:1995 - Entity Management (SGML Open Technical Resolution 9401:1995.)

February 4, 1996:

(Actually, this stuff was gradually added over the last while; today's just the day I finished these changes and made this new version the default one.)

Started displaying icons for successful validation runs.

Added a Weblint option, along with a "use Weblint in pedantic mode" option.

Added the "show outline" option, to teach people how to use headings properly.

Added the "don't show attributes" option to the "show parser output" option.

Added some signal-trapping stuff to clean up temporary files if the process gets killed prematurely because of impatient people.

Started intercepting a couple special validator errors ("cannot generate system identifier for entity ..." and "cannot open ..."), calling them "Fatal errors" and printing out an appropriate error message (and not displaying the cascade of errors from the confused parser.)

Started doing the Right Thing when a URL isn't supplied (printing the welcome page.)

Started nagging people to update their links if they're still pointing to the "validate.cgi" URL.

Fixed the appearance of line numbers in the "show source" listing (added leading whitespace).

Fixed the output format slightly, to clearly separate different sections of the validation results page.

Fixed the behavior of line numbers in the source listing and validator output when a DOCTYPE is inserted because it was missing: now the inserted DOCTYPE appears as line "0", so the other line numbers are the "true" line numbers in the actual file.

Changed the User-Agent: used for URL fetches to "KGValidator/1.3" to reflect a new "version" of the validator.

January 11, 1996:

Changed the output format back to a <PRE>-formatted version, after realizing that it was possible to make a valid <PRE>-with-<IMG> version by putting an extra tag around each <IMG> to change the content model in effect. (For more info, see the discussion on www-html.)

Started splitting error messages in half if they seem to be too long for the line with the arrow on it.

The "explanation..." links next to error messages only appear if an explanation actually exists in the unofficial KGV FAQ.

Changed the User-Agent: used for URL fetches to "KGValidator/1.2" to reflect a new "version" of the validator.

January 5, 1996:

Made links from each error message to the appropriate explanation in the unofficial KGV FAQ.

Fixed up the intro page somewhat.

January 4, 1996:

Added the DTD used by AdvaSoft's HTML editors, and the appropriate entry in the SGML catalog.

Started logging the results of validation runs (in a very simple format, so far.)

December 26, 1995:

Endorsed by Dan Connolly on Usenet (HTML activity lead at W3C, co-author of the original HalSoft HTML validator)

December 21, 1995:

Fixed up the intro page somewhat.

Created this list of changes.

Upgraded nsgmls from version 0.4 to version 1.0.1 (and updated my post-processing code to compensate for a small change in nsgmls' output format).

Added the DTD for <EMBED>, and the appropriate entry in the SGML catalog.

Changed the User-Agent: used for URL fetches to "KGValidator/1.1" to reflect a new "version" of the validator.

December 19, 1995:

Fixed up the input form slightly, and added the ``show parser output'' option.

Figured out a new, not-quite-as-friendly way to display the error messages (but this one doesn't use <PRE>-formatted text, which doesn't allow images inside of it):

-----------------------------------------------------------------
Error at line 8:
BORDER=0[*] ALT="WELCOME TO NETSCAPE"></A>
[*]there is no attribute `BORDER'. (explanation...)
-----------------------------------------------------------------
Error at line 16:
B<FONT SIZE=-1[*]>UGS</FONT>
[*]there is no attribute `SIZE'. (explanation...)
-----------------------------------------------------------------
Error at line 68:
[...]chase_banner.gif" WIDTH=468[*] HEIGHT=60 BO[...]
[*]there is no attribute `WIDTH'. (explanation...)

Unfortunately, the HTML used to accomplish this is pretty ugly, but it's the best I could come up (without resorting to using tables or using some intermediate hack to figure out the size of the user's monospace font or something). I'm going to ask the HTML Working Group why <IMG> can't appear inside <PRE>, and try to get this changed for a future version of HTML: I can't think of any good reason for this rule, and I've never seen a browser that doesn't support it...

December 16, 1995:

Was informed that my validator output itself does not validate, due to the use of <IMG>s within a <PRE> section.

November 18, 1995:

Started truncating `longer' HTML source lines (i.e., those that are too wide to fit within a typical browser window). Why can't people just format their HTML source nicely?

The validator output now looks like this:
(there's a screen shot here)

(If you hadn't noticed, this is an inline image; I can't include the HTML source here, because it's --gasp-- invalid!)

October 18, 1995:

Updated all the SGML stuff to the most recent versions (by stealing them from HAL's directory of SGML stuff.) I think I might have also changed these a bit based on the HTML specifications; I can't remember.

September 27, 1995:

I replaced the carets (`^') used to point out errors with inline images with red arrows on them, blissfully ignorant that you can't have <IMG>s inside <PRE>s. After making this change, I tried validating the output of a validation run by prepending the URL for the CGI script in front of the URL for a validation run with errors (i.e., calling the validator with itself), but I didn't realize the double-query would confuse the HTTP server, so it ended up validating the ``welcome page'' instead. I didn't find out that my validator output was invalid until December 16, when Harold Driscoll tried to use the output of my validator in an article he was writing...

August 24, 1995:

After noticing that `nsgmls' has slightly more useful output than its predecessor, `sgmls', I decide to make a groovy new validation service with the problematic markup displayed along with the error messages.

The first version of this program had output like this:

    Error at line 380:

    </TD>
        ^ end tag for element `TD' which is not open

    Error at line 382:

    </TR>
        ^ end tag for element `TR' which is not open

    Error at line 384:

    </TABLE>
           ^ end tag for element `TABLE' which is not open

Gerald Oskoboiny
$Date: 2003/03/22 02:34:58 $