This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Many of my webpages were originally created with Netscape 4.x's composer, and, as a result, have incorrect doctypes (bad case in certain words): <!doctype html public "-//w3c//dtd html 4.0 transitional//en"> I have been trying to fix the errors in those pages to get them up to standard and was attempting to use W3C's validator to help me do so. Unfortunately, when presented with pages with such a doctype the validator balks and then sends it through the SGML parser, causing it to essentially "invent" errors where there aren't any. What it should have done is simply report that an incorrect doctype was found and that no further validation was done, and then offer a drop down box of doctypes to choose from to continue the validation (as is done when no doctype is present). I created a valid test page and then made a copy with one change - I changed the doctype to the old NS4.x doctype (plus a 1): <!doctype html public "-//w3c//dtd html 4.01 transitional//en"> These two pages are identical except for the above difference. The first passes, the second fails with 80 errors, flagging as it does things that are actually valid HTML. http://members.rogers.com/dpjames/valid.html http://members.rogers.com/dpjames/invalid.html ==>http://validator.w3.org/check?uri=http%3A%2F%2Fmembers.rogers.com%2Fdpjames%2Finvalid.html The validation results from the latter are less than helpful and arguably misleading and inaccurate, as it indicates that there are 80 errors when there are in fact less than a half a dozen. There is also no information as to where one might find out about valid doctypes (an "explain..." link would be helpful here).
showing interest
David: Please attach the problematic page to this bug report so we have it on hand unless you are absolutely certain that that URL will remain static. The problem is that we do not detect the invalid DOCTYPE until we feed the document to the SGML Parser; but, yes, we should detect this particular error and emit a fatal error rather then a gazillion meaningless "undefined element" messages. This does have some side-effects that wants thinking about. The five first messages reported are separate messages from the SGML Parser for a reason. There actually are multiple separate errors here, each of which may appear independant of the others and do not necessarily indicate a fatal error. i.e. it's possible to have a combination of several of those messages and still get meaningfull results for the rest of the document. Hopefully, though, this is sufficiently rare in practice that we can allow ourselves to overstate the importance (by emitting a fatal error) of these messages. Setting target to 0.7.0 (aka. "When I Get a Round Tuit") since this wants classification as a feature enhancement and 0.6.x is nominally "frozen".
Nominating for 0.7.0. Our behaviour has changed in this area during the 0.6.x series, so this may now be a question of presentation now.
http://qa-dev.w3.org/wmvs/HEAD/check? uri=http%3A%2F%2Fmembers.rogers.com%2Fdpjames%2Finvalid.html shows this seems to be fixed.
Indeed it does. And it doesn't depend on Bug #739 either. Closing as FIXED.