You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
will incorrectly be inferred to have the charset 'iso8859-2;text/html'. The fix is to add a semicolon to the spaceCharacters scanned in SkipUntil - line 860.
EDIT: as per specification. Also, I don't know what the status is of the parser tests, but they're out of date and incorrect and (obviously) not used. Although most of the tests are still valid, so it would not take much to bring them back into the full test regime.
The text was updated successfully, but these errors were encountered:
We have a problem with the tests for this that they currently don't agree whether they're testing the eventual encoding (including potentially with the tokenizer changing the encoding while parsing, after the pre-parse) or just the pre-parse.
Reference: http://www.w3.org/html/wg/drafts/html/master/infrastructure.html#algorithm-for-extracting-a-character-encoding-from-a-meta-element
Because the ContentAttrParser is looking only for a space character to terminate an unquoted charset
<meta http-equiv="Content-Type" content="charset=iso8859-2;text/html">
will incorrectly be inferred to have the charset 'iso8859-2;text/html'. The fix is to add a semicolon to the spaceCharacters scanned in SkipUntil - line 860.
EDIT: as per specification. Also, I don't know what the status is of the parser tests, but they're out of date and incorrect and (obviously) not used. Although most of the tests are still valid, so it would not take much to bring them back into the full test regime.
The text was updated successfully, but these errors were encountered: