admin
doz
html2pdf_v4.03
htmlpurifier-4.10.0
art
benchmarks
configdoc
docs
dtd
entities
examples
specimens
dev-advanced-api.html
dev-code-quality.txt
dev-config-bcbreaks.txt
dev-config-naming.txt
dev-config-schema.html
dev-flush.html
dev-includes.txt
dev-naming.html
dev-optimization.html
dev-progress.html
enduser-customize.html
enduser-id.html
enduser-overview.txt
enduser-security.txt
enduser-slow.html
enduser-tidy.html
enduser-uri-filter.html
enduser-utf8.html
enduser-youtube.html
fixquotes.htc
index.html
proposal-colors.html
proposal-config.txt
proposal-css-extraction.txt
proposal-errors.txt
proposal-filter-levels.txt
proposal-language.txt
proposal-new-directives.txt
proposal-plists.txt
ref-content-models.txt
ref-css-length.txt
ref-devnetwork.html
ref-html-modularization.txt
ref-proprietary-tags.txt
ref-whatwg.txt
style.css
extras
library
maintenance
plugins
smoketests
tests
.gitattributes
.gitignore
.travis.yml
CREDITS
Doxyfile
INSTALL
INSTALL.fr.utf8
LICENSE
NEWS
README.md
TODO
VERSION
WHATSNEW
WYSIWYG
composer.json
phpdoc.ini
images
prints
prints3
proseminar
Kennwortwechsel.php
hauptframe.php
index.php
index_alt.php
index_db.php
index_frame.htm
index_ldap.php
login.php
logout.php
menuframe.htm
styles_pc.css
topframe.php
51 lines
2.3 KiB
Plaintext
Executable File
51 lines
2.3 KiB
Plaintext
Executable File
|
|
Handling Content Model Changes
|
|
|
|
|
|
1. Context
|
|
|
|
The distinction between Transitional and Strict document types is somewhat
|
|
of an anomaly in the lineage of XHTML document types (following 1.0, no
|
|
doctypes do not have flavors: instead, modularization is used to let
|
|
document authors vary their elements). This transition is usually quite
|
|
straight-forward, as W3C usually deprecates attributes or elements, which
|
|
are quite easily handled using tag and attribute transforms.
|
|
|
|
However, for two elements, <blockquote>, <body> and <address>, W3C elected
|
|
to also change the content model. <blockquote> and <body> originally
|
|
accepted both inline and block elements, but in the strict doctype they
|
|
only allow block elements. With <address>, the situation is inverted:
|
|
<p> tags were now forbidden from appearing within this tag.
|
|
|
|
|
|
2. Current situation
|
|
|
|
Currently, HTML Purifier treats <blockquote> specially during Tidy mode
|
|
using a custom ChildDef class StrictBlockquote. StrictBlockquote
|
|
operates similarly to Required, except that when it encounters an inline
|
|
element, it will wrap it in a block tag (as specified by
|
|
%HTML.BlockWrapper, the default is <p>). The naming suggests it can
|
|
only be used for <blockquote>s, although it may be possible to
|
|
genericize it to work on other cases of this nature (this would be of
|
|
little practical application, as no other element in XHTML 1.1 or earlier
|
|
has a block-only content model).
|
|
|
|
Tidy currently contains no custom, lenient implementation for <address>.
|
|
If one were to be written, it would likely operate on the principle that,
|
|
when a <p> tag were to be encountered, it would be replaced with a
|
|
leading and trailing <br /> tag (the contents of <p>, being inline, are
|
|
not an issue). There is no prior work with this sort of operation.
|
|
|
|
|
|
3. Outside applicability
|
|
|
|
There are a number of other elements that contain restrictive content
|
|
models, such as <ul> or <span> (the latter is restrictive in that it
|
|
does not allow block elements). In the former case, an errant node
|
|
is eliminated completely, in the latter case, the text of the node
|
|
would is preserved (as the parent node does allow PCDATA). Custom
|
|
content model implementations probably are not the best way of handling
|
|
these cases, instead, node bubbling should be implemented instead.
|
|
|
|
vim: et sw=4 sts=4
|