admin
doz
fachprojekt
html2pdf_v4.03
htmlpurifier-4.10.0
art
benchmarks
configdoc
docs
dtd
entities
examples
specimens
dev-advanced-api.html
dev-code-quality.txt
dev-config-bcbreaks.txt
dev-config-naming.txt
dev-config-schema.html
dev-flush.html
dev-includes.txt
dev-naming.html
dev-optimization.html
dev-progress.html
enduser-customize.html
enduser-id.html
enduser-overview.txt
enduser-security.txt
enduser-slow.html
enduser-tidy.html
enduser-uri-filter.html
enduser-utf8.html
enduser-youtube.html
fixquotes.htc
index.html
proposal-colors.html
proposal-config.txt
proposal-css-extraction.txt
proposal-errors.txt
proposal-filter-levels.txt
proposal-language.txt
proposal-new-directives.txt
proposal-plists.txt
ref-content-models.txt
ref-css-length.txt
ref-devnetwork.html
ref-html-modularization.txt
ref-proprietary-tags.txt
ref-whatwg.txt
style.css
extras
library
maintenance
plugins
smoketests
tests
.gitattributes
.gitignore
.travis.yml
CREDITS
Doxyfile
INSTALL
INSTALL.fr.utf8
LICENSE
NEWS
README.md
TODO
VERSION
WHATSNEW
WYSIWYG
composer.json
phpdoc.ini
images
prints
prints3
Kennwortwechsel.php
hauptframe.php
index.php
index_alt.php
index_db.php
index_frame.htm
index_ldap.php
login.php
logout.php
menuframe.htm
styles_pc.css
topframe.php
121 lines
4.7 KiB
HTML
Executable File
121 lines
4.7 KiB
HTML
Executable File
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
|
<meta name="description" content="Explains how to speed up HTML Purifier through caching or inbound filtering." />
|
|
<link rel="stylesheet" type="text/css" href="./style.css" />
|
|
|
|
<title>Speeding up HTML Purifier - HTML Purifier</title>
|
|
|
|
</head><body>
|
|
|
|
<h1 class="subtitled">Speeding up HTML Purifier</h1>
|
|
<div class="subtitle">...also known as the HELP ME LIBRARY IS TOO SLOW MY PAGE TAKE TOO LONG page</div>
|
|
|
|
<div id="filing">Filed under End-User</div>
|
|
<div id="index">Return to the <a href="index.html">index</a>.</div>
|
|
<div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div>
|
|
|
|
<p>HTML Purifier is a very powerful library. But with power comes great
|
|
responsibility, in the form of longer execution times. Remember, this
|
|
library isn't lightly grazing over submitted HTML: it's deconstructing
|
|
the whole thing, rigorously checking the parts, and then putting it back
|
|
together. </p>
|
|
|
|
<p>So, if it so turns out that HTML Purifier is kinda too slow for outbound
|
|
filtering, you've got a few options: </p>
|
|
|
|
<h2>Inbound filtering</h2>
|
|
|
|
<p>Perform filtering of HTML when it's submitted by the user. Since the
|
|
user is already submitting something, an extra half a second tacked on
|
|
to the load time probably isn't going to be that huge of a problem.
|
|
Then, displaying the content is a simple a manner of outputting it
|
|
directly from your database/filesystem. The trouble with this method is
|
|
that your user loses the original text, and when doing edits, will be
|
|
handling the filtered text. While this may be a good thing, especially
|
|
if you're using a WYSIWYG editor, it can also result in data-loss if a
|
|
user makes a typo. </p>
|
|
|
|
<p>Example (non-functional):</p>
|
|
|
|
<pre><?php
|
|
/**
|
|
* FORM SUBMISSION PAGE
|
|
* display_error($message) : displays nice error page with message
|
|
* display_success() : displays a nice success page
|
|
* display_form() : displays the HTML submission form
|
|
* database_insert($html) : inserts data into database as new row
|
|
*/
|
|
if (!empty($_POST)) {
|
|
require_once '/path/to/library/HTMLPurifier.auto.php';
|
|
require_once 'HTMLPurifier.func.php';
|
|
$dirty_html = isset($_POST['html']) ? $_POST['html'] : false;
|
|
if (!$dirty_html) {
|
|
display_error('You must write some HTML!');
|
|
}
|
|
$html = HTMLPurifier($dirty_html);
|
|
database_insert($html);
|
|
display_success();
|
|
// notice that $dirty_html is *not* saved
|
|
} else {
|
|
display_form();
|
|
}
|
|
?></pre>
|
|
|
|
<h2>Caching the filtered output</h2>
|
|
|
|
<p>Accept the submitted text and put it unaltered into the database, but
|
|
then also generate a filtered version and stash that in the database.
|
|
Serve the filtered version to readers, and the unaltered version to
|
|
editors. If need be, you can invalidate the cache and have the cached
|
|
filtered version be regenerated on the first page view. Pros? Full data
|
|
retention. Cons? It's more complicated, and opens other editors up to
|
|
XSS if they are using a WYSIWYG editor (to fix that, they'd have to be
|
|
able to get their hands on the *really* original text served in
|
|
plaintext mode). </p>
|
|
|
|
<p>Example (non-functional):</p>
|
|
|
|
<pre><?php
|
|
/**
|
|
* VIEW PAGE
|
|
* display_error($message) : displays nice error page with message
|
|
* cache_get($id) : retrieves HTML from fast cache (db or file)
|
|
* cache_insert($id, $html) : inserts good HTML into cache system
|
|
* database_get($id) : retrieves raw HTML from database
|
|
*/
|
|
$id = isset($_GET['id']) ? (int) $_GET['id'] : false;
|
|
if (!$id) {
|
|
display_error('Must specify ID.');
|
|
exit;
|
|
}
|
|
$html = cache_get($id); // filesystem or database
|
|
if ($html === false) {
|
|
// cache didn't have the HTML, generate it
|
|
$raw_html = database_get($id);
|
|
require_once '/path/to/library/HTMLPurifier.auto.php';
|
|
require_once 'HTMLPurifier.func.php';
|
|
$html = HTMLPurifier($raw_html);
|
|
cache_insert($id, $html);
|
|
}
|
|
echo $html;
|
|
?></pre>
|
|
|
|
<h2>Summary</h2>
|
|
|
|
<p>In short, inbound filtering is the simple option and caching is the
|
|
robust option (albeit with bigger storage requirements). </p>
|
|
|
|
<p>There is a third option, independent of the two we've discussed: profile
|
|
and optimize HTMLPurifier yourself. Be sure to report back your results
|
|
if you decide to do that! Especially if you port HTML Purifier to C++.
|
|
<tt>;-)</tt></p>
|
|
|
|
</body>
|
|
</html>
|
|
|
|
<!-- vim: et sw=4 sts=4
|
|
-->
|