WP-HTML-Compression

Combining HTML “minification” with cache and HTTP compression (WP Super Cache, or similar) will cut down your bandwidth and ensure near-immediate content delivery while increasing your Google rankings.

This plugin will compress your HTML by shortening URLs and removing standard comments and white space; including new lines, carriage returns, tabs and excess spaces. Most importantly, by ignoring <pre>, <textarea>, <script> and Explorer conditional comment tags, presentation will not be affected.

This isn’t just for WordPress either. It will also work fine in any PHP project:

<?php
/*
WP-HTML-Compression 0.5 <http://www.svachon.com/blog/wp-html-compression/>
Reduce file size by shortening URLs and safely removing all standard comments and unnecessary white space from an HTML document.
*/
 
class WP_HTML_Compression
{
	// Settings
	protected $compress_css;
	protected $compress_js;
	protected $info_comment;
	protected $remove_comments;
	protected $shorten_urls;
 
	// Variables
	protected $html;
 
 
 
	public function __construct($html, $compress_css=true, $compress_js=false, $info_comment=true, $remove_comments=true, $shorten_urls=true)
	{
		if ($html !== '')
		{
			$this->compress_css = $compress_css;
			$this->compress_js = $compress_js;
			$this->info_comment = $info_comment;
			$this->remove_comments = $remove_comments;
			$this->shorten_urls = $shorten_urls;
 
			$this->parseHTML($html);
		}
	}
 
 
 
	public function __toString()
	{
		return $this->html;
	}
 
 
 
	protected function bottomComment($raw, $compressed)
	{
		$raw = strlen($raw);
		$compressed = strlen($compressed);
 
		$savings = ($raw-$compressed) / $raw * 100;
 
		$savings = round($savings, 2);
 
		return '<!--WP-HTML-Compression saved '.$savings.'%. Bytes before:'.$raw.', after:'.$compressed.'-->';
	}
 
 
 
	protected function callback_HTML_URLs($matches)
	{
		// [2] is an attribute value that is encapsulated with "" and [3] with ''
		return $matches[1].'="'.absolute_to_relative_url($matches[2].$matches[3]).'"';
	}
 
 
 
	protected function minifyHTML($html)
	{
		$pattern = '/<(?<script>script).*?<\/script\s*>|<(?<style>style).*?<\/style\s*>|<!(?<comment>--).*?-->|<(?<tag>[\/\w.:-]*)(?:".*?"|\'.*?\'|[^\'">]+)*>|(?<text>((<[^!\/\w.:-])?[^<]*)+)|/si';
 
		preg_match_all($pattern, $html, $matches, PREG_SET_ORDER);
 
		$overriding = false;
		$raw_tag = false;
 
		// Variable reused for output
		$html = '';
 
		foreach ($matches as $token)
		{
			$tag = (isset($token['tag'])) ? strtolower($token['tag']) : null;
 
			$content = $token[0];
 
			if (is_null($tag))
			{
				if ( !empty($token['script']) )
				{
					$strip = $this->compress_js;
 
					// Will still end up shortening URLs within the script, but should be OK..
					// Gets Shortened:   test.href="http://domain.com/wp"+"-content";
					// Gets Bypassed:    test.href = "http://domain.com/wp"+"-content";
					$relate = true;
				}
				else if ( !empty($token['style']) )
				{
					$strip = $this->compress_css;
					$relate = true;
				}
				else if ($content == '<!--wp-html-compression no compression-->')
				{
					$overriding = !$overriding;
 
					// Don't print the comment
					continue;
				}
				else if ($this->remove_comments)
				{
					if (!$overriding && $raw_tag != 'textarea')
					{
						// Remove any HTML comments, except MSIE conditional comments
						$content = preg_replace('/<!--(?!\s*(?:\[if [^\]]+]|<!|>))(?:(?!-->).)*-->/s', '', $content);
					}
				}
			}
			else	// All tags except script, style and comments
			{
				if ($tag == 'pre' || $tag == 'textarea')
				{
					$raw_tag = $tag;
				}
				else if ($tag == '/pre' || $tag == '/textarea')
				{
					$raw_tag = false;
				}
				else if ($raw_tag || $overriding)
				{
					$strip = false;
				}
				else
				{
					if ($tag != '')
					{
						if (strpos($tag, '/') === false)
						{
							// Remove any empty attributes, except:
							// action, alt, content, src
							$content = preg_replace('/(\s+)(\w++(?<!action|alt|content|src)=(""|\'\'))/i', '$1', $content);
						}
 
						// Remove any space before the end of a tag (including closing tags and self-closing tags)
						$content = preg_replace('/\s+(\/?\>)/', '$1', $content);
 
						$relate = true;
					}
					else	// Content between opening and closing tags
					{
						// Avoid multiple spaces by checking previous character in output HTML
						if (strrpos($html,' ') === strlen($html)-1)
						{
							// Remove white space at the content beginning
							$content = preg_replace('/^[\s\r\n]+/', '', $content);
						}
					}
 
					$strip = true;
				}
			}
 
			// Relate URLs
			if ($relate && $this->shorten_urls)
			{
				$content = preg_replace_callback('/(action|href|src)=(?:"([^"]*)"|\'([^\']*)\')/i', array(&$this,'callback_HTML_URLs'), $content);
			}
 
			if ($strip)
			{
				$content = $this->removeWhiteSpace($content, $html);
			}
 
			$html .= $content;
		}
 
		return $html;
	}
 
 
 
	protected function parseHTML($html)
	{
		$this->html = $this->minifyHTML($html);
 
		if ($this->info_comment)
		{
			$this->html .= "\n" . $this->bottomComment($html, $this->html);
		}
	}
 
 
 
	protected function removeWhiteSpace($html, $full_html)
	{
		$html = str_replace("\t", ' ', $html);
		$html = str_replace("\r", ' ', $html);
		$html = str_replace("\n", ' ', $html);
 
		// This is over twice the speed of a RegExp
		while (strpos($html, '  ') !== false)
		{
			$html = str_replace('  ', ' ', $html);
		}
 
		return $html;
	}
}
 
 
 
function wp_html_compression_finish($html)
{
	// Plugin may be active, or another plugin may have already included this library
	if (!function_exists('absolute_to_relative_url'))
	{
		require_once dirname(__FILE__) . '/external/absolute-to-relative-urls.php';
	}
 
	return new WP_HTML_Compression($html);
}
 
 
 
function wp_html_compression_start()
{
	ob_start('wp_html_compression_finish');
}
 
 
wp_html_compression_start();
?>

You’ll also need to place the main code from Absolute-to-Relative URLs into a sub-directory entitled “external”.

17 thoughts on “WP-HTML-Compression

  1. Pingback: Improve Your Wordpress Page Load Performance

  2. Pingback: Check how fast your site loads | The Steady Hand

  3. Pingback: Download WP-HTML-Compression 0.4 | Free Web Script

  4. Pingback: Speeding Up WordPress. | Roll Your Own Creative

  5. Sorry me for my English, I write through the translator, but your product is really great. I put it on all their sites, helps EcoNova place on hosting.
    Sites gruzyatsya so quickly that it was difficult to imagine it before imagined.
    Thank you very much.
    Develop it further!

  6. Pingback: Plugin Mempercepat Loading Blog WordPress | www.prapto.web.id

  7. Pingback: Plugin for Speed Up WordPress | Blockade Cilacap City

  8. Hi Steven,

    Your plugin worked pretty good till your last update, the Google translate widget disappeared after upgrading to v0.5
    BTW, is it still useful to keep in combination with W3 Total Cache when minify is activated?

    Thanks,

    John

    • W3 Total Cache is slower than WP Super Cache. In addition, its HTML minification requires the HTML Tidy extension, which most servers do not support. I do have plans, however, to investigate whether it offers any performance enhancements to its limited audience. At this time, I cannot offer any further advice.

      • Thank you Steven!
        I understand you did not get any other messages about disapearing Translarte widget..
        I will try super cache in combo with your plugin and let you know if the Google translate widget does stay with this combo.

        BTW, W3 total has 1 million downloads compared to 3 million for WP super, so the audience is not that limited I would say…..for sure worth the effort to look into.

        Thank you so much again.
        Grz, Johnny

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">