Free, tested & ready to use examples!
AnyExample.com
 
Web anyexample.com
 

Making XML/XSLT driven site using PHP

abstract 
XML and XSLT technologies provides standard ways of separation of presentation and data. This article contains an example of simple php "xslt engine" for XML driven web-sites which implements caching techniques and Apache-based XML file processing.
compatible 
  • PHP 5 with XML/XSL extension
  • Apache HTTP Server 1.3 or higher
  • First, we have to set up our engine to process every *.xml file on web server. We use following Apache directives:

    config file: .htaccess, httpd.conf
     
    AddHandler ae_xslt xml      # we add handler 'ae_xslt' associated with xml files
    Action ae_xslt /ae-xslt.php # we set our script '/ae-xslt.php' with handler 'ae_xslt'
    			# thus every *.xml file should pass through our script '/ae-xslt.php'
     
    DirectoryIndex index.xml index.php index.html
    			# adding index.xml to directory index list, specifies that 
    			# yoursite.com -> yoursite.com/index.xml
    			# yoursite.com/folder/ -> yoursite.com/folder/index.xml
     
    

    In case you're curious, here is description of there directives from Apache documentation: AddHandler, Action

    You may put these lines in .htaccess file in the root folder of your site (your hosting provider should allow override 'FileInfo' in .htaccess files) or directly to httpd.conf of Apache.

    Our script will receive file information of processed xml files in two environmental variables: PATH_INFO (http-server path to file, like /page1.xml ) and PATH_TRANSLATED (filesystem path to file, like /var/www/htdocs/page1.xml or something )

    As web-server does not check existence of the handled files, engine should do additional checking and output 404 error message if requested xml file does not exist.

    Afterwards, our engine check if there is a fresh cached version of requested file. Checking is done comparing file modification times. If cached version is valid, engine outputs it and exists.

    Otherwise, engine loads main XSLT file 'ae-site.xslt', loads requested xml file, does transformation and saves new cached file.

    Here is the source code:

    source code: php
    <?php
    // AnyExample XSLT Site engine 

    // Allow PHP to report everything
    error_reporting(E_ALL);

    if (!isset(
    $_SERVER['DOCUMENT_ROOT']))
        die(
    "Web server didn't set DOCUMENT_ROOT");

    // DOCUMENT_ROOT -- is a path to your
    // web site's directory with your files. 
    $docroot $_SERVER['DOCUMENT_ROOT'];

    // some web servers pass file information 
    // in PATH_TRANSLATED/PATH_INFO
    // others -- in 
    // ORIG_PATH_TRANSLATED/ORIG_PATH_INFO
    // lets check: 
    $sapi php_sapi_name();

    if ((
    strpos($sapi'cgi') !== false)||($sapi == 'isapi')
        &&isset(
    $_SERVER['ORIG_PATH_TRANSLATED']))
    {
        
    $realfile $_SERVER['ORIG_PATH_TRANSLATED'];
        
    $http_file $_SERVER['ORIG_PATH_INFO'];
    }
    else
    {
        
    $real_file $_SERVER['PATH_TRANSLATED'];
        
    $http_file $_SERVER['PATH_INFO'];
    }


    // checking if source XML file exists 
    if (!file_exists($real_file))
    {
    // File does not exist: output 404 error 
    header("Status: 404 Not Found"); // 404 HTTP resonse status 
    // 404 page below. Your may change HTML code of it. 
    ?>
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <HTML><HEAD>
    <TITLE>404 Not Found</TITLE>
    </HEAD><BODY>
    <H1>Not Found</H1>
    The requested URL (<?php echo $http_file?>) was not found on this server.<P>
    <HR>
    <ADDRESS>Ae-XSLT by <a href="http://www.anyexample.com/">AnyExample</a> 
    at <?php echo $_SERVER['HTTP_HOST'];?></ADDRESS>
    <!--
    <?php 
    echo str_repeat('ie padding'40); // extra output for Internet Exporer
    ?>
    -->
    </BODY></HTML>    
    <?
    exit();
    }

    $cached_file $docroot.'/.cache/'.str_replace('/''-'$http_file);
    // cached_file -- files that stores generated HTML code 

    $xslt_file $docroot.'/site.xslt';
    // XSLT file -- file that contains XSLT template 

    $xml_time filemtime($real_file);
    $xslt_time filemtime($xslt_file);
    $cache_time = @filemtime($cached_file);
    // Modification times of source XML file, 
    // XSLT file and cached file 


    // Compare file modification time 
    // If cache is created after last modification of 
    // both xml and xslt 
    if (($cache_time $xml_time) && ($cache_time $xslt_time))
    {
        
    // than we can output cached file and stop
        
    readfile($cached_file);
        echo 
    '<!--cached-->';
        exit();
    }

    // Loading XML file 
    $source_xml file_get_contents($real_file);

    if (
    strpos($http_file'/sitemap.xml') !== false)
        echo 
    $source_xml// Do not process Google's Sitemap file 

    // do not process empty files 
    if ($source_xml == "")
        die(
    'Empty XML file');

    // creating&loading DOMDocument 
    $xml = new DOMDocument;
    $xml->substituteEntities true;
    if (
    $xml->loadXML($source_xml) == false// loadXML will fail
        
    die('Failed to load source XML: '.$http_file); // if document is not valid XML 
                         // some tags were not closed, etc. 

    // Loading XSLT site 
    $stylesheet = new DOMDocument;
    $stylesheet->substituteEntities true;
    if (
    $stylesheet->load($xslt_file) == false)
        die(
    'Failed to load XSLT file');


    // XSLT transformation
    $xsl = new XSLTProcessor();
    $xsl->importStyleSheet($stylesheet);
    $output $xsl->transformToXML($xml); // transforming 


    // in some versions of PHP internal
    // XSLTProcessor and DOMDocument 
    // generated broken XHTLM code
    // let's to our own 'htmlizing'

    // htmlizing XML
    $output ltrim(substr($outputstrpos($output'?'.'>')+2)); // removing <?xml
    $output preg_replace("!<(div|iframe|script|textarea)([^>]*?)/>!s""<$1$2></$1>"$output);
    // some browsers does not support empty div, iframe, script and textarea tags
    $output preg_replace("!<(meta)([^>]*?)/>!s""<$1$2 />"$output);
    // meta tag should have extra space before />
    $output preg_replace("!&#(9|10|13);!s"''$output);
    // nobody needs 9, 10, 13 chars 
    $output str_replace(chr(0xc2).chr(0x97), '&mdash;'$output);
    $output str_replace(chr(0xc2).chr(0xa0), '&nbsp;'$output);
    // lets substitute some UTF8 chars to HTML entities 


    echo $output
    // Finally! Outputting HTML to browser 

    // caching (save processed version and display it next time) 
    @file_put_contents($cached_file$output);        
    ?>

    As you may see, cached files is stored in '.cache' subfolder of web-site. Make sure it exists and is writable to your PHP scripts

    How to use it? Look at the XML file:

    source code: xml
    <?xml version="1.0"?> 
    <page> 
        <title>Page 2</title> 
        <subtitle>Famous panagrams:</subtitle> 
     
        <paragraph> 
        	The quick brown fox jumped over the lazy dog's typewriter.
        </paragraph> 
     
        <paragraph> 
        	Cozy lummox gives smart squid who asks for job pen
        </paragraph> 
     
        <references> 
    		<item url="http://www.anyexample.com">AnyExample</item> 
    		<item url="http://en.wikipedia.org/wiki/Panagram">Wikipedia panagrams</item> 
        </references> 
    </page>

    Web site's main XSLT file 'ae-site.xslt' contains following template:

    source code: XSLT
    <?xml version="1.0"?> 
     
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
        <!-- XML output mode --> 
        <xsl:output method="xml" standalone="yes" indent="no" encoding="utf-8"/> 
     
        <!-- we do not need spaces in output file --> 
        <xsl:strip-space elements="*"/> 
     
        <!-- this template copies unknown XML tags to output file, 
    	 allows use of XHTML -->    
        <xsl:template match="@*|node()"> 
    	<xsl:copy> 
    	    <xsl:apply-templates select="@*|node()" /> 
    	</xsl:copy> 
        </xsl:template> 
     
        <!-- Main page template --> 
        <xsl:template match="page"> 
        	<html> 
        		<head> 
        			<title><xsl:value-of select="//title"/></title> 
        		</head> 
        		<body> 
     
        			<div style="width: 50%; padding: 8px; background-color: #DDD;"> 
        				<b>Menu: </b> 
        				<a href="index.xml">Main page</a>, 
        				<a href="page1.xml">Page 1</a>, 
        				<a href="page2.xml">Page 2</a> 
        			</div> 
     
        			<div style="width: 50%; padding: 4px; background-color: #EEE;"> 
        				<h1><xsl:value-of select="//title"/></h1> 
     
        				<xsl:apply-templates match="content"/> 
     
        			</div> 
     
        		</body> 
        	</html> 
        </xsl:template> 
     
     
        <!-- For references section --> 
     
        <xsl:template match="references"> 
        	<ol> 
    			<xsl:apply-templates match="item"/> 
        	</ol> 
        </xsl:template> 
     
        <xsl:template match="references/item"> 
        	<li> 
        		<xsl:choose> 
        			<xsl:when test="@url"> <!-- item tag has url attibute --> 
        				<a> <!-- enclose text in <a href="" --> 
        					<xsl:attribute name="href"><xsl:value-of select="@url"/></xsl:attribute> 
        					<xsl:apply-templates/></a> 
        				</xsl:when> 
        				<xsl:otherwise> <!-- Otherwise, make text italic --> 
        					<i>--<xsl:apply-templates/></i> 
        				</xsl:otherwise> 
        			</xsl:choose> 
        		</li> 
        </xsl:template> 
     
        <!-- paragraph tag --> 
        	<xsl:template match="paragraph"> 
        		<p> 
    				<xsl:apply-templates/> 
        		</p> 
        </xsl:template> 
     
        <!-- subtitile tag --> 
        <xsl:template match="subtitle"> 
        	<h2> 
    			<xsl:apply-templates/> 
        	</h2> 
        </xsl:template> 
     
     
        <!-- Empty tempate: we use values from these tags in 
    	 other templates  --> 
        <xsl:template match="title" /> 
    </xsl:stylesheet>

    So, when web site visitor asks for page1.xml, XSLT transformation will substitute <page> tag to <html><head<..., set page title and H1 header from <title> tag, transform <paragraph> tag to <p>... — correctly converting page2.xml from pure XML to XHTML.

    Download whole XSLT engine example site in one zip archive.

    Check out other articles about XSLT on the net:





    warning 
  • Your site's source XML / XSLT pages should be valid XML documents
  • Your version of PHP 5 should be compiled with XSLT extension
  • tested 
  • FreeBSD 6.2 :: Apache 2.2.4 :: PHP 5.2.1
  • FedoraCore 3 :: Apache 1.3 :: PHP 5.0.5
  •  


     
    © AnyExample 2010-2013
    License | Privacy | Contact