![]() |
|
AxKit - XML Application ServerMatt Sergeant
This document is a guide to the ins and outs of using AxKit AxKit - XML Application ServerAxKit is an XML Application Server, written using the mod_perl framework. At it's core, AxKit provides the developer with many ways to setup server side XML transformations. In reality, this allows you to rapidly develop sites that use XML, allowing delivery of the same content in different formats, and also to allow you to change the layout of your site very easily, due to the forced separation of content from presentation. This appendix gives an overview of the ways you can put AxKit to use on your mod_perl enabled server. It is not a complete description of all the capabilities of AxKit. For more detailed information, please take a look at the documentation provided on the AxKit web site at http://axkit.org/. Commercial support and consultancy services for AxKit are available at http://axkit.com/. The key benefit to using XML application servers is their support for W3C recommendations. There are many templating technologies available in the Perl world, but only XSLT is a W3C recommendation [F]1[/F] . Because XML is a de-facto standard,it breaks down the barriers between people using Java or Perl or Python, because ultimately we are all working with the same data formats. Plus we also have the advantage that many tools are built, or being built, for writing XML and XSLT, allowing designers and developers to use the tools of their choice for authoring their content. Other benefits include a well designed scaleable architecture, flexible site layout options, and simple to build dynamic web applications. Installing and ConfiguringWhile we aim to make the AxKit installation as simple as possible, there are many configuration options that allow you to customize your installation. So in this section we aim to get you started as quickly as possible. This section assumes you already have mod_perl and Apache installed and working. See the Chapter X if this is not the case. This section does not cover installing AxKit on Win32 systems, for which there is an ActiveState package at <URL>. First download the latest version of AxKit, which you can get either from your local CPAN archive, or from the AxKit download directory at http://axkit.org/. Then type the following:
If perl Makefile.PL warns about missing modules, notably XML::XPath, make a note of the missing modules and install them from the CPAN. AxKit will run without the missing modules, but without XML::XPath it will be impossible to run the examples below [F]2[/F]. Now we need to add some simple options to the very end of our httpd.conf file:
Note that the first line: PerlModule AxKit must occur in your httpd.conf outside of any runtime configuration blocks, otherwise Apache cannot see the AxKit configuration directives and you will get errors when you try and start the httpd. Now if you have XML::XPath installed (try perl -MXML::XPath -e0 on the command line to check), (re)start your Apache server. You are now ready to begin publishing transformed XML with AxKit! Your First AxKit PageNow we're going to see how AxKit works by transforming an XML file containing data about Camelids (note the dubious Perl reference) into HTML. First you will need a sample XML file. Open the text editor of your choice and type the following:
Save this file in your web server root (normally /path/to/apache/htdocs/) as test.xml. Now we need a stylesheet to transform that to HTML. For this first example we are going to introduce XPathScript, an XML transformation language specific to AxKit. Later we will give a brief introduction to XSLT. Create a new file and type in:
Save this file as test.xps. Now to get the original file test.xml to be transformed on the server with text.xps we need to somehow associate that file with the stylesheet. Under AxKit there are a number of ways to do that with varying flexibility. The simplest way is to edit your test.xml file, and immediately after the <?xml version="1.0"?> declaration, add the following:
Now assuming the files are both in the same directory under your httpd document root, you should be able to do a request for text.xml and see in your browser server side transformed XML. Now try changing the source XML file, and watch AxKit detect the change next time you load the file in the browser. If things go wrong...If you don't see HTML in your browser, but instead get the source XML in your browser (in Internet Explorer you will see a tree based representation of the XML, and in Mozilla or Netscape you will see all the text in the document joined together), then you will need to check your error log. AxKit sends out varying amounts of debug information depending on the value of AxDebugLevel (which we set to the maximum value of 10). If you can't decipher the contents of the error log, contact the AxKit User's mailing list at axkit-users@axkit.org with details of your problem. How does it work?The stylesheet above specifies how the various tags work. The ASP <% %> syntax delimits Perl code from HTML. You can execute any code within the stylesheet, however here we are making use of the special XPathScript $t hash ref. This specifies the names of tags, and how they should be output to the browser. There are several options for the second level of the hash, and here we see two of those options: pre and post. This specifies quite simply, what appears before the tag, and what appears after it. These values in $t only take affect when we call the apply_templates() function, which iterates over the nodes in the XML, executing the matching values in $t. XPathOne of the key specifications being used in XML technologies is XPath. This is a little language used within other languages for selecting nodes within an XML document. The initial appearance is similar to that of Unix directory paths. In the above example we can see the XPath /dromedaries/species, which starts at the root of the document and finds first the dromedaries root element, then the species children of the dromedaries element. Note that unlike Unix directory paths, XPaths can match multiple nodes, so in the case above, we select all of the species elements in the document. Documenting all of XPath here would take up many pages. The grammar for XPath allows many constructs of a full programming language, such as functions, string literals, and boolean expressions, but it is important to know that the syntax we are using to find nodes in our XML document is not just something invented for AxKit! Dynamic ContentAxKit has a flexible tool for creating XML from various data sources such as relational databases, cookies, and form parameters, called eXtensible Server Pages, or XSP. This technology was originally invented by they Apache Cocoon team, and we share their syntax [F]3[/F]. This allows easier migration of projects to and from Cocoon. XSP is an XML based syntax that uses namespaces to provide extensibility. In many ways this is like the Cold Fusion model, of using tags to provide dynamic functionality. One of the advantages of using XSP is that it is impossible to generate invalid XML, which makes it ideal for use in an XML framework like AxKit. The XSP framework allows you to add in extra tags into your XML to provide custom functionality. These extra tags are called taglibs. By using taglibs, rather than embedding Perl code in your XSP page, you can further build on AxKit's separation of content from presentation, by separating out logic too. There are several taglibs that are available for AxKit's XSP engine already on CPAN. Handling Form ParametersThe AxKit::XSP::Param taglib allows you to easily read form and querystring parameters within an XSP page. The following example shows how a page can submit back to itself. To allow this to work, the following needs to be added to your httpd.conf:
Then the XSP page is:
The most significant factor about the above is how we freely mix XML tags with our Perl code, and the XSP processor figures out the right thing to do depending on context. There is a consequence of this, in that the XSP page itself must be valid XML, so the following would generate an error:
In order to get around this restriction, there are a number of ways we can code this in XML. The simplest is just to reverse the expression to if (3 > $page), because the greater-than sign is valid within an XML text section. Another way is to encode the less-than sign as <, which will be familiar to HTML authors. The other thing to notice is the <xsp:logic> and <xsp:content> tags. The former defines a section of Perl code, while the latter allows you to go back to processing the contents as XML output. It is also worth noting that the <xsp:content> tag is not always needed. Because the XSP engine inherently understands XML, you can omit the <xsp:content> tag when the immediate child would be an element, rather than text. Two examples would be:
versus the case when there is a surrounding non-XSP tag:
Note that the initial example, when processed only by the XSP engine, will output the following XML:
This needs processed with XSLT or XPathScript to be reasonably viewable in a browser, however the point is that you can re-use the above page as either HTML or WML just by applying different stylesheets. Handling CookiesAxKit::XSP::Cookie is a taglib interface to Apache::Cookie (part of the libapreq package). The following example demonstrates both retrieving and setting a cookie from within XSP. In order for this to run, the following option needs to be added to your httpd.conf:
And the XSP page is:
This page introduces the concept of XSP expressions, using the <xsp:expr> tag. In XSP, everything that returns a value is an expression of some sort. In both of the above examples we have used a taglib tag within a Perl if() statement. These tags have both been expressions, even though they don't use the <xsp:expr> syntax. In XSP, everything understands its context, and tries to do the right thing. This way, the following three examples would work as expected:
We see this as an extension of how Perl works - the idea of "Do What I Mean", or DWIM. Sending EmailWith the AxKit::XSP::Sendmail taglib it is very simple to send email from an XSP page. This taglib combines email address verification using the Email::Valid module, along with email sending using the Mail::Sendmail module (which will interface either to an SMTP server, or direct to the sendmail executable). Again, to allow usage of this taglib, the following line must be added to httpd.conf:
Then sending email from XSP is as simple as:
The only thing missing here is some sort of error handling. When the sendmail taglib detects an error (either in an email address, or in sending the email), it throws an exception. Handling ExceptionsThe exception taglib, AxKit::XSP::Exception, is used to catch exceptions. The syntax is very simple, rather than allowing different types of exceptions, it is currently a very simple try/catch block. To use the exceptions taglib, the following has to be added to httpd.conf:
Then we can implement form validation using exceptions:
The exact same try/catch (and message) tags can be used for sendmail, and for ESQL (see below). Utilities TaglibThe AxKit::XSP::Util taglib includes some utility methods for including XML, from the filesystem, from a URI, or as the return value from an expression (normally an expression would be rendered as plain text, and so a "<" character would be encoded as "<"). The AxKit Util taglib is a direct copy of the Cocoon Util taglib, and as such uses the same namespace as the Cocoon Util taglib: http://apache.org/xsp/util/v1. Executing SQLPerhaps the most interesting taglib of all is the ESQL taglib, which allows you to execute SQL queries against a DBI compatible database, and provides access to the column return values as strings, scalars, numbers, dates, or even as XML (the latter uses the Util taglib, which must be installed in order to be able to use the ESQL taglib). Like the sendmail taglib, the ESQL taglib throws exceptions when an error occurs. One point of interest about the ESQL taglib is that it is a direct copy of the Cocoon ESQL taglib [F]4[/F], again helping you to port projects to or from Cocoon. As with all the other taglibs, ESQL requires the addition of the following to your httpd.conf:
An example ESQL usage which reads data from an address book table is below. This page demonstrates how it is possible to re-use the same code for both our list of addresses, and viewing a single address in detail.
The result of running the above through the XSP processor is:
More XPathScript DetailsXPathScript aims to provide the power and flexibility of XSLT as an XML transformation language, without the restriction of XSLT's XML based syntax. Unlike XSLT, XPathScript only outputs plain text (XSLT has special modes for outputting in text, XML and HTML). This has advantages in being a lot easier to learn than XSLT for people coming from a Perl background, however XPathScript is not a W3C specification (although XPath, which XPathScript uses, is a W3C recommendation). XPathScript follows the basic ASP syntax for introducing code, and outputting code to the browser; use <% %> to introduce Perl code, and <%= %> to output a value. The XPathScript APIAlong with the code delimiters XPathScript provides stylesheet developers with a full API for accessing and transforming the source XML file. This API can be used in conjunction with the delimiters above to provide a stylesheet language that is as powerful as XSLT, and yet provides all the features of a full programming language (in this case, Perl, but I'm certain that other implementations such as Python or Java would be possible). Extracting ValuesA simple example to get us started, is to use the API to bring in the title from a docbook article. A docbook article title looks like this:
The XPath expression to retrieve the text in the title element is:
Putting this all together to make this text into the HTML title we get the following XPathScript stylesheet:
Again we see the XPath syntax being used to find the nodes in the document, along with the function findvalue(). Similarly a list of nodes can be extracted (and thus looped over) using the findnodes() function:
Here we see how we can apply the find* functions to individual nodes as methods, which makes the node the context node to search from, so $node->findnodes("title") finds <title> child nodes of $node. Declarative TemplatesWe have already seen declarative templates in our "First AxKit Page" above. The $t hash is the key to declarative templates. The apply_templates() function iterates over the nodes of your XML file, applying the templates defined in the $t hash reference as it meets matching tags. This is the most important feature of XpathScript, because it allows you to define the appearance for individual tags without having to do your own iteration logic. We call this declarative templating. The keys of $t are the names of the elements, including namespace prefixes where appropriate. When apply_templates() is called, XPathScript tries to find a member of $t that matches the element name. The following sub-keys define the transformation:
More details about XPathScript can be found on the AxKit web page, at http://axkit.org/. XSLTOne of the most important technologies to come out of the W3C is XSLT, or, Extensible Stylesheet Language Transformations. XSLT provides a way to transform one type of XML document into another using a language written entirely in XML. XSLT works by allowing developers to create one or more template rules that are applied to the various elements in the source document to produce a second, transformed document. While the basic concept behind XSLT is quite simple (apply these rules to the elements that match these conditions), the finer points of writing good XSLT stylesheets is a huge topic that we could never hope to cover here. We will instead provide a small example that illustrates the basic XSLT syntax. First, though, we need to configure AxKit to transform XML documents using an XSLT processor. For this example, we will assume that you already have the Gnome XSLT library (libxml2 and libxslt, available at http://xmlsoft.org/) and its associated Perl modules installed on your server.
Adding this line to your httpd.conf file tells AxKit to process all XML documents with a stylesheet processing instruction whose type is "text/xsl" with the LibXSLT language module. Anatomy Of An XSLT StylesheetAll XSLT stylesheets contain the following:
Consider the following bare-bones stylesheet:
Note that the root template (defined by the match="/" attribute) will be called without regard for the contents of the XML document being processed. As such, this the best place to put the top-level elements that we want to include in the output of each and every document being transformed with this stylesheet. Template Rules And RecursionLet's take our basic stylesheet and extend it to allow us to transform the following DocBook XML document (which we will call camelhistory.xml) into HTML:
First we need to alter the root template of our stylesheet:
Here we have created the top-level structure of our output document and copied over the book's title element into the head element of out HTML page. The <xsl:apply-templates/> element tells the XSLT processor to pass the entire contents of the current element (in this case the <book> element, since it is the root-level element in the source document) on for further processing. Now we need to create template rules for the other elements in the document:
Here we see more examples of recursive processing. The <para> and <chapter> elements are transformed into <div> and <p> elements and the contents of those elements are passed along for further processing. Note also that the XPath expressions used within the template rules are evaluated in the context of the current element. So, when we select the value of the title element to create the id attribute for the div tag, we are really saying "select the value of the title element that is a child of the current chapter element". While this sort of recursive processing is extremely powerful, it can also be quite a performance hit and is only necessary for those cases where the current element contains other elements that need to be processed. If we know that a particular element will not contain any other elements, we need only return that element's text value.
Look closely at the last two template elements. Both match a <title> element, but one defines the rule for handling titles whose parent is a book element, while the other handles the chapter titles. In fact, any valid XPath expression, XSLT function call, or combination of the two can be used to define the match rule for a template element. Finally, we need only save our stylesheet as docbook-snippet.xsl. Once our source document is associated with this stylesheet (see the section titled Putting It Together in this appendix), if we point our browser to camelhistory.xml we will get the following output:
Learning MoreWe have only scratched the surface of how XSLT can be used to transform XML documents. For more information, see the following resources:
Also note that the complete set of files for the above example XSLT code is available at http://axkit.org/examples/book-xslt.tar.gz Putting Everything TogetherThe last key peice to AxKit is how everything is tied together. We have a clean separation of logic, presentation and content, but we've only briefly introduced using processing instructions for setting up the way a file gets processed through the AxKit engine. A generally better and more scalable way to work is to use the AxKit configuration directives to specify how to process files through the system. Before introducing the configuration directives in detail, it is worth looking at how the W3C sees the evolving web of new media types. The HTML 4.0 specification defined 8 media types:
AxKit allows you to plug in modules that can detect these different media types, allowing you to deliver the same content in different ways. For finer grained control, you can use named stylesheets (where you might have a printable page output to the screen media type, as seen on many magazine sites such as http://take23.org/ for displaying multi-page articles). For example, to map all files with extension .dkb to a DocBook stylesheet, you would use the following directives:
Now if you wanted to display those DocBook files on WebTV as well as ordinary web browsers, but you wanted to use a different stylesheet for WebTV, you would use:
Now let's extend that to chained transformations. Lets say you wanted to build up a table of contents the same way in both views. One way you could do it is to modularize the stylesheet. However it's also possible to chain transformations in AxKit, simply by defining more than one processor for a particular resource:
Now the TV based browsers will see DocBook first tranformed by docbook_toc.xsl, then the output of that transformation would be processed by docbook_tv.xsl. This is exactly how we would build up an application using XSP:
This resolves the earlier issue we had where the XSP did not output HTML - instead it output something entirely different. Now we can see why - because this way we can build dynamic web applications that work easily on different devices! There are 4 other config directives similar to AxAddProcessor, which take an additional parameter specifying a particular way to examine the file being processed. These each take an additional parameter to facilitate the match.
Finally, the <AxStyleName> block allows you to specify named stylesheets. An example that implements printable/default views of a document might be:
By mixing the various embedded tags, it is possible to build up a very feature rich sitemap of how your files get processed. Footnotes1 The W3C calls it's approved specifications "Recommendations", rather than "standards", because they are an industry consortium, rather than a standards body. However most people consider thier recommendations to be standards, because they almost always become de-facto standards. 2 AxKit is very flexible in how it lets you transform the XML on the server, and there are many modules you can plug in to AxKit to allow you to do these transformations. For this reason, the AxKit installation does not mandate any particular modules to use, instead it will simply suggest modules that might help when you install AxKit. 3 While we share the XSP syntax with Cocoon, Cocoon allows you to embed Java code in your XSP, while AxKit only allows you to embed Perl code. 4 AxKit and Cocoon's ESQL taglibs are identical apart from a few minor deviations, such as how columns of different types are returned, and how errors are trapped (in Cocoon there are ESQL tags for trapping errors, whereas AxKit uses exceptions). List of Links
|