March 6, 2013
This article was contributed by Martin Michlmayr
There are many ways to create web sites. Possibilities include writing HTML
files manually,
employing a framework to create dynamic web sites, or adopting a
fully-fledged content management system (CMS) that offers a central
interface to create, edit, review, and publish content. There are also a
number of tools called static site generators that can help with the creation
of static web sites. They transform input (typically text in lightweight
markup languages, such as Markdown or
reStructuredText)
to static HTML, employing templates and filters on the way.
While static
site generators have been around for many years, it seems that their
popularity is increasing. FOSDEM migrated from
Drupal to nanoc for their 2013
edition and kernel.org just
rolled out a new site based on
Pelican. Static site generators are likely to appeal to LWN
readers as they allow you to turn your web site into an open source
project, approaching it like any software development project. This article
explains the concept behind static site generators, highlighting their
benefits and functionality. It refers to nanoc as one example of such a tool, but the
functionality of other site generators is quite similar.
Benefits
Static site generators offer a number of advantages over dynamic web sites.
One is high performance, as static HTML pages can
immediately be served by the web server because there are no database
requests or
other overhead. Performance is further enhanced because browsers can
easily cache static web pages based on the modification time. Security
is also higher since web sites generated statically are just collections
of HTML files and supplementary files, such as images. There is no
database on the server and no code is being executed when the page is
requested. As a side effect, no software upgrades of specific web
frameworks have to be applied in order to keep your site up to date, making
the site much easier to maintain. Finally, as the site is compiled on your
machine instead of the server, the requirements for the hosting site are
quite minimal: no special software needs to be installed, and the
processing and memory requirements are low as it just needs to run a web
server.
There are also a number of benefits in terms of workflow. One advantage is
that static site generators allow you to follow whatever workflow you like.
Input files are simple text files that can be modified with your editor of
choice and there is a wide choice of input formats. Markdown seems
particularly popular, but any input format that can be transformed to HTML
can be used. Furthermore, input files can be stored in the version control
system of your choice and shared with potential collaborators.
Static site generators also promote a smooth review and deployment
process. Since you compile your content locally, you can check it before
uploading. This can include a review of the diff of the generated content
or more thorough checks, such as validating the HTML or checking for broken
links. Once you're ready to deploy the site, updating your site is just a
matter of running your static site generator to generate the new output and
syncing your files to your hosting provider using rsync.
While static web sites are not suited for every use case, they are an
attractive alternative to a CMS or a
dynamic web framework in many situations.
Use software development processes
The use of static site generators makes the creation of your web site into
a process akin to software development. You create input files along
with rules that specify how these files should be transformed. Your static
site generator of choice performs the compilation process to generate your
site and make it ready for deployment. Dependencies between different files
are tracked and pages are only regenerated when their contents or dependencies
have changed.
As in every good software development project, content that is common to several
pages can be split out. The most common approach is to create a template for the
layout of your pages (consisting of HTML headers, the site layout, sidebars,
and other information). Nanoc supports Ruby's templating system, ERB,
as well as Haml, the HTML abstraction
markup language. You can also split out commonly used snippets of HTML code,
such as a PayPal or Flattr button. These can be included from other files
and it's possible to pass parameters in order to modify their appearance.
A site generator like nanoc will compile individual items and combine them
with a layout to produce the finished HTML document. Nanoc allows the creation of a Rules file which defines
the operations that nanoc should perform on different items. Nanoc differentiates between compile rules, which
specify the transformation steps for an item, and route rules,
which tell nanoc where to put an item. A compile rule could
specify that pages with the .md extension are to be rendered
from Markdown to HTML with the pandoc filter. The rule would also
specify a layout to use for the page. A route directive would
be used to specify that the rendered output of foo.md should
be stored as foo/index.html.
There are many filters that can transform your input. Nanoc offers filters
to transform text from a range of formats to HTML. It also allows you to
embed Ruby code using ERB, which is useful to access information from other
pages and to run code you've written. What I like about static site
generators is that they make it really easy to write content: instead of
writing HTML, you use a lightweight markup language and let the tool
perform the transformation for you. Additionally, you can run filters to
improve the typography of your page, such as converting --- to
— or "foo" to “foo”. You could also write
a filter to include images without manually specifying their height and
width—why not let a filter do the boring work for you? While nanoc
has a number of built-in filters,
it's trivial to write your own—or to extend it in other ways.
Once you have written some input files and created a layout along with
rules to specify how files should be compiled, the site generator will do
the rest for you. The compilation process will take every changed item and
produce output by running your specified transformations on the input. You
can also configure the tool to deploy the site for you. However, as
mentioned before, you should approach your web site like your software
project—and who wants to ship code before testing it? Nanoc allows
you to run checks on your output. It has built-in checks to validate CSS
and HTML, find stale files, and to find broken links (either internal or
external links). Further checks can be added with a few lines of code.
Some examples
Thinking of my own home page, I can
see a number of ways that using a static site generator would make it easier
to maintain. At the moment, my site relies on a collection of HTML files
and a Perl script to provide basic template functionality ("hack" might be
a more appropriate description). Migrating to a tool like nanoc would
instantly give me dependency tracking and a proper templating system.
There are a number of ways I could further improve my site, though. I
maintain a list of academic
publications, consisting of a main index page
along with a separate page for each paper. When adding a new paper, I have
to duplicate a lot of information on two pages. Using nanoc and some Ruby
libraries, I could simply generate both pages from a BibTeX file (LaTeX's bibliography
system). This would not only reduce code text duplication
but also automatically format the paper information according to my
preferred citation style. Similarly, I maintain several HTML tables showing
the status of Debian support for embedded devices. While updating these
tables is not too much work, it would be much cleaner to store the
information in a YAML or JSON file and generate the tables automatically.
Another useful nanoc feature is the ability to create different
representations
from one input file. In addition to transforming your CV or résumé from the Markdown
input to HTML, you could also generate a PDF from the same input.
Similarly, you could create an ebook from a series of blog entries in
addition to displaying the blog entries on your web site.
Static doesn't mean boring
One objection people might have to static site generators is that static
sites are boring. However, this isn't necessarily the case, for a number of
reasons. First, a static site can use JavaScript to provide dynamic and
interactive elements on the site. Second, a statically generated web site
doesn't have to be static—it can be updated many times per day.
Nanoc, for example, allows you to use data from any source as input. You
could periodically download a JSON file of your Twitter feed and render
that information on your web site. An open source project could download
the changelog from its version control system and automatically generate a
list of releases for its web site.
A good example is the FOSDEM web site: the FOSDEM organizers internally use
the Pentabarf conference planning
system to schedule talks. Information from
Pentabarf is periodically exported and used as a data source to generate
the schedule on the web site. The organizers only had to write some code once to
transform the Pentabarf data into a nice schedule. Now that this
functionality has been implemented, nanoc will update their web site
whenever the data changes.
Another problem with static sites is the lack of support for comments and
other discussion mechanisms. Fortunately, there are a number of solutions.
One approach is demonstrated by a plug-in for
Jekyll, which contains a PHP script that forwards comments by email. These
can be added by the web site owner (either automatically or after manual
moderation) and the web site re-built. A more interactive, and commonly
used solution, is the use of Disqus, an
online discussion and commenting service that can be embedded in web sites
and blogs with the help of JavaScript. Juvia appears to be a viable
open source alternative to Disqus, although I couldn't find many sites
using it.
Conclusion
Static site generators are an attractive solution for many web sites and
there is a wide range of tools to choose from. Since many site
generators are frameworks that allow you to extend the software, a good way
to select a tool is by looking at its programming language. There are
solutions for Haskell (Hakyll),
Perl (Templer),
Python (Hyde, Pelican), Ruby (Jekyll, Middleman, nanoc) and many more. You can also check out
Steve Kemp's recent evaluation of static
site generators.
What's clear to me is that the time of routinely writing HTML by hand is
definitely
over. It's much nicer to write your content in Markdown and let the site
generator do the rest for you. This allows you to spend more time writing
content—assuming you can stop yourself from further and further enhancing
the code supporting your site.
(
Log in to post comments)