Development
Static site generators for building web sites
There are many ways to create web sites. Possibilities include writing HTML files manually, employing a framework to create dynamic web sites, or adopting a fully-fledged content management system (CMS) that offers a central interface to create, edit, review, and publish content. There are also a number of tools called static site generators that can help with the creation of static web sites. They transform input (typically text in lightweight markup languages, such as Markdown or reStructuredText) to static HTML, employing templates and filters on the way.
While static site generators have been around for many years, it seems that their popularity is increasing. FOSDEM migrated from Drupal to nanoc for their 2013 edition and kernel.org just rolled out a new site based on Pelican. Static site generators are likely to appeal to LWN readers as they allow you to turn your web site into an open source project, approaching it like any software development project. This article explains the concept behind static site generators, highlighting their benefits and functionality. It refers to nanoc as one example of such a tool, but the functionality of other site generators is quite similar.
Benefits
Static site generators offer a number of advantages over dynamic web sites. One is high performance, as static HTML pages can immediately be served by the web server because there are no database requests or other overhead. Performance is further enhanced because browsers can easily cache static web pages based on the modification time. Security is also higher since web sites generated statically are just collections of HTML files and supplementary files, such as images. There is no database on the server and no code is being executed when the page is requested. As a side effect, no software upgrades of specific web frameworks have to be applied in order to keep your site up to date, making the site much easier to maintain. Finally, as the site is compiled on your machine instead of the server, the requirements for the hosting site are quite minimal: no special software needs to be installed, and the processing and memory requirements are low as it just needs to run a web server.
There are also a number of benefits in terms of workflow. One advantage is that static site generators allow you to follow whatever workflow you like. Input files are simple text files that can be modified with your editor of choice and there is a wide choice of input formats. Markdown seems particularly popular, but any input format that can be transformed to HTML can be used. Furthermore, input files can be stored in the version control system of your choice and shared with potential collaborators.
Static site generators also promote a smooth review and deployment
process. Since you compile your content locally, you can check it before
uploading. This can include a review of the diff of the generated content
or more thorough checks, such as validating the HTML or checking for broken
links. Once you're ready to deploy the site, updating your site is just a
matter of running your static site generator to generate the new output and
syncing your files to your hosting provider using rsync
.
While static web sites are not suited for every use case, they are an attractive alternative to a CMS or a dynamic web framework in many situations.
Use software development processes
The use of static site generators makes the creation of your web site into a process akin to software development. You create input files along with rules that specify how these files should be transformed. Your static site generator of choice performs the compilation process to generate your site and make it ready for deployment. Dependencies between different files are tracked and pages are only regenerated when their contents or dependencies have changed.
As in every good software development project, content that is common to several pages can be split out. The most common approach is to create a template for the layout of your pages (consisting of HTML headers, the site layout, sidebars, and other information). Nanoc supports Ruby's templating system, ERB, as well as Haml, the HTML abstraction markup language. You can also split out commonly used snippets of HTML code, such as a PayPal or Flattr button. These can be included from other files and it's possible to pass parameters in order to modify their appearance.
A site generator like nanoc will compile individual items and combine them
with a layout to produce the finished HTML document. Nanoc allows the creation of a Rules
file which defines
the operations that nanoc should perform on different items. Nanoc differentiates between compile
rules, which
specify the transformation steps for an item, and route
rules,
which tell nanoc where to put an item. A compile
rule could
specify that pages with the .md
extension are to be rendered
from Markdown to HTML with the pandoc
filter. The rule would also
specify a layout to use for the page. A route
directive would
be used to specify that the rendered output of foo.md
should
be stored as foo/index.html
.
There are many filters that can transform your input. Nanoc offers filters
to transform text from a range of formats to HTML. It also allows you to
embed Ruby code using ERB, which is useful to access information from other
pages and to run code you've written. What I like about static site
generators is that they make it really easy to write content: instead of
writing HTML, you use a lightweight markup language and let the tool
perform the transformation for you. Additionally, you can run filters to
improve the typography of your page, such as converting ---
to
— or "foo"
to “foo”
. You could also write
a filter to include images without manually specifying their height and
width—why not let a filter do the boring work for you? While nanoc
has a number of built-in filters,
it's trivial to write your own—or to extend it in other ways.
Once you have written some input files and created a layout along with rules to specify how files should be compiled, the site generator will do the rest for you. The compilation process will take every changed item and produce output by running your specified transformations on the input. You can also configure the tool to deploy the site for you. However, as mentioned before, you should approach your web site like your software project—and who wants to ship code before testing it? Nanoc allows you to run checks on your output. It has built-in checks to validate CSS and HTML, find stale files, and to find broken links (either internal or external links). Further checks can be added with a few lines of code.
Some examples
Thinking of my own home page, I can see a number of ways that using a static site generator would make it easier to maintain. At the moment, my site relies on a collection of HTML files and a Perl script to provide basic template functionality ("hack" might be a more appropriate description). Migrating to a tool like nanoc would instantly give me dependency tracking and a proper templating system.
There are a number of ways I could further improve my site, though. I
maintain a list of academic
publications, consisting of a main index page
along with a separate page for each paper. When adding a new paper, I have
to duplicate a lot of information on two pages. Using nanoc and some Ruby
libraries, I could simply generate both pages from a BibTeX file (LaTeX's bibliography
system). This would not only reduce code text duplication
but also automatically format the paper information according to my
preferred citation style. Similarly, I maintain several HTML tables showing
the status of Debian support for embedded devices. While updating these
tables is not too much work, it would be much cleaner to store the
information in a YAML or JSON file and generate the tables automatically.
Another useful nanoc feature is the ability to create different representations from one input file. In addition to transforming your CV or résumé from the Markdown input to HTML, you could also generate a PDF from the same input. Similarly, you could create an ebook from a series of blog entries in addition to displaying the blog entries on your web site.
Static doesn't mean boring
One objection people might have to static site generators is that static sites are boring. However, this isn't necessarily the case, for a number of reasons. First, a static site can use JavaScript to provide dynamic and interactive elements on the site. Second, a statically generated web site doesn't have to be static—it can be updated many times per day. Nanoc, for example, allows you to use data from any source as input. You could periodically download a JSON file of your Twitter feed and render that information on your web site. An open source project could download the changelog from its version control system and automatically generate a list of releases for its web site.
A good example is the FOSDEM web site: the FOSDEM organizers internally use the Pentabarf conference planning system to schedule talks. Information from Pentabarf is periodically exported and used as a data source to generate the schedule on the web site. The organizers only had to write some code once to transform the Pentabarf data into a nice schedule. Now that this functionality has been implemented, nanoc will update their web site whenever the data changes.
Another problem with static sites is the lack of support for comments and other discussion mechanisms. Fortunately, there are a number of solutions. One approach is demonstrated by a plug-in for Jekyll, which contains a PHP script that forwards comments by email. These can be added by the web site owner (either automatically or after manual moderation) and the web site re-built. A more interactive, and commonly used solution, is the use of Disqus, an online discussion and commenting service that can be embedded in web sites and blogs with the help of JavaScript. Juvia appears to be a viable open source alternative to Disqus, although I couldn't find many sites using it.
Conclusion
Static site generators are an attractive solution for many web sites and there is a wide range of tools to choose from. Since many site generators are frameworks that allow you to extend the software, a good way to select a tool is by looking at its programming language. There are solutions for Haskell (Hakyll), Perl (Templer), Python (Hyde, Pelican), Ruby (Jekyll, Middleman, nanoc) and many more. You can also check out Steve Kemp's recent evaluation of static site generators.
What's clear to me is that the time of routinely writing HTML by hand is definitely over. It's much nicer to write your content in Markdown and let the site generator do the rest for you. This allows you to spend more time writing content—assuming you can stop yourself from further and further enhancing the code supporting your site.
Brief items
Quotes of the week
Python moves to electronic contributor agreements
The Python project has announced that it is trying to ease the process of signing a contributor agreement through the use of Adobe's "EchoSign" service. "Faxes fail, mail gets lost, and sometimes pictures or scans turn out poorly. It was time to find a more user-friendly solution, and the Foundation is happy to finally offer this electronic form."
Buildroot 2013.02 released
Version 2013.02 of the buildroot tool for embedded Linux development is available. Changes include 66 new packages, Eclipse integration support, and the option to set the root password.
10 years of PyPy
The PyPy project, which is working toward the creation of a highly-optimized interpreter for the Python language, is celebrating its tenth anniversary. "To make it more likely to be accepted, the proposal for the EU project contained basically every feature under the sun a language could have. This proved to be annoying, because we had to actually implement all that stuff. Then we had to do a cleanup sprint where we deleted 30% of codebase and 70% of features."
Google releases a better compression algorithm
The Google Open Source Blog has announced the release of the "Zopfli" open source compression algorithm. Though compression cost is high, it could be a win for certain applications:Due to the amount of CPU time required, 2–3 orders of magnitude more than zlib at maximum quality, Zopfli is best suited for applications where data is compressed once and sent over a network many times — for example, static content for the web.
Upstart 1.7 available
James Hunt has released upstart 1.7, the latest version of the alternative init daemon. This version includes new D-Bus signals, new tests, an event bridge for proxying system-level events, plus the ability "to run with PID >1 to allow Upstart to manage a user session.
Running Upstart as a 'Session Init' in this way provides features
above and beyond those provided by the original User Jobs such that
the User Job facility has been removed entirely: to migrate from
a system using User Jobs, simply ensure the user session is started with
'init --user'.
"
0install 2.0 released
Version 2.0 of Zero Install, the decentralised cross-platform software installation system, is now available. There is a new feed format, which is "100% backwards compatible with the 1.0
format (all software distributed for 1.0 will also work with 2.0),
while supporting more expressive dependency requirements (optional,
OS-specific, restriction-only dependencies and dependencies for native
packages), more flexible version constraints, and executable bindings
(dependencies on executable programs, not just on libraries).
" Other changes include easier roll-back, improved diagnostics, and better support for headless systems.
[ANNOUNCE] xorg-server 1.14.0
Keith Packard has released xserver 1.14.0, complete with fixes for the touch device and GPU hotplugging, plus software rendering speedups.
Newsletters and articles
Development newsletters from the past week
- Caml Weekly News (March 5)
- What's cooking in git.git (March 3)
- Haskell Weekly News (February 28)
- Openstack Community Weekly Newsletter (March 1)
- Perl Weekly (March 4)
- PostgreSQL Weekly News (March 3)
- Ruby Weekly (February 28)
- Tahoe-LAFS Weekly News (March 3)
Firefox OS, Ubuntu and Jolla's Sailfish at MWC (The H)
The H briefly covers a panel session at the Mobile World Congress. The panel featured representatives of three Linux-based contenders in the mobile space: Mozilla Chair Mitchell Baker (Firefox OS), Canonical founder Mark Shuttleworth (Ubuntu for Phones), and Jolla CEO Marc Dillon (Sailfish OS). "Jolla CEO Dillon remarked at the panel that the time was right to give people alternatives, and like Shuttleworth, suggested that his company is doing its best to do so. The Sailfish SDK is based on QtCreator, the Mer project's build engine and an emulator for the operating system. The SDK is released under a combination of open source licences and the company states its goal with Sailfish 'is to develop an open source operating system in co-operation with the community', but it has not made clear what parts of the code, beyond the Mer underpinnings, it intends to open under which specific licences." There is a video of the panel session available as well.
Michaelsen: One
On his blog, LibreOffice hacker Bjoern Michaelsen celebrates the conversion to make for LibreOffice builds. Michael Meeks congratulated Michaelsen and the others responsible for "killing our horrible, legacy, internal dmake". Michaelsen looks at the speed improvements that came with the new build system, which reduced the "null build" (nothing to do) from 5 minutes (30 minutes on Windows) to 37 seconds. "
There are other things improved with the new build system too. For example, in the old build system, if you wanted to add a library, you had to touch a lot of places (at minimum: makefile.mk for building it, prj/d.lst for copying it, solenv/inc/libs.mk for others to be able to link to it, scp2 to add it to the installation and likely some other things I have forgotten), while now you have to only modify two places: one to describe what to build and one to describe where it ends up in the install. So while the old build system was like a game of jenga, we can now move more confidently and quickly."
Page editor: Nathan Willis
Next page:
Announcements>>