Table of Contents

BibTeX, XML and XSL for Bibliography Transformation

This is a set of tools I built using bibtex, bibtexml, saxon (I'm using saxon 8, but other versions will probably work) and apache fop. It is based on storing the underlying data as bib format, then translating to (a variant of) bibtexml on the fly, and using xsl to transform that to a desired output format. The .sh files contain examples for creating:

  1. A complete excel file containing all data (good for checking data integrity)
  2. A couple of selective text files (good for cutting and pasting into web forms - a job we have to do all the time in academia); readily modified for different selections of field
  3. A web page bibliography in html format (fairly easy to modify using css)
  4. A bibliography in dokuwiki format (readily modified to other wiki formats)
  5. A bibliography in pdf format
  6. A bibliography in rtf format (for incorporating into word and openoffice files)

To use:

  1. Install required software
  2. Make your own .bib file
  3. Modify paths in all .sh files to point to your own installations and to your own files
  4. Comment out output formats you don't need in bib.sh
  5. Run bib.sh

Why did I build it? I needed something more standards-based (and more tailorable) than other tools out there on the web. I needed to be able to build a new output format in an hour or so (something I've regularly done subsequently). I don't think I could have done this with other tools. Your mileage may vary…

Bib Maintenance Tool

At the moment, we store the primary data in bib format using bibdesk (open source for Macintosh); any other bib maintenance tool would do as well. If I were building this today I'd probably use a web-based system such as zotero or citeulike.

Planned Future Development

In the long run, bib isn't the ideal format for storing the database; the ideal mechanism would store the data in an xml format (where we could include binary blobs for many of the components that academic administrators often want), and separately build an XRX-based mechanism for maintaining it. This requires a bit of work I don't have time for right now. In addition, I eventually plan to build an html form-filler to automate this task, that we often have to undertake (why administrators can't handle asking for an xml input format, instead of requiring manual web-based entry, is beyond me).