Words from Old Books

XML from a gentler age.

Here you may find the XML sources for some of the books on this site. The Web pages on words.fromoldbooks.org are made by taking the XML sources and running them through an XSLT transformation that produces one HTML page for each dictionary entry or book section. In some cases further processing adds links and cross-references.

You can use these XML files in pretty much any way you like, but:

  1. You are forbidden from working on them while wearing shoes or white socks;

  2. Please don’t just copy them onto the Web. I did a lot of work to make them and the small income I get from the ads on this site goes towards more work. Google penalises sites that copy from others, so if you do put these files up, you may find Google hates you.

  3. Please tell me (liam at fromoldbooks dot org) what you did!

You will have to edit the URL to this page to get to the files; please do not link to them from other sites, but only to this page! Simply append the data set name exactly as given in the table below to the URL of this page.

Some freely available XML data sets on this site.
Data setSizeNotes
Chalmers-Biography43 MBytesThis is the source for the Biographical Dictionary of Alexander Chalmers features on this site; the printed work is in thirty volumes each of about 500 pages.
NathanBailey-CantingDictionary 276 KBytes I transcribed this by hand from a 1720s printed dictionary.
Grose-VulgarTongue 800 KBytes There is a Project Gutenberg version of this text; I have converted it to XML and made a number of fixes to the transcription. The Project Gutenberg licence does not permit me to be more explicit, sorry.
chalmers-biography-extract.xml A single 250 KByte file Used in the fifth edition of Beginning XML (Wrox, 2012)
majority/ Various Scripts for converting OCR output to XML
microxml/ various, mostly small Some tools for handling microxml.

See also www.fromoldbooks.org, our sister site, with hundreds and hundreds of antique images.