This page is for software developers and linguists who are looking for the code and raw data for this site.

Vocabulary data

Machine-readable versions of the vocabulary are created periodically. The following data is available:

  • XDXF exports for use in language-oriented programs. This is an open standard which supports most of the data included in the vocabulary. The XDXF project has open source tools to convert this to other formats if your program does not accept XDXF.
  • SQL dumps. These are for programmers who are going to mirror the online dictionary, make an app, or run interesting queries. This contains all of the vocabulary data, but is difficult for most people to work with.
  • Plain-text word list for use as a spell-checker or similar. This has very little data but is easier to work with.

The data is freely licensed, but attribution is required if you use it. See the about page for licensing information.


This website runs some custom code to allow the pages to be linked in with the vocabulary & its audio recordings.

The system is open source, and you can access it here. That page also contains known bugs.