Aspell word lists / dictionaries for several languages

“GNU Aspell, usually called just Aspell, is a free software spell checker designed to replace Ispell. It is the standard spell checker for the GNU software system….Dictionaries for it are available for about 70 languages.”

And since it is a spell checker, it comes with huge word lists embedded in its dictionaries.

For various purposes, one might need to see such a word list. But the aspell files, located in:
/usr/lib/aspell and /var/lib/aspell as *.rws files
are quite hard to be read.

To list the words in a given dictionary you can use:
aspell dump master
In my case it shows a list of English words, since this is my main dictionary file.

To list a different dictionary try:
aspell -l de dump master
This will list the german dictionary, as I’ve indicated with -l parameter (de). In case you don’t have this language/dictionary installed, you’ll receive an error message:
“Error: The file “/usr/lib/aspell/de” can not be opened for reading.”

You can install additional language packages with:
sudo apt-get install aspell-de
See a list of dictionaries (and country abbreviation) here.

For some languages (eg. Dutch), the words are ‘compressed’ using affixes. You’ll get some weird output:
aspell -l nl dump master
Shows:

bloot/G
blootte
blote/N

To uncompress the word list, and see them in plain text use:
aspell -l nl dump master | aspell -l nl expand
You now get:

bloot gebloot blootte

To put the words on separate lines try:
aspell -l nl dump master | aspell -l nl expand | tr ‘ ‘ ‘\n’
The last instruction basically replaces all spaces with Enter – so careful for combined words (frequent in some languages).

To save the list to a text file redirect like:
aspell -l nl dump master | aspell -l nl expand | tr ‘ ‘ ‘\n’ > dump-nl.txt

Be careful how you use these word lists since most of the Aspell dictionaries are copyrighted. See the Aspell page for more details.

Radu Motisan

This article has 1 Comment

Leave a Reply