Dictionaries, language tools and free software
Decoder for Lingea Dictionary (.trd) binary format
Converts into Stardict, MobiPocket (Nokia Symbian cellphones, devices like Psion, Blackberry, Windows Mobile, Palm) and weDict (Apple iPhone)
Use the script only on legally bought dictionaries!
To use this software you must be legal owner of the license for the dictionary!
Unless you are: buy the dictionary!
Read the license! It probably limits the usage to one computer and one user! Produced dictionary has the same copyright as original. You are not owner of the data - you have to follow the license!
You are not allowed to distribute the produced dictionaries. Script is only for personal use. No warranty. GNU GPL License.
Created for devices/platforms not supported by Lingea, like PowerPC Linux, Psion 5MX, Apple iPhone, Nokia cellphones with Symbian, ...
The decoding follows Nomad's specification, my work was done just by guessing and reverse engineering of .trd file, NOT BY CRACKING LINGEA BINARIES. This software was NOT written by Lingea s.r.o. and there is no connection to that company.
Copyright © 2007 Klokan Petr Přidal.
Lingea Dictionary Format
Description of binary format of Lingea dictionaries comes from CobuildConv
written by Nomad in Japanese
. English translation
Tags used for records
are described in Czech.
Decoder for Lingea TRD
Script for decoding Lingea Dictionary (.trd) file. Result is <header>\t<definition> file, convertible easily by stardict-editor
(or by tabfile command from package stardict-tools) into native Stardict dictionary (stardict.sf.net
To run the decoder you need a Python
We moved to GIT/SVN hosted on code.google.com. New versions will be published there, check the latest changes
Version 0.6 - Petr Dlouhy, support fot Spanish and French dicts, several bug patches.
Version 0.5, Almost all Lingea dictionaries are supported (series 2000, 2002 and PocketPC)
! Patch 0.5
by Petr Dlouhý. Josef Říha added the support for slovak characters.
Version 0.4, HTML output with colors! Applied new patch 0.4
by Petr Dlouhý
Version 0.3, cleaned code, support for more dictionaries. Contains a patch 0.3
by Petr Dlouhý.
Save the file and run:
python lingea-trd-decoder.py lg_XXXX.trd
For now, only 100% supported dictionaries are:
These dictionaries are available in bookstores or you can order them from www.lingea.cz. Czech-English dictionary was also on the DVD of CHIP 10/2007.
Other Lingea dictionaries could follow (byte streams for records are read, so problem is only in decoding of structure for more complicated records). Study the source and Nomad's docs if you would like to help.
Patches for the source code are welcome.
I do NOT provide any user support for the script, it was mine free time activity and I did it to have a nice dictionary on one of unsupported devices (like Apple iPhone, Nokia with Symbian or Psion 5mx). Let me know if you do some changes.
Stardict to Apple iPhone
An application weDict
which uses the stardict format directly.
Stardict to Apple Mac OS X built-in Dictionary client
You can convert any stardict dictionary into the format supported by native Apple Dictionary.app. Then the stardict dictionary appear in the Mac OS X 10.5+ in Services menu, in a Dashboard as well as in the standard /Applications/Dictionary.app client.
There is a mac application DictUnifier and sdconv command line utility doing this conversion.
It expects .bz2 archive of the dictionary, therefore you have to create the archive of your Lingea dictionary.
A huge collection of dictionaries is available here, so download and run the DictUnifier is usually enough.
Stardict to MobiPocket OPF (PRC, MOBI)
Script for conversion of Stardict tabfile (<header>\t<definition>
per line) into the OPF file for MobiPocket Dictionary
Scripts for conversion:
Produced .opf dictionary must be converted to binary file with extension .mobi (= .prc) by running:
(wine) mobigen.exe DICTIONARY.opf
mobigen.exe is available at:
MobiPocket project supports platforms: PalmOs, Windows Mobile, Symbian (Series 60, Series 80, 90, UIQ), Psion, Blackberry, Franklin, iLiad (by iRex), BenQ-Siemens, Pepper Pad..
Download the Reader
I had an idea to improve existing free software dictionary applications, and create a new version (successor) of DictOSX project
, or even merge DictOSX and Stardict, and submit some patches into stardict tree. But because it is a free time (not paid) activity, don't expect miracles soon.
There is a simple proof of concept done in python/GTK called GNU Lexicon, but I think from this project it make sense to use only the idea of better GUI for stardict and usage of regular expressions for conversion of text based dictd/stardict dictionaries into a html form. I planed to include native support for WordNet and for flection, so I did a small research into freely available natural language tools.
Czech and European free natural language tools
Most used words:
- German, English, French, Dutch, 100, 1000, 10000, ordered by frequency
- Oxford 3000 and other wordlists
- Czech National Corpus frequency analysis. There is also a CD from publication: Čermák, F. - Křen, M. (eds.): Frekvenční slovník češtiny. Nakladatelství Lidové noviny, Praha 2004. ISBN 80-7106-676-1
I was also thinking about embedding some free speech synthesis like Festival
, on platform where native synthesis is not supported. BTW There is also an EPOS synthesis
and a free czech synthesis based on festival
It should be done the same way like commercial voices from Infovox iVox do.
Do you know about some freely available tools? Let me know. ;-)