Demo & Links
This is Japana. a Japanese analysis tool using Jim Breen's JMdict & KanjiDic2 & mecab.
You need to provide a text: a novel, short story, song lyrics, article, sentence ..etc. The program will return a word list ordered by frequency. each dictionary in the list has at least the word and the number of occurrences (frequency) and depending on configurations: pronunciation and meaning
Japana was a library at first, but it later was made into a web application, a desktop application and a command-line interface. Currently, I am working on the mobile version!
to install the library, all you need is to run the command
pip install japana
go to Jamdict official project for further instructions
python3 japana-cli.py [-h] [-asc] [-dic] [-k] [-o OUTPUT] FILEPATH
|FILEPATH||file path to text file|
|-h, --help||show this help message and exit|
|-asc, --ascending||sort by frequency in ascending order (default descending)|
|-dic, --dictionary||add dictionary definitions|
|-k, --kana||count kana words|
|-i, --include||include words with no definitions|
|-o OUTPUT, --output OUTPUT||the full file path where the output will be saved, default is output/output.txt|
input file path: file.txt file will be written at: output/output.txt unique kanji count: 80 total word count: 261 unique word count: 62 |███████████████████████| 100.0% words has been done writing to file .. done
it provides an easier interface for users. The output is returned in a table and a csv file to help integration with language learning tools such as Anki. Users can log in to save and manipulate lists.
Japana web also provides a REST API for developers and -for future- for mobile application to use
I would like to make a mobile version of the app. Also, I want to add an OCR to allow users to upload images with text written -such as manga and comic books- then extract words from them