I am Reem Alghamdi, an enthusiastic programmer and a computer science student. My strongest skills are in web and mobile Development using Java, Python and Javascript. In the future, I would like to learn swift and dive more in machine learning! especially recommender systems and classifiers logo

Japana

a Japanese text analysis tool. It takes a Japanese text from file, URL or directly and returns the list of words ordered by their frequency from the text with their definitions, pronunciation and JLPT level.

Demo & Links


This is Japana. a Japanese analysis tool using Jim Breen's JMdict & KanjiDic2 & mecab.

You need to provide a text: a novel, short story, song lyrics, article, sentence ..etc. The program will return a word list ordered by frequency. each dictionary in the list has at least the word and the number of occurrences (frequency) and depending on configurations: pronunciation and meaning

Japana was a library at first, but it later was made into a web application, a desktop application and a command-line interface. Currently, I am working on the mobile version!

Japana Library

to install the library, all you need is to run the command

    pip install japana

in your venv, envor virtual_env.

go to Jamdict official project for further instructions

Japana CLI

Usage

     python3 japana-cli.py [-h] [-asc] [-dic] [-k] [-o OUTPUT] FILEPATH

required arguments:

Option Description
FILEPATH file path to text file


optional arguments:

Option Description
-h, --help show this help message and exit
-asc, --ascending sort by frequency in ascending order (default descending)
-dic, --dictionary add dictionary definitions
-k, --kana count kana words
-i, --include include words with no definitions
-o OUTPUT, --output OUTPUT the full file path where the output will be saved, default is output/output.txt

sample output

input file path:  file.txt
file will be written at:  output/output.txt
unique kanji count:  80
total word count:  261
unique word count:  62
 |███████████████████████| 100.0% words has been done
writing to file ..
done

Japana web

it provides an easier interface for users. The output is returned in a table and a csv file to help integration with language learning tools such as Anki. Users can log in to save and manipulate lists.

Japana web also provides a REST API for developers and -for future- for mobile application to use

future work

I would like to make a mobile version of the app. Also, I want to add an OCR to allow users to upload images with text written -such as manga and comic books- then extract words from them