scrape a freely available dictionary using tesseract (Crimean Tatar and Russian) [0]
completed by: Pylypchuk Ljudmyla
mentors: Jonathan Washington, Francis Tyers
Use tesseract to scrape a freely available dictionary that exists in some image format (pdf, djvu, etc.). Be sure to scrape grammatical information if available, as well stems (e.g., some dictionaries might provide entries like АЗНА·Х, where the stem is азна), and all possible translations. Ideally it should dump into something resembling bidix format, but if there's no grammatical information and no way to guess at it, some flat machine-readable format is fine.