Create tagged corpus of Armenian from EANC
completed by: conor-f
mentors: Francis Tyers, Jonathan
http://eanc.net/EANC/library/library.php?interface_language=en
This page has a series of texts in the following HTML format:
<span titles="գիշեր (N inanim)	sg,dat,def	night">Գիշերվան</span>
<span titles="մութ (N inanim)	sg,nom,def	dark, obscure, vague">մութը</span>
<span titles="գետ (N inanim)	sg,gen,nmlz,def	river գետին (N inanim)	sg,nom,def	earth, soil">գետինն</span>
<span titles="առնել (V tr)	cvb,pfv	take, buy">առել</span>
<span titles="է (V intr)	past,sg,3	be">էր</span>:
The objective of this task is to convert the format to 'lttoolbox' analysis format, like this:
^Գիշերվան/գիշեր<n><nn><sg><dat><def>$
^մութը/մութ<n><nn><sg><nom><def>$
^գետինն/գետ<n><nn><sg><gen><nmlz><def>/գետին<n><nn><sg><nom><def>$
^առել/առնել<vblex><tv><cvb><pfv>$
^էր/է<vblex><iv><past><sg><3>$