





























After a gap of almost two years, I am happy to announce the second official release (version 0.2.0) of Anubadok a free (as in freedom) machine translation system for English to Bengali. Anubadok is written in Perl and it uses Penn Treebank annotation system for natural language processing. To run Anubadok 0.2.0, you need to have Part-of-Speech tagger GPoSTTL installed in your system. The Anubadok system can be accessed online using the interface Anubadok Online run by Ankur.
First official release (ver. 0.1) of Anubadok was an experimental release which mainly served as a proof-of-concept for an open-source English to Bengali machine translation system.
With the release of version 0.2.0, I am glad to upgrade its official tag from “an experimental software” to “a software under development” with clear-and-specific implementation targets. However given the nature of the project, there are no specific time-frames for future releases. Further, given machine translation is considered an open research topic in Computational Linguistic, you should expect to see some surprises 😉 even for well implemented situations. Specially, if you are comparing results of machine translations with human translations.
In English, there are four types of sentences: Declarative, Imperative, Interrogative and Exclamatory. These sentence types further fall into four basic sentence type: Simple, Compound, Complex and Compound-Complex.
The table below gives approximate status of implementation for each sentence type in the current release and inversely it gives the targets for future implementations.
| Declar. | Imper. | Interro. | Exclam. | |
| Simple | W | W | W | M |
| Compound | M | M | M | M |
| Complex | N | N | N | N |
| Compound – Complex | N | N | N | N |
W: Well implemented
M: Moderately implemented
N: Not/Not-well implemented
Anubadok does not yet have any code to handle Complex or Compound-Complex sentences, not even moderately. This is where next push for development is needed.
Few other salient features of this release:
http://anubadok.sourceforge.net
Latest source codes of Anubadok can be downloaded from the “trunk” branch of its SVN repository.
Posted in Bengali Computing | Tagged Anubadok, Bengali Machine Translator, Indic Machine Translation, Machine Translation, Machine Translator | 116 Comments
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。