One thing that struck me is that even the largest online dictionaries (Oxford, Wiktionary) are totally inadequate. I'm continually adding new words. Another thing I noticed is that spaces are not always between words! Chinese doesn't have spaces at all, so I wrote a word spacing tool. When I rewrote the app for English, I thought I could use the space character, but I can't.
Many verbs wrap around nouns e.g. "put [the phone] down". The source dictionary data has a definition for "put something down", and my program has support for a special word: "something", which causes it to look ahead for the second half of the phrase.
Pingtype doesn't reorder the sentence, because it's intended for education. But it's easy to train, unlike machine learning models.
I don't think that even on a trivial level it's possible to word-for-word translate most languages, except the most closely related ones, since they have very different language structures.
You can still solve those with a dictionary of translations and for the missing words, a sub-word embedding system that gleans the meaning from the phonetic form.
https://pingtype.github.io
I also made an English-to-Chinese version, https://pingtype.github.io/english.html
One thing that struck me is that even the largest online dictionaries (Oxford, Wiktionary) are totally inadequate. I'm continually adding new words. Another thing I noticed is that spaces are not always between words! Chinese doesn't have spaces at all, so I wrote a word spacing tool. When I rewrote the app for English, I thought I could use the space character, but I can't.
Many verbs wrap around nouns e.g. "put [the phone] down". The source dictionary data has a definition for "put something down", and my program has support for a special word: "something", which causes it to look ahead for the second half of the phrase.
Pingtype doesn't reorder the sentence, because it's intended for education. But it's easy to train, unlike machine learning models.