Hacker News new | past | comments | ask | show | jobs | submit login
Chinese Text Project (ctext.org)
192 points by peterburkimsher on Jan 15, 2018 | hide | past | favorite | 19 comments



ctext is awesome. Here's a link to their Yi Jing (Book of Changes): http://ctext.org/book-of-changes

I volunteer as a sysadmin for wengu (http://wengu.tartarie.com/wg/wengu.php). We cover some of the Chinese Classics.

In the coming years, I intend to redo wengu in sphinx-doc + reStructuredText and make them available in a updated website and generated PDF's. There may be some copyright issues with Wilhelm's Yi Jing translation, though. I think there's still a few years left.

I also am creating a spiritual successor to Christoph Burgmer's cjklib (https://github.com/cburgmer/cjklib). It's called cihai, you can find it at https://cihai.git-pull.com.


> There may be some copyright issues with Wilhelm's Yi Jing translation, though. I think there's still a few years left.

I think there are more than a few years. Wilhelm's translation is probably out of copyright by now, but Bayne's English translation of that is only about 60 years old (too busy to look up the actual dates). I keep wondering if some kind soul who speaks German and English would translate Wilhelm's work, but so far I haven't seen anything. If anyone wants a non-technical side project...


Pingtype's keyboard uses character decomposition data that I generated myself based on cjk-decomp-0.4.0.txt.

https://cjkdecomp.codeplex.com

Just type a character in the text box (using handwriting recognition), then if the character isn't the right one, click on the parts to break it up and rebuild the one you wanted. Then click Insert to copy it to the text translation area, where it can be translated into pinyin, bopomofo, etc.


I love the UI of Wengu, especially how nice the column formatting is. The a:hover providing dictionary info is nice, expect it doesn't do that in context, which require totally different underlying data model (I have a prototype doing that).

Dr Sturgeon work on ctext.org is amazing. But I find the lack of open OCR model and scripts a bit sad, as this part of the work cannot be reused for other projects.


Wow. Love the Ctext categorization. Wish I'd known about this during undergrad. Wengu's nice too with the line by line translation.


If you have parallel translations of text, I wrote Pingtype to show the word-by-word literal translation and pinyin.

https://pingtype.github.io

For example, a random Bible verse every time you refresh. Or see the YouTube link from the heading row.

https://pingtype.github.io/verse.html

I'd like to include CText's data, but there's a strict warning against scraping their site.


this is very cool, thanks for making this!


Thank you for the encouragement!

I'm now working on more data sources for it. On my laptop I've collected 32,000 Christian song lyrics that I can sing in church. Running a search feature without PHP is a little complicated though.

My next project will be to write a chatbot using example dialogues. I need to provide data in CSV format, with one line for each question/answer. I think textbooks and cartoon subtitles have some dialogues, but if you have other ideas of data sources, please let me know.


Awesome project.

As a Chinese I always want to know more about our history and lecture.

I learned a small portion of it at school from text books, but it's very hard to fully understand some of them without knowing related context. Web site like this is surly helpful on doing so.

BTW: Digging through few pages, I found one article[0] that I had to recite and write it back out based on memory on class when I was a mid schooler, that was really a bad experience back then :D. But the article itself however is deep and worth a read (if you know ancient Chinese of course).

[0] http://ctext.org/wiki.pl?if=en&chapter=428029


Does anyone have ideas on how these features can be made to work on mobile?

For example, at http://ctext.org/dictionary.pl?if=en&id=1102 you can hover over the words to see meanings, or click on them to see in the dictionary. What is the analogue on phones, for “hover” versus “click”? What does the ctext project think about phones? (For context, I'm hoping to do something similar for Sanskrit someday…)


on my note phone I hover the s-pen above the text.


Incredible!

Are there similar websites for other languages?



A new version of Perseus will be launching in two months. I have been working on it with my team at Eldarion. It will open sourced as well.


Perseus is priceless! They usually have a translation or two to help you and maybe a commentary thrown in. Their word tools have helped me a lot.


Perseus also has (ancient) Greek of course



I wish they added a feature: left half of the page original text, right half translation. I tried to read the pre-Han Chinese Medicine and wished this was in.

Pretty neat job overall!


Great job! This is what I am looking for a long time.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: