Hacker News new | past | comments | ask | show | jobs | submit | valedra's comments login

You can look at examples from DyNet (https://dynet.readthedocs.io/en/latest/cpp_basic_tutorial.ht...) or the C++ Pytorch bindings (https://pytorch.org/cppdocs/). I would strongly recommend python though, as most deep learning work is done with it.


Seconding this. Keep in mind you're not in a "normal" C environment when dealing with GPUs. It's a profoundly negative experience dealing with chip manufacturers (Nvidia) that push proprietary hardware and actively thwart your efforts at producing stable and correct code. It made me realize why Linus had such animosity towards them.


He is like that in real life, too... I am in his department, and every single social event ends with him surrounded by a group of people, listening to his hilarious rants. At the same time, he's a great teacher, too. Somehow you learn things while listening to 90 minutes of his stand-up comedy.


He was like that at MSR also. Microsoft lost someone great when he moved to Harvard.


I would choose the framework according to your goals - OpenNMT-py is very research-oriented and hackable. It supports Transformer, copy-attention, image/speech/text2text and more. If you are more production-focused, maybe MarianNMT or OpenNMT-tf are for you.


If they want to translate some input to sensible output they would be better off using a translation model that has two LSTMs and a ton of data. See: https://github.com/harvardnlp/seq2seq-attn

They are also not giving credit for the graphic they are using: http://colah.github.io/posts/2015-08-Understanding-LSTMs/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: