valedra's comments

valedra · on Oct 8, 2018

You can look at examples from DyNet (https://dynet.readthedocs.io/en/latest/cpp_basic_tutorial.ht...) or the C++ Pytorch bindings (https://pytorch.org/cppdocs/). I would strongly recommend python though, as most deep learning work is done with it.

jdonaldson · on Oct 8, 2018

Seconding this. Keep in mind you're not in a "normal" C environment when dealing with GPUs. It's a profoundly negative experience dealing with chip manufacturers (Nvidia) that push proprietary hardware and actively thwart your efforts at producing stable and correct code. It made me realize why Linus had such animosity towards them.

valedra · on Aug 17, 2018

He is like that in real life, too... I am in his department, and every single social event ends with him surrounded by a group of people, listening to his hilarious rants. At the same time, he's a great teacher, too. Somehow you learn things while listening to 90 minutes of his stand-up comedy.

seanmcdirmid · on Aug 17, 2018

He was like that at MSR also. Microsoft lost someone great when he moved to Harvard.

valedra · on Aug 2, 2018

I would choose the framework according to your goals - OpenNMT-py is very research-oriented and hackable. It supports Transformer, copy-attention, image/speech/text2text and more. If you are more production-focused, maybe MarianNMT or OpenNMT-tf are for you.

valedra · on Oct 16, 2016

If they want to translate some input to sensible output they would be better off using a translation model that has two LSTMs and a ton of data. See: https://github.com/harvardnlp/seq2seq-attn

They are also not giving credit for the graphic they are using: http://colah.github.io/posts/2015-08-Understanding-LSTMs/