Seconding this. Keep in mind you're not in a "normal" C environment when dealing with GPUs. It's a profoundly negative experience dealing with chip manufacturers (Nvidia) that push proprietary hardware and actively thwart your efforts at producing stable and correct code. It made me realize why Linus had such animosity towards them.
He is like that in real life, too... I am in his department, and every single social event ends with him surrounded by a group of people, listening to his hilarious rants. At the same time, he's a great teacher, too. Somehow you learn things while listening to 90 minutes of his stand-up comedy.
I would choose the framework according to your goals - OpenNMT-py is very research-oriented and hackable. It supports Transformer, copy-attention, image/speech/text2text and more.
If you are more production-focused, maybe MarianNMT or OpenNMT-tf are for you.
If they want to translate some input to sensible output they would be better off using a translation model that has two LSTMs and a ton of data. See: https://github.com/harvardnlp/seq2seq-attn