
Ask HN: What's the best way to get tagged training data for NLP - mrburton
I&#x27;m looking to have technical keywords tagged such as programming languages, frameworks, etc. I would use mechanical turk, but since this data is very technical, I&#x27;m not sure the turks would be able to do it properly?<p>I was thinking about making a game and having technical recruiters tag words for the system. Thoughts?
======
PaulHoule
You might be able to tag enough samples yourself, particularly if you use
transfer learning, data augmentation and focus effort on the hard cases.

~~~
mrburton
I wrote a web site that allows me to quickly tag words as language, framework,
operating system, and service. I'll see if I can crowd source support from
various communities, gamify it or some sort of reward system and lastly try
mturk.com

I'm trying to get around 15k to 30k sentences tagged. Let's see :)

Thanks for your reply!

