Hacker News new | past | comments | ask | show | jobs | submit login

Maybe I don't understand correctly but my point is that BERT and friends just let you skip a lot of feature engineering. Which(the F.Eng) is actually pretty useless once you shift the corpus a bit. That is for me the biggest contribution, I don't care about a couple of percentage points of better results in the end but I do care that I don't need to spend all my time devising new features, which almost always require an expert on that particular topic.



Yeah I agree. Deep learning mostly eliminates feature engineering which is a huge win. However BERT and other transformer models take too damn long to train. I still prefer using FastText for transfer learning because it can be trained on large corpora quickly. Combine that with supervised deep learning, and you often get good enough accuracy in practice.


Sure, but the feature engineer could escape the asymptope that BERT is converging towards and get better accuracy which could make the product*market fit work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: