Hacker News new | past | comments | ask | show | jobs | submit login
N-Shot Learning: Learning More with Less Data (floydhub.com)
147 points by DarkContinent 78 days ago | hide | past | web | favorite | 16 comments

How does this differ from active learning? When would you use which in case you don't have sufficiently large training dataset? Would you combine both approaches? If so, how?

To combine few-shot learning with active learning you can

- use active learning to expand the training set, and

- revise your model training procedure to account for the sampling bias this introduces in the training set.

Maybe paying smart people for their ideas, insights, and algorithms costs less than throwing a bunch of data at AI training.

For some things, we discovered that wasn't true for other things this last decade though.

Better still, train a learning machine (or a collection thereof) based on their expertise, and use that machine to augment the experts and reduce tedium/repetition, or even reduce error rates below what you'd see in the field otherwise.

“If AI is the new electricity, then data is the new coal.“


I've been pondering about this analogy. I like this one but if AI us the new electricity do we need more Edisons or Teslas? I think everyone jumping into AI stuff learning Deep learning and stuff seems to learning how to create electricity itself than creating Lightbulb and other stuff that runs on electricity - building user apps on it.

Does anyone else feel so ?

I don't know, just check out a few Kaggle competitions and how pragmatic the winning teams are approaching their solution. It's most often a combination of tried-and-true techniques, used in an ensemble, with some smart feature selection. Anecdotally, there's plenty of ready-to-use ML tech available nowadays that I, as a novice, was able to go from zero to working Gradient Boosting classifier within a few days. For me that's the definition of applying the techniques without trying to earn a PhD in the field.

I think „data is the new oil“ is the more apt analogy: https://duckduckgo.com/?q=oil+spill+bird&iax=images&ia=image...

Oil is the processed product. Data is more like coal, generally useless to ML unless processed.

Oil is both terms (e.g., crude oil).

"Ugh" is not a substantial rejoinder. It's an apt analogy in my opinion. People keep waiting for AI to break out, and the argument here is it's not alone a product, it's a requisite tool for building them. AI, of course, is dependent on copious quality data, such that it fuels AI.

With what would you disagree?

Like many memes, it's vivid for no good reason. The only relationship in common is that A depends on B. Everything else about it is misleading.

Data is not fuel, let alone a fossil fuel. It doesn't provide energy and doesn't get used up.

> It doesn't provide energy

Training a model with data reduces entropy in the model. Isn't a local increase in energy a decrease in entropy?

> ... and doesn't get used up.

With respect to the model, that data was useful before training and not useful after training. Its entropy lowering potential is decreased.

Fuel is used as the metaphor here in the respective mechanisms.

Please don't do this here. It only makes things worse.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact