DAWN: Tools for AI and Data Product Development

pella · on Dec 20, 2017

summary(slides): http://dawn.cs.stanford.edu/assets/dawn-overview.pdf

hacker_9 · on Dec 20, 2017

The link to Snorkel [1] is really interesting, labeling data in a low quality programmatic way, which is then fed through a neural network to produce high quality labels is really smart.

[1] https://github.com/HazyResearch/snorkel

myth_drannon · on Dec 20, 2017

Yes, Snorkel and DeepDive look extremely useful. At my job we have a lot of data but it's unlabeled, it will cost millions of dollars to outsource it to India for labeling/data entry.

cl42 · on Dec 20, 2017

This is great. In some ways it reminds me of the recent "Software 2.0" posts around here -- make the code and architecting so easy that we begin teaching machines by creating data rather than writing code.

cscurmudgeon · on Dec 20, 2017

A lot of research has been done in that direction.

http://www.inductive-programming.org

haZard_OS · on Dec 20, 2017

I always thought the Wolfram language was a good step toward "Software 2.0". It's a shame the language isn't open source.

tuomosipola · on Dec 20, 2017

I like many of these ideas, they address real practical problems in the area and new research is always welcome. How this all will be packaged into a working environment is not clear to me but even the individual parts should be useful.

thisisit · on Dec 20, 2017

So anyone using these tools in live/production environment?

mateiz · on Dec 20, 2017

Matei Zaharia (one of the PIs on DAWN) here. Snorkel, MacroBase and ASAP are already being used in production at several companies, and we intend to continue publishing everything as open source. We only started this lab a year ago, so a lot of the projects listed are still new.

Vkkan2016 · on Dec 21, 2017

I am trying to see any sample projects to learn, if its possible pls share

tuomosipola · on Dec 20, 2017

Would be interesting to hear any experiences. These researchers have background in Spark etc. so setting up might not be that difficult.

tabtab · on Dec 20, 2017

Seems it has similar goals to the idea behind factor tables: https://github.com/RowColz/AI

Houshalter · on Dec 21, 2017

I'm pretty sure that guy just reinvented nearest neighbor.

tabtab · on Dec 26, 2017

Finding the "nearest matching pattern" is part of just about ANY pattern matching. The devil-of-the-detail is dealing with noise, precision-loss-for-speed, generalization (compression), tuning, etc. This attempts to break such down into staff-digestible chunks.