
Ask HN: What do you use Machine Learning for? - endorphone
What real-world applications are you finding it useful for right now?
======
ashark
I use it as something to worry about not knowing how to use, and how that
might make me unemployable in a few years, while also having no obvious need
for it at all, and therefore no easy avenue towards learning it, especially
since it requires math skills which have completely rusted over because I also
never need those, so I'd have to start like 2 levels down the ladder to work
my way back up to it.

I've found it very effective in the role of "source of general, constant, low-
level anxiety".

~~~
jameslk
This is true of the software engineering profession in general. I think
there's a quote somewhere stating "the half-life of a software engineer is 2-5
years." There's constantly new programming languages, new frameworks, new
platforms, new tools, new paradigms etc. It becomes harder and harder to keep
up with all that as you age and have more responsibilities and require more
stability having a mortgage and offspring (although there's certainly
individuals that can keep up with it).

Those who are still in school or fresh out of it will be up-to-date with the
latest theory and trends. And they will be eager to pour hours of overtime
into their new career. Sure, they will have years of practical experience to
gain, but that's not usually what they will be interviewed on (from my
experience). They will be interviewed on the latest tech and the algorithms
and theory they learned in school. Their whiteboard interviews will be another
day from class.

Given the influx of fresh blood, I view the software engineering profession as
a entry-level type of job, regardless of the varying levels of seniority
titles available. The only way to escape the constant churn is move to other
roles, such as architect, management, executive or to start a business such as
in niche consulting or a startup, etc. Or you can eschew the typical life of
mortgage + kids and dedicate it to keeping up with every latest trend, hopping
jobs to stay current and hoping you never slow down.

~~~
sjg007
"Those who are still in school or fresh out of it will be up-to-date with the
latest theory and trends. "

The theory doesn't change that fast... also trends? Maybe in terms of a new JS
library.. but I think it's more of new grads will work for cheap and will take
more bullshit. They also are more likely to move to/for the job.

------
jedberg
I'm not using it at this moment, but in a couple of weeks I'll have a newborn,
and I was thinking of taking pictures of him every time he is upset, and then
tagging the photo with what ends up being the resolution (feed, change, or
nap) and seeing if I could build a classifier that could figure out what he
needs just from a picture of his angry face.

Semi-related, but if anyone is using the AWS tools for their AI, please ping
me. I'm looking for a speaker for a community event in SF in June. (contact
info in profile)

~~~
pesfandiar
Suggestion from a new father: you'll have a better chance analyzing your
newborn's voice. They have very little control over their facial expression,
but there might be some useful information in their crying. Good luck with
your new projects!

~~~
sjg007
Yeah I agree, you figure out very quickly by the voice/cry what they want.

------
klancaster
We analyze images of skin lesions and give a probability whether they are
malignant melanoma. The analysis is performed on-board on mobile phones.

~~~
zfran
do you worry about the weights of your net being stolen off the app through
reverse engineering?

~~~
klancaster
It is a bit of concern, but we are not out in the wild yet so we have time to
look at ways to protect the IP.

------
brettz
Detecting what pornstars, actions, production and categories appear in 5mil+
adult videos.

~~~
trapatsas
Is this somewhere online? Github, site or what?

~~~
jedberg
OP works for Pornhub, so I suspect it is on their site.

------
phillc73
Classifying future elite race horses, before they've raced.[1][2]

I didn't develop the original approach but am involved in working to improve
the predictive outcomes.

[1] [https://www.breezeupiq.com](https://www.breezeupiq.com)

[2] [http://www.performancegenetics.com](http://www.performancegenetics.com)

------
fcnfan
I work for a large semiconductor manufacturer. In our R&D operations we
require a large volume of newly designed parts to be purchased from suppliers.
These parts are not off the shelf but are all custom created by us. Once a
design is approved we try to get them into the lab the fastest way possible so
the earlier we can get the designs quoted with a supplier, the better.

I use machine learning to predict if a newly designed part will have to be
purchased soon to go in the lab. That prediction saves me about 3 weeks of
lead time to get it to a supplier. Before machine learning, the first time the
purchasing department knew that it will need to be purchased is when the
designer filled out a form. Now they know as soon as he submits his design and
can get a head start.

~~~
adamsea
That sounds really interesting. What sort of data do you use to do this?

------
sevensor
Burning CPU cycles on an opaque, needlessly complicated model created by a
canny professor who realized that putting ML in grant proposals gets you 5x
the cash compared to regression.

~~~
froindt
Unfortunately I bet this will be the case for the next few years until funding
agencies figure out not everything has to be done with machine learning.

------
omarforgotpwd
I run a website called Hey Am I Fat. We've trained a classifier based off a
simple softmax regression in TensorFlow that tries to detect whether people
are fat or not. If you submit a picture of yourself to HeyAmIFat.com we can
tell you if you're fat or not within minutes.

~~~
Normal_gaussian
A nice email / phone capture portal you have there.

Does anybody know what the value of such lists are? Having no other
information with them I can't imagine it is much.

~~~
omarforgotpwd
I don't store this information after we send the response, but that would be
one way to make money if you don't care about being an asshole to your users.
This is mostly a joke and I don't make any money from it.

~~~
Normal_gaussian
Likely fair enough - I'm just suspicious of anything that takes contact
details when it could just hold the late open and let me know when it's done

------
antognini
We use it for automated seizure detection in EEG data. Most seizures have no
clinical manifestation (e.g., shaking or trembling), and if a seizure goes on
too long (~30 minutes) the patient can suffer permanent brain damage or even
die. This is particularly problematic for patients in the ICU since they tend
to seize silently more often and it isn't easy to tell that they are seizing
without having a neurologist examine the EEG data.

We also use machine learning for detecting other features of EEG data and
removing artifacts. (Eye blinks, for example, cause big artifacts.)

------
CodeSheikh
Replacing project managers and introducing reward/score based AI bots that
keep track of progress of individuals as function of stories they are working
on, and these bots send out reminders (slack chat bots perhaps). This will can
help improve productivity at large firms with multiple (and sometimes
unnecessary) levels of manager hierarchies such as IBM, Dell, HP, oil
companies etc. Let's be honest a certain programmer has a pattern with which
he/she works. Time spent on a problem, number of initial bugs, number of
commits, time spent vs difficulty of a certain problem and so forth. We have
all this data to quantify and train a bot. I mean why not :)

~~~
rdrey
That sounds a lot like the workflow in Dave Eggers' "The Circle". I think if
you read the book you'd be less likely to implement anything like this. :P

~~~
CodeSheikh
I have been meaning to read that book recommended by a friend working in tech.
Apparently there's a movie out now too :)

------
bitL
(In progress) Automate e-commerce business completely, including
customer/supplier e-mail responses, voice communication, identifying data
structure of supplier feeds and performing automated conversion between
formats for integration, sales estimation, hidden platform variables
identification (how to win a buy box?), price competition etc. Tech based on
Python, DL (Keras), RNN, GAN, SVM, decision trees etc.

------
nowarninglabel
Checking for issues with loan photos as they are posted. It's been interesting
to find that many ML platforms and libraries don't do a great job of
recognizing humans with dark skin. Hopefully, by adding Kiva's 1 mil+ images
to the mix we can give them a better basis point to learn from. Hoping to use
it for a lot of other things as well, but we're still getting to know how to
leverage it.

~~~
jedberg
That's a great insight to use Kiva as training data, because I agree that not
recognizing dark skinned people is a huge problem with current facial
recognition datasets.

This will get worse as more retail establishments use image recognition in
their every day operation.

------
jonbaer
In search, Learning to Rank is a common application for ML ...

[https://en.wikipedia.org/wiki/Learning_to_rank](https://en.wikipedia.org/wiki/Learning_to_rank)

[https://cwiki.apache.org/confluence/display/solr/Learning+To...](https://cwiki.apache.org/confluence/display/solr/Learning+To+Rank)

------
sulexk
Building a smart home security system, so you can be notified when your
friends have arrived, or if someone is trying to break into your home. Machine
learning is very useful here, as you can train the system to recognise what
your friends look like, or what a break-in might look like.

~~~
tobltobs
How do you get the data for training how a break-in looks like?

~~~
marsRoverDev
Couldn't you just train it on genuine visitors and then use the probability of
it being a genuine visitor to determine whether it is a break in?

~~~
kmmlng
I imagine only training on genuine visitors would be tricky with any
traditional classification approach. Even having a 90/10% split of
positive/negative training data is difficult since a lot of classifiers will
just degrade to a majority vote.

Maybe a Restricted Boltzmann Machine or something similar?

------
leanthonyrn
I work for a large multi-hospital corporation.

    
    
      We use machine learning to predict the risk of a patient, after an inpatient admission, being readmitted in 30 days post discharge.  If the risk is high, we proactively put process in place to reduce the readmission risk.

~~~
mikeroher
Can you elaborate a little bit more on the process?

------
eggie5
Forecast loan payment defaults for 90 days or longer, 3 months in advance
using a tree based model.

~~~
retube
What's your training set?

~~~
eggie5
past loan performance data

~~~
lawrenceyan
Lending Club?

~~~
eggie5
hedge fund

~~~
lawrenceyan
How large of a dataset are you looking at? I'm curious as to whether Lending
Club's publicly available data will be lengthy enough to get meaningful
accurate results for general 5-10 year loan predictions. Though it is nice
that a economic crash happened to occur in the middle.

~~~
eggie5
of course more data is always better, but there are plenty of flexible models
that can learn form small data, so just try the lending club data and if your
model isn't learning anything choose a more flexible one...

------
cmsimike
I am just getting into it, but I am primarily looking at it because I have
about 8 months worth of home automation sensor data stored in a MySQL database
thanks to my home automation software (thanks home-assistant!) and I want to
see if ML can look at patterns from that sensor data and infer what to do in
my apartment.

For instance when I get home (gps sensor) and when I open my front door (door
binary sensor), can a system understand that I typically turn my TV on when
that happens?

------
Just1689
Automatically answering HR queries worded in different ways

~~~
dotancohen
Different ways of wording automated HR answers.

------
jmstfv
I use Stanford NER to extract book titles and authors from HN comments.
Previous thread:
[https://news.ycombinator.com/item?id=14202557](https://news.ycombinator.com/item?id=14202557)

~~~
wyldfire
How do you disambiguate similar/identical titles?

Did you evaluate other NER against CoreNLP? e.g. how does spacy compare?

EDIT: answering my own question -- research here [1] ranks CoreNLP as the
winner among CoreNLP, NLTK, spaCy, Lingpipe but AFAICT spaCy is competitive.

[1]
[https://aclweb.org/anthology/W/W16/W16-2703.pdf](https://aclweb.org/anthology/W/W16/W16-2703.pdf)

~~~
jmstfv
I found spacy's documentation lacking in details on how to train NER from
scratch (at least I couldn't make sense of it). That's why I decided to stick
with Stanford's NER.

Regarding disambiguation: People usually mention books with their respective
authors, so it takes one query against the local database to check whether
entity is indeed a particular book by a particular author. When no author is
mentioned, I check parent comment (if any) to find if the given book is
mentioned there. Well, and if comment turns out to be standalone then I query
Goodreads / Google Books API. Given books with identical titles, these APIs
return the most popular option.

------
ollin
I'm using it to isolate acapellas from music for making mashups/remixes. Not
quite "real world" yet, since even my most recent models aren't perfectly
reliable, but it's getting there.

~~~
highd
Could you share what sort of model you're using? Something WaveNet inspired?

------
braindead_in
Predicting mistakes made by humans during audio/video transcription, giving
them auto-complete suggestions, classifying severity of changes, predicting
the difficulty level based on audio characteristics.

------
writeslowly
I've been working with clustering algorithms to create a personal news
aggregator from sitemaps and RSS feeds, using an NLP library (Stanford
CoreNLP) to do feature extraction. [1] I'd like to extend it to classify
articles into different categories so I could filter things like transcripts,
opinion columns, web scrapes gone wrong, but I'm not sure how to set that up
yet.

[1] Very rough prototype:
[https://confabulator.io/newsclustering/](https://confabulator.io/newsclustering/)

~~~
imwhoopsie
Hi writeslowly,

a friend and I are working on a tech investment news aggregator (manually
curated right now -- [http://circulaat.com](http://circulaat.com)), and I
think your clustering algorithm might be really useful for it. I can't figure
out how to DM you via the HackerNews system but would love to get in contact

~~~
writeslowly
I've added my contact information to my profile. Feel free to email me if
you'd like to talk

------
arbesfeld
Detecting moments of user frustration in web apps (ie. rage clicks, shaking
mouse, refreshing the page) [https://logrocket.com](https://logrocket.com)

------
agibsonccc
Fraud detection (credit cards, online banking)

Detecting illegal mining activity via satellite

network intrusion (packet analysis)

Predicting machine failure

Stock market trend prediction (up/down)

Simbox fraud

Finding similar parts for large vehicles to help engineers

I can keep going, but this is a sample I've seen over the years. We mainly do
work in time series data for enterprise basically mostly "things google
doesn't do". A bulk of what we do doesn't appear in research papers because
it's not "GAN art" eg: "the current hype".

------
dglass
I use some basic tf-idf machine learning for clustering similar articles for
[http://tracket.com](http://tracket.com)

------
kvz
At Transloadit we're going to use it for predicting how many machines we'll
need to encode incoming video and audio files in near realtime, and then
scaling that capacity up in parallel, as files are still being uploaded or
imported. Already half a year in the making, but in our early tests it seems
we can outperform the custom algorithms we had in place for this, by a lot.

------
muricula
Detecting malware.
[https://www.barkly.com/product](https://www.barkly.com/product)

------
KhalilK
Not ML per se but ANN have proven to be quite useful in tumor growth modeling.
I am using them to model cancerous cells' genotype.

------
wizzerking
Classification of Contenets Segmentation of MEdical Images Neural Networks can
also be used in DeNoising, DeHAzing Super-Resolution

------
afpx
* financial forecasting

* resource demand forecasting

* balancing distributed energy resource loads

* automating metadata generation

* categorizing 'events'

* finding anomalies in satellite data

Most important:

* running bots on my xbox

------
cl0wnshoes
Use it in my software engineer screening service to determine if a user should
answer more questions on a particular topic and dive deeper into a particular
subject. Using Azure ML, been working out well so far. Went from messing
around for days with other ML libraries to up and running within 20 minutes on
Azure.

------
mi100hael
Hotdog/Not Hotdog

------
dennisgorelik
We use machine learning for parsing jobs and resumes and then to match them
with each other.

To see the results - post your resume here: [https://www.postjobfree.com/post-
resume](https://www.postjobfree.com/post-resume)

------
__erik
We use ml to identify users who might try to commit fraud when purchasing
things from our site, as well as prevent identify fraud when users verify
their ID with us. We are also looking into taking advantage of adaptive design
to improve ux and conversion.

~~~
grooling
Interned at a KYC/KYB company last summer and its all manual there. They've
got people checking the submitted forms/scans manually. How good are you
getting because I truly believe they can all be replaced very quickly! Talk to
me :)

------
highd
-Customer analysis / clustering / behavior visualization

-Financials analysis / predictions

-Image classification

~~~
danvoell
bump on this one. \- Sales Promotion/Offer Prediction \- Customer Analysis /
Clustering / Site Content Display

~~~
highd
I've got a client that's wrangling their data into a place where I can do some
targeting for upselling, end-to-end - I'm really looking forward to that. Lots
of value on the table, IMO, and really interesting and unusual datasets.

------
alexcnwy
Automatic text summarization.

~~~
gerenuk
Mind sharing which algorithm you are using?

------
fredley
Predicting day-to-day harvest yields of perishable fruit, like Strawberries.

------
arwhatever
I find it useful for learning from experience E with respect to some class of
tasks T and performance measure P if its performance at tasks in T, as
measured by P, improves with experience E.

------
quadrature
At work we're using it to battle fraud, we're also using it to recommend apps
and themes to our users and we're using it internally for various forecasting
tasks.

------
orasis
Evolving mobile games to discover variants that earn 5 star reviews. Using
[https://improve.ai](https://improve.ai)

------
zeptomu
At my work we use it to do large-scale analysis of remote sensing data, in
particular segmentation of satellite imagery to provide land monitoring
services.

------
toisanji
image processing and effects using deep learning:
[http://somatic.io](http://somatic.io)

------
deevin9
improving relevancy of search results -
[http://www.coveo.com/en/platform/machine-
learning](http://www.coveo.com/en/platform/machine-learning)

------
waterside81
Multilingual named entity recognition and disambiguation

------
blackbear_
predicting the evolution of running cases, and the arrival of new cases, for
capacity planning in a call center

~~~
GFischer
Do you believe ML is adding value there (instead of a standard
prediction/forecasting model)?

------
eggie5
Recommender system using visual features

------
greatNespresso
Trying to predict horse racing

~~~
bbcbasic
Me too :-)

~~~
greatNespresso
What do you use ? I found out that svc trained with proper data leads to
satisfying perf for predicting the four first horses (in disorder though)

------
adictator
Not me, but a friend of mine heads a company that uses drones to identify
defects / damage on containers. He uses ML to process the images captured by
the drones, which is fed through an ML processor to detect such damages & flag
them for insurance claims purposes.

~~~
cr0sh
By containers, do you mean shipping containers? Sounds interesting!

