Extracting Structured Data from Recipes Using Conditional Random Fields (2015)

cs702 · on May 4, 2018

This is the kind of problem for which LSTM RNNs -- and more recently, fully-attention-based deep neural nets -- produce state-of-the-art results.

I wonder if the author ever tried using, say, an AWD LSTM RNN[a] or a Transformer-like model[b] for this task.

Using an RNN or an attention model for this would eliminate the need for feature engineering such as:

  feature_1 = 1 if x_t is capitalized and y_t equals "NAME";
              0 otherwise.

This is one of seven carefully engineered feature functions listed in the article, and the author states that the seven are only a partial list.

Moreover, using a modern RNN or attention model likely would produce better predictions, with much better generalization.

[a] https://arxiv.org/abs/1708.02182 / https://github.com/salesforce/awd-lstm-lm

[b] https://arxiv.org/abs/1706.03762 / https://github.com/tensorflow/tensor2tensor

nl · on May 4, 2018

This article is dated 2015. Can’t blame the author too much for not trying things that would be invented 2 years later.

But yeah, it would be great follow up work.

cs702 · on May 4, 2018

Ah, I didn't notice the article's date until now. Thanks for pointing that out! Makes more sense now.

Yes, it would be great follow-up work.

tensor · on May 4, 2018

Can you provide articles comparing CRFs directly with LSTMs? Most articles on LSTMs don't actually compare against CRFs and an LSTM isn't a drop in replacement for a CRF. I haven't personally seen that neural networks have uniformly beaten CRFs on all tasks. E.g. [2] directly compares CRFs and an LSTM and the CRF achieves an F1 of 97.533 while the LSTM gets 97.848.

In fact, because of the competitiveness of CRFs there are many works that combine them with neural networks (e.g. [2])

[1] https://arxiv.org/pdf/1606.03475.pdf

[2] https://arxiv.org/abs/1508.01991

cs702 · on May 4, 2018

tensor: my main point was and is that features learned by a suitable deep model (whether recurrent or attention-based) routinely outperform human-designed features. This has been shown in a large and growing number of sequence tasks (WMT language translation datasets, Stanford Question Answering Dataset, WikiText language modeling datasets, Penn Treebank dataset, IMDB and Stanford Sentiment Treebank movie review datasets, etc. -- to name a few).

Now, in some cases, and depending on the task, it might make sense to have the last layer of a deep model be a CRF layer. In the OP's case, for example, one could try replacing all those one-off feature functions with a proven deep architecture -- in other words, instead of having ψ at each time step be equal to exp(sum(weighted feature functions))), have it be a function of the output of the deep model.

That said, for something like the OP's task, the first thing I would try would be one of the readily available LSTM architectures[a], with a standard softmax layer predicting a distribution over the vocabulary of tags at each time step, and feeding that into a standard beam search.[b]

[a] Example: https://github.com/salesforce/awd-lstm-lm/blob/master/model....

[b] Intro to beam search algorithm: https://www.youtube.com/watch?v=UXW6Cs82UKo

Moru · on May 4, 2018

So now you can autoconvert all cups to deciliters for usage in other countries? :-)

I have a recipe collection since I bought my first Atari 30+ years ago. I always tried to stick to text-files written with a certain style in mind for later processing into a database but haven't gotten around to actually do it yet. Did a half-hearted start on in last week again while I was sick but since the textfiles work already as they are, I stopped again.

Usually the ingredients are written "1 dl water", nothing fancy like their problems so would be very easy to parse with just some php.

Freak_NL · on May 4, 2018

> So now you can autoconvert all cups to deciliters for usage in other countries? :-)

In my experience converting between cups/tablespoons/teaspoons and metric units only complicates recipes. It's easier just to get a set of measuring spoons and cups in addition to a set of scales and metric measuring cups. Besides, even though cups may not be used as much in the metric world, tablespoons and teaspoons and their fractions are, so most home cooks already own a set of measuring spoons at least.

Of course units like floz and oz do need converting to normal volumetric and weight units.

thiagocsf · on May 4, 2018

Where did you buy your measuring cups for oven gas mark 5?

Moru · on May 9, 2018

No matter if it has degree settings or not, you still have to calibrate your oven. A gas mark 5 is just as exact as "medium heat". You need to use your experience to calculate or just guess.

mark_l_watson · on May 4, 2018

A three year old article, but worth reading: NYT solves a pain point using NLP and we’re nice about open sourcing their work.

BTW, if you sign up properly, the NYT has an API for collecting articles by topic - also useful for NLP research. I used this API a few years ago.

js2 · on May 4, 2018

The code discussed in the post is here:

https://github.com/NYTimes/ingredient-phrase-tagger

magoon · on May 4, 2018

The value of converting media content into structured data is underrated.

I imagine the future market is vast for media companies to offer their large stores of content this way.

Kudos to NYT. I’m often impressed with their technical contributions.

jonnydubowsky · on May 4, 2018

I agree wholeheartedly! While NYT subscription revenue has picked up in recent years, I wonder if more newspapers and publishers might find significant value in building more of these knowledge based software products? Structuring the data appears to be the key process to make these products effective. Does anyone know of any other examples similar to this effort?

tclancy · on May 4, 2018

>recipes that users can search, save, rate and (coming soon!) comment on

That would be a fun NLP follow-up: mark or remove recipe comments that follow the general pattern "I substituted a for b, c for d, didn't bother with x because I never buy it, broiled the whole thing instead of pan frying and it was TERRIBLE!"

speps · on May 4, 2018

Please add (2015) to the title.

sctb · on May 4, 2018

Updated. Thanks!

chatmasta · on May 4, 2018

Cool project and good write up. This CRF reminds me a bit of definite clause grammar (DCG) from prolog. Would be interesting to mix the two using some sort of probabilistic predicate logic.

camkego · on May 5, 2018

Would anyone in the know care to share pointers to other current state of the art ingredient structured tree extractors, either open source, or proprietary?

wenc · on May 4, 2018

This is the first time I've seen MathJax used at a mainstream site like nytimes.com. Good for them

azinman2 · on May 4, 2018

Should have a (2015)