Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What is the legality of scraping recipe ingredient lists?
16 points by bobblywobbles on July 24, 2022 | hide | past | favorite | 16 comments
Companies such as AllRecipes, BigOven and Yummly have millions of combined recipes saved within their systems. BigOven, for example, has a clipper tool (https://www.bigoven.com/clipper) that allows users to copy recipes from other sites and save them in BigOven. Based on Copyright Law, lists of ingredients are not copyrightable, but creative works (ie. ingredients + instructions + images) are. Under this understanding, recipes that are copied beyond the ingredient list into BigOven (or other recipe database) would be breaking copyright law.

Are there reputable sources (attorneys) in Copyright law that I can contact - if it's within legal means, I desire to compile a list of recipes (name of food & list of ingredients only) and sell this collection to users who are interested.




There was a ruling that said that recipes are not copyrightable. However, anything beyond the recipe ingredients and method is copyrightable.

Think images, videos, or that hunk of text detailing their experience as a 5 year old before a recipe.

Way back in the day I ran a public recipe scraping API. It had an option to scrape the image along with the recipe, but you had to set an explicit copyright flag to get them.

There was a HN post about it if you’re curious: https://news.ycombinator.com/item?id=14794949


The purely-mechanical portion of the recipe (list of ingredients and instructions) are excluded from copyright. Any other non-mechanical content (pictures, but also descriptions of the recipe or all that flavor/story text that pollutes modern online recipe sites ) will be protected by copyright.

Additionally, it is likely that collation or curation of the recipes into categories or collections or sites is protected by copyright.

Finally, if the website posts terms and conditions which limit or restrict your access, you might end up with some liability should you violate them. This is still an area of law that is shifting, but according to the American Bar Association, courts now somewhat routinely enforce terms-and-conditions when clear notice is given to the user and the terms are not unduly long or confusing. In general, the situations most likely to be enforceable are when the user sees a clear prompt to agree to the Ts and Cs, and clicks an I Agree button. The further away from such a clear agreement one strays, the less likely a court is to consider access/use of the site as agreement to the Ts and Cs as a binding contract.

I am not a lawyer, I am not representing you, and this is not legal advice


If it is publicly accessible, they cannot limit the means by which you access the information from the perspective of the law. This was seen in the case of Oracle & another company scraping their technical documentation IIRC (https://www.eff.org/deeplinks/2018/01/ninth-circuit-doubles-...). The site may take action to prevent this access, etc. but they cannot pursue legal action for this alone (unless this has changed in the last couple of years). There was a similar precedent with Linkedin maybe 5-7 years ago.

That being said, depending on the target they can make things difficult if they know you're doing it. In general, scraping data itself, then transforming that data for use is reasonably safe, but using the scraped content in unprocessed form can be problematic. Selling the collected data to users without processing it sounds like it could cause problems both from the target companies, as well as via the customer's perception of how you acquire the data. Processing the data to show something like variations in recipes per region, categorizing different recipes into styles based on ingredients, cook time, complexity, etc. are all value-adds which make your data more useful than the raw data-set, and make a stronger argument for the sale of your dataset.

Of course IANAL, and I welcome anyone else to add to or contradict this info.


Supremes have said that cookbooks and other fact collections don't necessarily have copyright protection. Recipes definitely don't.


As a base law, that's helpful, but I'm wondering when it gets into cfaa territory. What if you violate the terms of service to scrape it? What if you have to implement tricks like IP rotating or useragent lying to get the data? What if you accidentally scrape some creative text which is copyrightable, embedded in the recipes and accidentally sell it?


If it matters, ask your lawyer.

If it doesn't matter, it doesn't matter.

Good luck.


I mean, in terms of riskiness, this is as least risky as you can get in the field of crawling/scraping. You're not scraping product prices, social media profiles, real estate data, or anything that is majorly commercial - just recipes.

Disclaimer: not legal advice.


I scrapped https://allrecipes.com a few years ago and posted a link to it on reddit... Maybe the file is still up but I'm too lazy to find it right now.


My opinion is simple.

Legally isn’t the best argument.

Morally is.

Scraping and selling is morally not ok.

Scraping and selling may legally be possible.

I’d rather sleep with myself comfortably at night and stick with the moral police


> Scraping and selling is morally not ok

What about hand copying? Is it ok to read a recipe on one site, and write it for your own? Or should the original site own that list of ingredients and instructions?

If it's ok to hand copy but not to scrape, what is the difference? Your personal time cost?

Like most things in life, one simple binary rule does not work for most situations.

Perhaps your primary objection is to the "selling" part. What if merely presenting the information comes at a cost, and you make up for that cost by "selling" access?

Or what if you are providing additional value to the original content?

Or what if you are making the information accessible to a larger audience, particularly one with special accessibility needs? (This may seem like grasping for straws, but so many websites are so badly made that even people with low data and old phones cannot get the content from them. So re-presenting that data in a better way may now make it accessible to a lot more people. Those are people who _could_ not get it from the "original" source in the first place.)


Then I guess Google is not a moral business.


Well I'm fed up with being belted & whipped by the moral police who are so easily misled & duped into excessively strict moral punishment. Especially when 20 years later the moral police look like conservative uncles who call themselves father's but we all know they should be called uncles... For hitting us with the 'moral stick' & traditional soap in mouth.... Lol uncles...


The moral police hasn't been conservative for 20+ years. Now we're in the age of liberal political correctness.

(It has been an interesting experience to side more with conservatives than liberals for the first time in my life, but we live in the society in which we live)


>The moral police hasn't been conservative for 20+ years. Now we're in the age of liberal political correctness.

go touch grass


You wouldn't bake a car, would you? /s

I could get not ripping off the description and images of a recipe, but refusing to copy ingredients lists is a level of moral piety I can't reach.


I'm talking personally, after someone asked. I wouldn't be lashing out on the general public with such opinions. I only can be aware and control my own actions.




The deadline for YC's W25 batch is 8pm PT tonight. Go for it!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: