
Turn recipe websites into plain text - vincent_s
https://plainoldrecipe.com/
======
benawad
It looks like you're using the recipe-scrapers library to scrape recipes which
only supports a set number of websites.

If you want to expand that, I recommend parsing JSON+LD and Microformats.
Given your parsers folder [2], it looks like you've tried it, but only for
specific websites. I would make that generic and check whether the metadata is
available on any website. I wrote a blog post on this if you're interested
[3].

source: I've built a very similar tool for my cooking app:
[https://www.mysaffronapp.com/](https://www.mysaffronapp.com/)

[1] [https://github.com/hhursev/recipe-
scrapers](https://github.com/hhursev/recipe-scrapers)

[2]
[https://github.com/poundifdef/plainoldrecipe/blob/master/par...](https://github.com/poundifdef/plainoldrecipe/blob/master/parsers/bowlofdelicious.py)

[3] [https://www.benawad.com/scraping-recipe-
websites/](https://www.benawad.com/scraping-recipe-websites/)

~~~
xtracto
Howdy crap... I just created an account on your website and added one random
recipe (cashew nut yoghurt) that did not work on the original post site, and
it worked like a charm!

You've got a new paying customer :)

I'd been looking for something like your app for a long time.

Ough, your PayPal flow is not working :( fix that and you'll have a paying
customer haha

~~~
benawad
Oops, what part of the flow isn't working?

~~~
xtracto
[https://ibb.co/kQrkbJV](https://ibb.co/kQrkbJV)

(will autodelete in 30 mins) there's nothing very private but...

~~~
benawad
I'm using Paddle for payments and it looks like something wrong on their end.
I'll contact them and keep you posted. Thanks for the heads up!

------
memset
Hi! I built this! (Surprised to see it on the front page as I didn’t get much
traction when i first submitted it :)

Anyway, all of you have a lot of neat suggestions! Please do take a look at
the “contributing” section of the repo and let me know if you’d like to pitch
in!

~~~
chillee
Hackernews is quite random - stuff that gets no upvotes will sometimes get
hundreds of upvotes the next time it's posted.

That's why they explicitly allow some amount of reposting.

~~~
danielbarla
From what I've observed on other sites (mostly StackOverflow and Reddit) I
think there are a few major components to this, namely: timing of post, luck /
randomness regarding early upvotes, and other posts during that day.

The last one is a bit like when a film releases on the same weekend as some
blockbuster, it is more likely to go under the radar. The middle condition is
mostly luck, unless someone is prepared to manipulate the early upvotes
somehow. But the first one is quite easy to time correctly - US-heavy sites
tend to have a few time slots where a disproportionately large number of
people check it. My guess would be morning, lunch and afternoon slots, and
especially slots where different time zones overlap. E.g. an afternoon slot
for Europe which overlaps with the Eastern seaboard doing a morning check of
HN might work quite well, etc. This can help a fresh post break through the
decaying, but highly upvoted older posts that are keeping it from more
visibility.

~~~
julianlam
A couple years back, there was a hackernews post that analyzed hundreds of HN
submissions and determined which times were best to post.

Some days I wish I had bookmarked that link

~~~
tleb_
This [0]? Or the linked article in it?

[0]: [https://chanind.github.io/2019/05/07/best-time-to-submit-
to-...](https://chanind.github.io/2019/05/07/best-time-to-submit-to-hacker-
news.html)

------
tgb
I've always wondered why recipes have the ingredient list and quantity
separate from the instructions. I often have to scroll up and down (even on a
mostly-decent recipe site like all recipes.com) first to see what gets added
next and then to see how much of it to add. Why not tell me it's one teaspoon
of salt in the same place as you tell me to add the salt? Only advantage to
separating them is to make the shopping list, but no reason one can't
duplicate the quantities.

~~~
doersino
Same. As a result, I've been using this LaTeX package for my personal recipe
collection (see example on page 5):
[http://ftp.gwdg.de/pub/ctan/macros/latex/contrib/cuisine/cui...](http://ftp.gwdg.de/pub/ctan/macros/latex/contrib/cuisine/cuisine.pdf)
[PDF]

(I'm in the process of abandoning LaTeX in favor of a custom Markdown → Pandoc
→ HTML flow with basically the same layout, though.)

~~~
k2enemy
Funny, after trying out all sorts of digital recipe organization methods, I've
settled on plain old 4x6 index cards as the most useful.

I have all of my recipes plaintext in an org file with a very simple format...

    
    
      * Recipe title
      ** Ingredients
         - 1 tsp x
         - 2 tbsp y
      ** Directions
         1. Mix together x and y
         2. Bake at 350 for 15 minutes
    

I set up a simple python script using the same package as the OP's website for
scraping recipe sites to org format, then I export subsections to latex with a
custom class and print to index cards.

~~~
memset
My site (the OP) uses a separate print-specific CSS to try and format recipes
to a 4x6 index card. It doesn't work as well as I'd like, as in, it's hard to
get what shows up on the screen to print on an odd-sized piece of paper, but
if you, or someone you know, has CSS expertise then I'd love for someone to
help make this work more reliably!

------
dlivingston
This is great and works, for the supported sites, remarkably well.

One major flaw: it seems like the calories and macros aren’t captured. For
bodybuilding and powerlifting types, and other athletics, these are the most
important part of a meal.

What would make this a “killer app”, in my view, is if I could request a
recipe in JSON instead of just formatted plain text as you do. Then I could
use the recipe (and recipe search) in my own home-brewed meal planning
program.

~~~
aspenmayer
Very nice insight. I might even go a step further and archive the entire
page[1]; hard drive space is cheap, and how many recipes is one person going
to save, honestly? 1-2 LOCs worth? Then you can just parse the content you
want, with the ability to drop down into the original page as you first saw
it.

As a person with better visual memory for certain kinds of data, having the
original page content may have as much meaning as the recipe, for entirely
different reasons. Food can be very personal, and recipe books doubly so. A
recipe archive can be as personal as we like, or all of that can abstracted
away when we don’t need it.

[1] [https://www.gwern.net/Archiving-URLs](https://www.gwern.net/Archiving-
URLs)

~~~
gindely
LOC? Line of Code? Surely 1-2 LOCs is significantly less than any recipe.

~~~
icegreentea2
I'm guess it's "Library of Congress"

~~~
aspenmayer
Yeah, that’s what I meant. I was going for a visual metaphor that might be
lost if I just said some number of bytes.

------
adrianmonk
My quick and dirty trick for finding the actual recipe in a long blog post:

Ctrl+F "print".

Probably 90+% of websites use some tool to format the actual recipe, so the
actual recipe pretty consistently includes a link for printing it. Search for
the word "print", and it takes you to the part of the page that has the
recipe.

(You don't actually have to click the link. It's just a landmark for
navigation within the page.)

------
bjoli
Neat! Now make a website that instantly electrocutes or at least painfully
shocks people that publish recipes as Instagram stories! The accumulated anger
I have after trying to follow those will mean a hefty donation from my side :)

As an aside: I live in a jurisdiction where recipes cannot be copyrighted, so
I have collected all recipes I remotely liked on a web page with only text.
All recipes except some of the very last ones "untested, potentially
disgusting" headline are in Swedish though:
[https://koketteriet.se/skrivet/Recept/recept.html](https://koketteriet.se/skrivet/Recept/recept.html)

~~~
perk
Wow, your collection of recipes is a goldmine, thanks!

Didn't know that recipes couldn't be copyrighted here in Sweden.

~~~
bjoli
It is due to a very old ruling saying that just listing ingredients and steps
is not enough to reach a "threshold of originality" (verkshöjd).

~~~
TheSpiceIsLife
Maybe that’s why modern recipes on the web at 90,000 word behemoth life
stories with one step of the recipe every 70 screens of indirection.

~~~
bjoli
We can still just copy the how and what, and not the accompanying epic novel
:)

I think I managed to give credit where it is due (in the intro, at least), but
about 60% of the recipes are things we veganised ourselves from whatever old
family recipes we have.

~~~
TheSpiceIsLife
Plug a friends books here, all vegan from Byron Bay in New South Whales:

[http://organicpassioncatering.com/](http://organicpassioncatering.com/)

Anyway, I find, as I rapidly approach the big four-oh, I don’t need recipes
much anyways, so I have a sleeve-book with some recipes, some pages are just
dish names, and others are flavours or ingredients that go together.

------
patrickbolle
This is fantastic! I'd love it if I could pair this with an RSS aggregator to
get new, plain-text recipes in my RSS feed every day/week, etc. Right now I
get an RSS feed of my favourite recipe sites but have to get through so much
junk to see the actual recipe.

I recently went down the rabbit hole of recipe websites while I built a little
side project Shopify app for creating recipes on ecommerce stores[1]. Having a
plaintext version has been on my backlog to-do list forever, but it seems the
vast majority of store owners aren't keen on it. It's been a massive learning
experience for me; and I never realized how... bad the recipe website
experience was for so many people.

Question: how do you think we, as in us as the people building the current
web, can improve the standard recipe display on the web? Obviously removing
the 1300 word novel before recipes is a big plus, but what else do you think
would improve your day-to-day recipe browsing?

[1] [https://recipekit.app/](https://recipekit.app/)

~~~
bagacrap
the obvious answer is to remove the insane number of enormous ads you have to
scroll past or click through on your way past that novella

~~~
patrickbolle
That's valid. I guess I haven't seen ads in a long time thanks to ublock, but
makes sense for the general reader for sure.

------
butler14
As this only works on certain websites, that implies you're not pulling from
something consistent e.g. structured data, which the vast majority of popular
/ visible websites should use

e.g. [https://search.google.com/structured-data/testing-
tool/u/0/#...](https://search.google.com/structured-data/testing-
tool/u/0/#url=https%3A%2F%2Fwww.everydayhealthyrecipes.com%2Fauthentic-polish-
bigos-stew-recipe%2F)

out of interest, did you try using structured data to scale the tool?

~~~
C1sc0cat
Yes I would have thought that handling json mark-up would be the first thing
that would be implemented.

Having said that a lot of users of recipe mark-up have problems implementing
it properly.

------
jspash
Now if you can add a feature to convert Cups -> Grams _correctly_ then you've
got my money!

~~~
encom
I don't know how americans manage to feed themselves, while measuring things
with cups! It's such a bizarre and archaic unit. Like the english stone.

[https://en.wikipedia.org/wiki/Stone_(unit)](https://en.wikipedia.org/wiki/Stone_\(unit\))

~~~
jeremydavid
I'm not American (Canadian living in Germany) and I have to say that I find
using cups/spoons is much easier than pulling out a scale and weighing
everything. Just grab a measuring cup / spoon (every house has a set of
standard sized measuring cups and spoons) fill it up to the right amount and
dump it in.

~~~
Aeolun
How do you fill a cup with apple slices?

~~~
adrianmonk
It's quite easy to fill a cup with apple slices. You won't get an accurate,
reproducible measurement this way, but the procedure itself isn't difficult.
If accuracy isn't that important, then it works okay.

------
langitbiru
This is awesome. But perhaps this product is better packaged as a browser
extension. So when you browsed a recipe page, the product would turn the page
into a plain text recipe.

~~~
bagacrap
I'd rather not install an extension (trust) and not all browsers I use (e.g.
on mobile) support extensions. You can use a bookmarklet pretty much anywhere
though:

javascript:location.href='[https://plainoldrecipe.com/recipe?url='+encodeURIComponent(l...](https://plainoldrecipe.com/recipe?url='+encodeURIComponent\(location.href\))

~~~
SilasHaslam
Thanks for the great idea. The code got broken in the comment, but was easy
enough to reconstruct. Another bookmarklet I use is PrintFriendly. It uses
some of it's own js, and also works well on recipe sites - giving more options
for what to keep/remove.

------
Polylactic_acid
Recipe websites are the worst of web design. To make things even worse they
are mostly used on mobile where bloat/advertisements/janky loading is even
more of a pain.

~~~
somehnguy
Not to mention the 1200 word partial life story preceding every recipe.

I could not care less what pivotal moment in your childhood led you to like
chocolate cake. I just want to make one.

~~~
newswasboring
>Not to mention the 1200 word partial life story preceding every recipe.

Everyone I know balks at this, yet it exists. Can some explain why? Where is
this tradition coming from? Does this drive page views? I know nothing about
the cooking subculture. Is it some sort of normies vs "really into this" thing
(and please excuse the phrasing, I could not come up with something better).
Because that would mean the page is just not targeted at me. But then again
that argument will make sense if the monetization was not through
advertisement and view volume.

I didn't realize how curious I was about this.

Edit: just had a thought. Can it be something to do with SEO? Does google like
bigger articles with lots of personal text? If this is the reason it would
blow my mind. Without knowing it google algo would have made the internet a
shittier place, this is very amusing to me.

~~~
twalla
It's mostly SEO and "engagement" but also possibly copyright law (at least
that's the case with cookbooks that include non-recipe fluff content)

From the US Copyright Office:

"Mere listings of ingredients as in recipes, formulas, compounds, or
prescriptions are not subject to copyright protection. However, when a recipe
or formula is accompanied by substantial literary expression in the form of an
explanation or directions, or when there is a combination of recipes, as in a
cookbook, there may be a basis for copyright protection.”

~~~
gruez
Adding your life story into your recipe might make it eligible for copyright,
but what's the point? Someone can simply strip out your life story and then
they'll be allowed to copy it.

------
mobasirhassan
I am eagerly awaiting for something like that can happen in coming days. There
are lots of things like photos, video, ingredients, nutritional value, serving
tips, recipe notes and ofcourse structure data in my
[https://hassanchef.com](https://hassanchef.com)

------
furstenheim
Very interesting. I'm starting now a project to organise my recipes. With
[https://github.com/domchristie/turndown](https://github.com/domchristie/turndown)
it transforms the html from clipboard into markdown, they result is really
good and it's generic to all recipes.

Of course you don't have structure of the recipe and it does not simply work
with a url.

------
wimagguc
I wonder if this could work as a DuckDuckGo clip? You search for a recipe, and
see the result right there on top, without all the fluff.

------
SamuelAdams
Neat idea! The fiance and I have taken to simply printing recipes on paper. It
has the benefit of eliminating a lot of the cruft that the website has -
condensing the printed material to just ingredients and instructions.

And makes it really easy to reference a week or two later if we want to remake
it - just pull it out of the binder.

~~~
mhb
I've tried that. After the recipe binder got to be a couple of inches thick,
it would take more discipline than I have to keep it organized in some kind of
effective way. Plus I don't know what way that would be. So I pretty much end
up either looking at the laptop while cooking or printing out a new copy each
time.

------
dmje
This is ace!

I do often wonder if many recipe sites ever really think about the user
experience. It's almost always going to be on mobile and there's either got to
be a simplicity to the presentation (which is what this does brilliantly) OR a
really simple way to flick between ingredients and method.

The one that stands out for me is Jamie Oliver (example:
[https://www.jamieoliver.com/recipes/vegetables-
recipes/veggi...](https://www.jamieoliver.com/recipes/vegetables-
recipes/veggie-chilli/)) - on mobile there's this super simple tab that flips
you between ingredients / method. It's simple, but exactly what you want,
especially when you're covered in flour / oil / etc... :-)

------
kaonwarb
Thank you for this. For an app the does something similar (not free):
[https://www.paprikaapp.com/](https://www.paprikaapp.com/)

~~~
diroussel
I've been using Paprika app for years, and I would reccomend it.

As well as importing from websites, you can enter you own recipes. And my wife
and I can sync between our phones and laptops so we can follow and edit the
same recipes. Very handy.

------
btbuildem
Very cool! Does a good job on select websites.

Next stop: the holy grail of recipe website cleanup, removing the mindless
drivel about the author's.. whatever / whoever, their inane anectodes, and
generally any trace of their annoying personalities (here's an offending
example: [https://www.ibreatheimhungry.com/easy-roasted-pork-
shoulder-...](https://www.ibreatheimhungry.com/easy-roasted-pork-shoulder-3/))

------
vestrigi
I was already excited when Apple showcased an Extension for the new Safari in
their keynote that did something similar.

I wonder if any of these websites will find a way to prevent these recipe
scrapers from working so that people read the damn ads in their blog texts.
Instapaper‘s article extraction gets rejected from time to time and Safari‘s
reader mode too, but it’s mostly on major news sites.

------
tlofreso
Nice work! I built something very similar:
[https://recipemincer.com](https://recipemincer.com)

It seems you're using the same Python scraper I am:
[https://github.com/hhursev/recipe-
scrapers](https://github.com/hhursev/recipe-scrapers)

------
richardgill88
This is really cool, I've been building a site to help build templated recipes
with markdown and javascript and I'll definitely be using this when I
transcribe recipes!

[https://programmablerecipes.com](https://programmablerecipes.com)

------
Lerain
This is pretty cool! Are you open to PRs for foreign language recipe sites
(German in my case)?

------
ErikAugust
Trim is a general purpose remover of cruft, but seems to work on recipe sites:
[https://beta.trimread.com/articles/23286](https://beta.trimread.com/articles/23286)

------
reactchain
I want a tool like this for _any_ website, that intelligently extracts just
the useful part as plain text. Reader view and so forth just don't work that
well. Are there any tools around like this?

------
bennettfeely
Related: I just launched a website with a collection of easily copy-and-paste-
able lists in plain text.

[https://copypastelist.com/](https://copypastelist.com/)

------
xwdv
Recipes are so common on the web that we need a <recipe> html tag along with
<rs> and <ri> for steps and ingredients.

~~~
2038AD
there is actually a standard using attributes :)
[https://schema.org/Recipe](https://schema.org/Recipe)

------
MaxBarraclough
Tried it with two websites, and neither was supported.

Is this aiming to be a domain-specific equivalent of Outline.com and reader-
mode?

------
bovermyer
Recently I started putting my recipes on my personal website in a minimalist
way. No stories, no ads, no images.

------
blcarson
The title on the homepage is spelled wrong but otherwise nice work! :)

~~~
HenryBemis
"Plan" instead of "Plain". No spellchecker will pick this up :) Many will see,
few will notice, and typoglycemia will do the rest :)

~~~
cstuder
A german synonym for typoglycemia is "Buchstabensalat" \- letter salad. So
maybe it's a very subtle joke.

Otherwise there's a PR open:
[https://github.com/poundifdef/plainoldrecipe/pull/1](https://github.com/poundifdef/plainoldrecipe/pull/1)

------
cburger
This is awesome. Thanks !

------
Jolter
Got a certificate security error from Firefox, so didn't visit.

------
greenie_beans
Yes!! Thank you

------
rawoke083600
OH AWESOME !

