Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
PageToSheet: Download Tables from Any Webpage to Excel (pagetosheet.com)
52 points by yolo123 on June 12, 2022 | hide | past | favorite | 20 comments


Firefox has a really nice built in table selection mode I use to copy into spreadsheets all the time. Just hold down CMD (Mac) or Control (Others) and you can select table rows and columns.


Where are these tricks documented?


`=IMPORTHTML` in Google Sheets works great and surprisingly many people who could benefit from it a lot don't know it exists.


`IMPORTHTML("http://example.com", "table", 4)` – https://support.google.com/docs/answer/3093339?hl=en

Pretty snazzy. They also have `IMPORTFEED` for RSS which is giving me all sorts of neat ideas.


Duly noted - thanks!



Doesn't Excel already do this on its own? It's based on an IE control in my ancient version (2010), but perhaps it has been updated by now.


Its buggy and unpredictable. I feel the crawling engine hasnt been kept up to date.


In newer versions of Excel, Power Query makes it easy, if you can be bothered to learn Power Query...it doesn't matter if the tables are a bit off, because it is a complete functional-esque language. Like SQL plus easy higher order functions.

PDFs can often be extracted too in a similar way.

I feel exhausted every time I see another thread on HN about what to do when you will not learn Excel & Office because it's for PHBs and secretaries.


And it’s *so* slow, on every PC I’ve used. It’s usually easier to just copy and paste the table, if you can.


The point of automation is making sure you get things correct.


I use copytables https://merribithouse.net/copytables/. It has number of options for selection and also keyboard shortcuts.


There's a neat Chrome extension to do this https://www.georgemike.com/tablecapture/


If you are a Mac user using Numbers and Safari, you can do this by literally selecting the table, and copy pasting it right into Numbers. I do this quite often.


The page is 502ing so maybe I'm missing something but I just copy and paste from my browser into Numbers. Is this for automation?


I use pd.from_html


Woah, didn't know about this, thanks for bringing it to my attention!

Found this article shows its use:

https://pbpython.com/pandas-html-table.html


I guess still better off pdf'ing it and running it through tabula (free) or maybe abbyy (paid)


looks like they need to beef up their web hosting solution.


All sites of this type ( static product description/docs/landing/e.g.) should be thrown on top of Netlify/Vercel/Firebase Hosting/similar.

Zero maintenance, free (well unless your traffic is huge, but then with the maintenance overhead it still usually comes out cheaper), zero security risk ( for you).




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: