More

snake117 · on Nov 11, 2023

Looks interesting, and thank you for sharing this! One common issue with scraping web pages is dealing with data that is dynamically loaded. Is there a solution for this? For example, when using Scrapy, you can have Splash running in Docker via scrapy-splash (https://github.com/scrapy-plugins/scrapy-splash).

philippta · on Nov 11, 2023

Thanks! As mentioned in another comment, currently there is no build in support for this yet.

As a workaround one could use a service like ScrapingBee (not affiliated) as a proxy, that renders the page in a browser for you.

Surely, relying on a service for this is not always ideal. I am also working on a small wrapper that turns Chrome into an HTTPS proxy, which you could plug right into flyscrape. Unfortunately it is very experimental still and not public yet. I have not yet decided if I release it as part of flyscrape or as a separate project.

figmert · on Nov 11, 2023

Can't you load the URL that is being dynamically loaded directly within your scraper?

mdaniel · on Nov 11, 2023

Not only can you, in my experience it is substantially less drama and arguably less load on the target system since the full page may make many many other requests that a presentation layer would care about that I don't

The trade-offs usually fall into:

- authing to the endpoint can sometimes be weird

- it for sure makes the traffic stand out since it isn't otherwise surrounded by those extraneous requests

- it, as with all good things scraping, carries its own maintenance and monitoring burden

However, similar to those tradeoffs, it's also been my experience that a full page load offers a ton more tracking opportunities that are not present in a direct endpoint fetch. I mean, look how many "stealth" plugins out there designed to mask the fact that a headless browser is headless

But, having said all of that: without question the biggest risk to modern day scraping is Cloudflare and Akamai gatekeeping. I do appreciate the arguments of "but ddos!11" and yet I would rather only actors that are actually exhibiting bad behavior[1] be blocked instead of everyone trying with a copy of python who have set reasonable rate limits

1 = this setting aside that "bad behavior" can be defined as "downloading data that the site makes freely available to Chrome but not freely available to python"

snake117 · on Jan 22, 2023

I didn't have enough time to dive into all the information or test out the editor. However, one bit of feedback that I have is just wondering what sets ERD Lab apart from other existing solutions? What motivated you to make your own DB design tool? If a prospective customer was either weighing the pros/cons between several DB design tools or already using another tool, what would compel them to decide on using ERD Lab?

Maybe consider including a table highlighting the differences in features/pricing from other similar tools on the homepage (after the list of features) or a separate page entirely highlighting this information? For example, Render.com has a page specifically highlighting how their product is a better PaaS solution compared to Heroku [0]. This is just a thought.

Personally, I'm not familiar enough with DB design editors to know what I should be looking for in such a tool. Moreover, about ~1 year ago, I was looking for a solution that solved this exact problem, so I am genuinely curious about this.

Finally, just wanted to mention that while I don't need this at the moment, I did bookmark it to consider using in the future.

Otherwise, great job, and I wish you the best!

[0] - https://render.com/render-vs-heroku-comparison

snake117 · on Oct 26, 2020

Probably one of my favorite courses during my undergraduate studies at University of Michigan. Recently, I needed to brush up on statistics and was excited when I found this entire course available online.

All of the course materials are available under the 'Materials' tab, including the labs.

The lecture playlist on YouTube can be found here: https://www.youtube.com/playlist?list=PL432AB57AF9F43D4F

Back when I took this course, they were teaching us how to use SPSS, but it seems like they have since transitioned to teaching R. Definitely worth checking out if you need to review the fundamentals of statistics and probability.

snake117 · on Aug 25, 2020

I've read a lot of comparisons between Asana and Jira, but I was curious if anyone can explain the difference between Asana and Basecamp? I don't have experience with either service.

snake117 · on June 17, 2020

Great news! Also, thank you and everybody for all your hard work :)

Will there still be the $15k grant consideration if you become certified when you submit your app for the YC Core program?

kcorbitt · on June 17, 2020

We aren't currently evaluating companies for grants, but are likely to restart a version of that program later in the year. If and when we do, founders that participate in the program starting now will be eligible!

snake117 · on May 29, 2020

Thanks for sharing! Do you mind me asking what admin template you used for the app? I'm searching for a decent admin template right now with a similar color scheme.

snake117 · on Feb 6, 2020

The only other company I know attempting to revolutionize email is Superhuman [0]. It will be interesting to compare these services. One thing I can say is that Basecamp will certainly be competitive if they can offer their service for less than $30 per month, which is how much Superhuman charges currently [1].

[0] - https://superhuman.com/ [1] - https://techcrunch.com/2019/06/27/my-six-months-with-30-mont...

davidivadavid · on Feb 7, 2020

I don't think Hey.com is supposed to be an email client, though, is it?

omarchowdhury · on Feb 8, 2020

Not enough information provided to determine that.

snake117 · on Jan 27, 2020

Along with the other resources mentioned, I would check out https://elixircasts.io/

I find the videos to be great, both in terms of quality and pacing. Make sure to use the coupon code 'elixirforum' (without quotes) to get 10% off!

snake117 · on Jan 15, 2020

This is an awesome post. Thank you for sharing your story with the HN community.

I was browsing your GitHub and played around a little with IsoCity [0]. I really like it! Small projects like these are great and you can learn so much from them. You said it very well:

> I don't do projects to gather attention, I do cause I have fun doing them.

Thanks again for sharing and all the best :)

[0] - https://github.com/victorqribeiro/isocity

atum47 · on Jan 16, 2020

yeah, I was surprised with all the attention that project got. Made sure to link to the artist portfolio so people would give him some love too

snake117 · on Jan 2, 2020

This brings to mind Dr. Greer and the Sirius Disclosure project [0]. The first documentary, Sirius, is available on YouTube [1] and the second documentary, Unacknowledged, can be viewed on Netflix.

To be clear, I'm not sure if I entirely believe in all of this myself. However, I don't regret watching any of it. At the very least I got some inspiration for short stories/screenplays.

[0] - https://siriusdisclosure.com/ and https://www.youtube.com/channel/UCC6B4Y0oFACv9QBlf0ebBcg

[1] - https://www.youtube.com/watch?v=5C_-HLD21hA