I remember reading about a project who’s sole purpose is to provide a large index of the open web for free, anyone could download it. Forgot the name of the project.
Why can’t mwmbl download their index?
Also, is mwmbl planning on providing their crawled index for free? Like, can I also download it later?
If that is the case, I’s happily download their FF extension.
I installed the extension because why not, then I noticed it was only crawling spam pages and redirect links that were being abused for spam... I guess it's kind of expected but not sure how I feel about it
Though, it's still a very cool idea, maybe an option to crawl sites I visit would be nice?
> Though, it's still a very cool idea, maybe an option to crawl sites I visit would be nice?
I had an extension installed for a while that submitted pages to the Internet Archive automatically, but it was a constant battle to remember to denylist any sites that were personal (bank, doctor, whatever) before visiting them. That was before I was heavy into the Firefox containers setup, so if I were to try that again I'd try to find a way to disable it for those containers (which, come to think of it, may be yet another container feature request)
Having thought a little further about your suggestion, I could imagine an extension that merely submitted the window.location.origin to the search engine and let it index the site, as a heuristic for "this site is popular enough to have received a visit in the past hour/day/whatever" but with Mwmbl specifically that'd put things back in a loop since it would send the site back to your browser to index it for them
I sure do hope Mwmbl's extension is not using the full browser context to make requests, otherwise any request to index mail.google.com would be no bueno
Why can’t mwmbl download their index?
Also, is mwmbl planning on providing their crawled index for free? Like, can I also download it later?
If that is the case, I’s happily download their FF extension.