More

daoudc · 2023-12-30T17:46:31

Here's an alternative for streaming services that I've been trying out: buy second hand DVDs, rip them, then serve them with Plex. You can get 5 DVDs for £10 which is more than I can watch in a month, and less than I'd pay for Netflix, and the choice is huge, even if I restrict myself to these cheap ones.

trevyn · 2023-12-30T17:53:16

Let me guess, you have also sworn fealty to your Lord and Master the King?

daoudc · 2023-11-30T21:50:33

I can't comment on the question as I'm not a designer and not in Germany, but there's a typo on the GitHub page: "tought" -> "thought".

pb82 · 2023-12-01T10:10:11

Thanks, good spot!

daoudc · 2023-11-29T21:55:33

Hi HN! Would love to get your feedback on this idea and the feature. It is super early and there are lots of issues with it, but the basic idea is there.

anenefan · 2023-11-29T22:47:33

I worked on an idea some years ago for a couple of months until putting it up on a shelf, (beyond my capabilities) after the workable way forward was for sites themselves to identify what best labels would cover each page.

Nonetheless I slowly deduced, apart from clear spam, people would be saved a lot of time in searches if two main types of site types could easily be identified in search results, and either include or exclude these results depending on the nature of their search.

The fist being billboard or banner types, where a business had thrown up a large looking site, but really has no working data apart from address, contact, about info, quick really you knew this summary of their organisation or company.

The second are what I refer to redirection type sites, they are sites that actually don't have any / much of their own data, they're just coasting on already existing services [this might have caught people who refashion google maps with additional overlays, but so many now do not,] or an indirect way to get their parent services out thought children sites. I'm one who'd search excluding both if I'm after hard information. Generally people can use regular searches to get address and contact phone numbers for physical sales and service outlets.

daoudc · 2023-11-02T18:13:54

All I can see is a "continue with Google" button...

daoudc · 2023-10-07T21:45:58

This is so cool, the new machine is a beast!

Do you know roughly how many pages you have in your index?

marginalia_nu · 2023-10-07T22:18:25

I'm at about 164 million docs now. So hopefully this will take me into the billions :D

daoudc · 2023-10-07T21:40:48

I've been using dokku for Mwmbl and have been fairly happy with it.

Congrats on the recent success by the way ;)

daoudc · 2023-09-22T14:00:43

That seems to be because it's written kamilkazani, and automatically splitting such names is a hard problem

daoudc · 2023-09-19T13:33:27

The more people that join, and help us crawl, the better it gets.

daoudc · 2023-09-19T13:28:24

Yes, I took a break for a while, my fourth child was just born! Still committed to the project though and working on it when I get time.

Thanks for your encouragement. Would love to have a chat some time.

marginalia_nu · 2023-09-19T13:41:02

Feel free to shoot me an email, I love to talk search engines :D

daoudc · 2023-09-19T13:26:48

We actually just take the union and then re-rank. Because the lists are all small, this is cheap.

marginalia_nu · 2023-09-19T13:39:16

Point is, with a skip list (or similar), the lists don't need to be small. You can intersect data sets that are enormous very quickly using this algo[1] where a single linear read of both lists is the worst case scenario.

[1] https://nlp.stanford.edu/IR-book/html/htmledition/faster-pos...