
Show HN: Privacy conscious, discovery focused search engine – Whize (Alpha) - Grimm1
https://alpha.whize.co
======
anotheryou
Well for now I sadly don't see any use above the github search.

What will the first dedicated "discovery" feature be?

Focusing on discovery sounds amazing, but just adding noise to a "perfect"
google-like ranking would just degrade the search result overall.

edit: I could imagine ranking harshly for quality of content and a focusing
less on direct relevance (e.g. keyword occurrence) but rather "proximity" of
the results. With proximity I mean topical or social relation between your
query and the result. Look for Wordpress, find other CMS' and WP-plugins.

But quality and proximity are both so much harder to measure than just pulling
the keyword matches to the top.

edit2: for visibility of startups I could rather imagine a hand-curated and
sorted list of tools

~~~
Grimm1
So to the point of your first edit that is basically what we demo here in
regards to GitHub we have some metrics in regards to the repos we've been
tracking and we have various knobs that control how fast a top result would
degrade from it's position while trying to maintain topical relevance in the
results. The difficult part is with GitHub you have some canned metrics you
can use as an approximation for quality and novelty that aren't immediately
obvious for the rest of the web. We have some thoughts in mind though of how
we can tackle that.

For number 2 it's interesting that you mention that, our initial inspiration
for this was the awesome lists which if you look those up are hand curated
lists of different topics that people have put together. We wanted something
wider in that we want to be able to accommodate a much wider array of
interests and we though by figuring something out algorithmically we could do
it for anything we've crawled. What we didn't like about those lists is
they're static, the top item in the list is going to stay that way we wanted
to create something that will refresh itself and provide continuous use in
discovering new things about some topic of interest.

~~~
anotheryou
maybe you could show what is novel about the search result a bit better?

~~~
Grimm1
So like some human understand able description from the metrics we are using?
"This website is fairly new and is seeing a lot of active contribution." "This
website's content is different from other websites in the same category."

~~~
anotheryou
Exactly.

Or if you mix them in a smart way maybe explain that: ("we promote projects
that were at least a bit active within the last x month and ...").

Maybe "top lists" could work? most active, most unique, most collaboration

------
Grimm1
To give some context, for this demo we crawled the entirety of public GitHub
to show off our idea of what we mean by novel. We chose GitHub because we
figured HN might like to see some cooler repos for things they are interested
in.

We have the capability to crawl about a hundred million sites a day and are
figuring out where to go from here.

If you want to learn more about what we're doing.
[https://medium.com/@iantbutler01/existing-search-engines-
fai...](https://medium.com/@iantbutler01/existing-search-engines-fail-
independent-and-small-businesses-enter-whize-404958949534)

~~~
kburman
> entirety of public GitHub to show off our idea

Well, I tried searching for my public repo and end up with 0 results. Same
with my username.

~~~
Grimm1
Worth noting too you may just not meet our idea of novel if your score is
sufficiently low you are not ranked.

~~~
kburman
> meet our idea of novel

Thanks for letting me know.

~~~
Grimm1
Always welcome to build a competing service with your idea of what is novel.

