Hacker News new | past | comments | ask | show | jobs | submit login

I genuinely think Kagi has led the way on this one. Simplicity is beautiful and effective, and Kagi has (IMHO) absolutely nailed it with their AI approach. It's one of those things that in hindsight seems obvious, which is a pretty good measure of how good an idea is IMHO.

Google could have done it and kind of tried, although they're AI sucks too much. I'm very surprised that OpenAI hasn't done this sooner as well. They're initial implementation of web search was sad. I don't mean to be super critical as I think generally OpenAI is very, very good at what they do, but they're initial browse the web was a giant hack that I would expect from an intern who isn't being given good guidance by their mentors.

Once mainstream engines start getting on par with Kagi, there's gonna be a massive wave of destruction and opportunity. I'm guessing there will be a lot of new pay walls popping up, and lots of access deals with the search engines. This will even further raise the barrier of entry for new search entrants, and will further fragment information access between the haves and have-nots.

I'm also cautiously optimistic though. We'll get there, but it's gonna be a bit shakey for a minute or two.




> I'm also cautiously optimistic though. We'll get there, but it's gonna be a bit shakey for a minute or two.

But I don't understand how all of these AI results (note I haven't used Kagi so I don't know if it's different) don't fundamentally and irretrievably break the economics of the web. The "old deal" if you will is that many publishers would put stuff out on the web for free, but then with the hope that they could monetize it (somehow, even just with something like AdSense ads) on the backend. This "deal" was already getting a lot worse over the past years as Google had done more and more to keep people from ever needing to click through in the first place. Sure, these AI results have citation results, but the click-through rates are probably abysmal.

Why would anyone ever publish stuff on the web for free unless it was just a hobby? There are a lot of high quality sites that need some return (quality creators need to eat) to be feasible, and those have to start going away. I mean, personally, for recipes I always start with ChatGPT now (I get just the recipe instead of "the history of the domestication of the tomato" that Google essentially forced on recipe sites for SEO competitive reasons), but why would any site now ever want to publish (or create) new high quality recipes?

Can someone please explain how the open web, at least the part of the web the requires some sort of viable funding model for creators, can survive this?


> Why would anyone ever publish stuff on the web for free unless it was just a hobby

That's exactly what the old deal was, and it's what made the old web so good. If every paid or ad-funded site died tomorrow, the web would be pretty much healed.


That's a bit too simple. There is way fewer people producing quality content "for fun" than people that aim or at least eventually hope to make money from it.

Yes a few sites take this too far and ruin search results for everyone. But taking the possibility away would also cut the produced content by a lot.

Youtube for example had some good content before monetization, but there is a lot of great documentary like channels now that simply wouldn't be possible without ads. There is also clickbait trash yes, but I rather have both than neither.


Demonetizing the web sounds mostly awesome. Good riddance to the adtech ecosystem.


The textual web is going the way of cable TV - pay to enter. And now streaming. "Alms for the poor..."

But, like on OTA TV, you can get all the shopping channels you want.


Not to be the downer, but who pays for all the video bandwidth, who pays for all the content hosting? The old web worked because it was mostly a public good, paid for by govt and universities. At current webscale that's not coming back.

So who pays for all of this?

The web needs to be monetized, just not via advertising. Maybe it's microtransactions, maybe subscriptions, maybe something else, but this idea of "we get everything we want for free and nobody tries to use it for their own agenda" will never return. That only exists for hobby technologies. Once they are mainstream they get incorporated into the mainstream economic model. Our mainstream model is capitalism, so it will be ever present in any form of the internet.

The main question is how people/resources can be paid for while maintaining healthy incentives.


No one paid you to write that?


Except I also pay my network provider to run the infrastructure

I think you forgot that


It costs the Internet Archive $2/GB to store a blob of data in perpetuity, their budget for the entire org is ~$37M/year. I don't disagree that people and systems need to be paid, but the costs are not untenable. We have Patreon, we have subscriptions to your run of the mill media outlets (NY Times, Economist, WSJ, Vox, etc), the primitives exist.

The web needs patrons, contributions, and cost allocation, not necessarily monetization and shareholder capitalism where there is a never ending shuffle of IP and org ownership to maximize returns (unnecessarily imho). How many times was Reddit flipped until its current CEO juiced it for IPO and profitability? Now it is a curated forum for ML training.

I (as well as many other consumers of this content) donate to APM Marketplace [1] because we can afford it and want it to continue. This is, in fits and starts, the way imho. We piece together the means to deliver disenshittification (aggregating small donations, large donations, grants, etc).

(Tangentially, APM Marketplace has recently covered food stores [2] and childcare centers [3] that have incorporated as non profits because a for profit model simply will not succeed; food for thought at a meta level as we discuss economic sustainability and how to deliver outcomes in non conventional ways)

[1] https://www.marketplace.org/

[2] https://www.marketplace.org/2024/10/24/colorados-oldest-busi...

[3] https://www.marketplace.org/2024/08/22/daycare-rural-areas-c...


> There is way fewer people producing quality content "for fun" than people that aim or at least eventually hope to make money from it...But taking the possibility away would also cut the produced content by a lot.

....is that a problem? most of what we actually like is the stuff that's made 'for fun', and even if not, killing off some good stuff while killing off nearly all the bad stuff is a pretty good deal imo.


Agreed. The entire reason why search is so hard is because there's so much junk produced purely to manipulate people into buying stuff. If all of that goes away because people don't see ads there anymore, search becomes much easier to pull off for those of us who don't want to stick to the AI sandbox.

There's a slight chance we could see the un-Septembering of the internet as it bifurcates.


Unless the reason for the death of the paid content deal is because of AI vacuuming up all the content and spitting out an anonymous slurry of it.

Why would anyone, especially a passionate hobbyist, make a website knowing it will never be seen, and only be used as a source for some company's profit?


> and only be used as a source for some company's profit?

Are we forgetting the main beneficiaries? The users of LLM search. The provider makes a loss or pennies on million tokens, they solve actual problems. Could be education, could be health, could be automating stuff.


The problem is not the ad sites dying. The problem is that even the good sites will not have any readers, as the content will be appropriated by the AI du jour. This makes it impossible to heal the web, because people create personal sites with the expectation of at least receiving visitors. If nobody finds your site, it is as if it didn't exist.


I'm not so sure.

I think the best bloggers write because they need to express themselves, not because they need an audience. They always seem surprised to discover that they have an audience.

There is absolutely a set of people who write in order to be read by a large audience, but I'm not sure they're the critical people. If we lost all of them because they couldn't attract an audience, I don't think we'd lose too much.


Exactly. Even if people don't publish information for money, a lot of them do it for "glory" for lack of a better term. Many people like being the "go to expert" in some particular field.

LLMs do away with that. 95% of folks aren't going to feel great if all of the time spent producing content is then just "put into the blender to be churned out" by an LLM with no traffic back to the original site.


chatGPT puts trillions of tokens into human heads per month, and collects extensive logs of problem solving and outcomes of ideas tested there. This is becoming a new way to circulate experience in society. And experience flywheel. We don't need blogs, we get more truthful and aligned outcomes from humna-AI logs.


You, for one, welcome our new AI overlords?

Blogs have the enormous advantage of being decentralized and harder to manipulate and censor. We get "more truthful and aligned outcomes" from centralized control only so long as your definition of "truth" and "alignment" match the definitions used by the centralized party.

I don't have enough faith in Sam Altman or in all current and future US governments to wish that future into existence.


But it would disincentive those who create knowledge? AFAIK, most of the highly specific knowledge comes from a small communities where shared goal and socialization with like-minded individuals are incentive to keep acquiring and describing knowledge for community-members. Would it really be helpful to put an AI between them?


First issue, silos of information.

Second issue: who decides the weights of sources. this is the reason why every nation must have culturally aligned AIs defending their ways of living in the information sphere.


Yet 300M users are creating interactive sessions on chatGPT, which can be food for self improvement. AI has a native way to elicit experience from users.


Only middle-class and rich people could participate in "the old deal" Internet made by and for hobbyists. I think people forget this. It was not so democratized and open for everyone – you first had to afford a computer.

If you're a member of a yacht club, you can probably expect other members to help you out with repairs while you help them. But when a club has half the world population as members, those arrangements don't work anymore.


As if OpenAI won't end up offering paid access to influence these results, or advertise inside them. Of course they will, just like how Google started without ads.

It will be even more opaque and unblockable.


To quote Prince: ahh, now people can finally go back to making music for the sake of making music.


Remember in that time, less web content meant major media outlets dominated news and entertainment on TV and newspapers.


Paging Sergey


The internet was great before the great monetization of it, had tons of information provided for free with no ads. After ads, it will still have tons of information. Stack Overflows will still exist, Wikipedias, corporate blogs that serve just to boost the company, people making courses and other educational content, personal blogs (countless of which make their way here), all of those will continue to exist.

Ad-driven social networks will continue to exist as well.

The age of the ad-driven blog website is probably at an end. But there will be countless people posting stuff online for free anyway.


Nobody will visit stackoverflow because AI through its reasoning and back and forth with users will have solved the problems. This process creates training data for future AIs of that particular company unavailable to any other


Many people have an intrinsic motivation to share knowledge. Have a look at Wikipedia. There are enough of these people that we don't need to destroy the open Internet to accommodate those who only write when they expect to be paid.


[flagged]


You have some examples of that?


Their stance during Covid to ban any mentions of the lab leak theory. Even if not considered the most likely, it had always been a possibility, and not an absurdist one.


> the history of the domestication of the tomato" that Google essentially forced on recipe sites for SEO competitive reasons

That may help with SEO, but another reason is copyright law.

Recipes can't be copyrighted, but stories can. Here is how ChatGPT explained it to me:

> Recipes themselves, particularly the list of ingredients and steps, generally can't be copyrighted because they're considered functional instructions. However, the unique way a recipe is presented—such as personal stories, anecdotes, or detailed explanations—can be copyrighted. By adding this extra content, bloggers and recipe creators can make their work distinctive and protectable under copyright law, which also encourages people to stay on their page longer (a bonus for ad revenue).

> In many cases, though, bloggers also do this to build a connection with readers, share cooking tips, or explain why a recipe is special to them. So while copyright plays a role, storytelling has other motivations, too.


> can't be copyrighted because they're considered functional instructions

by that logic software shouldn't be copyrighted either!


Would like to read more about this. Has anybody used this technique to actually successfully sue someone for infringing their copyright on an instructional website or is it only theoretically possible?


> Why would anyone ever publish stuff on the web for free...?

Why indeed, person who posted for free* on the Internet?

As a side note, consider that adds can be woven into and boosted in LLM results just as easily as in index lookups.

* assuming that you're not shilling here by presenting the frame that the new shiny is magically immune to revenue pressures


It can be like the youtube premium model. The search app is subscription based. So every time your content is served you will get paid. But you have to make your content available to the AI for crawling and mention your monetisation preferences.


The recipe trade-off doesn't make sense: while it's trivial to skip the history, you can't skip the false ingredients of the gpt variety

Then this whole category is not known for "high quality recipes", so the general state wouldn't change much?


That's why you're seeing media companies making deals with companies like OpenAI to allow them to access their content for AI learning/parsing purposes, in exchange for the media company getting paid yearly royalties.

Since anyone creating content (whether that's a big media corp or a small cooking blog) holds copyright over their content, they get to withhold the permission to scrape their content unless these AI platforms make a deal with them.


By the same logic they'd get to sue over the scraping done to originally train the models. If royalties need to be paid for additional use, they would've needed to be paid for the original use.


> Can someone please explain how the open web, at least the part of the web the requires some sort of viable funding model for creators, can survive this?

The funding model for the open web will be for the open web content to be the top of the funnel for curated content and/or walled gardens.

I think many business models already treated the web this way. Specifically, get people away from the 800-pound gorilla rent-seekers like Google and Amazon, and get them into your own ecosystem.


> fundamentally and irretrievably break the economics of the web

Good riddance, it is a surefire way to get slop by having misaligned incentives for publication.


> Why would anyone ever publish stuff on the web for free unless it was just a hobby?

So that ChatGPT mentiones you, not your competitor, in the answer to the user. I have seen multiple SEO agencies already advertise that.


Wait, did google force "the history of the domestication of the tomato" to be part of recipes on the web for SEO reasons?


Yep, I was incredibly skeptical about Kagi but I tried it and never looked back. Now my wife, friends, and several coworkers are customers.

The chatgpt approach to search just feels forced and not as intuitive.


Once Kagi implements location aware search that is actually useful I’ll be interested in Kagi. That’s what made me leave the engine besides loving it otherwise.


Google Maps is quite the moat. I suspect they'll need to find a way to license the data, e.g. via API. Apple has not (yet) been as successful at building out a database of local places with reliable hours of operations, reviews, etc.


There already is an official Google Maps API, but it is already very expensive with prices rising from time to time. There is no other company (other than maybe Meta) that has this much POI data in the western world.

So that is a solid advantage that Google is going to have, but the maps business alone wouldn't be able to keep it in the S&P list for long.


Why not? I can see a future where Maps becomes the core of Google's business. It's already their strongest offering, together with YouTube. In every other field Google has been beaten.


Same here - I do a lot of location aware searches. When I left Kagi after trying it out for a while, I wrote a detailed feedback hoping it would be useful to the Kagi team.


If you go to maps.kagi.com and in allow access to your location local results should be better. If it doesnt ask for access to your location there is a small icon on bottom right hand side that shows if it has access.


That's great if I'm trying to find a location, but that's not what local results is about.

Local results means that if I search for "driving laws", Google gives me .gov sites for my state as the top results, while Kagi's first page gives me results for 8 other states (including Alaska!) but not for my state.

There are a lot of kinds of queries that benefit from knowing the user's location even though they aren't actually looking for a place that exists on a map.

(I'm a happy paying Kagi user, but OP is right that this is its weakest point by far.)


Isn't the alternative to just simply type "driving laws for [state]"? That doesn't seem too odious.


This boils the problem down to a dichotomy which isn’t how it works in the real world. Most of the searches I make that aren’t tech related searches have a location based aspect to them. Anything I do in my day to day life involving logistics has a high chance of needing some location based search. Kagi (and DDG) performs at a range of 0% usefulness to 70% usefulness on average for these kinds of searches. Usually it’s 0%. There is simply a huge gap here in what Kagi offers when you need to search for results near you vs the leading competitor


Yep. Or county or city or whatever is relevant.

It's not terrible—as I said, I'm a happy customer—but it's not a habit I have and it feels like something that should be configurable once in a settings menu. I don't even really want to have it detect my location live, I just want to be able to tell it where I live and have it prioritize content that's local when given the chance.


For what it’s worth DuckDuckGo is flawed in the exact same way. I ended up leaving DDG for the exact same reason years ago


The entire selling point of DDG is that its search results are not personalised. This is not a flaw.


Personalized != contextualized. You could have a search engine that uses geolocation without building any sort of cross-request profile on the individual making the search.


But OP is right that this would actually be serving their target demographic less well than serving everyone the same results regardless of context. The fact that the results don't know where the user is is reassuring for the kind of user who wants to use a privacy-oriented search engine, regardless of whether localized results could technically be provided in a privacy-preserving way.


These things are not mutually exclusive. Allow me to specify a city or state or county or country or zip code as a bang in my search and show me good results based on that. Both problems immediately solved. I wouldn’t be any more or less reassured about a search engine’s privacy stance if that feature was offered to me. This is a feature I can absolutely use in a private way (I can do that search over tor or a vpn with two hops if I so desire), and it gives me the control over what I provide the search engine and how and when.

Right now search engines don’t provide an interface for good location aware searches that you can manually specify - you have to let them build a shadow profile on you via all sorts of privacy violating fingerprints or just give up location aware searches altogether. There’s no reason it has to be that way though.


> These things are not mutually exclusive. Allow me to specify a city or state or county or country or zip code as a bang in my search and show me good results based on that. Both problems immediately solved.

Do you actually find that attaching your location to the end of the query doesn't work? I don't do it naturally, but when I do do it I'm rarely disappointed.


I wouldn't usually point this out, but as you did it repeatedly: "they're" is a contraction of "they are". You're looking for the possessive, "their".

- Your local grammar pedant


I gave Kagi a shot two weeks ago, and it instantly impressed me. I didn't realize how much search could be improved. It's a beautiful, helpful experience.


Yeah, it’s wonderful. Especially once you take the time to up/downrank domains.


Could it be that Kagi benefits from being niche, though? Google search gets gamed because it’s the most popular and therefore gaming it gives the best return. I wonder if Kagi would have the same issues if it was the top dog.


I think they absolutely benefit from being niche, but there are a few other things they have going for them that won't go away if they become popular:

* They're not ad funded. Sergey Brin and Larry Page called this out in 1998 and it is just as true as ever: you need the economics to align. Kagi wins if people keep paying for it. Google wins if you click on Search ads or if you visit a page filled with their non-Search ads.

* Partially because of the economic alignment, Kagi has robust features for customizing your search results. The classic example is that you can block Pinterest, but it also allows gentler up- and down-weights. I have Wikipedia get a boost whenever its results are relevant, which is by itself a huge improvement over Google lately. Meanwhile, I don't see Fandom wikis unless there's absolutely nothing else.

I hope to see more innovation from Kagi in the customization side of things, because I think that's what's going to make the biggest difference in preventing SEO gaming. If users can react instantly to block your site because it's filled with garbage, then it won't matter as much if you find a brief exploit that gets you into the first page of the natural search results. On Google Fandom is impossible to avoid. On Kagi it just takes one click.


I don’t understand how it’s different to Perplexity, looks pretty much the same. Can you enlighten me?


Not op, but Kagi user. Also have perplexity but usually use kagi.

I would say: 1) The UI. You’re still performing normal searches in Kagi. But if you hit q, or end your query with a question mark, you get an llm synthesized answer at the top, but can still browse and click through the normal search results.

2) Kagi has personalization, ie you uprank/downrank/block domains, so the synthesized llm answer should usually be better because it has your personalized search as input.


Paying customer of Kagi here.

In addition to all that's been written above, you can configure personal filters, so that (for example) you never ever see a pinterest page in your search results. Things like that are IMO killer features today.


Are you referring to Kagi Assistant?

https://help.kagi.com/kagi/ai/assistant.html


I am referring to all of their AI stuff, kagi assistant included. Personally, the best feature is the quick answer. It essentially scans the top several hits and uses an llm, reads them to see if it answers your question, and will display a summary that also includes links to the full source. I find that feature to be wonderful. I will usually look through the quick answer and see if the site actually answers the question I have, and then we'll click through. If everyone implemented it like this, it's possible it could save the current model.


> absolutely nailed it with their AI approach.

Thankfully, Kagi also have a toggle to completely turn that crap (AI) off so it never appears.

Personally, I have absolutely no use for a product that can randomly generate false information. I'm not even interested until that's solved.

(If/when it ever is though, at that point I'm open to taking a look)

So yeah, Kagi definitely "leads the way" on this. By giving the user a choice to not waste time presenting AI crap. :)


You have no use for a product that can randomly generate false information but you trust google to provide you with search results based on how much they were paid for those keywords?…

give me ai hallucinations over google every day of the week and twice on sunday…


> but you trust google

???

Looks like my comment wasn't as clear as I thought. I do not trust Google at all, and don't use it. That's why I pay for Kagi.

And Kagi has an option to disable the AI crap, so it's just like "a good search engine" instead, which is all I need.

A high quality search engine without ads, and without hallucinated bullshit.


>but you trust google to provide you with search results based on how much they were paid for those keywords?

Google isn't paid for keywords, that's not how search works. They sell ad space, Google does not rank up search content for payment.

And also the obvious point is, you don't need to trust Google because they merely point you to content, they don't produce the content. They're an index for real existing content on the web which you can judge for yourself. A search index unlike an AI model, does not output uniform or even synthetic content.


Google makes money if you click through to a site that displays Google ads. This includes doubleclick, which is pay per impression and is owned by Google.


“Google does not rank search content for payment” might be the funniest thing I read all year on this site…


[flagged]


I know the internet lately incentivizes low effort comments like this but be better.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: