Hacker News new | past | comments | ask | show | jobs | submit | aoeusnth1's comments login

or, hear me out, organic machines have conscious experience because existence itself is divine. Humans don't have a special soul separate from the universe, they have a soul because they are the universe: materialism.

That argument might make sense if Bing was as good, but Bing is worse. Wasting money building a bad product does not entitle you to market share.

You just left two comments that say the same thing. I replied to the one you left first.

https://news.ycombinator.com/item?id=43946577


all that would make sense, but Bing is worse. Wasting money building a bad product does not entitle you to market share.

That's (a) a different argument than the competition is "too risk averse", (b) subjective, and (c) arguably the result of a number of flywheel effects. That is, Bing's ability to compete is hampered by the fact that Google already has an overwhelming majority of search traffic from which to learn and improve.

For example, from the second filing I linked to:

> After search began appearing on phones, Google started logging information about user location, swipes, and other user-related movements. PFOF ¶¶ 1003–1004. This data is now vital to every aspect of search, including figuring out where and when to crawl specific websites, how to index the information retrieved from that crawl, what documents to retrieve from the index in response to a user query, and how to rank the retrieved items. Some elements of Google’s search engine are trained on 13 months of data—a volume that would take Bing over 17 years to accumulate.


Also, what is Bing's retention on windows? They try to cram it down your throat, but people still go straight for chrome/google.

LMArena is a joke, though

This post is a very weak and incoherent criticism of a well formulated benchmark: task length bucket for which a model succeeds 50% of the time.

Gary says: - This is just the task length that the models were able to solve in THIS dataset. What about other tasks?

Yeah, obviously. The point is that models are improving on these tasks in a predicable fashion. If you care about software, you should care how good ai is at software.

- Gary says: Task length is a bad metric. What about a bunch of other factors of difficulty which might not factor into task length?

Task length is a pretty good proxy for difficulty, that's why people grade a bug in days. Of course many factors contribute to this estimate, but averaged over many tasks, time is a great metric for difficulty.

Finally, Gary just ignores that despite his perspective that the metric makes no sense and is meaningless, it has extremely strong predictive value. This should give you pause - how can an arbitrary metric with no connection to the true difficulty of a task, with no real way of comparing its validity of measuring difficulty across tasks or across task-takers, result in such a retrospectively smooth curve, and so closely predict the recent data points from sonnet and o3? something IS going on there, which cannot fit into Gary's ~spin~ narrative that nothing ever happens.


In principle, Math proofs are another relatively easy to verify problem. In the extreme case, you can express any math proof as a computer-verifiable formalism — no intelligence necessary. Step back one step, and you could have a relatively weak model translate a proof into verifiable formalism and then use a tool call to run the verification. Coming up with the proof is an expensive search process, while verifying it is more mechanical. Even if it is not completely trivial to make the proof computer-verifiable, it might still be a vastly easier task compared to finding the proof in the first place.

I have a hard time believing that he hadn't already made up his mind to make an open source model when he posted the poll in the first place

Why not make the strong model compile a non-ai-driven test execution plan using selectors / events? Is Moondream that good?


Definitely a good question. Using an actual LLM as the execution layer allows us to more easily swap to the planner agent in the case that the test needs to be adapted. We don’t want to store just a selector based test because it’s difficult to determine when it requires adaptation, and is inherently more brittle to subtle UI changes. We think using a tiny model like Moondream makes this cheap enough that these benefits outweigh an approach where we cache actual playwright code.


100 years optimistically!? That's an incredibly pessimistic timeline, maybe one of the most hardline "nothing ever happens" outlooks I've ever heard articulated.


that's crazy to say. mars is very cold and very dry and not shielded from radiation and doesn't have much air and that air isn't breathable.

i wouldn't say we've settled antarctica, which is on our planet and has air.

100 years would be a wild amount of time for us to settle mars.


It's also particularly awkward to land on, as it has just enough atmosphere to be annoying, but not enough to be particularly helpful. Most Mars landings have involved some sort of ridiculous Rube Goldberg machine or other (see https://en.wikipedia.org/wiki/Sky_crane_(landing_system) , https://en.wikipedia.org/wiki/Mars_Pathfinder#Entry,_descent... ) which would not be viable for humans (and were only arguably viable for the probes they were used for; the risk of failure was high).


Add to that a soil and dust that's toxic to humans. Our biology, unsurprisingly, is only compatible with a single planet.


We purposefully decided not to settle there socially, yet we have settled there permanently with research and military stations.

Just as we will with Mars.

And yes we grow things there, even if just green onions and herbs.

Not to mention the reason for this isn't that it is insurmountable, merely that far better land is close by.

100 years is beyond pessimistic. We could easily have settled Mars with 1970s tech.


> the reason for this isn't that it is insurmountable, merely that far better land is close by.

times a few orders of magnitude and this is the main reason to doubt a settled Mars colony.

Possibly a research outpost, but why would that be staffed by humans rather than robots?


There is no 'few orders of magnitude', because it's conquered, you're not allowed to settle there, and it's not a new frontier.

Yet Mars is a new frontier, and endless, massive numbers of humans would go to Mars in a heartbeat.

Frankly, settlers travelling to the new world in the 1400s faced a far more dangerous journey and living conditions than a trip to Mars.

We're talking first explorers here, so many died on boats, of starvation the first year, on and on.

Modern tech does not remove said risk, but it tips the playing field.


Given that no one has yet traveled to Mars, “faced a far more dangerous journey” seems a ridiculously hyperbolic statement. (Thinking even about the lost colony at Roanoke.)


Colonizing Mars isn't a problem. Colonizing Mars is a goal. Making that happen requires addressing a ridiculous number of problems and sub-problems.

If history teaches us anything, the biggest problem is supply chains - and supply chains have been so difficult to get right that they've led to countless famines, lost wars, failed businesses and economic crises. And those have all been supply chains here on Earth, mostly between fixed locations at fixed distances with relatively few environmental hazards and risks compared to space travel.

If we want to create a sustainable multi-planetary future, we need to solve this incrementally. Colonizing the moon would be a logical stopgap. But as it stands now we haven't even established a presence on the moon - let alone a permanent one. The only presence we have off-planet is the ISS and that one's still in Low Earth Orbit, no different from regular communication satellites, so that only qualifies as "off-planet" by not being on the surface of the planet.

Remember that we can't just scale up space travel indepently either. Even if SpaceX figures out how to do space launches every other day, that still requires a supply chain for fuel, parts, refinement, resource extraction, etc, all of which also needs to be scaled up accordingly. And that's just for launching stuff into space, which so far has mostly meant LEO.


Have you read about the natural conditions on Mars?

I doubt there will be a permanent settlement in a thousand years.


I think you are both on different pages about settlement vs just a visit.


Eh, it's a reasonable prior. The timeline is "it will never happen" until the leap forward happens that makes it "within 2 years." Basically the same as air flight.

You can't know when the leap will happen so it's basically picking a year that seems far enough off to be pretty darn sure.


It doesn't require a leap forward, we could put boots on the ground with 1990s tech.


aye we could have. and they'd all be long dead on the surface of Mars by this point. getting them there isn't enough.


Keeping them alive and returning them doesn't require "a leap" which is the central point of OP I am disagreeing with. We have all the technology, material science etc to do it.

Sure, it requires some research, engineering and a crapload of investment, but it doesn't require anything that is currently "science fiction".


Their inference costs are the lowest in the business.


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: