Hacker Newsnew | past | comments | ask | show | jobs | submit | zhyder's commentslogin

Looks like the best display you can get in laptops at this price: 2408x1506 resolution, 500 nits, antireflective coating (!). And bonus points for no silly notch.


I guess it could warn about it but the VM sandbox is the best part of Cowork. The sandbox itself is necessary to balance the power you get with generating code (that's hidden-to-user) with the security you need for non-technical users. I'd go even further and make user grant host filesystem access only to specific folders, and warn about anything with write access: can think of lots of easy-to-use UIs for this.


Model card: https://deepmind.google/models/model-cards/gemini-3-1-flash-...

Pretty close to Gemini 3 Pro Image (aka Nano Banana Pro) in most benchmarks, even without thinking+search, and even exceeding it in 2 most important ones of 'Overall Preference' and 'Visual Quality'. I'm excited about the big jump in Infographics/Factuality (even without thinking+search; I'm surprised that text+image search grounding doesn't make an even bigger dent).


Surprisingly big jump in ARC-AGI-2 from 31% to 77%, guess there's some RLHF focused on the benchmark given it was previously far behind the competition and is now ahead.

Apart from that, the usual predictable gains in coding. Still is a great sweet-spot for performance, speed and cost. Need to hack Claude Code to use their agentic logic+prompts but use Gemini models.

I wish Google also updated Flash-lite to 3.0+, would like to use that for the Explore subagent (which Claude Code uses Haiku for). These subagents seem to be Claude Code's strength over Gemini CLI, which still has them only in experimental mode and doesn't have read-only ones like Explore.


>I wish Google also updated Flash-lite to 3.0+

I hope every day that they have made gains on their diffusion model. As a sub agent it would be insane, as it's compute light and cranks 1000+ tk/s


Agree, can't wait for updates to the diffusion model.

Could be useful for planning too, given its tendency to think big picture first. Even if it's just an additional subagent to double-check with an "off the top off your head" or "don't think, share first thought" type of question. More generally would like to see how sequencing autoregressive thinking with diffusion over multiple steps might help with better overall thinking.


The only thing I can notice is deep research is better. Like much closer to outputting a paper from arxiv straight away.

I am really the bottleneck now and what to do with all this new information.


"the value of a human eyeball" / attention is and always will be the limited resource. But I wish the way the economy worked wasn't that attention is sold for money, which makes money the moat, and sets a floor on how low-priced things can get for customers too. Is this really the best the economy can do? Or is it possible to have a fair LLM-based search engine that matches customer need description with stated product descriptions from providers (while weighing customer reviews, etc)?


Hmm the whole point of checkpoints seems to be to reduce token waste by saving repeat thinking work. But wouldn't trying to pull N checkpoints into context of the N+1 task be MUCH more expensive? It's at odds with the current practice of clearing context regularly to save on input tokens. Even subagents (which I think are the real superpower that Claude Code has over Gemini CLI for now) by their nature get spawned with fresh near-empty context.

Token costs aside, arguably fresh context is also better at problem solving. When it was just me coding by hand, I didn't save all my intermediate thinking work anywhere: instead thinking afresh when a similar problem came up later helped in coming up with better solutions. I did occasionally save my thinking in design docs, but the equivalent to that is CLAUDE.md and similar human-reviewed markdown saved at explicit -umm- checkpoints.


Yes I also don't see how having persistent context would help. For one, I don't want to read the slop the AI produced while it wrote the code. It doesn't have intent or thinking, it's a completion machine. The code is the artifact that matters, not the thousands of lines of "Your absolutely right --- I was wrong to ..."

Also, sometimes it gets something very wrong. I don't want to then poison every subsequent sessions with the wrong thing it learned. This has been a major issue for me at $WORK with AGENTS.md files that my colleagues write: they make my agent coding much worse so I need to manually delete them often.


So 2.5x the speed at 6x the price [1].

Quite a premium for speed. Especially when Gemini 3 Pro is 1.8x the tokens/sec speed (of regular-speed Opus 4.6) at 0.45x the price [2]. Though it's worse at coding, and Gemini CLI doesn't have the agentic strength of Claude Code, yet.

[1] - https://x.com/claudeai/status/2020207322124132504 [2] - https://artificialanalysis.ai/leaderboards/models


6x price/token, so 15x price/second, and only at the API pricing level, not the far cheaper (per token) subscription pricing.

Definitely an interesting way to encourage whales to spend a lot of money quickly.


I didn’t quite understand why they were randomly giving people $50 in credits. But I think this is why?


no, it’s for Max subscribers to enable “use API when running out of session limit”. the assumption (probably) being that many will forget to turn it off, and they’ll earn it back that way.


This was my first thought, but by default, you have no automatic reload of your prepaid account. Which I think is for once user friendly. They could have applied a dark pattern here.


Gemini is pretty good for frontend tasks


> Though it's worse at coding, and Gemini CLI doesn't have the agentic strength of Claude Code, yet.

You can use OpenCode instead of Gemini CLI.


or you can proxy Gemini through Claude Code


That sounds pretty nice. How are you achieving that?


Litellm makes it easy


Love it. Wonder if it's viable for citizen journalism in warzones and areas of civil unrest, with the larger size of photos (and short videos), given the inherently slow transfer rates and battery life implications of going thru multiple hops before Internet-exiting the area that's otherwise Internet-offline. What's the back-of-the-envelope math here on viable bandwidth?

Wifi obviously has higher bandwidth, but I guess it isn't viable as a mesh, or is there any trick with turning on/off hotspots on phones dynamically that'd make it viable? (Afaik older phones made you pick between being a hotspot or being a regular wifi client, but at least some newer ones seem to allow both simultaneously.)

I'm definitely hoping for a future with wider support for C2PA (content credentials on images) on phone cameras to make these photos power citizen journalism. So far Samsung S25 and Pixel 10 support C2PA in the camera hardware: need other phone makers (especially Apple) to get on board already... if you're an iPhone user, please help yell at Apple support etc!

Aside: I registered a domain and plan to build a citizen journalism news feed for such photos (and uncut videos). I see it as the antidote to Instagram et al's feeds that're full of AI slop (and plenty of fakery even before AI-generated imagery got big). And it's essential to truth, democracy and ultimately (maybe I'm too idealistic here) peace. Aside to the aside: wish some of us techies banded together to build "peace tech" as a new sector in tech, DM if interested in brainstorming or working together.


Sounds like antirez, simonw, et al are still advocating reviewing the code output of these agents for now. But presumably soon (within months?) the agents will be good enough such that line-by-line review will no longer be necessary, or humanly possible as we crank the agents up to 11.

But then how will we review each PR enough to have confidence in it?

How will we understand the overall codebase too after it gets much bigger?

Are there any better tools here other than just asking LLMs to summarize code, or flag risky code... any good "code reader" tools (like code editors but focused on this reading task)?


We will review fully until they reach superhuman perfection.


Most car manufacturers made this mistake because they started mimicking the then leader for innovation (and customer satisfaction), Tesla, too much.

General cautionary tale: just coz a company is successful, doesn't mean it's doing _everything_ right. Plenty of folks who love their Teslas would prefer a few more buttons (and door handles on the inside, etc) if given the choice. Could say similar things about some choices Apple made.


I own a Tesla, and I agree.

1. What Tesla did right was put a big screen in the center of the car, and then actually think about the UX, and how to improve the software to avoid having to fiddle every other minute with controls on the screen (e.g. climate control is usually amazing, I rarely touch the temperature). What other companies did was just put the screen and slap on sub-par software without much regard for UX, so of course it sucks, even if you have the big screen.

2. Yes, I'd have loved a couple extra buttons, perhaps programmable. My main gripe for instance is/was the air re-circulation (used to live in a country with lots of tunnels), but I'm sure others would have liked some other button. I'd have been very happy to have 3-4 software-programmable buttons for the most used functions.


> What Tesla did right was put a big screen in the center of the car

I would disagree with that. You do not need a big flashing distract-o-tron in the middle of the dashboard.

Cars should have exactly zero screens.


> I would disagree with that. You do not need a big flashing distract-o-tron in the middle of the dashboard.

Except my car's screen is not distracting: I set it up for my destination, I give it a glance when needed for navigation, and I basically don't touch it until I'm done driving, because (second part of the previous comment) the UX is so well done that I don't have to. Worst case, voice control works well enough for e.g. changing playlists and songs or changing destination mid-trip.

> Cars should have exactly zero screens.

People have been attaching tomtoms and mobiles to the windscreen for the past 30 years anyway to solve exactly the same problem (navigation), and they were always inferior solutions to a well done integrated screen: detaching on a bump, leaving forever-smudges, having to update all maps offline, removable meaning easier to steal, limited functionality, ..... So I disagree. I'd rather have governing bodies evolve to take screen UX into account at regulation: most cars with screens couldn't have been sold.


You shouldn't have things like tomtoms and mobiles attached to the screen.

Turn all that off. Don't drive distracted.


> Cars should have exactly zero screens.

Backup / 360 view cameras and navigation? I'd argue those are a lot safer than no camera looking backwards and fiddling with maps / phones.


The display dims adequately , and is far less distracting than competitors , who usually have multiple displays and flashing lights. Especially luxury brands who do the above and have "bejeweled" decorative LEDs all over the cockpit.

Tesla has the most subdued interior of every brand on the market.


But why do you want a massive glaring floodlight shining in your face when you're driving at all?

The screen is not useful.


sure, I would prefer 90s interfaces if I had the choice, but given the products on the Market , Tesla's attentiveness to the driver experience ( low LCD brightness, moderate contrast UI, reducing demand on the driver) exceeds all competitors by a large margin : better than luxuries, better than German cars.


Even leaving the big distracting floodlight in the the middle of the dash out of it, I don't like Teslas because I don't think an 80 grand car should feel like a 30 grand car.

If they want to sell cars at that price they need to not feel like a base-spec Skoda.


my car was $37k out the door (model Y LR) . It was priced just below a Rav 4 where I live.

The features are better than mid tier luxury . Fit and finish is adequate, still better than a Rav 4.

You sound like you're my age (given your preferences, and experience with cars). The car market is very different.

We rented a $35k 2025 Prius and it felt like a 2005 Corolla.

Believe me , Tesla's are a great value.


I've driven a Model 3 and frankly it was a rattly plasticky piece of shit.

My mum's 14-year-old Fiat felt more solid.


not within the past 5 years, then


CarPlay is wonderful and Google Maps on a display is a hell of a lot safer than paper maps.


I don’t think they were following Tesla. It’s a trend that affects everything, including washing machines. Tesla is a mere symptom


It's not (just) imitation/fashion/aesthetics. Shoving everything into a display allows manufacturers to:

* compress and de-risk production timelines because changes can be made in software instead of requiring retooling/replacing parts.

* reduce cost; the cost of a display is basically required by legislation requiring back-up cameras. Add in a few settings or map view and it has to be a touch screen. Consolidating everything else into a part you are already mandated to included reduces cost.

* meet customer reqirements; Except at the very bottom end of the market, customers expect cars to have space to display a map and be able to use music streaming services. Carplay/android auto also is a requirement for some users.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: