More

maxdo · 2026-06-01T18:41:48 1780339308

Lantern | Software Engineers (Python, Full-Stack) | NYC (Onsite) | Full-time | withlantern.com

Lantern is building AI-powered revenue intelligence—autonomous agents that turn cold outreach into warm conversations. We detect buying signals across your tech stack, prioritize opportunities, and help B2B sales teams focus on deals that actually close. We're early, agentic-first, and moving fast.

Senior Software Engineer: You'll own complex technical problems end-to-end—translating customer needs into scalable architecture, optimizing AI model efficiency through prompt engineering, and shipping features that directly impact how revenue teams operate. We want bold ideas and people who push boundaries.

Junior Software Engineer: You'll work across our stack (Python, FastAPI, Next.js, React) alongside senior engineers, contributing to real features from day one. Great fit if you're hungry to learn, ship fast, and grow with a startup building cutting-edge agentic systems.

New Grad Software Engineer: Entry point for recent grads excited about AI and agentic systems.

You'll get hands-on experience with our modern data stack (Postgres, ElasticSearch, ClickHouse, DuckDB), Kubernetes/AWS deployment, and Temporal.io, pi.dev, context engineering, workflow orchestration while shipping production code.

This role is based in our Manhattan office (4 days/week in-person). Full health and dental coverage.

Apply: https://forms.gle/kEi5MDDiaJKdyN3fA or email maksym at withlantern.com with your resume.

maxdo · 2026-05-30T00:33:25 1780101205

Great UI cult could be an early pre-ai phenomena , not a lost decade.

you will need 100 times less of front end , if any action can be asked or explain in voice, text or video or slides always custom to your request. you don't need navigation complex forms to structure. just text, animation to wait, and a few way to embed more complex structures, like text and basic html with charts.

this sounds like "almost ui" but in reality it's not. this is text + some custom visuzalizations adhoc.

maxdo · 2026-05-29T19:15:51 1780082151

Oh most prominent eu ai company . Without reading an article predict next, will update after :

1. They give up on building competitive models. It’s time to drink wine not to struggle with competition

2. Because of #1 they will talk a bit about something around llms maybe coding agents , and after start talking about sovereignty.

lejalv · 2026-05-30T06:14:29 1780121669

Unlike you, who drank the wine before writing the comment.

FinnKuhn · 2026-05-29T19:20:29 1780082429

3. They are going to start focusing on B2B implementation and deployment.

See what happened to Aleph Alpha...

maxdo · 2026-05-27T03:07:34 1779851254

benchmarks we deserve: google search quick ai answers vs full llm model :)

irthomasthomas · 2026-05-27T09:44:24 1779875064

search answers use Flash 3.5

maxdo · 2026-05-27T14:31:02 1779892262

they use a "low" flavor of it to scale it on billions of users

maxdo · 2026-05-27T02:40:47 1779849647

it's not that huge of a deal if you compare commercial costs in china and cheapest us states, and electricity is only one of the factors.

The real reason: anthropic + openai just cut the reasoning output to prevent distill, and hence you see the rise of chinese models to establish contracts globally .

stingraycharles · 2026-05-27T02:57:44 1779850664

“and hence you see the rise of chinese models to establish contracts globally”

how will that help them working around the distill issue?

gessha · 2026-05-27T03:09:50 1779851390

Collecting user data directly by competing on price. The next step would be figuring out how that data can bring them closer to SOTA.

stingraycharles · 2026-05-27T03:38:29 1779853109

Yes ok but that doesn’t give them the thinking tokens, how to reason about the prompt, which is precisely what’s most important.

maxdo · 2026-05-27T02:12:56 1779847976

a flagship with no pirates, all fired due to ai.

maxdo · 2026-05-22T21:58:58 1779487138

Flash on it's own is not a very competitive model, it's pricing is within ranges of everything else on the market.

Probably the most direct competitor of Flash model :

GPT 5.4 mini

Cache Read $0.075 /M tokens

Gemini 3 flash :

Cache Read $0.05 /M tokens

e.g nothing very magical or ground breaking.

freehorse · 2026-05-22T22:37:39 1779489459

Cache read for dp4-flash is $0.0028 /M tokens, which is more than 10 times cheaper (and also much cheaper for cache miss and output tokens).

Have not actually compared it to other models, but I would not consider it in the same price range.

maxdo · 2026-05-23T02:47:13 1779504433

this price only available if you ok to send your data to Beijing Volcano Engine Technology Co. for the rest open router vendors it is not the same.

csunoser · 2026-05-24T19:40:05 1779651605

Not sure why you are downvoted, this is essentially correct (assuming Volcano Engine tech refers to Deepseek as provider).

maxdo · 2026-05-22T21:51:16 1779486676

Sonnet : Cache Read $0.30

Gemini 3.5 flash : Cache Read $0.15

minimaxir · 2026-05-22T21:59:43 1779487183

For Sonnet, that's 10% of input cost (and requires paying for the cache)

For Gemini 3.5 Flash, it's also 10% of input cost.

Which is why 2%/0.8% change the economics in a meaningful way, given the input/cache-heavy way agents operate.

throwdbaaway · 2026-05-23T01:28:45 1779499725

And their disk-based caching is amazing. I got a long 700k context session spanning more than a week, with pauses in between that was longer than a day, and some rewinds mixed in as well.

Stats from pi:

↑400k ↓438k R432M 71.9%/1.0M

Half a billion tokens, $2.12

kingstnap · 2026-05-23T01:14:19 1779498859

Anthropic's caching requires you to pay a $0.75/Mtok for Sonnet and $1.25/MTok for Opus as a surcharge on top of the original input token cost. It's not even automatic.

If you are reading ~8 times (8 total back and forth tool calls) that means that cache reads in some sense cost ~$0.4 / M toks (Amortizing the write surcharge over all reads).

It's really quite ridiculously expensive considering what you are paying for is some residence on a VRAM that sometimes gets offloaded to NVMe.

maxdo · 2026-05-22T21:53:51 1779486831

GPT 5.4 Cache Read ≤272K $0.25

And it's multi modal, and available at whatever you might imagine rates limits.

maxdo · 2026-05-22T21:28:20 1779485300

maybe i need to give it second chance, surprisingly Kimi 2.6 consistently fail even to generate valid json plan, where gemma 4 was doing really good, but slow.

JSR_FDED · 2026-05-23T12:49:49 1779540589

Are you going through OpenRouter or direct? I’ve had nothing short of excellent results from Kimi.

maxdo · 2026-05-22T21:25:36 1779485136

i've been trying that, in reality every time you try to save it, it's not worth it, the cost of mistake is so high , you can spent 2-3h on just wrong assumption, you lost your time and all the burned tokens.