But this is a blatant misapplication of the technology in an obviously sensitive use case with an implementation that's so exploitable the people driving it have certainly never heard the term "jailbreak" once in their lives.
Reminds me of a consulting call that I had with a very large internet provider about their new agentic chat support system.
"We're going to start with the request routing layer and move that to AI agents, and then work though the individual services."
I thought it was a wild architectural decision that they would choose to roll every single action that the system handled through an experimental layer. My advice was to start with a safe, repeatable process to validate the effectiveness in the wild, and then expand in the same manner, bringing edges in as they had "solved" the individual implementations.
So, while this is almost the exact opposite of that, choosing a high-value target with real repercussions as their leaf implementation still baffles me. Step zero of any AI integration plan should be prioritization. Companies are routinely failing at this very simple, not-even-technical aspect.
Losing two weeks while you try out a candidates fit has way less cost than bringing the wrong person on formally and spending the next year debating wether they were the right choice or not with all of the associated "soft" overhead.
This is my preferred proposal to new contacts as well (I set it up as a contract so there's a little less red tape, but even people that pursue me for traditional employment afterword usually land on an extended contract).
Two things it solves: You get to evaluate me, my ability to deliver, and how I interact with your team and I either bring real value within two weeks, or I don't. I can tell you verbally I am an indispensable asset or I can show you; other people have ruined the verbal trust layer which is why this whole debacle exists in the first place btw.
And more importantly, but less communicated, I get to evaluate you. How your team works, the level of talent present, management's ability to keep direction, and wether I genuinely enjoy what you have for me to work on.
My experience with this is that contract is usually better paid than perm, so when it comes time to convert to perm I am not willing to take the pay cut so I go contract somewhere else, or remain with them as a repeat contractor.
HR usually stops my clients from hiring me perm at 2X the rate they pay their full time employees, despite the strong demonstration of competence that I’ve shown over the last few months.
My clients are usually unable to make permanent employment with them more appealing than repeat contracting with them.
Fast is a Netflix product so the fact that you've even heard of it is in direct relation to the weight of the brand that launched it.
speedtest.net has been the first search result on Google for "speed test" for decades. Partly the boost of domain SEO and partly the boost of it being an effective exit node for searches for that term for that long.
(Nobody searches "ookla" and nobody is going to search your tier-3 .com)
This exact thing happened to me when I ran https://www.fatherly.com/ circa 2017. Google just shut down our account without notice. We were spending like $10k/month. It also locked us out of our premium support account, so we couldn't even get anyone there to notice that they'd locked us out.
After about 8 hours, a random Google support tech said it was because we were mining bitcoin, which was laughably untrue. We had CPU usage graphs and logs for the whole time and there was no spike. At around 12 hours, they turned it back on, said it was "misconfiguration of our abuse detection" and gave us like $100 in credit.
Absurd. Say what you will about AWS, they would never do that to a customer without a rep reaching out to you first. I have not trusted GCP since.
Google thinks everything should be replaced with automation.
Remember knowledge cards? Prior to the LLM AI revolution, they had an extraordinarily crappy AI system digest the entire internet to figure out the wrong facts about stuff and then present it to users as solid truth, with no human review and no way to report inaccuracies.
They just don't care. If the task requires a person to look at a thing and tell if it's right, they only do that for like 5 examples and then train a classifier, then deploy said classifier without thinking twice because "at internet scale" or whatever crap.
Google is the epitome of expecting happy path results to always be the end result. I could absolutely see someone writing this knowledge card system, but then realizing how much work it would be to edit it with some PM not wanting to say the project was a failure and needing serious amounts of human effort to correct and just releasing it as is. Gotta earn those KPIs for that next promotion, and then it's someone else's problem!
The report at this point is pretty much just a timeline of what happened. No explanation of why, no accusations, no blame. A PR piece, to Railway's customers, reassuring them that "we're not ignoring this."
Now the lawyers are huddling. IMO there won't be a lot more said publicly by either side, at least until any threat of lawsuits for damages is settled.
I don't think you're typically told why for these things, and it's mostly automated from what I can tell. The automated systems make mistakes but more importantly they're completely opaque. Nobody, not even Google, knows how they work exactly.
Google knows why there is no human oversight: because that is expensive (both in terms of the labor doing review and the ongoing fraud likely happening while the human review happens).
Really? This isn't the first time their automation took down a big customer (UniSuper in 2024) by accident. In that case the automation actually deleted the resources and GCP had to recover them.
Who is the "You" in "you haven't addressed the root cause"?
If you are asking Railway to spend effort doing this rather than simply moving away from GCP, I'm not sure why they would unless they want to sue GCP to recover damages to brand and long term customer retention.
The moment GCP shut off without any forewarning, its done deal, no need to ask any further questions.
To play devil's advocate, the majority of the internet traffic in general is US company based. So it stands to reason any disruption would impact them the most.
They changed the article" Or you missed the 3rd paragraph?
But Tasnim and Fars, both Iranian state-linked media channels, laid out more detailed proposals on how Iran could charge license fees to US tech giants for the use and maintenance of undersea cables carrying regional Internet traffic, according to The Guardian. For example, the Tasnim plan described charging tech companies—specifically naming Meta, Google, Amazon, and Microsoft—license fees for cable usage while also claiming that Iran alone has the right to repair and maintain the subsea cables.
Woah. This is ars. The pinnacle of tech reporting. Not some washed up rag being squeezed of value by its corporate overlords at Conde Nast. How dare you bring up this stuff
I appreciate the effort you put into mapping semantics so language constructs can be incorporated into this. You’re probably already seeing that the amount of terminology, how those terms interact with each other, and the way you need to model it have ballooned into a fairly complex system.
The fundamental breakthrough with LLMs is that they handle semantic mapping for you and can (albeit non-deterministically) interpret the meaning and relationships between concepts with a pretty high degree of accuracy, in context.
It just makes me wonder if you could dramatically simplify the schema and data modeling by incorporating more of these learnings.
I have a simple experiment along these lines that’s especially relevant given the advent of one-million-token context windows, although I don’t consider it a scientifically backed or production-ready concept, just an exploration: https://github.com/tcdent/wvf
Thanks for the careful read — the "schema is ballooning" observation is real and I've felt it building this. You're pointing at a genuine design tension.
My counter, qualified: deterministic consolidation is cheap and reproducible in a way LLM-in-the-loop consolidation isn't, at least today. Every think() invocation is free (cosine + entity matching + SQL). If I put an LLM in the loop the cost is O(N²) LLM calls per consolidation pass — for a 10k-memory database, that's thousands of dollars of inference per tick. So for v1 I'm trading off "better merge decisions" against "actually runs every 5 minutes without burning a budget."
On 1M-context-windows: I think they push the "vector DB break point" out but don't remove it. Context stuffing still has recall-precision problems at scale (lost-in-the-middle, attention dilution on unrelated facts), and 1M tokens ≠ unbounded memory. At 10M memories no context window saves you.
wvf is interesting — just read through. The "append everything, let the model retrieve" approach is the complement of what I'm doing: you lean fully into LLM semantics, I try to do the lookup deterministically. Probably both are right for different workloads. Yours wins when you have unbounded compute + a small corpus; mine wins when you have bounded compute + a large corpus that needs grooming.
Starring wvf now. Curious if you're seeing meaningful quality differences between your approach and traditional retrieval at scale.
Absolutely agree the deterministic performance-oriented mindset is still essential for large workloads. Are you expecting that this supplements a traditional vector/semantic store or that it superceeds it?
My focus has absolutely been on relatively small corpii, and which is supported by forcing a subset of data to be included by design. There are intentionally no conventions for things like "we talked about how AI is transforming computing at 1AM" and instead it attempts to focus on "user believes AI is transforming computing", so hopefully there's less of the context poisoning that happens with current memory.
Haven't deployed WVF at any scale yet; just a casual experiment among many others.
Supplements, definitely — for a specific workload. General document retrieval at scale (millions of chunks, read-heavy, doc-search patterns) is well-served by existing vector stores; YantrikDB doesn't compete on throughput. Where it's meant to supersede is the narrow case of agent memory: small-to-medium corpus, write-heavy with paraphrases every turn, lives for the lifetime of an agent identity, nothing curating the input.
Your "user believes X" framing is exactly the episodic/semantic split cognitive psych has been calling this for decades. YantrikDB exposes it via memory_type ∈ {episodic, semantic, procedural}. Your intuition about context poisoning from over-specific episodic details lines up with how I've been thinking about it — "we talked about AI at 1am" is high-noise low-signal for future retrieval. The design bet is consolidation + decay should burn episodic into semantic over time, and episodic-only memories should fade faster.
What does WVF stand for? Curious what you've been experimenting with.
To me, the OP’s reply reeks LLM, along with many others from them in this thread.
I would hope that their replies are from an actual person, knowing they’re interacting with people in a similar field as themselves, and asking for criticism from real people in the top comment.
And, yes, the current tech is pretty dumb.
But this is a blatant misapplication of the technology in an obviously sensitive use case with an implementation that's so exploitable the people driving it have certainly never heard the term "jailbreak" once in their lives.
Reminds me of a consulting call that I had with a very large internet provider about their new agentic chat support system.
"We're going to start with the request routing layer and move that to AI agents, and then work though the individual services."
I thought it was a wild architectural decision that they would choose to roll every single action that the system handled through an experimental layer. My advice was to start with a safe, repeatable process to validate the effectiveness in the wild, and then expand in the same manner, bringing edges in as they had "solved" the individual implementations.
So, while this is almost the exact opposite of that, choosing a high-value target with real repercussions as their leaf implementation still baffles me. Step zero of any AI integration plan should be prioritization. Companies are routinely failing at this very simple, not-even-technical aspect.
reply