If the original framing was too generous, the response is at least as ungenerous. Table saws aren't deterministic tools either, and anyone who has used one for more than a minute can tell you that getting it to consistently cut the straight line you want takes skill.
As with all uses of current AI (meaning generative AI LLMs) context is everything. I say this as a person who is both a lawyer and a software engineer. It is not surprising that the general purpose models wouldn't be great at writing a legal brief -- the training data likely doesn't contain much of the relevant case law because while it is theoretically publicly available, practicing attorneys universally use proprietary databases like Lexis and WestLaw to surface it. The alternative is spelunking through public court websites that look like they were designed in the 90s or even having to pay for case records like on PACER.
At the same time, even if you have access to proper context like if your model can engage with Lexis or WestLaw via tool-use, surfacing appropriate matches from caselaw requires more than just word/token matching. LLMs are statistical models that tend to reduce down to the most likely answer. But, typically, in the context of a legal brief, a lawyer isn't attempting to find the most likely answer or even the objectively correct answer, they are trying to find relevant precedent with which they can make an argument that supports the position they are trying to advance. An LLM by its nature can't do that without help.
Where you're right, then, is that law and software engineering have a lot in common when it comes to how effective baseline LLM models are. Where you're wrong is in calling them glorified auto-complete.
In the hands of a novice they will, yes, generate plausible but mostly incorrect or technically correct but unusable in some way answers. Properly configured with access to appropriate context in the hands of an expert who understands how to communicate what they want the tool to give them? Oh that's quite a different matter.
> As with all uses of current AI (meaning generative AI LLMs) context is everything.
But that's the whole point. You can't fit an entire legal database into the context, it's not big enough. The fact that you have to rely on "context is everything" as a cope is precisely why I'm calling them a glorified autocomplete.
I recommend the context7 MCP tool for this exact purpose. I've been trying to really push agents lately at work to see where they fall down and whether better context can fix it.
As a test recently I instructed an agent using Claude to create a new MCP server in Elixir based on some code I provided that was written in Python. I know that, relatively speaking, Python is over-represented in training data and Elixir is under-represented. So, when I asked the agent to begin by creating its plan, I told it to reference current Elixir/Phoenix/etc documentation using context7 and to search the web using Kagi Search MCP for best practices on implementing MCP servers in Elixir.
It was very interesting to watch how the initially generated plan evolved after using these tools and how after using the tools the model identified an SDK I wasn't even aware of that perfectly fit the purpose (Hermes-mcp).
As someone who is a (current) software engineer and (former) lawyer I find this interesting. Not sure if I'm willing to bet on big uptake, though, unless it was through an acquisition by one of the big e-discovery companies.
That's assuming the business still employs those Sr Devs so they can do the wrist smacking.
To be clear, I think any business that dumps experienced devs in favor of cheaper vibe-coding mids and juniors would be making a foolish mistake, but something being foolish has rarely stopped business types from trying.
There's a difference between having the humility to admit that you might not be able to hit another home run, and between claiming that "all the good ideas are taken." At best, the latter is an admission of a lack of a desire to even try anymore, at worst it shows a stunning lack of curiosity and creativity.
Yeah but there's a significant difference in the leader of a company shifting the goals, product, or market of a company in a top-down way and a leader attempting to shift the implementation details of goals in a top-down way. The former is the entire job of a CEO, the latter is micro-managing of the first order.
Yes, but the message from Shopify leadership is "it's part of your job to mess around with this stuff and see what works". Not "use AI at all costs".
The general feeling I'm getting is that using this AI stuff is important, but it's a learned skill, and we want as many people as possible to get familiar enough with it to have actual opinions.
That's one reading, and if that happens to be the correct reading then I agree it's unobjectionable. To me, though, making it part of a performance review process makes it closer to the "use AI at all costs" requirement than a request for devs to mess around with new technologies.
There was a 1-5 Likert scale self-rating on "leveraging AI" and a free-text box. I rambled about using claude code to help summarize my daily notes, cursor for implementation, using the chatgpt ui for broad questions (what happened to internal project X, how do I configure airflow again, etc.), then experiments with the find-the-right-table-for-you SQL generator. That seemed like about the level folks were going for.
There are some people who are really into it. The sql-generator's great for PMs; ops are experimenting with moderation triage. I personally have mixed feelings, but I'll futz with it on company time (and api $) to see if I can get it to do something useful. It'll mess up tensor alignment, but I can fix that.
So, yes, it was in the performance review. No, it wasn't a big deal. Yes, it seems to me like a reasonable nudge to get over the activation energy of learning to use the thing.
It's internal. But it's basically hooking up SQL code generation to the docs and schema of most of the tables in the data warehouse. I don't know the details, but probably also some extra stuff in the prompt to give a bit of context. It started as a few-days hack that then got momentum.
Because so many of them are spoiled brats that can't deal with even the thought of putting others before themselves? Just guessing. Effective Altruism is a hell of a drug.