Hacker Newsnew | past | comments | ask | show | jobs | submit | latenightcoding's commentslogin

when I used to crawl the web, battle tested Perl regexes were more reliable than anything else, commented urls would have been added to my queue.

DOM navigation for fetching some data is for tryhards. Using a regex to grab the correct paragraph or div or whatever is fine and is more robust versus things moving around on the page.

Doing both is fine! Just, once you've figured out your regex and such, hardening/generalizing demands DOM iteration. It sucks but it is what is is.

but not when crawling. you don't know the page format in advance - you don't even know what the page contains!

Very cool concept, but it doesn't work too well.


With the rise of LLM-coding do these specialized/niche languages lose their edge? (i.e: prototyping speed, job security, etc)


In this case it is a little beyond specialized and niche. It is right to left too.

https://news.ycombinator.com/item?id=44498766


No LLM can write KDB yet


As most KDB code is proprietary, one wonders if there's even enough available code to train/steal from.


KDB code is as proprietary as java, python, C++ code found at other financial institutions. What's proprietary is the Q language. not the code you write with it


I think what they mean is that there is less publicly available kdb code for llms to be trained on


No "public" LLM can write KDB yet.


LLMs make me 10-20x more productive in frontend work which I barely do. But when it comes to low-level stuff (C/C++) I personally don't find it too useful. it just replaces my need to search stackoverflow.

edit: should have mentioned the low-level stuff I work on is mature code and a lot of times novel.


This is good if front end is something you just need to get through. It's terrible if your work is moving to involve a lot of frontend - you'll never pick up the skills yourself


As the fullstacker with a roughly 65/35 split BE/FE on the team who has to review this kinda stuff on the daily, there's nothing I dread more than a backender writing FE tickets and vice versa.

Just last week I had to review some monstrosity of a FE ticket written by one of our backenders, with the comment of "it's 90% there, should be good to takeover". I had to throw out pretty much everything and rewrite it from scratch. My solution was like 150 lines modified, whereas the monstrous output of the AI was non-functional, ugly, a performance nightmare and around 800 lines, with extremely unhelpful and generic commit messages to the tune of "Made things great!!1!1!!".

I can't even really blame them, the C-level craze and zeal for the AI shit is such that if you're not doing crap like this you get scrutinized and PIP'd.

At least frontenders usually have some humility and will tell you they have no clue if it's a good solution or not, while BEnders are always for some reason extremely dismissive of FE work (as can be seen in this very thread). It's truly baffling to me


Interesting, I find the exact opposite. Although to a much lesser extent (maybe 50% boost).

I ended shoehorned into backend dev in Ruby/Py/Java and don't find it improves my day to day a lot.

Specifically in C, it can bang out complicated but mostly common data-structures without fault where I would surely do one-off errors. I guess since I do C for hobby I tend to solve more interesting and complicated problems like generating a whole array of dynamic C-dispatchers from a UI-library spec in JSON that allows parsing and rendering a UI specified in YAML. Gemini pro even spat out a YAML-dialect parser after a few attempts/fixes.

Maybe it's a function of familiarity and problems you end using the AI for.


As in, it seems to be best at problems that you’re unfamiliar with in domains where you have trouble judging the quality?


>it seems to be best at problems that you’re unfamiliar with

Yes.

>in domains where you have trouble judging the quality

Sure, possibly. Kind of like how you think the news is accurate until you read a story that's in your field.

But not necessarily. Might just be more "I don't know how do to <basic task> in <domain that I don't spend a lot of time in>", and LLMs are good at doing basic tasks.


This is exactly my experience as well. I've had agents write a bit of backend code, always small parts. I'm lucky enough to be experienced enough with code I didn't write to be able to quickly debug it when it fails (and it always fails from the first run). Like using AI to write a report, it's good for outlines, but the details are always seemingly random as far as quality.

For frontend though? The stuff I really don't specialize in (despite some of my first html beginning on FrontPage 1997 back in 1997), it's a lifesaver. Just gotta be careful with prompts since so many front end frameworks are basically backend code at this point.


I've been hacking on some somewhat systemsy rust code, and I've used LLMs from a while back (early co-pilot about a year ago) on a bunch of C++ systems code.

In both of these cases, I found that just the smart auto-complete is a massive time-saver. In fact, it's more valuable to me than the interactive or agentic features.

Here's a snippet of some code that's in one of my recent buffers:

    // The instruction should be skipped if all of its named
    // outputs have been coalesced away.
    if ! self.should_keep_instr(instr) {
      return;
    }

    // Non-dropped should have a choice.
    let instr_choice =
      choices.maybe_instr_choice(instr_ref)
        .expect("No choice for instruction");
    self.pick_map.set_instr_choice(
      instr_ref,
      instr_choice.clone(),
    );

    // Incref all named def inputs to the PIR choice.
    instr_choice.visit_input_defs(|input_def| {
      self.def_incref(input_def);
    });

    // Decref all named def inputs to the SIR instr.
    instr.visit_inputs(
      |input_def| self.def_decref(input_def, sir_graph)
    );
The actual code _I_ wrote were the comments. The savings in not having to type out the syntax is pretty big. About 80% of the time in manual coding would have been that. Little typos, little adjustments to get the formatting right.

The other nice benefit is that I don't have to trust the LLM. I can evaluate each snippet right there and typically the machine does a good job of picking out syntactic style and semantics from the rest of the codebase and file and applying it to the completion.

The snippet, if it's not obvious, is from a bit of compiler backend code I'm working on. I would never have even _attempted_ to write a compiler backend in my spare time without this assistance.

For experienced devs, autocomplete is good enough for massive efficiency gains in dev speed.

I still haven't warmed to the agentic interfaces because I inherently don't trust the LLMs to produce correct code reliably, so I always end up reviewing it, and reviewing greenfield code is often more work than just writing it (esp now that autocomplete is so much more useful at making that writing faster).


What exact tool are you using for your smart auto-complete?


Whatever copilot defaults to doing on vscode these days. I didn't configure it very much - just did the common path setup to get it working.


It works with low-level C/C++ just fine as long as you rigorously include all relevant definitions in the context window, provide non-obvious context (like the lifecycle of some various objects) and keep your prompts focused.

Things like "apply this known algorithm to that project-specific data structure" work really well and save plenty of time. Things that require a gut feeling for how things are organized in memory don't work unless you are willing to babysit the model.


This feels like a parallel to the Gell-Mann amnesia effect.

Recently, my company has been investigating AI tools for coding. I know this sounds very late to the game, but we're a DoD consultancy and one not traditional associated with software development. So, for most of the people in the company, they are very impressed with the AI's output.

I, on the other hand, am a fairly recent addition to the company. I was specifically hired to be a "wildcard" in their usual operations. Which is too say, maybe 10 of us in a company of 3000 know what we're doing regarding software (but that's being generous because I don't really have visibility into half of the company). So, that means 99.7% of the company doesn't have the experience necessary to tell what good software development looks like.

The stuff the people using the AI are putting out is... better than what the MilOps analysts pressed into writing Python-scripts-with-delusions-of-grandeur were doing before, but by no means what I'd call quality software. I have pretty deep experience in both back end and front end. It's a step above "code written by smart people completely inexperienced in writing software that has to be maintained over a lifetime", but many steps below, "software that can successfully be maintained over a lifetime".


Well, that's what you'd expect from an LLM. They're not designed to give you the best solution. They're designed to give you the most likely solution. Which means that the results would be expected to be average, as "above average" solutions are unlikely by definition.

You can tweak the prompt a bit to skew the probability distribution with careful prompting (LLMs that are told to claim to be math PHDs are better at math problems, for instance), but in the end all of those weights in the model are spent to encode the most probable outputs.

So, it will be interesting to see how this plays out. If the average person using AI is able to produce above average code, then we could end up in a virtuous cycle where AI continuously improves with human help. On the other hand, if this just allows more low quality code to be written then the opposite happens and AI becomes more and more useless.


I have no doubt which way it is going to go.


Before the industrial revolution a cabinetmaker would spend a significant amount of time advancing from apprentice to journeyman to master using only hand tools. Now master cabinetmakers that only use hand tools are exceedingly rare, most furniture is made with power tools and a related but largely different skillset.

When it comes to software the entire reason maintainability is a goal is because writing and improving software is incredibly time consuming and requires a lot of skill. It requires so much skill and time that during my decades in industry I rarely found code I would consider quality. Furthermore the output from AI tools currently may have various drawbacks, but this technology is going to keep improving year over year for the foreseeable future.


Maintainable software is also more maintainable by AI. The required standards may be a bit different, for example there may be less emphasis on white space styling, but, for example, complexity in the form of subtle connections between different parts of a system is a burden for both humans and AI. AI isn't magic, it still has to reason, it fails on complexity beyond its ability to reason, and maintainable code is one that is easier to reason with.


Same. It’s amazing for frontend.


As a front-of-the-frontend guy, I think it's terrible with CSS and SVG and just okay with HTML.

I work at a shop where we do all custom frontend work and it's just not up to the task. And, while it has chipped in on some accessibility features for me, I wouldn't trust it to do that unsupervised. Even semantic HTML is a mixed bag: if you point out something is a figure/figcaption it'll probably do it right, but I haven't found that it'll intuit these things and get it right on the first try.

But I'd imagine if you don't care about the frontend looking original or even good, and you stick really closely to something like tailwind, it could output something good enough.

And critically, I think a lot of times the hardest part of frontend work is starting, getting that first iteration out. LLMs are good for that. Actually got me over the hump on a little personal page I made a month or so ago and it was a massive help. Put out something that looked terrible but gave me what I needed to move forward.


It's astonishing. A bit scary actually. Can easily see the role of front-end slowly morphing into a single person team managing a set of AI tools. More of an architecture role.


A yes I’m more of a backend guy that loves tailwind.


Is this because they had the entire web to train on, code + output and semantics in every page?


I guess it’s because modern front-end “development” is mostly about copying huge amounts of pointless boilerplate and slightly modifying it, which LLMs are really good at.


It's moreso that a backend developer can now throw together a frontend and vice-versa without relying on a team member or needing to set aside time to internalize all the necessary concepts to just make that other part of the system work. I imagine even a full-stack developer will find benefits.


So we are all back to be webmasters :)


This has nothing to do with what they asked.


Copilot is going to feel "amazing" at helping you quickly work within just about any subject that you're not already an expert in.

Whether or not a general purpose foundation model for coding is trained on more backend or frontend code is largely irrelevant in this specific context.


Found the bot.


I’m not sure how this was extended and refined but there are sure a lot of signs of open source code being used heavily (at least early on). It would make sense to test model fit with the web at large.


It's crazy this app has barely changed in like a decade, they even went public but the learning experience is just worse


It is in some projects, OP gave a good example: hedge funds.


pandas has been around for years and never tried to sell me a service.


Their (polars) FOSS solution isn't at all neuteured, imo that's a little bit of an unfair criticism. Yeah, they are trying to make their distributed query engine for-profit, but as a user of the single-node solution, I haven't been pressured at all to use their cloud solution.


To be fair, almost everything in quantum is poorly named. That's how they attract funding.


Except for "quantum supremacy", which is the best name in the entire multiverse.


some of these companies are straight up inept. Not an AI company but "babbar.tech" was DDOSing my site, I blocked them and they still re-visit thousands of pages every other day even if it just returns a 404 for them.


> we decided that the only way to leverage the full value of Kotlin was to go all in on conversion

Could someone expand on this please.


In addition to what @phyrex already pointed out, without any Java in the code base, they probably hope to hire from a different cohort of job seekers.

Younger developers are far less likely to know Java today than Kotlin, since Kotlin has been the lingua franca for Android development for quite some time. Mobile developers skew younger, and younger developers skew cheaper.

With Java out of the code base they can hire "Kotlin developers" wanting to work in a "modern code base".

I'm not suggesting there's something malevolent about this, but I'd be surprised if it wasn't a goal.


I think you're on to something here. When recruiters contact me about Java jobs, I tell them my level of interest in a Java job is about as high as RPG or COBOL, and that I'm wary of spending time on a "legacy" technology. Most of them are fairly understanding of that sentiment, too.

If I had someone call me about Kotlin, I would assume the people hiring are more interested in the future than being stuck in the past.


You're already expected to learn a number of exotic (Hack) or old (C++) languages at Meta, so I'm pretty sure that's not the reason.

To quote from another comment I made:

> I don't have any numbers, but we know that the Meta family of apps has ~3B users, and that most of them are on mobile. Let's assume half of them are on Android, and you're easily looking at ~1B users on Android. If you have a nullpointer exception in a core framework that somehow made it through testing, and it takes 5-6 hours to push an emergency update, then Meta stands to lose millions of dollars in ad revenue. Arguably even one of these makes it worth to move to a null-safe language!


An NPE means an incomplete feature was originally pushed to production. It would still be incomplete or incorrect, in Kotlin and would still need a fix pushed to production.

It's even worse with Kotlin, without the NPE to provide the warning something is wrong, the bug could persist in PROD much longer potentially impacting the lives of 1 Billion users much longer than it would have if the code remained in the sane Java world.


How would a bug persist in production if you get a compile time error that prevents you from running the application? You don't seem like you know what you're talking about.

Even if I am charitable with my interpretation, I'm not sure I get your point. If you refuse to handle the case where something is nullable and you convert it to non-null via .unwrap() (Rust perspective, I haven't used Kotlin), then you will get your NullPointerException in that location, so Kotlin is just as powerful as Java in terms of producing NPEs, but here is the thing. The locations where you can get NPEs are limited to the places where you have done .unwrap(), which is much easier to search through, than the entire codebase, which is what you'd have to do in Java, where every single line could produce an NPE. So in reality if you push incomplete code to production, you will have strong markers in the code that indicate that it is unfinished.


"The" reason is not what I'm speculating on, because I don't think a singular reason is likely to exists.

There is likely a mix of reasons -- of which NPE avoidance is almost certainly one. And hiring/talent management is almost always another, when making technology choices. Particularly when choices are coupled with a blog post on the company's tech blog.


From the article:

> The short answer is that any remaining Java code can be an agent of nullability chaos, especially if it’s not null safe and even more so if it’s central to the dependency graph. (For a more detailed explanation, see the section below on null safety.)


One of my biggest gripes with an otherwise strictly typed language like Java was deciding to allow nulls. It is particularly annoying since implementing something like NullableTypes would have been quite trivial in Java.


Would it have been trivial and obvious for Java (and would Java still have been "not scary") back in the 90s when it came out?


It wouldn't have been particularly hard from a language, standard library, and virtual machine perspective. It would have made converting legacy C++ programmers harder (scarier). Back then the average developer had a higher tolerance for defects because the consequences seemed less severe. It was common to intentionally use null variables to indicate failures or other special meanings. It seemed like a good idea at the time


> It would have made converting legacy C++ programmers harder (scarier).

And that, right there, is all the reason they needed back then. Sun wanted C++ developers (and C developers, to some extent) to switch to Java.


It would have been trivial for record types to be non-nullable by default.

Record types are 3 years old and they are already obsolete with regards to compile time null checking. This is a big problem in Java. A lot of new features have become legacy code and are now preventing future features to be included out of the box.

This is why the incremental approach to language updates doesn't work. You can't change the foundation and the foundation grows with every release.

I am awaiting the day Oracle releases class2 and record2 keywords for Java with sane defaults.


Tony Hoare (the guy who originally introduced the concept of null for pointers in ALGOL W) gave a talk on it being his "billion dollar mistake" in 2009: https://www.infoq.com/presentations/Null-References-The-Bill...

Now, this wasn't some thing that just dropped out of the blue - the problems were known for some time before. However, it was considered manageable, treated similarly to other cases where some operations are invalid on valid values, such as e.g. division by zero triggering a runtime error.

The other reason why there was some resistance to dropping nulls is because it makes a bunch of other PL design a lot easier. Consider this simple case: in Java, you can create an array of object references like so:

   Foo[] a = new Foo[n];  // n is a variable so we don't know size in advance
The elements are all initialized to their default values, which for object references is null. If Foo isn't implicitly nullable, what should the elements be in this case? Modern PLs generally provide some kind of factory function or equivalent syntax that lets you write initialization code for each element based on index; e.g. in Kotlin, arrays have a constructor that takes an element initializer lambda:

   a = Array(n) { i -> new Foo(...) } 
But this requires lambdas, which were not a common feature in mainstream PLs back in the 90s. Speaking more generally, it makes initialization more complicated to reason about, so when you're trying to keep the language semantics simple, this is a can of worms that makes it that much harder.

Note that this isn't specific to arrays, either. For objects themselves, the same question arises wrt not-yet-initialized fields, e.g. supposing:

   class Foo {
      Foo other;   
      Foo() { ... }
   }
What value does `this.other` have inside the constructor, before it gets a chance to assign anything there? In this simple case the compiler can look at control flow and forbid accessing `other` before it's assigned, but what if instead the constructor does a method call on `this` that is dynamically dispatched to some unknown method in a derived class that might or might not access `other`? (Coincidentally, this is exactly why in C++, classes during initialization "change" their type as their constructors run, so that virtual calls always dispatch to the implementation that will only see the initialized base class subobject, even in cases like using dynamic_cast to try to get a derived class pointer.)

Again, you can ultimately resolve this with a bunch of restrictions and checks and additional syntax to work around some of that, but, again, it complicates the language significantly, and back then this amount of complexity was deemed rather extreme for a mainstream PL, and so hard to justify for nulls.

So we had to learn that lesson from experience first. And, arguably, we still haven't fully done that, when you consider that e.g. Go today makes this same exact tradeoff that Java did, and largely for the same reasons.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: