It sounds like the people you work with are “phonies”. You may have already heard this advice, but try to make friends outside of work. Maybe with other parents?
It can seem untrue, but there are still lots of communities online and offline.
Also regarding “people in the US are friendly, people elsewhere are unfriendly” (which IMO is incorrect but users are being too harsh on you). Most people in the southern US are generally known for appearing friendly and extroverted, while most in Eastern Europe appear “cold” and introverted. It’s a culture thing. But there are people who pretend to be friendly while spreading rumors behind your back (as you’ve experienced) and not committing to anything; likewise, some cold people are very nice if you get to know them, and would immediately help anyone in need, they just don’t like smalltalk with strangers. “You can’t judge a book by its cover”: there are friendly and unfriendly people everywhere, look for those who demonstrate commitment (act friendly and help others in ways that require effort or don’t improve their appearance).
They're AI tells. No human would write about how "corporate friction" "introduces friction at the infrastructure layer", except maybe someone who's trying to use big words to sound smart, because it barely makes sense (what "friction"? "Infrastructure layer", what are the other layers?)
AI didn't invent the terms, they were a part of the training data given to it.
The real tell is that you've not been in the group of people that use these terms frequently enough for you to think they're normal.
It's like the emdash alarmism, AI never invented emdash, nor did it invent using it frequently. Its training was full of examples, so many that AI picked up using it frequently.
> like the emdash alarmism, AI never invented emdash, nor did it invent using it frequently. Its training was full of examples, so many that AI picked up using it frequently
Look at my comment history. I emdash. But I adapted by removing the spaces around them—AI hasn’t similarly adapted.
Most comments on HN with emdashes aren’t slop. But if it starts getting into Wernicke word-salad territory and there are emdashes? With spaces? At that point, it’s fair to flag.
I'm laughing not at you but the ludicrousness of the times - I use endash heavily, have done for a minute, but now I see endash used by LLMs with no surrounding spaces.
I think that the "identify AI by some artefact" is just another game of whack-a-mole, and the better approach is to look at the quality of what's being presented.
I have argued before, and still feel strongly, that LLM/AI generated images/audio/text is causing a stronger inspection of what's being presented as fact, which is a healthy thing (how far that will go is yet to be determined, as per when the availability of Photoshop generated content exploded)
> now I see endash used by LLMs with no surrounding spaces
Goddamit. (Flippity floppity floop.)
> LLM/AI generated images/audio/text is causing a stronger inspection of what's being presented as fact, which is a healthy thing
If it is, I agree. What I think is actually happening is folks are skimming and then concluding on vibes. Unfortunately, that means “I don’t agree” gets lumped in with “this is slop.”
dang should just give power to users with 10 000+ karma and 10+ years on HN to nuke from orbit "green" accounts that do feel like they're posting AI slop. No due process, no recourse: just give users with lots of reputation and old accounts the ability to eliminate new accounts. I don't say this because I crave for "power": I only have 9400 karma. I say this because I'm tired of people creating new accounts to destroy our community.
Having one red flag is something I wouldn’t nuke on, but new account + em dash + other ai style talking points is just too much.
I feel like we’re eventually going to end up with shibboleths or something like the thieves cant that updates everytime a new model launches just to distinguish the humans.
This was discussed before. People will age accounts and buy/hack inactive ones. Meanwhile, often a link gets posted, the project owner (or someone affiliated) finds out, and they make a new account to comment; it would be a shame to lose these people.
I think em-dashes were once a reliable indicator (though never proof), but recent models have been fine-tuned to use them much less. Lots of recent AI-generated writing I've seen doesn't have em-dashes. Meanwhile, I've heard many people say that they naturally use em-dashes, and were already and/or are afraid of being accused of AI; so ironically this rumor may be causing people to use their own voice less.
Before, I naturally used hyphens as if they were em-dashes. The kerfuffle over LLM use of em-dashes motivated me to figure out how to type them properly (and configure my system to make that easier). Now I even go over old writing to fix the hyphens.
These are guidelines. I'm sure asking an AI about your comment (not pasting its text, so it's still your words) isn't an issue. The main target is obvious slop like https://news.ycombinator.com/threads?id=patchnull
Another example of a top comment that was definitely written by an LLM.
And to be clear, style isn't the only problem. This comment can be summarized as "WebAssembly can now interact with the DOM directly instead of through JavaScript, making it the better choice for more types of problems". One sentence instead of a paragraph of cliches ("...change how people think about this...chicken-and-egg loop..."), uncanny phrases ("...the hot-path optimization niche"), and inaccurate claims ("...the only viable use cases were compute-heavy workloads like codecs and crypto").
(For anyone who doesn't believe me, check the user's comment history)
The incentive of the human who deployed it—at one remove or another—would require knowing more. But the more likely cases are easy to guess at, e.g., someone is playing with OpenClaw. I'd guess "someone is playing with OpenClaw and intends to write something about it boost their brand, could be a Show HN could be a LinkedIn screed they hope goes viral."
There's money to be made if you can build an audience. There are many ifs on the way of course, but some people do earn handsomely from publishing. They're called content creators or influencers.
There’s an accurate way to confirm fraud: look for inconsistencies and replicate experiments.
If the fraudsters “fail to replicate” legitimate experiments, ask them for
details/proof, and replicate the experiment yourself while providing more details/proof. Either they’re running a different experiment, their details have inconsistencies, or they have unreasonable omissions.
Of course this is slightly messy too. Fraudsters are probably always incorrect, of course they could have stolen the data. But being incorrect doesn't mean your intentionally committing fraud.
That would be great if journals bothered publishing replication studies. But since they don't, researchers can't get adequate funding to perform them, and since they can't perform them, they don't exist.
We can't look for failed replication experiments if none exist.
>>95% of the time, the fraudsters get off scot-free. Look at Dan Ariely: Caught red-handed faking data in Excel using the stupidest approach imaginable, and outed as a sex pest in the Epstein files. Duke is still giving him their full backing.
It’s easy to find fraud, but what’s the point if our institutions have rotten all the way through and don’t care, even when there’s a smoking gun?
What do you think it is about machine learning that makes it hard to replicate? I'm an outsider to academic research, but it seems like computer based science would be uniquely easy - publish the code, publish the data, and let other people run it. Unless it's a matter of scale, or access to specific hardware.
Lack of will.
That was one of the main results from the survey from Whitaker in 2020.
Making your code reusable and easy to understand is significant work that had no direct benefits for a researcher's career. Particularly because research code grows wildly as researchers keep trying thungs.
Working on the next paper is seem as the better choice.
Moreover if your code is easy for others to run then you're likely to be hit with people wanting support, or even open yourself to the risk of someone finding errors in your code (the survey's result, not my own beliefs).
There are other issues, of course. Just running the code doesn't mean something is replicable. Science is replicated when studies are repeated independently by many teams.
There are many other failure modes SOTA-hacking, benchmarking, and lack of rigorous analysis of results, for example. And that's ignoring data leakage or other more silly mistakes (that still happen in published work! In work published in very good venues even)
Authors don't do much of anything to disabuse readers that they didn't simply get really look with their pseudorandom number generators during initialization, shuffling, etc. As long as it beats SOTA who cares if it is actually a meaningful improvement? Of course doing multiple runs with a decent bootstrap to get some estimation of the average behavior os often really expensive and really slow, and deadlines are always so tight. There is also the matter that the field converged on a experimentation methodology that isn't actually correct. Once you start reusing test sets your experiments stop being approximations of a random sampling process and you quickly find yourself outside of the grantees provided by statistical theory (this is a similar sort of mistake as the one scientists in other fields do when interpreting p-values). There be dragons out there and statistical demons might come to eat your heart or your network could converge to an implementation of nethack.
Scale also plays into that, of course, and use of private data as the other comment mentioned.
Ultimately Machine Learning research is just too competitive and moves too fast. There are tens of thousands (hundreds maybe?) of people all working on closely related problems, all rushing to publish their results before someone else published something that overlaps too much with their own work. Nobody is going to be as careful as they should, because they can't afford to. It's more profitable to carefully find the minimal publishable amount of work and do that, splitting a result into several small papers you can pump every few months. The first thing that tends to get sacrificed during that process is reliability.
A lot of things are easy if you ignore the incentive structure. E.g. a lot of papers will no longer be published if the data must be published. You’d lose all published research from ML labs. Many people like you would say “that’s perfectly okay; we don’t need them” but others prefer to be able to see papers like Language Models Are Few-Shot Learners https://arxiv.org/abs/2005.14165
So the answer is that we still want to see a lot of the papers we currently see because knowing the technique helps a lot. So it’s fine to lose replicability here for us. I’d rather have that paper than replicability through dataset openness.
But the lab must publish at least the general category of data, and if that doesn't replicate, then the model only works on a more specific category than they claim (e.g. only their dataset).
Even with the exact same dataset and architecture, ML results aren't perfectly replicable due to random weight initialisation, training data order, and non-deterministic GPU operations. I've trained identical networks on identical data and gotten different final weights and performance metrics.
This doesn't mean the model only works on that specific dataset - it means ML training is inherently stochastic. The question isn't 'can you get identical results' but 'can you get comparable performance on similar data distributions.
Then researchers should re-train their models a couple times, and if they can't get consistent results, figure out why. This doesn't even mean they must throw out the work: a paper "here's why our replications failed" followed by "here's how to eliminate the failure" or "here's why our study is wrong" is useful for future experiments and deserves publication.
As per my previous comment - we are discussing stochastic systems.
By definition, they involve variance that cannot be explained or eliminated through simple repetition. Demanding a 'deterministic' explanation for stochastic noise is a category error; it's like asking a meteorologist to explain why a specific raindrop fell an inch to the left during a storm replication.
Did you RTA? The author is predicting that those employees (at least in software dev) will get laid off; so they should get out and find some way to create real value (or make some other change) for their own sake, because they’re about to lose even “paycheck to paycheck”. You should debate this instead, because if true, it makes your point irrelevant.
It can seem untrue, but there are still lots of communities online and offline.
Also regarding “people in the US are friendly, people elsewhere are unfriendly” (which IMO is incorrect but users are being too harsh on you). Most people in the southern US are generally known for appearing friendly and extroverted, while most in Eastern Europe appear “cold” and introverted. It’s a culture thing. But there are people who pretend to be friendly while spreading rumors behind your back (as you’ve experienced) and not committing to anything; likewise, some cold people are very nice if you get to know them, and would immediately help anyone in need, they just don’t like smalltalk with strangers. “You can’t judge a book by its cover”: there are friendly and unfriendly people everywhere, look for those who demonstrate commitment (act friendly and help others in ways that require effort or don’t improve their appearance).
reply