More

thatguysaguy · 2024-12-16T21:13:10 1734383590

> Monitor Apple, Google, Microsoft, and Meta stock. Notify me if any of them start trending toward being undervalued

If you actually had a system that could detect when a stock was "undervalued" you should just trade directly on that system instead of selling it to consumers. This is pretty clearly scammy/predatory towards retail investors.

clark-kent · 2024-12-16T21:40:08 1734385208

We only look at when the stock is objectively "undervalued". I'm not saying the system will tell you that a stock like Palantir is undervalued because it has a bright future. No. This system only tells when a stock is objectively undervalued. There are a few ways to know this.

Here some examples:

- Market cap is less than book value.

- Market cap is less than free cash flow x 8

There are other ways, but these are some examples. Being undervalued doesn't mean its a good company, the trick is knowing why its undervalued and if that valuation is justified.

qeternity · 2024-12-16T22:52:04 1734389524

> the trick is knowing why its undervalued and if that valuation is justified.

If it’s justified, then it’s not undervalued.

Everything you described above is also just wrong and you should please stop encouraging uneducated retail investors to believe anything that fits these heuristics is “objectively undervalued”.

m3kw9 · 2024-12-16T22:00:09 1734386409

If any of these metrics were arbitrage, it would mean the fastest execution wins and the gap wouldnt last long. These metrics almost never work

danielmarkbruce · 2024-12-16T23:06:16 1734390376

> This system only tells when a stock is objectively undervalued. There are a few ways to know this.

> - Market cap is less than free cash flow x 8

Lol. "Objectively".

mnky9800n · 2024-12-17T09:37:02 1734428222

If that were true you wouldn’t need an ai to tell you.

thatguysaguy · 2024-11-15T18:57:17 1731697037

Yeah from the title I thought this was going to be about leetcode problems, but this is truly something that comes up regularly.

Where are these dev jobs where _don't_ have to figure out some mysterious issue in a barely maintained GitHub repo semi-regularly?

VyseofArcadia · 2024-11-15T19:56:35 1731700595

> Where are these dev jobs where _don't_ have to figure out some mysterious issue in a barely maintained GitHub repo semi-regularly?

Oh, my job is one of those.

Our code is in Perforce.

anankaie · 2024-11-16T02:14:29 1731723269

This is the most Software Engineer answer. Emphasizing only the distinctions that make no difference :D

evoke4908 · 2024-11-15T19:10:43 1731697843

Yeah, I feel like jobs that don't require you to reverse engineer a bunch of stuff are the exception, not the inverse.

Hall, I do greenfield embedded programming. Most of the code I touch is completely new; the entire codebase has been replaced and rewritten by me over the last three years. Not one other developer has contributed any significant amount of code. Even then, in this scenario, I'm still reverse engineering some arcane undocumented code at least once a week. Nobody ever documents their libraries properly so I have to rip everything apart to see how it works. Or I need to adapt some vendor example code or a GitHub repo into a slightly different form. Even just existing in the ESP-IDF environment I have to dig into the core code to figure out what the hell it's doing and where these error messages come from.

If you can't read someone else's Gordian knot of undocumented code, I'd argue you are probably not very good at writing code. Reading bad code is a vital part of learning to write good code, and is generally just a fundamental core skill that every programmer needs to have.

tayo42 · 2024-11-15T19:27:06 1731698826

How are you defining bad code?

Imo unreadable and undecipherabe code is one subset of bad code.

dahousecat · 2024-11-15T22:20:18 1731709218

I would argue, that in that vast majority of cases, the readability of code is the single most important metric.

amluto · 2024-11-16T05:19:56 1731734396

If I encounter third party code that is terminally broken, I don’t really care how readable it is, except insofar as I can probably tell it’s terminally broken faster if it’s more readable.

icedchai · 2024-11-15T23:54:55 1731714895

A repo would be nice. I was recently given a 500 meg .zip of "backend stuff" and asked to figure out what's going on. No repository, no history, the read me is useless, the APIs look like they were built by rabid animals...

dudleypippin · 2024-11-16T00:13:22 1731716002

My favorite description of this sort of setup is "coded by vandals".

thatguysaguy · 2024-09-24T22:53:38 1727218418

It's very bad in CS as well. See e.g.: https://arxiv.org/abs/1807.03341

IIRC there was also a paper analyzing how often results in some NLP conference held up when a different random seed or hyperparameters were used. It was quite depressing.

thatguysaguy · 2024-09-24T22:50:26 1727218226

I think it's clear that this paper has stood the rest of time over the last 20 years. Our estimates of how much published work fails to replicate or is outright fraudulent have only increased since then.

giantg2 · 2024-09-24T23:06:32 1727219192

[Please consider the following with an open mind]

Just because a study doesn't replicate, doesn't make it false. This is especially true in medicine where the potential global subjects/population are very diverse. You can do a small study that suggests further research based on a small sample size, or even a case study. The next study might have a conflicting finding, but that doesn't make the first one false - rather a step in the overall process of gaining new information.

llamaimperative · 2024-09-24T23:31:39 1727220699

I think it's much, much more powerful to think of "failure to replicate" as "failure to generalize."

In lieu of actual fraud or a methodological mistake that wasn't represented/caught in peer review, it's still extremely difficult to control for all possible sources of variation. That's especially true as you go further "up the stack" from math -> physics -> chem -> bio -> psych -> social. It is absolutely possible to honestly conduct a very high quality experiment with a real finding, but fail to account for something like "on the way here, 80% of participants encountered a frustrating traffic jam."

Their finding could be true for people who just encountered a traffic jam, and lack of replication would be due to an unsuccessful generalization from what they found.

scns · 2024-09-24T23:37:30 1727221050

Dislike being a pedant but the stack was missing math up front

bakuninsbart · 2024-09-25T08:12:56 1727251976

Math isn't a science, it is a tool we can use to construct coherent arguments. We can do this about our world, which science aims to do, but we can do this about many worlds. We can consider correct deductions within a mathematical system as fact, but they do not represent facts in the "real" world.

llamaimperative · 2024-09-25T11:34:31 1727264071

Elegant explanation of why I felt it didn’t belong! Thanks for writing :)

Chris2048 · 2024-09-25T13:19:19 1727270359

> but they do not represent facts in the "real" world.

In what sense do they not? On the assumption that there can be other "worlds" for which math, but not physics, holds?

llamaimperative · 2024-09-25T14:08:33 1727273313

Yes, you can make perfectly valid mathematical systems that have zero anchoring to any physical reality we experience (such as, trivially, n-dimensional geometries).

A geology that isn't anchored to our physical reality seems intrinsically invalid.

pixl97 · 2024-09-25T15:56:29 1727279789

In our universe there are some number of physical constants that cannot be determined from maths alone, but only measured. If you go about changing these physical constants then we don't have physics as we know it (change 1/137 to 1/140 and electromagnetics no longer works). You get some totally different physics of which there may be nothing more complicated than hydrogen, or maybe hydrogen doesn't even exist at all.

https://en.wikipedia.org/wiki/Fine-structure_constant

hermitdev · 2024-09-25T16:11:28 1727280688

One I always go back to: You can represent a perfect impulse (think electrical signals, a switch from 0 (low voltage) to 1 (high voltage)) mathematically, but it's impossible to physically create.

llamaimperative · 2024-09-24T23:38:53 1727221133

Haha, math strikes me as a bit different from the others... but I'll add it just for you ;)

gopher_space · 2024-09-25T03:57:23 1727236643

Dislike being a pedant but the stack is missing philosophy up front.

jahewson · 2024-09-24T23:35:36 1727220936

> Just because a study doesn't replicate, doesn't make it false.

But it also doesn’t make it not false. It makes the null hypothesis more likely to be true.

robwwilliams · 2024-09-25T03:06:51 1727233611

That is certainly one possible interpretation.

The other is the introduction or loss of critical cofactors or confounders that radically change environment and context.

Think of experiments of certain types before and after COVID-19.

jerf · 2024-09-25T13:37:48 1727271468

"Just because a study doesn't replicate, doesn't make it false."

This is a subtle point, but truth or falsity isn't really the issue. The problem with a non-replicable study is that the rest of science can't build on it. You can't build your PhD on top of a handful of studies that turn out to be non-replicable, and so on. It is true you can't build science on outright false statements, either, but true statements that aren't adequately reproducible are also not a solid enough foundation. That may seem counterintuitive, but it comes down to this truth not being binary; even if a study comes to a nominally true conclusion it still matters if it didn't do it via the correct method, or is somehow otherwise deficient in the path it took to get there. Studies are more than just the headline result in the abstract.

But the whole process of science right now is based on building up over time. How could it not be? It has to be, of course. But non-replicable studies mean that the things you're trying to build on them are non-replicable too. It doesn't take all that much before you're just operating in a realm of flights of fancy where you may "feel" like you're on solid ground because of all the Science you're sitting on top of, but it's all just so much air.

However, it is also true that non-replicability is a signal of falsity, and that is simply due to the fact that the vast, vast, vast, exponential majority of all possible hypotheses are false. As is another subtle point, a scientist engaging in science properly should probably not come to that conclusion and may not want to change their priors about something because of a non-reproducible study very much, but externally, from the generalized perspective of "what is true and is not true" where science is merely one particularly useful tool and not the final arbiter, I may be justified in taking non-replicable studies and updating my priors to increase the odds of the hypothesis being false. After all, at the very least, a non-replicable study tends to put an upper bound on the ability of the hypothesis to be true (e.g., if someone studies whether or not substance X kills bacteria Y, and it turns out not to reproduce very well, the lack of reproducibility does fairly strongly establish it can't be that lethal).

giantg2 · 2024-09-27T15:46:13 1727451973

"The problem with a non-replicable study is that the rest of science can't build on it."

This isn't true at all. You could have future experiments set out to prove the opposite, or to dig into what novel and confounding variables are at play causing conflicting results. The fact that you have studies that inconsistently replicate is actually a sign of possible new knowledge if you can find out which undiscovered variable is causing the conflicts. Especially in medicine, you can have a case study that leads to future studies even though that specific case may not be able to be consistently replicated in other n=1 populations.

"e.g., if someone studies whether or not substance X kills bacteria Y, and it turns out not to reproduce very well, the lack of reproducibility does fairly strongly establish it can't be that lethal"

Or there's another factor that hasn't been discovered, such as back in the day before understanding gram positive and negative bacteria was even a thing. If you don't know the subtypes exist, you would see inconsistent results as that is an undiscovered variable that you can't possible account for until it's discovered. Once discovered and controlled for, the substance could be very lethal.

tptacek · 2024-09-24T22:52:31 1727218351

Outright research fraud is probably very rare; the cases we've heard about stick out, but people outside of academia usually don't have a good intuition for just how vast the annual output of the sciences are. Remember the famous PhD comic, showing how your thesis is going to be an infinitesimal fraction of the work of your field.

kelipso · 2024-09-24T23:08:07 1727219287

Research fraud is likely very rare but it's not about a few stories that show up about unreplicable studies that stick out. There was a study a few years where they tried to replicate a bunch of top cited psychology papers and the majority of the experiments were not replicated. Then people did the same for other disciplines afterwards and, while it wasn't as bad as psychology, there were plenty of papers they couldn't replicate.

tptacek · 2024-09-25T00:13:38 1727223218

Every time this topic comes up I'm reminded of what Stefan Savage, a hero of mine, said about academic papers ("studies", in the sense we're discussing here): they are the beginnings of conversations, not the end. It shouldn't shock people that results in papers might not replicate; papers aren't infallible, which makes sense when you consider the process that produces them.

robwwilliams · 2024-09-25T03:11:31 1727233891

That is a generous interpretation. But in many cases we try our best to dress up studies and tell good stories—-preferably stories with compelling positive statistics and with slick figures. The story telling often obscures the key data.

mrguyorama · 2024-09-25T16:48:34 1727282914

No, "Science" "Journalists" who have zero scientific training will take an extremely limited study and completely miss the point and make wild claims that the general population becomes convinced the "science" said.

Scientists do not consider a study to be more than an observation. What matters to scientists is the totality of the evidence.

There are several chemistry youtubers who have failed to "reproduce" papers they are working from. Does that mean chemistry is a farce and doesn't work? No, it means some chemist at some point in history failed to write something down, mostly because they didn't even know or realize it mattered. Science is incredibly difficult and we can only hope to be okay at it.

robwwilliams · 2024-09-25T18:30:58 1727289058

Both ways. I have published several hundred peer-reviewed papers in biomedical research over 40 years. My experience is that almost all papers are written to maximize impact on the minds of readers—-ideally hewing to the data and limits of the design. But often the story telling aspects of a paper and the data snd design do not see eye-to-eye.

tptacek · 2024-09-25T17:30:14 1727285414

Just keep in mind while we're scare-quoting "journalists" that the scientists on HN also seem to think a failure of replication in a paper is a devastating indictment of whole scientific fields. It's not like we're that much better.

oxym0ron · 2024-09-25T08:07:53 1727251673

Yes, papers start conversations, not end them. Replication issues are part of the academic process.

dekhn · 2024-09-24T23:04:37 1727219077

Is incompetence fraud? Or just incompetence? I'm asking because a fair number of the molecular biologists who get caught by Elizabeth Bik for copy/pasting images of gels insist they just made honest mistakes (with some commentary about the atrocious nature of record-keeping in modern biology).

I alter Ionnides's conclusion to be instead: "Roughly 50% of papers in quantitative biological sciences contain at least one error serious enough to invalidate the conclusion" and "Roughly 75% of really interesting papers are missing at least one load-bearing method detail that reproducers must figure out on their own" (my own personal observations of the literature are consistent with these rates; I was always flabbergasted at people who just took Figure 3 as correct).

kelipso · 2024-09-24T23:13:50 1727219630

There is no one hovering over scientists all the time ready to stick a hot poker in them when they make a mistake or get careless. I was in academia and my impression is there is a reluctance to double and triple check results to make sure they are right as long as the results match your instincts, whether it's time pressure, laziness, bias, or just being human.

dekhn · 2024-09-25T00:06:53 1727222813

At least in my own mental model of publishing a paper (I've published only a few), I'd want my coauthors to stick hot pokers in my if I made a mistake or got careless. But then, my entire thesis was driven by a reproducible Makefile that downloaded the latest results from a supercomputer, re-ran the whole analysis, and wrote the latex necessary (at least partly to avoid making trivial mistakes). It was clear everything I was doing was just getting in the way of publishing high prestige papers.

robwwilliams · 2024-09-25T03:20:40 1727234440

All too easy to understand your situation. NIH is finally but slowly waking up and is imposing more “onerous” (aka: essential and correct) data management and sharing (DMS) document. Every grant applicant now submits following these guidelines:

https://grants.nih.gov/grants/guide/notice-files/NOT-OD-24-1...

Unfortunately, not all NIH institutes understand how to evaluate and moderate this key new policy. Oddly enough the peer reviewers do NOT have access to DMS plans as of this year.

IG_Semmelweiss · 2024-09-25T08:18:35 1727252315

Is this a process whereby the researcher is forced to submit the thesis (null, etc) of the research, ahead of the study and its findings?

robwwilliams · 2024-09-25T18:22:15 1727288535

I think you are referring to clinical trial registration. Different idea and process (a QA-like step).

The NIH DMS mandates are about the data generated by an award.

kelnos · 2024-09-25T01:32:26 1727227946

> Is incompetence fraud? Or just incompetence?

Fraud requires intent; it's a word that describes what happened, but also the motivations of the people involved. Incompetence doesn't assume any intent at all; it's merely a description of the (lack of) ability of the people involved.

Incompetent people can certainly commit fraud (perhaps to try to cover up their incompetence), but that's by no means required.

> ...insist they just made honest mistakes

If they're lying about that, it's fraud; they're either covering up their unrealized incompetence with fraud, or trying to cover up their intended fraud with protestations of mere incompetence. If they really did make honest mistakes, then it's just garden-variety incompetence. (Or just... mistakes. To me, incompetence is when someone consistently makes mistakes often. One-time or few-time mistakes are just things that happen to people, no matter how good the are at what they do.)

pfdietz · 2024-09-25T03:37:07 1727235427

The legal phrase I like is "knew or should have known". If there is a situation where you should have known something was wrong, it's as bad as if you really knew it was wrong. To hold otherwise incentivizes willful blindness and plausible deniability.

dllthomas · 2024-09-25T16:53:25 1727283205

I don't think the fact that the law (rightly, I will grant) unified two things when determining whether to punish means that we should always unify those things in our reasoning in other contexts.

dataviz · 2024-09-25T07:56:41 1727251001

People often use incompetence as an excuse for what were actually intentional bad decisions. Never attribute to malice that which is adequately explained by stupidity.

Maybe someone was incompetent but also knew they were cutting corners. Should they get a pass because they claim they didn't mean to do it? We should hold people accountable regardless of intent.

veunes · 2024-09-25T09:58:02 1727258282

People should be held accountable for the impact of their decisions

llamaimperative · 2024-09-24T23:32:55 1727220775

> I'm asking because a fair number of the molecular biologists who get caught by Elizabeth Bik for copy/pasting images of gels insist they just made honest mistakes

You're talking about (almost certainly) fraudsters denying they committed fraud. The vast majority of non-replicable results have nothing to do with these types of errors, purposeful or not.

jklinger410 · 2024-09-25T14:23:26 1727274206

> Outright research fraud is probably very rare

Not sure what rare means in this context. The more important research is, the more likely there is fraud involved. So in terms of size of impact, it's probably very common.

And then if you combine this with poorly done, non repeatable, or inconclusive research being parroted as a discovery...You end up with quite a bit of BS research.

thatguysaguy · 2024-04-27T19:28:03 1714246083

That is quite the claim, any source?

Edit: I guess I'm not sure on whether large training runs count as prod or not. They're certainly expensive and mission critical.

JSDevOps · 2024-04-27T19:34:46 1714246486

[flagged]

thatguysaguy · 2024-04-27T19:41:29 1714246889

I have worked on a team which trains foundation models

JSDevOps · 2024-04-27T19:47:40 1714247260

And no one’s used anything other than Python? The core of Tensorflow uses C++

dekhn · 2024-04-27T20:43:04 1714250584

Many parts of tensorflow required Python- at least when I worked there a few years ago, it was nearly impossible to compile XLA into a saved model and execute it from pure C++ code.

thatguysaguy · 2024-04-27T19:52:17 1714247537

The claim was that Python is only used for prototyping at Google, not that people are writing the AI frameworks themselves in pure Python.

JAX is obviously implemented in C++, but the scientists running the training runs which cost millions of dollars are writing lots of python.

mihai_maruseac · 2024-04-27T19:57:10 1714247830

Rather than repeat myself: https://news.ycombinator.com/item?id=40182940

thatguysaguy · 2024-04-21T19:37:37 1713728257

Not sure on the specific combination, but since everything in Jax is functionally pure it's generally really easy to compose libraries. E.g. I've written code which embedded a flax model inside a haiku model without much effort.

thatguysaguy · 2024-03-29T21:55:39 1711749339

Sounds like these guys didn't use custom kernels, but BitNet did.

mobicham · 2024-03-30T13:58:07 1711807087

That's correct. Only the dequantization is done on CUDA, the matmul is done with Pytorch. If they put their kernels open-source we could re-use them!

thatguysaguy · 2024-03-29T20:17:28 1711743448

To think of non-evil versions just consider cases where right now there's no voice actor to replace, but you could add a voice. E.g. indie games.

thatguysaguy · 2024-03-29T20:11:29 1711743089

I'm 100% going to clone my voice and use it on my discord bot.

thatguysaguy · 2024-03-29T19:13:27 1711739607

Completely agree. He ends up sounding kinda amateurish but that's only because (unlike most other podcasters) he's willing to ask questions deep inside the domain of the interviewee.

(Amateurish w.r.t. the domain, not as an interviewer I mean)