For anyone one confused, the first part is an approximate transliteration into Cyrillic of the English sentence “Much like how you can/could spell English in Cyrillic.”
The legal premise of training LLMs on everything ever written is that it’s fair use. If it is fair use (which is currently being disputed in court) then the license you put on your code doesn’t matter, it can be used under fair use.
If the courts decide it’s not fair use then OpenAI et al. are going to have some issues.
Quite possibly. If they care a great deal about not contributing to training LLMs then they should still be aware of the fair use issue, because if the courts rule that it is fair use then there’s no putting the genie back in the bottle. Any code that they publish, under any license whatsoever, would then be fair game for training and almost certainly would be used.
A few points because I actually think Lindley’s paradox is really important and underappreciated.
(1) You can get the same effect with a prior distribution concentrated around a point instead of a point prior. The null hypothesis prior being a point prior is not what causes Lindley’s paradox.
(2) Point priors aren’t intrinsically nonsensical. I suspect that you might accept a point prior for an ESP effect, for example (maybe not—I know one prominent statistician who believes ESP is real).
(3) The prior probability assigned to each of the two models also doesn’t really matter, Lindley’s
paradox arises from the marginal likelihoods (which depend on the priors for parameters within each model but not the prior probability of each model).
Are you seriously saying that, because a point distribution may well make sense if the point in question is zero (or 1) other points are plausible also? Srsly?
The nonsense isn't just that they're assuming a point probability, it's that, conditional on that point probability not being true, there's only a 2% chance that theta is .5 += .01. Whereas the actual a priori probability is more like 99.99%.
> The nonsense isn't just that they're assuming a point probability, it's that, conditional on that point probability not being true, there's only a 2% chance that theta is .5 += .01. Whereas the actual a priori probability is more like 99.99%.
The birth sex ratio in humans is about 51.5% male and 48.5% female, well outside of your 99.99% interval. That’s embarrassing.
You are extremely overconfident in the ratio because you have a lot of prior information (but not enough, clearly, to justify your extreme overconfidence). In many problems you don’t have that much prior information. Vague priors are often reasonable.
In a perfect world everybody would be putting careful thought into their desired (acceptable) type I and type II error rates as part of the experimental design process before they ever collected any data.
Given rampant incentive misalignments (the goal in academic research is often to publish something as much as—or more than—to discover truth), having fixed significance levels as standards across whole fields may be superior in practice.
The real problem is that you very often don't have any idea about what your data are going to look like before you collect them; type 1/2 errors depend a lot on how big the sources of variance in your data are. Even a really simple case -- e.g. do students randomly assigned to AM vs PM sessions of a class score better on exams? -- has a lot of unknown parameters: variance of exam scores, variance in baseline student ability, variance of rate of change in score across the semester, can you approximate scores as gaussian or do you need beta, ordinal, or some other model, etc.
Usually you have to go collect data first, then analyze it, then (in an ideal world where science is well-incentivized) replicate your own analysis in a second wave of data collection doing everything exactly the same. Psychology has actually gotten to a point where this is mostly how it works; many other fields have not.
This is an interesting post but the author’s usage of Lindley’s paradox seems to be unrelated to the Lindley’s paradox I’m familiar with:
> If we raise the power even further, we get to “Lindley’s paradox”, the fact that p-values in this bin can be less likely then they are under the null.
Lindley’s paradox as I know it (and as described by Wikipedia [1]) is about the potential for arbitrarily large disagreements between frequentist and Bayesian analyses of the same data. In particular, you can have an arbitrarily small p-value (p < epsilon) from the frequentist analysis while at the same time having arbitrarily large posterior probabilities for the null hypothesis model (P(M_0|X) > 1-epsilon) from the Bayesian analysis of the same data, without any particularly funky priors or anything like that.
I don’t see any relationship to the phenomenon given the name of Lindley’s paradox in the blog post.
The Secretary Problem tells us that once you’ve lived 1/e (~37%, 30ish years) of your life[1], the next time you see something that’s stupider than everything you’ve seen before there’s a 1/e chance that’s it’s the stupidest thing you’ll ever see.
[1] Strictly speaking it would be 1/e of your stupidity sightings, which may not be 1/e of your life. If you intend to retire early and become a hermit you may want to stop the exploration phase earlier.
It has a finger on the long term trend of decreasing relevance for labor and increasing relevance for capital as factors of production, but it's certainly not a metric I'd choose and that's why I tried so hard to steer towards something better.
One can imagine a world where productivity increases, the need for old jobs is reduced, but newer, better jobs more than replace them because the economy is experiencing genuine growth. Self-serving capital rhetoric will push you to always imagine it this way, self-serving labor rhetoric will push you to never imagine it this way, but good policy lies in figuring out what's actually happening in aggregate and responding accordingly (the framing I tried to push).
It's not productivity itself; it's the decoupling of productivity from wages. If I'm creating 3 times as much value as my equivalent in 1970, why aren't I getting paid 3 times as much inflation-adjusted money, hmm? It's not even unfair to shareholders - they'd also get 3 times as much as in 1970. But instead they get 10 times as much and I get 0.7 times as much, or something like that. What's the deal?
> If I'm creating 3 times as much value as my equivalent in 1970, why aren't I getting paid 3 times as much inflation-adjusted money, hmm?
Because that increase in productivity comes almost entirely from technology owned by your employer.
To look at it in a contrived example, let's take textiles. There is a textile factory employing weavers who weave fabric by hand, and the factory owners buys a new automated weaving machine that makes the weavers each 3 times more productive. The maker of the machine created the technology, and is paid for it, the owner of the factory made the investment to bring the technology, and profits from it.
This is basically exactly what has happened to modern productivity.
Except in technology where the gains come from my personal investment in skills. I'm spending hours every week keeping up with the field of software engineering. I've been investing in learning my craft since I was 14 or so.
I'd argue the same goes for many types of digital creators, artists, video editors, animators, and so forth.
> Except in technology where the gains come from my personal investment in skills.
Not really. That's essentially a weaver learning to use the new automated weaving machine. That is what you do to remain qualified for the job. Now, if you were a framework or key system creator, building the underlying platforms that get adopted throughout the industry, I would agree. But just learning to use the tooling the the industry creates isn't that different, other than the rate of change you have to keep up with.
A weaver who knows how to use an automated weaving machine produces 3 times as much cloth as one who doesn't, so why don't they get paid 3 times as much? This is the problem of the decoupling of productivity and wages. It started happening at precisely the moment the gold standard was ended - weird.
> A weaver who knows how to use an automated weaving machine produces 3 times as much cloth as one who doesn't, so why don't they get paid 3 times as much?
An automatic weaving machine, operated by a capable operator, produces 3 times as much as a manual weaver. The productivity increase is the machine, not the operator. That's my entire point.
The owner of the machine reaps the surplus, not its operator.
> This is the problem of the decoupling of productivity and wages. It started happening at precisely the moment the gold standard was ended - weird.
You'll get no argument from me about the ills caused by the financialization of the economy, but I don't think that's what's going on here.
>>A weaver who knows how to use an automated weaving machine produces 3 times as much cloth as one who doesn't, so why don't they get paid 3 times as much?
> An automatic weaving machine, operated by a capable operator, produces 3 times as much as a manual weaver. The productivity increase is the machine, not the operator. That's my entire point.
An automatic weaving machine operator, operating a capable machine, produces 3 times as much as the lack of a machine operator. The productivity increase is the operator, not the machine. That's my entire point.
What's different between what I just said and what you just said? Nothing. In fact they can both be true. Both parties can get 3 times as much money as they did previously. Why don't they? Why does one party get 10x and the other party get 0.7x?
If productivity increase is entirely caused by machines, why did it take until 1971 for wages to decouple? The reality is that both workers and owners would like their share to be as high as possible. In 1971, however, owners seized control of the money printer and they never let it go since then.
> Both parties can get 3 times as much money as they did previously.
Increased productivity shifts
the supply curve which will (unless demand has zero elasticity, which is unrealistic) lower the market price of the good. So tripling productivity does not triple the amount of revenue per hour worked.
> Why does one party get 10x and the other party get 0.7x?
Because the people purchasing labor (capital) are able to get the labor they need at that price. Automatic weaving machine operators are trainable, and if they were getting paid 3 times what weavers were paid then people would rush into that space, driving down labor prices—in other words, the supply of automatic weaving machine operators has high elasticity. The demand for automatic weaving machine operators (i.e. the supply of factories full of automatic weaving machines) has much lower elasticity, so capital (demand for labor) gets most of the economic surplus.
It comes down to the capital owners owning the money printer. And nothing else.
I'm aware of a few attempts to create a labour-owned money printer (using the ideas of cryptocurrency) but none that are getting off the ground. Bitcoin is not one - it was a good idea to try, but it got captured by capital just the same as fiat money did.
You can grasp for vague conspiracy theories about “the money printer”, or you can sit down and think about the concrete factors that make demand for labor (i.e. capital investments in buildings and equipment) less elastic than supply for labor. Here are a few:
- It’s fundamentally more difficult to raise and organize millions of dollars to build a factory and fill it with automatic weaving machines than it is for someone to train for a few weeks to become an automatic weaving machine operator.
- Various government regulations, from environmental protections and zoning laws that make it harder to build factories to safety regulations for operating factories, make it harder to open new factories and so decrease the elasticity of labor demand. I want to be explicit here that I am not saying these regulations are bad—but we must recognize the side effects they have.
- Long lead times on capital investments greatly increase the risk of market movements or technological advances making the business plan untenable before it gets off the ground.
- Organizational inertia slows staffing changes. Corporations often make decisions at glacial speeds. Want to hire a new team? Who is going to manage them? Who do they report to? Where will they work? These discussions can take up months, at which point the market has changed and ehhhh maybe we don’t want to hire a new team after all.
- High cost and difficulty of firing people makes hiring for a possibly short-term market opening less attractive. Think union contracts, severance pay, etc. Again, I want to be explicit that I’m not saying these are bad things, but we need to understand the effects they have.
No it’s not. If the increased productivity is realized by multiple industries, then they all compete on price and the price of their goods comes down. That means the consumers of the product capture the gains in productivity.
Farmers using machinery instead of labor has meant cheaper food for everyone, not rich farmers.
I think that if we look at inflation-adjusted productivity, and inflation-adjusted average income, then that would indeed prove increasing inequality, right?
I believe the chart in this link is adjusted by inflation. Showing overall the same trend:
I haven’t seen this representation before—I suppose the vertices of the graph are the chessboard squares, the edges
are adjacency (white squares can only be adjacent to black squares and vice versa, which gives the bipartite-ness), and covering two squares corresponds to removing those two vertices from the graph?
Knowing that the solution is unique makes this trivial to solve in a couple minutes just by scribbling on a piece of paper (I just did). It does not seem more subtle than the original.
Proving that the solution is unique may be more subtle.
Do you have any evidence that the cancer is a type that would have been caught by a screening regime currently in place in other countries which is not in place in the US?
Without such evidence your post reads more like propagandizing a death for political purposes than an honest argument.
> Do you have any evidence that the cancer is a type that would have been caught by a screening regime currently in place in other countries which is not in place in the US?
Do you have any evidence that it wasn't?
I honestly don't know if earlier detection was possible, or would have helped her out or not. What I can tell you is that given the state of health care in this country, you can bet that my default assumption would be "yes" until proven otherwise.
Starting with the assumption of "no" gives our system more slack than it deserves.
Most types of cancers are not routinely screened for. The post says that the cancer was in her liver and lungs, and neither liver cancer nor lung cancer are routinely screened for (lung cancer screenings are recommended for people with a history of heavy smoking).
> What I can tell you is that given the state of health care in this country, you can bet that my default assumption would be "yes" until proven otherwise.
This is clearly a politically-motivated point rather than one grounded in science or reality. Cancer screening in the US is generally more aggressive, not less aggressive, than in other developed countries. For example, the US has historically recommended annual mammograms starting at age 40, while Europe doesn't start until age 50 and only does them every two years. US guidelines are to start screening for colon cancer at age 45 (c.f. 50 in most of Europe), and the US uses a much more invasive (and costlier) approach to colon cancer screening on top of the age gap.
If anything the US probably overinvests in cancer screening. The evidence in favor of starting mammograms at 40 is extremely dubious, as is the evidence for invasive and expensive colonoscopies (standard US practice) over fecal matter tests (standard European practice) for colon cancer screening.
> The post says that the cancer was in her liver and lungs, and neither liver cancer nor lung cancer are routinely screened for ...
If you have got cancer in your liver and lung then those are probably metastases, and most often the original cancer is in the colon.
> the evidence for invasive and expensive colonoscopies (standard US practice) over fecal matter tests (standard European practice) for colon cancer screening [is extremely dubious].
Fecal matter tests will tell if you have got a tumour and that tumour is bleeding. But not all colon tumours bleed.
Colon cancer can be a silent killer, that often goes without symptoms for years until it has metastasised and become terminal.
A colonoscopy will tell if you if you have got a polyp — an early pre-stage of cancer. And a polyp can be removed right then and there during the procedure with a tiny wire-loop or claw at the end of the instrument — and then you're safe.
I recommend everyone who is 45 y/o or older to get a colonoscopy every ten years. That is how long a polyp takes to develop into a tumour .. for normal people.
Myself, I have Lynch syndrome, so I have had to start earlier and get a colonoscopy every year. I had my fourteenth two days ago.
A COLONOSCOPY IS NO BIG DEAL. It is not invasive, it is not sexual, it is not demeaning. Everyone is professional, interested in your intestine, not your butt. It does usually not hurt, and if it does it is because of gas, as there are no other types of sensory nerves in the colon. If you are otherwise healthy, it is not dangerous. You can get it done it medicated, or even sedated if you want. I usually do it without any such drugs. The worst part is not the procedure but the prep — because laxatives taste bad. But if you are healthy and ask for it, a doctor could give you a stronger laxative that you don't have to drink as much of.
Colonoscopies, involving inserting instruments into the body, are definitely an invasive medical procedure.
> An invasive procedure is one where purposeful/deliberate access to the body is gained via an incision, percutaneous puncture, where instrumentation is used in addition to the puncture needle, or instrumentation via a natural orifice. It begins when entry to the body is gained and ends when the instrument is removed, and/or the skin is closed. Invasive procedures are performed by trained healthcare professionals using instruments, which include, but are not limited to, endoscopes, catheters, scalpels, scissors, devices and tubes.
[1], emphasis added.
> A medical procedure that invades (enters) the body, usually by cutting or puncturing the skin or by inserting instruments into the body.
[2], emphasis added
> An invasive procedure is one in which the body is "invaded", or entered by a needle, tube, device, or scope.
[3], emphasis added
Is it a big deal? Maybe not to you, maybe to other people. Is it better than a much cheaper (and not invasive) FOBT? Questionable.
NordICC [4] found an 18% reduction in colon cancer incidence after 10 years with a colonoscopy screening program, but no statistically significant reduction in mortality (either colon cancer or all-cause). Hardcastle et al. [5] found no reduction in colon cancer incidence but a 15% reduction in colon cancer mortality after 7.8 years with a FOBT screening program.
Everyone's gungho about evidence-based medicine until the evidence fails to support their preferred procedures.
Looking at corporate profit levels versus wage levels over the past twenty years, the U.S. as a capitalist country can afford a great deal more of healthcare inflation in order to raise the quality of life for its population.
Should its businesses afford that out of their profits?
Since households can’t afford eggs, much less health care costs, at the wages paid by businesses; so this decision is up to firms rather than households to decide. Founders, your input would especially be appreciated here.
Even if inappropriate, this reads like a normal expression of grief to me.
It's normal to be upset about the circumstances under which someone died, and to be angry if you believe it was avoidable. Under the five stages model, this would be bargaining and anger.