More

mattheww · on March 29, 2023

You're right, the thing limiting the value is not demand - there are more than enough faculty and high-level staff that want to live there.

The thing limiting the value is the other restrictions that Stanford imposes on these houses - namely they essentially control the price the houses sell for because they all have to be financed through a Stanford-controlled lending program.

mattheww · on July 29, 2022

FYI, if you make a realistic calculation (including tax and multiple winners), the EV of a lottery ticket is still negative.

Depending on your tax bracket, the non-jackpot prizes have about $0.19 of EV.

In a state where you don't pay tax on lottery winnings, like CA, you cash option post-tax is currently estimated at $408,403,045.

My estimate of the multiple winner correction to the EV is 0.8, based on numbers of winners for prizes >$500M.

Your jackpot EV is therefore $1.08 and the total winnings EV is therefore $1.27, meaning the overall EV is -$0.73.

For there to be positive EV, that delta needs to all come from the jackpot, so the headline jackpot number would need to go from $1.1B to $1.8B to get it there.

jallen_dot_dev · on July 29, 2022

Great analysis. I want to share my thoughts on EV because they seem to go against what pretty much everyone else is saying. I think even if the EV was >$2, it would still be irrational to play. I say this because I think it's silly to talk about the EV of buying a single ticket when the odds of winning are 1 in 300 million.

Like, you still aren't going to win the jackpot. Your odds didn't improve just because the jackpot got bigger. I guarantee you are going to find out tonight that you wasted $2, just as you would have wasted $2 at any other time when the jackpot is/was smaller.

I think it's just people rationalizing what they know is an irrational behavior. They're fooling themselves into believing they are making a rational choice because they want in on the fun/excitement despite paying the "poor tax."

Anyway, good luck to anyone who decided to buy a ticket and enjoy a few hours of fantasizing about what you'd do with the winnings :)

beambot · on July 29, 2022

Most of what you're describing is captured by the Kelly Criterion: it tells you the "optimal" fraction of your bankroll you should bet to maximize returns given the odds and EV. For a game with positive EV and 1 in 300M odds, you should be betting a very small fraction of your overall bankroll -- well below $2 for anyone not already in the 0.01% wealth bracket. So indeed, it would be irrational (suboptimal) to participate in the lottery unless you were already very wealthy.

Even beyond the Kelly Criteria, you could also tweak your EV based on the marginal utility of a dollar -- i.e. going from $0 in savings to $10M would have a massive impact on your quality of life, whereas 10x'ing from there ($100M) would provide marginal additional benefit comparatively. Thus, (ignoring Kelly) the "quality of life EV" for a $10M jackpot with 1 in 5M odds is vastly better than a $100M jackpot with 1 in 50M odds despite having the same EV.

https://en.wikipedia.org/wiki/Kelly_criterion

jallen_dot_dev · on July 29, 2022

Nice, I had heard of the Kelly Criterion before but never thought to consider applying it to this situation. It's fantastic if the mathematically correct amount to wager (for any ordinary person) is less than $2 which rounds to 0 tickets which supports my intuition to not play.

Your second paragraph touches on another thought I had, which is why people think it's only worthwhile to play when the jackpot is $600M+. Any of us would be happy to win just $20M so why not play for every jackpot? Again it comes down to that misleading EV calculation, which I believe doesn't even matter if you are only going to buy 1 ticket only. Wagering based on the EV would only make sense if it was feasible to buy on the order of 100M tickets. Then you'd have a reasonable shot of winning each jackpot. And playing enough jackpots, you would come out ahead.

jrs235 · on July 29, 2022

They say you can't win if you don't play (pay). But that's not true! There is an epsilon chance that someone dropped their lottery ticket and the wind blew it away and they gave up in getting it back and while taking your groceries to your vehicle in a parking lot it blows by you and you pick it up. And there's an epsilon prime chance that it will be the winning ticket. So there is an epsilon times epsilon prime chance that you win the lottery without actually playing (paying inti) it. ;)

rootusrootus · on July 29, 2022

I like to say that your odds don't significantly change if you pay for a ticket or not. Both non-zero but very close.

tmaly · on July 29, 2022

You could flip the perspective and look at it as helping out the winner.

Still for most people, their biggest return is being able to day dream for 5 minutes what it would be like to win.

csw-001 · on July 29, 2022

Totally. And it becomes a cultural phenomenon - neighbors and coworkers are chatting about it … you want to be part of that moment. Here’s what I wonder - what is the maximum frequency of big media hyped jackpots? Would we all do this again next week, next month? I don’t think so. Maybe next year?

jrs235 · on July 29, 2022

About a decade ago Megabucks in Wisconsin (a Wisconsin State pick six game) had a positive EV after taxes were even taken out. I think it was maybe a penny at best. However, buying more than one ticket is still foolish in my book. 1 in 14 million vs 1 in 1.4 million vs ... They're all ridiculously long odds. You more likely to die driving to get your ticket.

mattheww · on June 29, 2022

There are four questions at the end of the post.

For sure, the second one is answered - it is possible to parallelize GNNs to the billion-scale, while still using message passing. It requires rethinking how message passing is implemented, modifying objective functions that work in parallel, and changing ML infrastructure. You're not going to get to large graphs with generic distributed Tensorflow.

I don't know if the third question is fully answered, but there are many approaches to preserving locality, either by changing architectures or changing objective functions.

Also, errata: PinSage was developed for Pinterest, not Etsy (hence, not EtsySage).

flooo · on June 29, 2022

I’m a researcher working in the fraud detection domain.

Do you have some pointers on scaling GNNs to such large problems?

legothief · on June 29, 2022

Thank you for pointing that out, we've corrected that in the article!

mattheww · on June 17, 2021

It's unbelievable that nobody has done this well - I and most people I know are tracking their own job searches on a spreadsheet.

Pretty obvious value-add for job searchers. Not to mention that having access to this data would enable tons of other product features. Shows that most services/sites don't care that much about the applicant experience.

toomuchtodo · on June 17, 2021

https://www.kiter.app/

https://news.ycombinator.com/item?id=27256776

mattheww · on May 5, 2018

To be fair, 50 PU is above design peak luminosity, much less mean. And I'm sure I've seen plots from both ATLAS and CMS at the end of LS1 that show improvements in the processing time at 100 PU by factors of roughly 10.

mattheww · on May 5, 2018

Hough transform is basically useless for precision tracking in a high multiplicity environment. Tracks have 5 degrees of freedom, so the memory costs make it infeasible. I think ATLAS uses it in the trigger, where you don't need to actually reconstruct all tracks, just find out if there are a couple passing certain criteria.

danbruc · on May 5, 2018

You don't have to accumulate the transformation result into an actual five dimensional array, you can just transform the points, keep them in a list, a quad tree, or whatever you like, and then just run a clustering algorithm on the transformed points. Probably complicated by the fact that every point can vote for several parameters so that you are not actually clustering points but something like lines or planes associated with the points.

That seems also to be - but I did not look at the code - more or less what the creators of the challenge implemented and submitted as a benchmark implementation, admittedly with the expected poor performance score of only about 20 %.

mattheww · on May 5, 2018

If you know a way to find clusters of intersections of hyperplanes, I'm pretty sure you can get a highly acclaimed paper out of it, but that is not what a Hough transform is. The Hough transform is an approximate solution to that problem which works by sampling the hyperplanes and then polling them. There's no way to perform a Hough transform without having many more transformed points than input points, and the more precision you need, the more points you need.

danbruc · on May 5, 2018

I see, they actually did two different implementation, one DBSCAN based clustering approach, one based on a Hough transformation, and submitted the former one as a benchmark.

Not withstanding that, I am not yet convinced that a Hough transformation combined with something similar to a quad tree could not work. More specifically I am thinking of delaying the creation of votes. Roughly the first point just becomes a node in the tree corresponding to the bounding box of its entire possible parameter space. Only when we encounter a second point whose possible parameter space overlaps with that of the first one we split up the two volumes into one volume for which both points vote and a few volumes for which only one of the points vote.

This obviously requires that the possible parameter spaces do not have terrible shapes that are hard to bound and I also could see nearly perfectly overlapping volumes cause issues due to the generation of many small volumes for the imperfection in the overlap. There are probably more issues and possibly even show stoppers, but without picking up a pencil and really thinking about it, I not really tell whether or not it could work out. But, as said, I am also unable to see immediately why this could never work.

mattheww · on May 5, 2018

Depends exactly how you limit the problem, but going from detector hits to abstract arcs takes a few seconds.

mattheww · on May 5, 2018

The main thing of interest here is whether a system that knows about physics (traditional approaches) can be beaten by a system that has no a-priori knowledge of physics (OOB DL) or if someone will find a way to integrate physics knowledge into a DL approach.

mattheww · on Dec 28, 2017

>That number is calculated without including info about any theory/evidence regarding the Higg's [sic] boson.

That's not true. You can see plots in the 2011 and 2012 papers that give these calculations as a function of mass. It's basically impossible to make a mass-independent calculation. (Not withstanding the fact that there's no Higgsless theory to do calculations for the background).

>As an example, I'd imagine detecting the Higg's [sic] boson at 1 eV energy levels is theoretically predicted to be even less likely than billions/trillions to one odds, therefore detector noise would be a more likely explanation for such results, despite the low p-value.

You're right, the p-value is much lower. Previous experiments have long since excluded such a low Higgs mass. Also, if the Higgs mass were so low, we probably wouldn't exist.

nonbel · on Dec 28, 2017

1) Typos fixed.

2) "That's not true. You can see plots in the 2011 and 2012 papers that give these calculations as a function of mass."

- Can you explain what you mean via figure 1 in this paper: https://arxiv.org/abs/1207.7235 ?

- I don't see the relevance of calculating p-values as a function of mass to my comment

3) "Not withstanding the fact that there's no Higgsless theory to do calculations for the background"

- Then what model did they use to calculate the p-values? (I do not know the details but am fairly certain it is one where there is no Higgs boson at any given mass)

4) "Also, if the Higgs mass were so low, we probably wouldn't exist."

- Ok, but I've also read headlines like "CERN proves the universe shouldn't exist", clearly this is just because their model of the universe is wrong. I'm sure in a pinch people could come up with some kind of balancing out of whatever problems would arise from such a small Higgs mass. The point was that assuming the current theory is correct, the Higgs would be a much worse explanation than detector noise.

mattheww · on Dec 28, 2017

RE 2: What I mean is that "N-sigma" and local p-value are being used interchangeably, which you can see from this plot. There's no "number ... calculated without including info about any theory/evidence regarding the Higgs boson".

RE 3: The model used to calculate the background are unphysical in the sense that they set the Higgs production cross section to zero without changing anything else. There's no physical Higgsless theory.

RE 4: I was agreeing with you and adding some facts.

nonbel · on Dec 29, 2017

>'What I mean is that "N-sigma" and local p-value are being used interchangeably, which you can see from this plot. There's no "number ... calculated without including info about any theory/evidence regarding the Higgs boson".'

Sorry, I still don't see what information is being used that requires theory/evidence regarding the Higgs boson. Whether or not anyone knows about the Higgs boson they could be plotting p-value by mass for these experiments.

>"The model used to calculate the background are unphysical in the sense that they set the Higgs production cross section to zero without changing anything else. There's no physical Higgsless theory."

So they actually change their model to be surely false, then go on to prove the known false model they just created... is false. This is pointless.

posterboy · on Dec 29, 2017

> they could be plotting ... by mass

he was saying, the mass calculation depends on the theory about the higgs? they can't just put it on a scale :)

nonbel · on Dec 29, 2017

You may be right on that point. For some reason I was thinking they just summed up the velocities, etc from whatever hits the detector. Perhaps somewhere they assume something about the Higgs boson though.

This makes the significance testing they do even more ridiculous though. At first I thought the model of background noise was simply not providing much info about the topic of interest: the existence and mass of the Higgs.

According to what I have learned here, it is much worse. Not only are they testing a purposefully rendered false model (and taking rejection of that known-false model as evidence for the Higgs), but they are also assuming the Higgs exists (and has whatever properties you all are referring to) as part of this process. As a result, the Higgs exists either way (whether background model is rejected or not) according to this process.

Doing this test sounds pretty meaningless to me.

mattheww · on Oct 7, 2017

So there's no obvious practical application (right now).

Also, the proton is not 4% smaller. Protons are obviously whatever size they are.

The discrepancy comes from the fact there are two techniques to measure the proton size. Both experiments do their thing and then there's a way to interpret the results that would tell you the size of the proton (look up proton form factors).

However, when you do the interpretations, which depend on some theoretical calculations, you get different results. The general thinking around this result, because nobody has found any issue with the experimental results, is that there are some additional interactions that are stronger than expected that need to be accounted for (there are some unknown quantities that allow this).

One of the interactions would only affect the muonic hydrogen measurement - basically there are some different interactions between muons and protons than between electrons and protons because of the muon's mass and those might be different than originally thought.

The other is a type of interaction that could affect both normal and muonic hydrogen. This new measurement shows that the interactions that affect both has to play an important role in understanding this discrepancy. There are other measurements trying to measure this effect independently (not using hydrogen at all).