Hacker Newsnew | past | comments | ask | show | jobs | submit | maxminminmax's commentslogin

Yes, but that’s not the relevant datum, because of selection effects. The relevant question is how well employed is the person who had a choice to do a Ph.D. or not compared to the counterfactual person who made an opposite choice.

As an example, an Ivy graduate makes more than state school graduate on average, but there was a study showing that those offered Ivy admission but deciding to go to a state school made just as much (that study setup has its own selection bias issues, but hopefully those gives an idea of what I mean).


> because of selection effects

We're literally measuring a selection effect: that of pursuing a graduate degree.

> there was a study showing that those offered Ivy admission but deciding to go to a state school made just as much

Source?

I'm not rejecting the hypothesis that this is a measurement error. But it's been observed across multiple countries for several generations. The burden of proof is on the hot take that graduate degrees in general are a bad economic bet. (Note: I don't have a PhD. I went to a state school. So you're hypothesis is tempting to believe, hence my scepticism.)


It’s ironic to see the link to that post which implicitly assumes that functions are defined on R^n, under a post about Legendre transform, whose point is that functions are defined on state spaces of systems, and only become represented by functions on R^n once we parameterize the state spaces by state variables. So the value of f at a given point doesn’t depend on what your favorite letter (aka state variable) is, but the value of f’ certainly does. And Legendre transform, as is actually explained, albeit cryptically, in Goldstein comes from the fact that we have 2d state space - phase space of 1d system with config variable y, velocity variable x, and momentum variable u, - on which we have have non-linearly related variables x and u.

In Legendre transform, what we have (the y variable is a red herring, and I will ignore it; everything happens "pointwise in y"), is curve in u-x plane, which we lift to u-x-z space in two ways -- that is, we find functions f and g defined for the points on that curve such that: 1) if the curve is parametrized by x, so that f is a function of x, then df/dx=u 2)if the curve is parametrized by u, so that g is a function of u, then dg/dx = u. (Why do we want this? Presumably because when x is velocity and f is energy, u is momentum, and we want g to have same property going back. And yes, there are conditions when one can parametrize a curve by one of the coordinates, either locally, or globally; one such is that u is monotone increasing function of x - that corresponds to convexity of f.) Of course now "derivatives of f and g are inverse" is tautological.

If we already know f(x), but don't know neither u nor g we could set u = df/dx and try to compute g. Or we could do it the way Goldstein does it: dg/du=x, so dg=xdu (this is an ODE), integrating it "by parts" g=int x du = xu - int u dx = xu - f.

(In advanced speak, u-x curve is a Lagrangian in the u-x plane, which is symplectic as every sum of vector space and its dual is; the functions f and g correspond to lifts of this Lagrangian to Legendrians based on choice of "canonical" 1-forms udx and xdu, respectively,so that df - udx=0 and g-xdu=0.)


that link is about something more fundamental wrt the underlying calculus and how the differential notation used in these explanations can often be confusing especially when you have differentials whose variable is a function of other variables. That's all its trying to clear up.


*then dg/du = x and dg - xdu =0


This is one of the most strawman (to put it mildly) things I have ever read.


Where is it?


>but probabilistically the strategy is optimal.

For what value function? It is basically never the case that my value function is "all choices other than the optimal are equally bad" -- which is what this rule is based on.

As a personal opinion, this drives me up the wall. There is a great problem here, and there is a whole area (several of them, actually!) of applied math dedicated to it (Statistical Decision Theory, Reinforcement Learning, you name it). Instead we get this toy version -- which at best is an oversimplified intro to he subject, and at worst an excuse to bamboozle with math-fairy-dust -- brought out as some kind of rule "to live by". Your algorithm is bad, and you should feel bad.


I'm confused, isn't this literally one of the founding problems to "Statistical Decision Theory"?

That is, this may be a simplified version of the problem, but it is a legit problem from that field. And the results being presented here don't disagree with the legit problem, do they?

Now, is it a simplification of a simplification? Sure. I'm not clear on why it is as bad as you are putting forth, though.


People discuss it as though it has basically any real life applicability, but the assumptions are violated by basically every important real life decision ever.

Don't get me wrong, I love algorithms, CS and math and very much liked learning the secretary problem and solution. I just wouldn't think of it as practically useful.


I'm guessing you see/hear this discussed way more than I do, then. :D



105B/39.51M=2657. Can I just take the money?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: