Good news is that a year in consulting isn't long enough to look like a detriment. Bad news is that the job market is crap right now.
> Is anyone aware of any firms that do formal-methods-like activities
Don't limit yourself to what you studied in your PhD. A big chunk of the commercial research position will be focused on AI.
> Has anyone got experience making a shift from a non-technical to a technical role?
Don't think of it as a shift from non-technical to technical role. Think of it as "I finished my CS PhD about a year ago and now I'm looking for a research/software job".
Yes, it is. And I say this as someone who explored significantly more than the surface. Even the way you “search” is different, as is the number of results and contrasting opinions you see.
It was not Helsing, sorry. The challenge was to build a zero-knowledge implementation from scratch as an auth/login server. I was able to get it working, but I spent a non-trivial amount of time on it, which is the real story of my job search lately. It's a part-time job just to keep up with code challenges while I'm applying this often for work.
I don't understand why Python gets shit for being a slow language when it's slow but no credit for being fast when it's fast just because "it's not really Python".
If I write Python and my code is fast, to me that sounds like Python is fast, I couldn't care less whether it's because the implementation is in another language or for some other reason.
Because for any nontrivial case you would expect python+compiled library and associated marshaling of data to be slower than that library in its native implementation without any inyerop/marshaling required.
When you see an interpreted language faster than a compiled one, it's worth looking at why, because most the time it's because there's some hidden issue causing the other to be slow (which could just be a different and much worse implementation).
Put another way, you can do a lot to make a Honda Civic very fast, but when you hear one goes up against a Ferrari and wins your first thoughts should be about what the test was, how the Civic was modified, and if the Ferrari had problems or the test wasn't to its strengths at all. If you just think "yeah, I love Civics, that's awesome" then you're not thinking critically enough about it.
Yep, and the next logical question when both implementations are for the most part bare metal (compiled and low-level), is why is there a large difference? Is it a matter of implementation/algorithm, inefficiency, or a bug somewhere? In this case, that search turned up a hardware issue that should be addressed, which is why it's so useful to examine these things.
> Because for any nontrivial case you would expect python+compiled library and associated marshaling of data to be slower than that library in its native implementation without any inyerop/marshaling required.
> When you see an interpreted language faster than a compiled one, it's worth looking at why, because most the time it's because there's some hidden issue causing the other to be slow (which could just be a different and much worse implementation).
On the contrary, the compiled languages tend to only be faster in trivial benchmarks. In real-world systems the Python-based systems tends to be faster because they haven't had to spend so long twiddling which integers they're using and debugging crashes and memory leaks, and got to spend more time on the problem.
I don't doubt that can happen, but I'm also highly doubtful that it's the norm for large, established, mature projects with lots of attention, such as popular libraries and the standard library of popular languages. As time spent on the project increases, I suspect that any gain an interpreted language has over an (efficient) compiled one not only gets smaller, but eventually reverses in most cases.
So, like in most things, the details can sometimes matter quite a bit.
> I don't doubt that can happen, but I'm also highly doubtful that it's the norm for large, established, mature projects with lots of attention, such as popular libraries and the standard library of popular languages.
Code that has lots of attention is different, certainly, but it's also the exception rather than the rule; the last figure I saw was that 90% of code is internal business applications that are never even made publicly available in any form, much less subject to outside code review or contributions.
> As time spent on the project increases, I suspect that any gain an interpreted language has over an (efficient) compiled one not only gets smaller, but eventually reverses in most cases.
In terms of the limit of an efficient implementation (which certainly something like Python is nowhere near), I've seen it argued both ways; with something like K the argument is that a tiny interpreter that sits in L1 and takes its instructions in a very compact form ends up saving you more memory bandwidth (compared to what you'd have to compile those tiny interpreter instructions into if you wanted them to execute "directly") than it costs.
I think there's something to the idea of keeping the program in the instruction cache by deliberately executing parts of it via interpreted bytecode. There should be an optimum around zero instruction cache misses, either from keeping everything resident, or from deliberately paging instructions in and out as control flow in the program changes which parts are live.
There are complicated tradeoffs between code specialisation and size. Translating some back and forth between machine code and bytecode adds another dimension to that.
I fear it's either the domain of extremely specialised handwritten code - luajit's interpreter is the canonical example - of the the sufficiently smart compiler. In this case a very smart compiler.
> On the contrary, the compiled languages tend to only be faster in trivial benchmarks. In real-world systems the Python-based systems tends to be faster because they haven't had to spend so long twiddling which integers they're using and debugging crashes and memory leaks, and got to spend more time on the problem.
This is an interesting premise.
Python in particular gets an absolute kicking for being slow. Hence all the libraries written in C or C++ then wrapped in a python interface. Also why "python was faster than rust at anything" is headline worthy.
I note your claim is that python systems in general tend to be faster (outside of trivial benchmarks, whatever the scope of that is). Can you cite any single example where this is the case?
> Can you cite any single example where this is the case?
Plenty of line-of-business systems I've seen, but systems big enough to matter tend not to be public. Bitbucket's cloud and on-prem version are the only case I can think of where you can directly compare something substantial between an implementation known to be written in Python and an implementation that's known to be written in C/C++ (and even then I'm not 100% that that's what they use).
I wonder if its because we're sometimes talking cross purposes.
For me, coding is almost exclusively using python libraries like numpy to call out to other languages like c or FORTRAN. It feels silly to say I'm not coding in Python to me.
On the other hand, if you're writing those libraries, coding to you is mostly writing FORTRAN and c optimizations. It probably feels silly to say you're coding in Python just because that's where your code is called from.
There is a version of BASIC, a QuickBasic clone called Qb64 that is lightning fast because it transpiles to C++. By your admission a programmer should think that BASIC is fast because he only does BASIC and does not care about the environment details?
It's actually the opposite, a Python programmer should know how to offload most, or use the libraries that do so, out of Python into C. He should not be oblivious to the fact that any decent Python performance is due to shrinking down the ratio of actual Python instructions vs native instructions.
I think maybe it's just semantics as long as everyone agrees where the speedup is happening (at the low level language calls).
I noticed that you're pretty hard in the "basic isn't fast, the thing it transpiles to is fast" camp, but still accidentally said "there is a version of BASIC [...] that is lightning fast" which I'm not sure you think? Highlights just how tricky it is to talk about where speed lives
There is clear distinction between original language design (an interpreter) and a project aiming to recreate a sub-standard of that language and support its legacy codebase via a transpiler.
But you will care if that "python" breaks - you get to drop down to C/C++ and debugging native code. Likewise for adding features or understanding the implementation. Not to mention having to deal with native build tooling and platform specific stuff.
It's completely fair to say that's not python because it isn't - any language out there can FFI to C and it has the same problems mentioned above.
It's pretty hard to draw this line in Python because all built-in types and functions are effectively C extensions, just compiled directly into the interpreter.
Conversely, you can have pure C code just using PyObjects (this is effectively what Cython does), with the Python bytecode interpreter completely out of the picture. But the perf improvement is nowhere near what people naively expect from compiled code, usually.
Yes, which is why I would argue that IO is a particularly bad benchmark here, since everything is just a thin layer on top of the actual syscall, and those layers don't do any real work worth comparing.
The only thing that makes sense to compare when talking about pythons performance is how many instructions it needs to compute something, versus the instructions needed to compute the same thing in C. Those are probably a few orders of magnitude apart.
Usually, yes, but when it's a bug in the hardware, it's not really that Python is fast, more like that CPython developers were lucky enough to not have the bug.
The PyObject header is a target for optimisation. Performance regressions are likely to be noticed, and if a different header layout is faster, then it's entirely possible that it will be used for purely empirical reasons. Trying different options and picking the best performing one is not luck, even if you can't explain why it's the best performing.
You can expect the Python developers to look very closely at any benchmark that significantly benefits from adding random padding to the object header. Performance isn’t just trying a bunch of random things and picking whatever works the best, it’s critical to understand why so you know that the improvement is not a fluke. Especially since it is very easy to introduce bias and significantly perturb the results if you don’t understand what’s going on.
We're not talking about random changes. We're talking about paying attention to the measured performance of changes made for other reasons.
Just like in this article. The author measured, wondered, investigated, experimented, and finally, after a lot of hard work, made the C/Rust programs faster. You wouldn't call that luck, would you? If there had been a similar performance regression in CPython, then a benchmark could have picked up on it, and the CPython developers would then have done the same.
You can look at the history of PyObject yourself: https://github.com/python/cpython/commits/main/Include/objec.... None of these changes were done because of weird CPU errata that meant that making the header bigger was a performance win. That isn't to say that the developers wouldn't be interested in such effects, or be able to detect them, but the fact that the object header happens to be large enough to avoid the performance bug isn't because of careful testing but because that's what they ended up for other reasons, far before Zen 3 was ever released. If it so happened that Python was affected because the offset needed to avoid a penalty was 0x50 or something then I am sure they would take it up with AMD rather than being content to increase the size of their header for no reason.
What you don't see in the logs are the experiments and branches that weren't pursued further because they didn't perform well enough.
Also: If you're going to prove that changes informed by performance measurements are absent from the commit logs, then you'll need to look in the logs for all the relevant places, which means also looking at I/O and bytes and allocator code.
Given that the performance is only affected by the size of that object header, the file I linked is all you'd need to see changes in. Look, the Python project is not picking their object sizes because it performs well on a quirk of Zen 3. End of story. I did performance work professionally in the past and now recreationally and this specific instance is 100% luck. This is not because I don't think the runtime people aren't smart or anything but this would be an insane thing to do on purpose.
I think the confusion comes from people not having a good understanding of what an interpreted programming language does, and what actual portion of time is spent in high versus low level code. I've always assumed that most of my programs amount to a bit of glue thrown in between system calls.
Also, when we talk about "faster" and "slower," it's not clear the order of magnitude.
Maybe an analysis of actual code execution would shed more light than a simplistic explanation that the Python interpreter is written in C. I don't think the BASIC interpreter in my first computer was written in BASIC.
Agreed. The speed of a language is reverse proportional to number of CPU instructions emitted to do something meaningful, e.g. solve a problem. Not whether it can target system calls without overhead and move memory around freely. That's a given.
>I don't understand why Python gets shit for being a slow language when it's slow but no credit for being fast when it's fast just because "it's not really Python".
What's there to understand? When it's fast it's not really Python, it's C. C is fast. Python can call out to C. You don't have to care that the implementation is in another language, but it is.
I constantly get low key shade for choosing to build everything in Python. It’s really interesting to me. People can’t break out of thinking, “oh, you wrote a script for that?”. Actually, no, it’s software, not a script.
99% of my use cases are easily, maintainably solved with good, modern Python. The Python execution is almost never the bottleneck in my workflows. It’s disk or network I/O.
I’m not against building better languages and ecosystems, and compiled languages are clearly appropriate/required in many workflows, but the language parochialism gets old. I just want to build shit that works and get stuff done.
I keep queries in .sql files in git repo. I run longer queries by writing them in a file and including/running it with \i. There's also \e to open $PSQL_EDITOR.
i find it funny how the bryan johnsons of the world take like 123 pills every day and optimize all the fun out of life and we still dont really really know if it works or not
and then old geezers like munger and buffett do whatever the hell they want and outlive everybody
Jack LaLanne died at 96 years old, yet my grandpa who's eaten tons of Taco Bell for most of his life is alive at just a hair from that age. I wonder what people will think about the obsession with "longevity" in the likely outcome of David Sinclair or Andrew Huberman dying in their 80s or earlier.
There a tribe in Ecuador with reduced height studied by Walter Longo etc. They have a mutation in their growth hormone receptor, that means that they experience less effect of human growth hormone, hence reduced height. And their lifestyle is rather unhealthy: alcohol, smoking, sugars, junk food, obesity etc. Yet they rarely experience diabetes, cancer etc probably due to reduced mTOR pathway activation.
In the context of this specific post regarding Charlie Munger, you can't say that it's genetics unless you measure specific genes. He could be 100x Bryan Johnson, but Bryan Johnson at least makes his protocols open for use. And Munger didn't even bother to make a genetics test with his curiosity, thus providing no essential value to human civilization.
genetics. they matter a ton . read the bios of really old people (100+ yrs old) and nothing really stands out beyond having a lot of family member who also lived a long time. I think a low-stress lifestyle helps a lot too.
Commenting only on their professional lives. Their investment strategy was generally not a high stress one, they clearly enjoy their work so one could assume their professional lives were generally low stress.
They lived frugal lives, with everything needed covered for probably 10'000 years ahead. Large families, plenty of friends and a lot of wisdom. Yeah, I think Charlie and Warren's life's have way less stress than most people do.
This is a fair point. If you exclude the early part of his life, working as an asset manager is easy work. Most jobs are much more stressful. Basically, you sit around and read annual/quarterly reports and try to find the next company to buy. To be clear, this type of asset management is similar to private equity. They are buying whole companies. This business is much lower volatility than buying/selling stocks for fund management. Also: Fewer transactions. So I would say the 2nd half of his life was low stress (outsider's view, of course).
I genuinely think that this whole argument is a waste of time.
What matters is whether the outputs are useful and the outputs don't change based on whether you call it "thought", "AGI" or "probabilistic word selection".
Trees have been given rights in some places. Some people believe dogs and cats have rights.
Humans have been not given rights in some places. Some people believe some humans don't have rights.
It's not about "sentience" or "consciousness" -- in reality, these concepts are religious ones, like the soul, and don't map to anything objectively meaningful.
A better way to think about rights, I think, is that they stem from the barrel of a gun. Can a thing reciprocate a social contract with me usefully? Can it help me if I'm nice to it? Can it hurt me if I'm mean to it? Alternatively, will entities that can help me do so if they see me help this thing or harm me if they see me harm this thing? That's all that matters; I'll be game-theoretically forced to grant it personhood. I'm programmed by my brain to show empathy to you, and, when that fails, you or others will harm me if I hurt you. Or, if I become a powerful dictator, I might execute you for being inconvenient to me. For all I know, you're a p-zombie that I could otherwise hurt with a clean conscience. None of that really matters; it's all about what I'm forced to do, from within or without.
Your ethics are leading you to believe this. For other people it doesn’t make sense at all that a computer program should have rights. It makes even less sense to people who know what a computer program is.
We don't give single cells rights, but we do give the complex organization of cells rights. Why should binary code never be given the same consideration as dna code? What defining factor is at play here?
To think that it is "just a program" could be like saying we are just machines without determination. This view may well become "racist" or "bigotry"
Individual ethics will determine the societal ethics that get codified into law. I have a hard time seeing how giving intelligent enough machines won't happen based on our existing ethics, laws, and history therein
> It makes even less sense to people who know what a computer program is.
I write code professionally and my beliefs are not what you claim them to be. Perhaps your opinion is the minority opinion? You should certainly not be claiming it as the de facto belief among programmers
Severance (the TV show) is a pretty entertaining exploration of this issue.
Still... Maybe its not a good analogy. LLMs are inifinitely replicable and editable. The "concious experience," if you will, is discontinuous even if you assume the architecture will advance massively. We definitely dont need to be talking about rights yet.
rights is part of ethics, which we most certainly need to be talking about
As I said, what we have today is not deserving of such considerations imho, but I do expect to see someone trying to marry an AI before I die, so this will become an issue not to far off
(in fact, someone already married an AI in Japan, and then the company that ran it closed, iirc)
Seems like a very bad reason to switch. Data engineering is different (and much worse than SWE in my opinion), and it's not like you're certain that you can avoid LC interviews if you try to switch.
> Is anyone aware of any firms that do formal-methods-like activities
Don't limit yourself to what you studied in your PhD. A big chunk of the commercial research position will be focused on AI.
> Has anyone got experience making a shift from a non-technical to a technical role?
Don't think of it as a shift from non-technical to technical role. Think of it as "I finished my CS PhD about a year ago and now I'm looking for a research/software job".