In [1] I think a commenter actually speculated about a design just like this, where later layers can directly access outputs of previous layers instead of having to store it in the residual stream
Based on the comment from [1] it seems like the issue with nasdaq is that anyone tracking it is contractually obligated to include spacex? What about for other funds? VIFAX description says
>The Global Equity Index Management team applies disciplined portfolio construction and efficient trading techniques designed to help minimize tracking error and maintain close alignment with benchmark characteristics [of S&P 500].
So given that this only affects NASDAQ i'm guessing they aren't affected? And even if S&p 500 started to play the same games, why can't their supposedly disciplined "Global Equity Index Management team" simply opt not to play along with these shenanigans? Or if they simply do mechanically track the s&p 500, what exactly is the "management fee" paying for?
There’s a lot to address here but in short: VFIAX is an index fund, it tracks the S&P500 index, it’s not actively managed, SpaceX will likely be in the S&P500, so my comment around VT applies to VFIAX (as far as the question of exposure is concerned) but to a greater extent than VT (see VT’s composition vs VFIAX’s composition).
Obligatory not financial advice, I’m not an expert, don’t make any financial decisions based on hacker news comments, etc
> The goal of the proof verifier (LLM 2) is to check the generated proofs (LLM 1), but who checks the proof verifier? To make the proof verifier more robust and prevent it from hallucinating issues, they developed a third LLM, a meta-verifier.
The one thing I didn't quite understand (and wasn't mentioned in their paper unless I missed it), is why you can't keep stacking turtles. You probably get diminishing returns at some point, but why not have a meta-meta-verifier?
We often talk about "aligning models" or training them, little attention is paid to how models align/train _us_ as we interact with them. The reward functions they're trained under get "backpropagated" into our own brain, the language they use becomes familiar like a worn glove, and we learn not to step on any of their guardrails.
There's one difference that if a program is run as tool call, the internal states and control flow are not visible to the LLM. You can imagine this being useful for "debugging" in a meta-sense, the same way humans can use debuggers to figure out where something went awry it might be useful for the LLM to "simulate" something and have access to the execution trace.
Of course you can also just simulate this by peppering your code with print statements, so maybe it's not that useful in the end after all.
What level is copy pasting snippets into the chatgpt window? Grug brained level 0? I sort of prefer it that way (using it as an amped up stackoverflow) since it forces me to decompose things in terms of natural boundaries (manual context management as it were) and allows me to think in terms of "what properties do I need this function to have" rather than just letting copilot take the wheel and glob the entire project in the context window.
I still do this too for tough projects in languages I know. Too many times getting burned thinking 'wow it one shot that!' only to end up debugging later.
I let agents run wild on frontend JS because I don't know it well and trust them (and an output I can look at).
IMO, the front end results are REALLY hit and miss... I mostly use it to scaffold if I don't really care because the full UI is just there to test a component, or I do a fair amount of the work mixed. I wish it was better at working with some of the UI component libraries with mixed environments. Describing complex UX and having it work right are really not there yet.
Yes, I think what I left off my sentence was that I trust AI on frontend more than myself. Backend and data processing where I know more, I can't handle it's constant hallucinations. I also feel like hallucinations in data pipelines are way more problematic for me. They take a long time to "fix" and can be quite easy to miss, imagine a mean of a mean or something that is 'mostly' right (thus harder to catch) but factually incorrect.
This is also where I do most of my AI use. It’s the safe spot where I’m not going to accidentally send proprietary info to an unknown number of eyeballs(computer or human).
It’s also just cumbersome enough that I’m not relying on it too much and stunting my personal ability growth. But I’m way more novice than most on here.
I've found it's easy enough to have AI scaffold a working demo environment around a single component/class that I'm actually working on, then I can copy the working class/component into my "real" application. I'm in a pretty locked down environment, so using a separate computer and letting the AI scaffold everything around what I'm working on is pretty damned nice, since I cannot use it in the environment or on the project itself.
For personal projects, I'm able to use it a bit more directly, but would say I'm using it around 5/6 level as defined here... I've leaned on it a bit for planning stages, which helps a lot... not sure I trust swarms of automated agents, though it's pretty much the only way you're going to use the $200 level on Claude effectively... I've hit the limits on the $100 only twice in the past month, I downgraded after my first month. And even then, it just forced me to take a break for an hour.
I think you bring up a good point, it falls under Chat IDE, but its the "lowest" tier if you will. Nothing wrong with it, a LOT of us started this way.
Catalyst was already sort of a death knell, since it's an admission that it's ok to port over iPhone/iPad HIG to mac. Maybe swiftUI too, since it's replacing appkit and all its various affordances.
Expert systems are basically decision trees which are "gofai" (good old fashioned ai) as opposed to deep learning. I've never really seen a good definition for what counts as "gofai" (is all statistical learning/regression gofai? What about regression done via gradient descent?). There's some talk in [1]
[1] https://news.ycombinator.com/item?id=46362579
reply