Hacker News new | past | comments | ask | show | jobs | submit | antimatter15's comments login

Maybe look into implicit in-order forests (https://thume.ca/2021/03/14/iforests/)


Figure 3 on p.40 of the paper seems to show that their LLM based model does not statistically significantly outperform a 3 layer neural network using 59 variables from 1989.

  This figure compares the prediction performance of GPT and quantitative models based on machine learning. Stepwise Logistic follows Ou and Penman (1989)’s structure with their 59 financial predictors. ANN is a three-layer artificial neural network model using the same set of variables as in Ou and Penman (1989). GPT (with CoT) provides the model with financial statement information and detailed chain-of-thought prompts. We report average accuracy (the percentage of correct predictions out of total predictions) for each method (left) and F1 score (right). We obtain bootstrapped standard errors by randomly sampling 1,000 observations 1,000 times and include 95% confidence intervals.


Not to mention, as somebody who works in quant trading doing ml all day on this kind of data. That ann benchmark is nowhere near state of the art.

People didn't stop working on this in 1989 - they realised they can make lots of money doing it and do it privately.


I never traded consistently and successfully but I did do a startup with a seasoned quant trader with the ambition of using bigger models to generate novel alpha. We mopped the floor with the academics who publish but that is whiffle ball compared to a real prop outfit that lasts.

Not having made it big myself I obviously don’t know the meta these days, but last I had any inside baseball, the non-stationarity and friction just kill you on trying to get fancy as opposed to just nailing it on the fundamentals.

Extreme execution quality is a game, people make money in both traditional liquidity provision and agency execution by being fast as hell and managing risk well.

Individual signals that are individually somewhat mundane but composed well via straightforward linear-ish regressions is a game: people get (ever decaying) alpha out of bright ideas (and rotate new signals in).

And I’m sure that LLMs have started playing a role, there’s a legitimate capability increase in spite of the dubious production-worthiness.

But as a blind wager, I bet prop trading is about what it was 5 years ago on better gear: elite execution (no pun intended) on known-good ways to generate alpha.


I think I understood 7 words you just said mister


I think what he is saying is.

1. Your automated system should be as fast as possible.

2. Stick with known, basic fundamental strategies.

3. Try new ideas around how to give those same strategies more predictive power (signal).

#1 is straight technical execution.

#3 is constantly evolving.

Is how I understood this.

And as sort of an afterthought I guess the better you are at #1 the less good you need to be at #3 and the worse you are at #1 the better you need to be at #3?


>People didn't stop working on this in 1989 - they realised they can make lots of money doing it and do it privately.

Mind elaborating?


Speaking for myself and likely others with similar motivations, yes we can "figure it out" and publish something to show our work and expand the field of endeavor with our findings - OR - we can figure something profitable out on our own and use our own funds to trade our strategies with our own accounts.

Anyone who has figured out something relatively profitable isn't telling anyone how they did it.


> Anyone who has figured out something relatively profitable isn't telling anyone how they did it.

Corollary: someone who is selling you tools or strategies on how to make tons and tons of money, is probably not making tons and tons of money employing said tools and strategies, but instead making their money by having you buy their advice.


I think I could probably make more money selling a tool or strategy that consistently, reliably makes ~2% more than government bonds than I could make off it myself, with my current capital.


You can't do it because there are lots of fraudulent operators in the space. Think about it: someone comes up to you offering a way to give you risk-free return. All your ponzi flags go up. It's a market for lemons. If you had this, the only way to make it is to raise money some other way then redirect it (illegal but you'll get away with it most likely), or to slowly work your way up the ranks proving yourself till you get to a PM and then have him work your strat for you.

The fact that you can't reveal how means you can't prove you're not Ponzi. If you reveal how, they don't need you.


It's been done again and again by funds. A new fund or fund company convinces someone with some name and reputation to come work for them and they become the name and reputation that gives some credibility and sells the new fund. Clients start by putting in just a little money and more later. Nobody knows the hard details outside of the new firm. Sometimes it goes nowhere and nobody hears of the thing again, sometimes it works out and they let nobody unaware.


> The fact that you can't reveal how means you can't prove you're not Ponzi. If you reveal how, they don't need you.

This is why I am wary of all those +10 minute YT vids telling you how you can't make significant amounts of money quickly or reliably in a short amount of time with very limited capital.


Seems like the money here would be building a shiny, public facing version of the tool behind a robust paywall and build a relationship with a few Broker Dealer firms who can make this product available to the Financial Advisors in their network.

If you were running this yourself with $1M input capital, that'd be $20k/year per 1M of input - so $20K is a nice number to try and beat selling a product that promulgates a strategy.

But you're going to run into the question from people using the product: "Yeah - but HOW DOES IT WORK??!!!" and once you tell them does your ability to get paid disappear? Do they simply re-package your strategy as their own and cease to pay you (and worse start charging for your work)? Is your strategy so complicated that the value of the tool itself doing the heavy lifting makes it sticky?

Getting people to put their money into some Black Box kind of strategy would probably be challenging - but Ive never tried it - it may be easier than giving away free beer for all I know. Sounds like a fun MVP effort really. Give it a try - who knows what might happen.


As fas as I know the more people use the strategy the worse it performs, the market is not static, it adapts. Other people react to the buy/sell of your strategy and try to exploit the new pattern.


This is an interesting observation in combination with the popular pension strategy to continually buy index funds regardless of performance.


The average return from index funds is the benchmark that all those others are trying to beat but all the competitors trying to beat the average have a tendency to push successful strategies towards the average.


It only works until most people do it


If you can prove it works you won't have any difficulty raising capital.


Or just sell it to exactly one buyer with a lot of capital to invest.


That hypothetical person or organization already has an advisor in charge of their money at the smaller end or an entire private RIA on the Family Office side of things. This approach is a fools errand.


Lol try it and get back to us.


Well, see, I don't actually have a method for that. But if I did, I think my capital is low enough that I'd have more success selling it to other people than trying to exploit it myself, since the benefit would be pretty minimal if I did it with just my own savings, but could be pretty dramatic for, say, banks.


Strats tend to have limits. What works for you may fall apart with large amounts of capital. Don't discount compound interest. $10,000 compounding 30% over 20 years is 2 million without any additional capital.


Absolutely correct - and more over - when you do sit someone down (in my case, someone with a "superior education" in finance compared to my CS degree) and explain things to them, they simply don't understand it at all and assume you're crazy because you're not doing what they were taught in Biz School.


Why not then publish the strategies once outmoded, or are they in fact published? Can I go see somewhere what strategies big funds used in the 90s to make bank, which presumably no longer offer a competitive advantage? The way I can go see what computer exploits/hacks used to work when they were still secret?

Maybe it's just what I know, but I can't help but think the "strategies" are a lot like security exploits--some cleverness, some technical facility, but mainly the result of staring at the system for a really long time and stumbling on things.


> Why not then publish the strategies once outmoded

Because then your competition knows which strategies don't work, and also what types of strategies you work on.

Don't leak information.


Why not? Because you won't know what of your strategies is outmoded by something new because that group is not publishing their strategy, which is like yours but on steroids, either.

And then everything regresses to the Dark Forest game theory.


Wouldn't publishing also influence the performance itself because it would also make an impact on the data? And if you'd calculate that in and the method is spreading, wouldn't that in turn have to be calculated in also, which would lead to a spiral?


At a SciPy meeting where someone in finance was presenting an intro on some tools, someone asked if they ever contribute code to those open source projects. Their answer was "Yes, but only after we've stopped making money with them."


Seems simple: Why share your effective strategies in an industry full of competition and those striving to gain a competitive edge?


> Mind elaborating?

I am assuming, he/she minds a lot.


Do you use llama 3 for your work?


No hedge fund registered before the last 2 weeks will use Llama3 for their "prod work" beyond "experiments".

Quant trading is about "going fast" or "being super right", so either you'd need to be sitting on some huge llama.cpp/transformer improvement (possible but unlikely) or its more likely just some boring math applied faster than others.

Even if they are using a "LLM", they wont tell you or even hint at it - "efficient market" n all that.

Remember all quants need to be "the smartest in the world" or their whole industry falls apart, wait till you find out its all "high school math" based on algo's largely derived 30/40 years ago (okay not as true for "quants" but most "trading" isn't as complex as they'd like you/us to believe).


Well I work in prop trading and have only ever worked for prop firms- our firm trades it's own capital and distributes it to the owners and us under profit share agreements - so we have no incentive to sell ourselves as any smarter than the reality.

Saying it's all high school math is a bit of a loaded phrase. "High school math" incorporates basically all practical computer science and machine learning and statistics.

If I suspect you could probably build a particle accelerator without using more math than a bit of calculus - that doesn't make it easy or simple to build one.

Very few people I've worked with have ever said they are doing cutting edge math - it's more like scientific research . The space of ideas is huge, and the ways to ruin yourself innumerable. It's more about people who have a scientific mindset who can make progress in a very high noise and adaptive environment.

It's probably more about avoiding blunders than it is having some genius paradigm shifting idea.


Would you ever go off on your own to trade solo or is that something that just does not work without a ton (like 9 figures) of capital and a pretty large team?


Going solo in trading is a very different beast compared to trading at a prop firm. Yes, capital is a significant factor. The more you have, the more you can diversify and absorb losses which are inevitable in trading. However, it's not just about the capital. The infrastructure, data access, and risk management systems at a prop firm are usually far superior to what you could afford or build on your own as an individual trader.

Moreover, the collaborative environment at a prop firm can't be understated. Ideas and strategies are continuously debated, tested, and refined. This collective brainpower often leads to more robust strategies than what you might come up with on your own.

That said, there are successful solo traders, but they often specialize in niche markets where they can leverage unique insights or strategies that aren't as capital intensive. It's definitely not for everyone and comes with its own set of challenges and risks.


It's like any other business, there are factors of production that various actors will have varying access to, at varying costs.

A car designer still needs a car factory of some sort, and there's a negotiation there about how the winnings are divided.

In the trading world there are a variety of strategies. Something very infra dependent is not going to be easy to move to a new shop. But there are shops that will do a deal with you depending on what knowledge you are bringing, what infra they have, what your funding needs are, what data you need, and so on.


> It's probably more about avoiding blunders than it is having some genius paradigm shifting idea.

I too believe this is key towards successful trading. Put in other words, even with an exceptionally successful algorithm, you still need a really good system for managing capital.

In this line of business, your capital is the raw material. You cannot operate without money. A highly leveraged setup can get completely wiped out during massive swings - triggering margin calls and automatic liquidation of positions at the worst possible price (maximizing your loss). Just ask ex-billionaire investor/trader Bill Hwang[1].

1. https://www.bloomberg.com/news/features/2021-04-08/how-bill-...


Here is a simple way to think about it. The markets follow random walk and there is a 50/50 chance of being right or wrong. If you can make more when you are right, and lose less when you are wrong, you are on your way to being profitable.


>Saying it's all high school math is a bit of a loaded phrase. "High school math" incorporates basically all practical computer science and machine learning and statistics.

Im responding to the comment "do use llama3" not "breakdown your start"

> Very few people I've worked with have ever said they are doing cutting edge math - it's more like scientific research . The space of ideas is huge, and the ways to ruin yourself innumerable. It's more about people who have a scientific mindset who can make progress in a very high noise and adaptive environment.

This statement is largely true of any "edge research", as I watch the loss totals flow by on my 3rd monitor I can think of 30 different avenues of exploration (of which none are related to finance).

Trading is largely high school Math, on top of very complex code, infrastructure, and optimizations.


Do you work for rentech?


> but most "trading" isn't as complex as they'd like you/us to believe

I know nothing about this world, but with things like "doctor rediscovers integration" I can't help but wonder if it's not deception but ignorance - that they think it really is where math complexity tops out at.


Drs rediscover integration is about people stepping far outside their field of expertise.

It is neither deception or ignorance.

It's the same reason some of the best physics students get PhD studentships where they are basically doing linear regression on some data.

Being very good at most disciplines is about having the fundamentals absolutely nailed.

In chess for example, you will probably need to get to a reasonably high level before you will be sure to see players not making obvious blunders.

Why do tech firms want developers who can write bubble sort backward in assembly when they'll never do anything that fundamental in their career? Because to get to that level you have to (usually) build solid mastery of the stuff you will use.

Trading is truly a complex endeavour - anybody who says it isn't has never tried to do it from scratch.

Id say the industry average for somebody moving to a new firm and trying to replicate what they did at their old firm is about 5%.

Im not sure what you'd call a problem where somebody has seen an existing solution, worked for years on it and in the general domain, and still would only have a 5% chance of reproducing that solution.


> Drs rediscover integration is about people stepping far outside their field of expertise.

> It is neither deception or ignorance.

How is it not ignorance of math?


> Being very good at most disciplines is about having the fundamentals absolutely nailed.

> In chess for example, you will probably need to get to a reasonably high level before you will be sure to see players not making obvious blunders.

To extend the chess analogy, having the fundamentals absolutely nailed is critical at even a mid-level, because the payoff/effort ratio in avoiding blunders/mistakes is much higher than innovating or being creative.

The process of getting to a higher level involves rote learning of common tactics so you can instantly recognize opportunities, and then eventually learning deep into "opening theory" which is memorizing 10 starting moves + their replies because people much better than you have written lengthy books on the long-term ramifications of making certain moves. You're learning a vast repertoire of "existing solutions" so you can reproduce them on-demand, because those solutions are battle-tested to not have weaknesses.

Chess is a game where the amount you have to lose by being wrong is much higher than what you gain by being right. Fields where this is the case want to ensure to a greater extent that people focus on the fundamentals before they start coming up with new ideas.


write bubble sort backward in assembly

you mean backporting a high-level implementation to assembly? Or is writing code "backward" some crazy challenge interviewees have to do now?


Spell the assembly backwards out loud with no prior notes while juggling knives (shows boldness in the way you approach problems!) and standing on a gymnastics ball (shows flexibility and well-roundedness)...


> Id say the industry average for somebody moving to a new firm and trying to replicate what they did at their old firm is about 5%.

Because 95% of experienced candidates in trading were fired or are trying to scam their next employer.

“Oh, yeah, my <insert HFT pipeline or statarb model> can do sharpe <random int 1 to 10> for <random int 10 to 100> million pnl per year. Trust me bro”. Fucking annoying


Obviously not true. The deals for most of these set ups are team founders/pms are paid mostly by profit share. So the only scam is scamming yourself into a low salary position for a couple years till they fire you.

Orders of magnitude more leave their jobs of their choosing than are fired.


> The deals for most of these set ups are team founders/pms are paid mostly by profit share.

These PMs are not the ones job hopping every year.

And 95% of interview candidates are not PMs.

> So the only scam is scamming yourself into a low salary position for a couple years till they fire you.

200k-300k USD salary is not low.

And 1 year garden leave / non compete? That’s literally 0.5M over 2 years for doing jack shit.

This is very appealing for tech SWEs or MBA product managers who are all talk and no walk.

But even with profit share / pnl cut, many firms pay you a salary, even before you turn a profit. It eventually gets deducted when you turn a profit.

> Orders of magnitude more leave their jobs of their choosing than are fired.

Hedge fund, maybe. Prop trading, no.


They hire people who know that maths doesn't "top out here", so they can point to them and say "look at that mathematicians/physicists/engineers/PHD's we employ - your $20Bn is safe here". Hedge funds aren't run by idiots, just a different kind of "smart" to an engineer.

The engineers are are incredibly smart people, and so the bots are "incredibly smart" but "finance" is criticised by "true academics" because finance is where brains go to die.

To use popular science "the three body problem" is much harder than "arb trade $10M profitably for a nice life in NYC", you just get paid less for solving the former.


It is just a different (applied) discipline.

It's like math v engineering - you can come up with some beautiful pde theory to describe this column in a building will bend under dynamic load and use it to figure out exactly the proportions.

But engineering is about figuring out "just make its ratio of width to height greater than x"

Because the goal is different - it's not about coming up with the most pleasing description or finding the most accurate model of something. It's about making stuff in the real world in a practical, reliable way.

The three body problem is also harder than running experiments in the LHC or analysing Hubble data or treating sick kids or building roads or running a business.

Anybody who says that finance is where brains go to die might do well to look in the mirror at their own brain. There are difficult challenges for smart people in basically every industry - anybody suggesting that people not working in academia are in some way stupider should probably reconsider the quality of their own brain.

There are many many reasons to dislike finance. That it is somehow pedestrian or for the less clever people is not true. Nobody who espouses the points you've made has ever put their money where there mouth is. Why not start a firm, making a billion dollars a year because you're so smart and fund fusion research with it? Because it's obviously way more difficult than they make out.


> The three body problem is also harder than running experiments in the LHC or analysing Hubble data or treating sick kids or building roads or running a business

Not that it's particularly relevant to this discussion but the three body problem is easy. You can solve it numerically on a laptop with insane precision (much more precisely than would be useful for anything) or also write down an analytic solution (which is ugly and useless because it converge s extremely slowly, but still. See wikipedia.org/wiki/Three-body_problem).


From your link:

> Unlike the two-body problem, the three-body problem has no general closed-form solution,[1] and it is impossible to write a standard equation that gives the exact movements of three bodies orbiting each other in space.

This seems like the opposite of your claim.


The crucial parts of that are "closed-form" and "standard". The analytic solution is "non-standard" because it involves the kind of power series that nobody knows or cares about (because they are only about 100 years old and have no real useful applications in engineering).

A similar claim is that roots of polynomials of degree 5 (and over) have no "general closed form solution" (with, as usual, the implicit qualification: "in terms of functions I'm currently comfortable with because I've seen them a lot"). That doesn't mean it's a difficult problem.

The two problems have in common that they are significantly harder than their smaller versions (two bodies, or degree 4). Historically, people spent a lot of time trying to find solutions for the larger problems in terms of the same functions that can be used to solve the smaller problems (conic sections, radicals). That turned out to not be possible. This is the historical origin of the meme "three body problem is unsolvable".


Ill probably go look this up, but do you mean functions of a higher type than normal powers like eg. Tetration, or something more complicated (am I even on the right track?)


I mean functions defined by power series (just like sin(x) is defined in analysis courses). For the three body problem, see http://oro.open.ac.uk/22440/2/Sundman_final.pdf (Warning, pdf!). This is what Wikipedia cites when talking about the solution to the three body problem. The document gives a lout of historical context.

For polynomial roots, see wikipedia.org/wiki/Elliptic_function.


> ... suggesting that people not working in academia are in some way stupider ...

My interpretation of "finance is where brains go to die" is more along the lines of finance being less good for society at large compared to pure science. Like if someone invents something new and useful in a lab for their phd, then they go find a job in finance. The brain died because it was onto something and then abandoned it for being a cog in the machine.


Claiming that being smart isn't required for trading is not the same as claiming that people doing trading aren't smart.

(Note that I personally have no opinion on this topic, as I'm not sufficiently informed to have one.)


I was specifically addressing the "being smart isn't necessary for trading".

The op is making some implication across numerous posts that it's all basically a big con and it's all very simple.

It is like claiming you don't need to be rocket scientist to go to the moon because they just use metal and screws.

The individual parts might be simple in isolation. But it is the complexity of conducting large scale, large scope research in an environment that gives you limited feedback and will adapt to your own behaviour changes that is where the smarts are needed.

OP seems to not understand the inherent difficult of doing any research.

Almost anybody could be taught to make a simple circuit and battery from some basic raw materials. The fact it is simple and easy now we know the answer does not mean it was simple or easy to discover. Some of the greatest minds dedicated their entire lives to discovering things that now most 10 years olds understand. That doesn't imply you only need to have the intellect of a 10 year old to make fundamental breakthroughs in science.

Working in quant trading is almost pure research - and so it requires a certain level of intellect - probably at least the intellect required to pursue a quantitative PhD successfully (not that they need the PhD but they need the capacity to be able to do one).


You misunderstand the quote. It’s where brains go to die from a societal perspective. It might be stimulating and difficult for the individual but it’s useless to science.


Many advancements in computer science have come from the finance world.

e.g. LMAX Disruptor was a pretty impressive concurrency library a decade ago:

https://lmax-exchange.github.io/disruptor/


Who is using it besides LMAX?


Please cite your references, lest you run afoul of the lulgodz:

https://diabetesjournals.org/care/article/17/2/152/17985/A-M...


It’s impressive how incorrect so much of this information is. High frequency trading is about going fast. There is a huge mid and low freq quant industry. Also most quant strategies are absolutely not about being “super right”…that would be the province of concentrated discretionary strategies. Quant is almost always about being slightly more right than wrong but at large scale.

What algos are you referring to derived 30 or 40 years ago? Do you understand the decay for a typical strategy? None of this makes any sense.


Quantitative trading is simply the act of trading on data, fast or slowly, but I'll grant you for the more sophisticated audience there is a nuance between "HFT" and "Quant" trading.

To be "super right" you just have to make money over a timeline, you set, according to your own models. If I choose a 5 year timeline for a portfolio, I just have to show my portfolio outperforming "your preferred index here" over that timeline - simple (kind of, I ignore other metrics than "make me money" here).

Depending on what your trading will depend on which algo's you will use, the way to calculate the price of an Option/Derivative hasn't changed in my understanding for 20/30 years - how fast you can calculate, forecast, and trade on that information has.

My statement wont hold true in a conversation with an "investing legend", but to the audiance who asks "do you use llama3" its clearly an appropriate response.


I don't really understand your viewpoint - I assume you don't actually work in trading?

Aside from the "theoretical" developments the other comment mentioned, your implication that there is some fixed truth is not reflected in my career.

Anybody who has even a passing familiarity with doing quant research would understand that black scholes and it's descendants are very basic results about basic assumptions. It says if the price is certain types of random walk and also crucially a martingale and Markov - then there is a closed form answer.

First and foremost black scholes is inconsistent with the market it tries to describe (vol smiles anyone??), so anybody claiming it's how you should price options has never been anywhere near trading options in a way that doesn't shit money away.

In reality the assumptions don't hold - log returns aren't gaussian, the process is almost certainly neither Markov or martingale.

The guys doing the very best option pricing are building empirical (so not theoretical) models that adjust for all sorts stuff like temporary correlations that appear between assets, dynamics of how different instruments move together, autocorrelation in market behaviour spikes and patterns of irregular events and hundreds of other things .

I don't know of any firm anywhere that is trading profitably at scale and is using 20 year old or even purely theoretical models.

The entire industry moved away from the theory driven approach about 20 years ago for the simple reason that is inferior in every way to the data driven approach that now dominates


There's no way this person works as a quant. Almost every statement they've made is wrong...


> the way to calculate the price of an Option/Derivative hasn't changed in my understanding for 20/30 years

That’s not true. It is true that the black scholes model was found in the 70s but since then you have

- stochastic vol models

- jump diffusion

-local vol or Dupire models

- levy process

- binomial pricing models

all came well After the initial model was derived.

Also a lot of work in how to calculate vols or prices far faster has happened.

The industry has definitely changed a lot in the past 20 years.


Very few of the fancy models are actually used. Dupire's non parametric model has been the industrial work horse for a long time. Heston like SV's and Jump diffusions promised a lot and did not work in practice (calibration, stability issues). Some form of local stochastic models get used for certain products. In general, it is safe to say that Black-Scholes and its deterministic extension local vol have held up well.


Not only that, but Dupire’s local vol, stochastic vol (Heston in rates, or on the equity side models that combine local vol with a stoch vol component to calibrate to implied vols perfectly) and jump diffusion were basically in production 15 years ago.

Since the GFC it’s not about crazy new products (on derivatives desks), but it’s about getting discounting/funding rates precisely right (depending on counterparty, collateral and netting agreements, onshore/offshore, etc), and about compliance and reporting.


> the way to calculate the price of an Option/Derivative hasn't changed in my understanding for 20/30 years

Not true. Most of the magic happens in estimating the volatility surface, BSM's magic variable. But I've also seen interesting work in expanding the rates components. All this before we get into the drift functions.


While the industry has changed substantially since the GFC, all foundational derivatives models were basically in place back then.


> all foundational derivatives models were basically in place back then

In vanilla equity options, sure. But that’s like saying we solved rockets in WWII. The foundational models were derived by then; everything that followed was refinement, extension and application.


> how fast you can calculate , forecast, and trade on that information has.

How you can calculate fast, forecast, and trade on that information has

There. Fixed it for you. ;)


Leveraging "hidden" risk/reward asymmetries is another avenue completely that applies to both quant/HFT, adding a dimension that turns this into a pretty complex spectrum with plenty of opportunities.

The old joke of two economists ignoring a possible $100 bill on the sidewalk is an ironic adage. There are hundreds of bills on the sidewalk, the real problem is prioritizing which bills to pick up before the 50mph steamroller blindsides those courageous enough to dare play.


Algo trading is certainly about speed too though, but it's not HFT which is literally only a out speed and scalping spreads. It's about the speed of recognizing trends and reacting too them before everyone else realizes the same trend and thus altering the trend.

It's a lot like quantum mechanics or whatever it is that makes the observation of a photon changes. Except with the caveat that the first to recognize the trend can direct it's change (for profit).


The math might not be complicated for a lot of market making stuff but the technical aspects are still very complicated.


>Quant trading is about "going fast" or "being super right",

Going fast means scalping?


Is there any learning resources that you know of?


llama3 is all high school math too.


Was going to point out the same. Glad to have the paper to read but I don't think the findings are significant.


I agree this isn't earth shattering, but I think the benefit here is that it's a general solution instead of one trained on financial statements specifically.


That is not a benefit. If you use a tool like this to try to compete with sophisticated actors (e.g. all major firms in the capital markets space) you will lose every time.


We come up with all sorts of things that are initially a step backwards, but that lead to eventual improvement. The first cars were slower than horses.

That's not to suggest that Renaissance is going to start using Chat GPT tomorrow, but maybe in a few years they'll be using fine tuned versions of LLMs in addition to whatever they're doing today.

Even if it's not going to compete with the state of the art models for something, a single model capable of many things is still useful, and demonstrating domains where they are applicable (if not state of the art) is still beneficial.


It seems to me that LLMs the metaphorical horse and specialized algorithms are the metaphorical car in this situation. A horse is a an extremely complex biological system that we barely understand and which has evolved many functions over countless iterations, one of which happening to be the ability to run quickly. We can selectively breed horses to try to get them to run faster, but we lack the capability to directly engineer a horse for optimal speed. On the other hand, cars have been engineered from the ground-up for the specific purpose of moving quickly. We can study and understand all of the systems in a car perfectly, so it's easy to develop new technology specialized for making cars go faster.


Far too much in the way of "maybe in a few years" LLM prediction relies on the unspoken assumption that there will not be any gains in the state of the art in the existing, non-LLM tools.

"In a few years" you'd have the benefit of the current, bespoke tools, plus all the work you've put into improving them in the meantime.

And the LLM would still be behind, unless you believe that at some point in the future, a radically better solution will simply emerge from the model.

That is, the bet is that at some point, magic emerges from the machine that renders all domain-specialist tooling irrelevant, and one or two general AI companies can hoover up all sorts of areas of specialism. And in the meantime, they get all the investment money.

Why is it that we wouldn't trust a generalist over a specialist in any walk of life, but in AI we expect one day to be able to?


> That is, the bet is that at some point, magic emerges from the machine that renders all domain-specialist tooling irrelevant, and one or two general AI companies

I have a slightly more cynical take: Those LLMs are not actually general models, but niche specialists on correlated text-fragments.

This means human exuberance is riding on the (questionable) idea that a really good text-correlation specialist can effectively impersonate a general AI.

Even worse: Some people assume an exceptional text-specialist model will effectively meta-impersonate a generalist model impersonating a different kind of specialist!


> Even worse: Some people assume an exceptional text-specialist model will effectively meta-impersonate a generalist model impersonating a different kind of specialist!

Eloquently put :-)


Specialists exist because the human generalist can no longer possibly learn and perfect all there is to learn in the world not because the specialist has magic powers the generalist does.

If there were some super generalist that could then the specialist would have no power.


The technocrat thinks that the AI is that generalist and will impose it on you whether you want it or not:

"I didn't violate a red light. I wasn't even driving, the AI was!"

"The AI said you did, that's 50,000 yuan please."


>Why is it that we wouldn't trust a generalist over a specialist in any walk of life, but in AI we expect one day to be able to?

The specialist is a result of his general intelligence though.


If you don't look, you will never see.


agreed. most people can't create a custom tailored finance statement model. but many people can write the following sentence: "analyze this financial statement and suggest a market strategy." and if that sentence performs as well as an (albeit old) custom model, and is likely to have compound improvements in its performance over time with no changes to the instruction sentence...


But it can't come up with a particularly imaginative strategy; it can only come up with a mishmash of existing stuff it has seen, equivocate, or hallucinate a strategy that looks clever but might not be.

So it all needs checking. It's the classic LLM situation. If you're trained enough to spot the errors, the analysis wouldn't take you much time in the first place. And if you're not trained enough to spot the errors...

And let's say it does work. It's like automated exchange betting robots. As soon as everyone has access to a robot that can exploit some hidden pattern in the data for a tiny marginal gain, the price changes and the gain collapses.

So if everyone has the same access to the same banal, general analysis tools, you know what's going to happen: the advantage disappears.

All in all, why would there be any benefits from a generalised model?


"buy and hold the S&P 500 until you're ready to retire"


> "buy and hold the S&P 500 until you're ready to retire"

That is bad advice.

VGT Vanguard Technology ETF has outperformed S&P 500 over the past 20 years.

All the people who say “VTSAX and chill” disappeared in the past 3-4 years because their cherished total passive index fund is no longer the best over long horizons. And no, the markets are not efficient.


> VGT Vanguard Technology ETF

Given the techie audience here, I want to caution that investing in the same industry as your job is a kind of anti-diversification.

A really severe example would be all the people who worked at Enron and invested everything in Enron stock.

Even if your employer/investments aren't quite so fraudulent, You don't want to be in a situation where you are long-term unemployed and are forced "sell low" in order to meet immediate needs. If only one or the other is hit, you can ride things out more effectively.


Need to invest in VT not VGT. Markets are efficient.


No


But I bet it uses way more energy.


The infamous 1/N portfolio comparison is missing. 1/N puts to shame many strategies.


For dimly-lit environments, the human eye's peak sensitivity for scoptopic vision is around 498nm (https://en.wikipedia.org/wiki/Scotopic_vision) which is blueish-green.


> Scotopic vision occurs at luminance levels of 10−3[5] to 10−6[citation needed] cd/m2

They should have more than enough brightness to be clearly visible in those light conditions on almost any visible wavelength they chose for the laser, so it's weird if they optimize for this instead of the outdoor performance.


And yet, yellow is the most easily visible color at night...

However, apparently I'm the idiot because red, not blue is hardest to see due to various reasons. At night we are all but blind to it.


It looks like Llama 2 7B took 184,320 A100-80GB GPU-hours to train[1]. This one says it used a 96×H100 GPU cluster for 2 weeks, for 32,256 hours. That's 17.5% of the number of hours, but H100s are faster than A100s [2] and FP16/bfloat16 performance is ~3x better.

If they had tried to replicate Llama 2 identically with their hardware setup, it'd cost a little bit less than twice their MoE model.

[1] https://github.com/meta-llama/llama/blob/main/MODEL_CARD.md#...

[2] https://blog.ori.co/choosing-between-nvidia-h100-vs-a100-per...


They mention the cost was ~80,000k USD so for 32,256 hours it comes to ~2.48$ an hour. Amazing how cost effective the compute actually is.


I was paying $1.1 for A100 hour more than a year ago. $2.48 is crazy expensive.


It was for a 96 X H100 cluster. Their provider was exabits.ai which bills itself as a decentralised computing marketplace.


They cite straight through estimators in the previous work with many of the same authors on (actual binary) BitNet


Looks like it is possible to download it locally, but as far as I can tell you have to manually copy all the various files from the Artifacts folder individually


Could you post a magnet link?


What makes you think they have a magnet link?


"Could you...?" is not necessarily a question about one's ability. It is sometimes a colloquial request to do something, similar to "Would you mind...?"


Never mind, it is already on HuggingFace: https://huggingface.co/microsoft/phi-2/tree/main


The code generation tool better be called "Tcl me NeMo"


Author here- I'm sorry about the camera controls! Happy to accept pull requests that replace it with something more sensible

The original idea was to be able to navigate around with just arrow keys (conceptually by turning yourself around in place and being able to walk back and forward).


This is insanely cool!

If you integrate this with ThreeJS you'd have a lot of control options for free!

Whilst you're here, I have a question for you: It seems like you don't render read gaussians (I see sharp edges in many cases). Is this a bug on my side or is this an optimization made to be able to run fast? I created an issue to discuss if you prefer https://github.com/antimatter15/splat/issues/2


If you do an update, consider this a vote for WASD + mouselook. It's a ubiquitous scheme among everyone with an interest in real time computer graphics


No need to apologize, it's a minor thing! Anyway really neat stuff and I love seeing it here.


Seems interesting that there appears to have been a patent application for LK-99 (https://patents.google.com/patent/KR20230030188A/en?oq=WO202...) filed two years ago in August 2021 (a year earlier than the article suggests).

If true, it seems wild to sit on this kind of discovery for over two years.

Update: Seems like there might be even more history given the name LK-99 apparently comes from the names of its discoverers Dr. Lee and Dr. Kim, and the year of its discovery, 1999 (https://kr.linkedin.com/in/ji-hoon-kim-03508b80).


quick googling appears to indicate that it was produced in trace amounts as a consequence of other experiments at that time and only was investigated in and of itself in larger amounts more recently? Awaiting more info from better informed people.


Naive question: is that the original content of the application from 2021, or could it have been updated later? Is it the same material as LK-99 in the published paper?


I've recently been enjoying OrbStack (https://orbstack.dev/), which I've found easier to get started with than Lima, starts up faster, and automatically mounts volumes so you can access things from Finder

It's unfortunately not fully open source


Thanks for mentioning OrbStack!

I thought I'd elaborate on a few specific ways OrbStack improves on apps and issues mentioned elsewhere in this thread.

- Network hangs and connection issues: I wrote a new virtual network stack in userspace and made sure to address issues plaguing other virtual networking solutions (VPN compat, DNS failures, etc.).

- Inexplicable errors: Can't say it's perfect, but I do take every issue seriously. For example, sending OOM kill notifications on macOS instead of silently killing processes.

- Running x86 containers: Builtin fixes and workarounds for many Rosetta bugs. Since the bugs are on Apple's side, they affect all other apps (as of writing) and often cause issues when running linux/amd64 containers. If you're used to slow QEMU emulation, then well, it should be a major improvement.

- Multipass: OrbStack can run 15 different distros (https://docs.orbstack.dev/machines/distros#linux-distributio...). It's not limited to Ubuntu, and cloud-init support is also planned for most distros.

- UTM: OrbStack doesn't do graphics yet, so you'll have to stick to UTM for GUI, but the CLI and other integration is designed to be on WSL2 level. It also boots much faster: baseline in ~1 sec, each machine in ~250 ms, total ~1.2 sec.

- Bind mounts and file sharing: It uses VirtioFS which isn't affected by sshfs consistency issues, plus caching and optimizations to give it an edge.

- Colima: https://docs.orbstack.dev/compare/colima


Thank you for your work on OrbStack. Just tried it after reading about in this thread and it looks really great so far, both as Docker replacement and absolutely delightful to launch full Linux VMs.

Noticed you are using a very recent kernel, Linux ubuntu 6.3.12-orbstack, which is so great to test latest revisions of Linux system calls (eg. io_uring) locally, compared to Docker old 5.x kernels which I gave up figuring out how to upgrade.

Any way to select a specific kernel version for VM or container? That would be a killer feature for regression testing.


Glad to hear that!

There's currently no support for changing the kernel version. I think it may not be feasible to support many versions because a lot of OrbStack's improvements are very closely tied to the kernel, and maintaining the patches for multiple versions wouldn't be worth the work. Outside of regression testing, it's rare that anything breaks and in the event that such changes occur upstream, I try to hold off on updating until they're fixed.

Are you mainly interested in 5.15/6.1 LTS or other specific versions? Having an alternate LTS kernel (for a total of two options) might be a possibility in the (long term) future.


Having the option of an "older" LTS kernel (say 5.15 or 5.10) would be useful so as to match the kernel used in a lot of commonly used cloud images (including Ubuntu 22.04 LTS and Amazon Linux 2).


i love orbstack, like actually love it to the point where now i have to complain about its issues a lot, because i cant use anything else still has a lot of things to imrpove on in general, which are annoying when you encounter them suddenly. most frustrating for me is the "unlimited permission" setup. Sometimes thats useful as hell. Other times i would indeed like to run a servive using the convinience of orb stack, but you know, a little more sandboxed...I loved /Users/yyy/Documents being tied to my iCloud Drive. Had to disable that once i started using orbstack for personal reasons. orbctl / orb really need much more explanation of a lot of the options. especially whatever the config options do.

or let me bind to other network interfaces not just the one :(

but i love orbstack. hard too go back after finding it. impossible actually. havent been at the computer in like two weeks so mabye its changed by now


Totally fair! There are definitely still limitations.

Support for "isolated" machines that don't have bind mounts is planned (https://github.com/orbstack/orbstack/issues/169). This is actually mostly implemented internally, but I'm not exposing it until a few remaining security gaps are plugged or it would just give a false sense of security.

If you meant binding servers to specific interface IPs, it might be possible one day but it's very challenging to implement as all of the host's IPs need to be assigned to an interface in the guest and managed accordingly. If you meant connecting machines directly to your LAN, it'll be supported eventually but it's low priority due to unavoidable compatibility issues. https://github.com/orbstack/orbstack/issues/342


Also would be great to know which "orb" flags you find confusing. Config options are now documented: https://docs.orbstack.dev/settings


Orbstack is nice to use but it’s not open source and who knows what they’re going to charge for it, once VC gets its dirty hands in there you know it’ll become expensive.


(dev here) The info has been on Twitter for a while but I've just added it to the docs: https://docs.orbstack.dev/faq#free


Thanks for sharing this. I’d stayed away from it worried that it’ll be expensive, but I’ll try it now as its free for personal use.


It is a really nice looking product but I wish they would set down a firmer plan for how they are going to charge for it.


I honestly think this is a feature and not a bug. The FAQ shows an attention to detail for the trade-offs of various pricing models[0]. It's clear that Danny cares about monetizing the project in a thoughtful way.

I moved away from Docker Desktop to colima for a couple years and would not pay for Docker Desktop, but after a few weeks of swapping back to OrbStack now that it's public beta, I can definitely see myself paying. OrbStack just works and gets out of the way.

[0]: https://docs.orbstack.dev/faq#free


This is a couple months old, but is a reasonably concrete proposal: https://twitter.com/OrbStack/status/1656326409995055104


(dev here) Yes, this is still the proposal that I'm planning to move forward with. It's always possible that it doesn't work out and will need to be changed, but I think it's a pretty solid start.

Just updated the docs with this info: https://docs.orbstack.dev/faq#free


Nice one, thanks for that.


That jumped right out at me too


I gave up on Docker for MacOS a year ago. Couple months ago I found OrbStack, being using "Docker" again on Mac since then. It's fantastic.


Big fan of OrbStack here, too.


Thanks for mentioning this, even though it’s not open source. I wanted to check the business model and pricing, and found the FAQ on pricing refreshing to see. [1] Unfortunately, a $5 per user per month kind of fee may be out of reach for personal use. So I hope the free plan for personal use stays (or they come up with an even cheaper plan).

[1]: https://docs.orbstack.dev/faq#free


The free plan for personal use should be here to stay. What's not set in stone yet is whether there will be a Pro plan for more advanced features, and if so, what said advanced features would be (likely cloud services / Kubernetes). But I'd expect the core Docker and Linux machines functionality to stay free.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: