To me, this basically says "LLMs aren't pre-trained on enough 1D timeseries data" - there's a classic technique in time series analysis where you just do a wavelet or FFT on the time series and feed it into a convnet as an image, leveraging the massive pre-training on, e.g. ImageNet. This "shouldn't" be the best way to do it, since a giant network should learn a better internal representation than something static like FFT or a wavelet transform. But there's no 1D equivalent of ImageNet so it still often works better than a 1D ConvNet trained from scratch.
Same applies here. An LLM trained on tons of time series should be able to create its own internal representation that's much more effective than looking at a static plot, since plots can't represent patterns at all scales (indeed, a human plotting to explore data will zoom in, zoom out, transform the timeseries, etc.). But since LLMs don't have enough 1D timeseries pretraining, the plot-as-image technique leverages the massive amount of image pre-training.
For training AI MLPs to predict time-series data that's known to have sinusoidal behaviors (which might lead to 'reasoning' like it did in LLMs) I bet it's more efficient to first curve-fit the data onto continuous data points, and then convert to frequency domain (like you said, FFT), and then do all the training using just "Frequency Domain" datasets. So then the way the AI would "predict" (run inference) would be by spitting out Frequency Domain predictions, which have to be converted back to 'time domain' to get the 'real output'.
I'm sure the audio-processing AI systems out there are doing something like this already so it would be interesting to try to leverage that stuff by sending it "audio" that's actually just arbitrary time-series data rather than PCM of sound waves.
There are some recent foundation models pre-trained on time-series data. For example TimesFM from Google. Of course, it's not directly built for classification, and it's meant for univariate datasets, so it would take some work to adapt it to these problem domains.
It kind of feels criminal to do time-series analysis with multimodel models and not use any traditional numerical models to provide a baseline result. It's an interesting result though.
They mention using a IMU dataset that is collected using an APDM Opal.
https://www.apdm.com/wp-content/uploads/2015/05/Opal-Publica...
This publication mentions a paper on p. 5839 (p 13 of the pdf) where a single sensor on the waist (as used in the Google research) would lead to an f1 score of 0.77 if I did my math correctly. In other words, pretty close to a >1 shot plot analysis of gpt4o and gemini pro1.5.
I would also be interested how the llm's would hold up to the free-fall interrupt that's built in to some consumer grade IMU's (BMA253 for instance), anyone here with experience in this usecase?
I don't want to sound too dismissive of someone's hard work but I was kind of hoping for something more sophisticated than showing an LLM the image of a plot. Using the article's example, I would be interested in understanding causes (or even just correlations) of near falls - is it old people, or people who didn't take their vitamins, or people who recently had an illness, etc.? What's the best way of discovering these that isn't me slicing the data by X and looking at the plot.
The fact that you can show an LLM an image of a plot and it'll give you a good-enough-ish classification I think is the interesting part. It really is just prompt engineering all the way down...
There is a surprisingly common use case for "quick and dirty univariate time series forecasts" that are basically equivalent to giving a small child a pencil, and asking them to draw out the trendline. The now-deprecated Prophet model from Facebook (which was just some GAM) was often used for this. Auto-ARIMA models, ETS etc are also still really commonly used. I also see people try to use boosted trees, or deep learning stuff like DeepAR or N-BEATS etc even though it's rarely appropriate for their 1k-datapoint univariate time series, just because it gives off the impression of serious methodological work.
There are a lot of use cases in business were what's needed is just some basic reasonable-ish forecast. I actually think this new model is really neat because it completely dispenses with the pretense that we're doing some really serious and methodologically-backed thing, and we're really just looking a basic curve fit that seems pretty reasonable with human intuition.
This is not curve fitting or forecasting: it's pattern matching.
It's also a serious methodological approach. A fall on a sensor graph has a certain look to it just like an abnormality on an EKG that a human can detect. You can train multimodal models to detect these too with decent accuracy. What's methodologically unsound about that? If anything, it demonstrates you don't necessarily need a class of hyper-specific models to do pattern matching.
IMO if doing this, you should avoid text in the charts entirely (as the title can sometimes I think lead the models astray, such as the clustering title I think will bias it to find clusters even if none exist). Presuming you are the one making the chart and not just prompting with another image.
I've also heard of converting stock data into sound, to try to listen to it as music so you can sort of intuitively use the audio part of your brain to predict where the stock market will go next. It's such an obvious idea I'm sure some large investment institutions have tried this. But I bet it failed, because music tends to lock into certain notes, and jump octaves in ways that markets definitely do not!
Logically it wont be long until we all have our own micromodels instead of hedge fund managers. Trained on random factors that have seemingly no relation to anything at all, but the correlation absurdly strong from the market. With enough data collected and compute getting cheap enough such a model is certainly possible. I bet us in the peasant class won’t get to leverage it when it comes out of course.
I agree. AI is going to be (or already is) able to not only predict markets, but also uncover and plan strategies to MANIPULATE markets as well, thru both legal and illegal means.
Perhaps, Renaissance has been doing this all along. Just that they had data and compute in times (80s and 90s) when most have not heard of these things.
In the end, one needs a small edge and thousands of low correlated trades to take advantage of LLN.
If you had a million monkeys commanding substantial enough funds it doesn’t matter what you model as the market will react to your moves. Which you can then anticipate and profit further from. Show them an image of a rotting banana, they all panic sell, then the puts your orangutans bought and the calls your capuchins sold will be looking pretty.
How dare you insult monkeys. :0 lol. They have better short term memories than humans, per 2007 study by Tetsuro Matsuzawa and colleagues at the Primate Research Institute at Kyoto University in Japan.
Not sure how much or if at all anything valuable was unlocked. Given this amount of paid talent and this amount people involved, surely the amount being unlocked should be proportional, was it?
Has anyone seen an example of time series analysis via transfer learning / fine-tuning an LLM to process and predict multivariate data as xml or something? e.g. :
<speed 45>
<speed 46>
<heading 123>
<speed 47>
<speed 47>
...etc
Same applies here. An LLM trained on tons of time series should be able to create its own internal representation that's much more effective than looking at a static plot, since plots can't represent patterns at all scales (indeed, a human plotting to explore data will zoom in, zoom out, transform the timeseries, etc.). But since LLMs don't have enough 1D timeseries pretraining, the plot-as-image technique leverages the massive amount of image pre-training.