AI has transformed text, images, and code—but structured data remains overlooked. Prior Labs is an early-stage startup building Foundation Models for tabular data, unlocking a new AI modality with the potential to transform science, finance, healthcare, and data science itself. Our model, published in Nature, is already state-of-the-art for small datasets, and we’re scaling this into a step-change for data science.
We’re backed by Balderton, XTX Ventures, and leaders from Hugging Face, Black Forest Labs, DeepMind, DataRobot, .. Our team includes world-class researchers and engineers from top AI labs, and we’re growing fast.
We're hiring founding engineers & builders to help define this new category:
Software Engineer – Build scalable infrastructure and APIs to integrate our models into real-world applications.
Product Manager – Define and execute the vision for AI-native tools for structured data.
Developer Relations – Grow the developer community, drive adoption, and showcase use cases.
ML Engineer – Optimize and scale our foundation models for structured data.
Location: On-site in Berlin or Freiburg (we believe in building together).
Why Join? Competitive salary + equity, shape the future of foundation models for structured data from day one.
Thanks a lot! Currently have an issue on documenting how to use for more samples at https://github.com/PriorLabs/TabPFN/issues/129. Will do this soon, maybe give an upvote there if it matters to you.
Yes! This makes sense from a learning perspective: More samples add additional evidence the datapoint is actually what you observed - based on one sample the model is closer to a mean regression (which would translate to more balanced class probabilities in classification).
Transformers have trouble counting repeated entries (there was a famous failure case of ChatGPT, asking it to count the number of 1s and 0s in a string). This model has some tricks to solve this.
Thanks a lot! We don't see clear artifacts for the synth data. Part of the "trick" is to keep the capacity of our model low, it has only about 11M parameters. That forces the model to "learn an in-context learning algorithm" or in other words "do in-context learning rather than in-weigthts learning".
Adding real data on top will help, agreed! The synthetic data is very broad, we started by a synth data prior that was just BNNs samples with differing sizes and thus super broad. Our new data samples functions more densely that are simpler to explain but could still sample almost any function (with the constraints that our networks aren't infinitely complex).
if you're predicting on text data, our public models don't do that, they would encode as classes. Our API (https://github.com/PriorLabs/tabpfn-client/) has experimental support.
Looks like a great use case! We have a method specifically for imputation in the tabpfn-extensions package (https://github.com/PriorLabs/tabpfn-extensions/blob/dbc3f5da...). It needs some cleaning up before I want to highlight in the notebooks and docs.
Author here! The fundamental challenge is that LLMs like O1 and Claude 3.5 simply aren't built for the unique structures of tabular data. When processing tables through LLMs, the inefficiencies quickly become apparent - tokenizing a 10,000 x 100 table as a sequence and numerical values as tokens creates massive inefficiencies.
There's some interesting work on using LLMs for tabular data (TabLLM: https://proceedings.mlr.press/v206/hegselmann23a.html), but this only works for datasets with tens of samples rather than the thousands of rows needed in real-world applications.
What o1 and other LLMs typically do is wrap around existing tabular tools like XGBoost or scikit-learn. While this works, they're ultimately constrained by these tools' limitations. We're taking a fundamentally different approach - building foundation models that natively understand tabular relationships and patterns. Our approach combines the benefits of foundation models with architectures specifically designed for tabular data structures.
Author here! The breast cancer dataset is simple and heavily saturated, so small differences between methods are expected. As you say, single-use examples can be noisy due to randomness in how the data is randomly split into training and testing sets especially for a saturated dataset like this one. Cross-validation reduces this variance by averaging over multiple splits. I just ran this below:
TabPFN mean ROC AUC: 0.9973
SVM mean ROC AUC: 0.9903
TabPFN per split: [0.99737963 0.99639699 0.99966931 0.99338624 0.99966465]
SVM per split: [0.99312152 0.98788077 0.99603175 0.98313492 0.99128102]
from sklearn.model_selection import cross_val_score
from tabpfn import TabPFNClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.svm import LinearSVC
import numpy as np
data = load_breast_cancer()
X, y = data.data, data.target
# TabPFN
tabpfn_clf = TabPFNClassifier()
tabpfn_scores = cross_val_score(tabpfn_clf, X, y, cv=5,
scoring='roc_auc')
print("TabPFN per split:", tabpfn_scores)
print("TabPFN mean ROC AUC:", np.mean(tabpfn_scores))
# SVM
svm_clf = LinearSVC(C=0.01)
svm_scores = cross_val_score(svm_clf, X, y, cv=5,
scoring='roc_auc')
print("SVM per split:", svm_scores)
print("SVM mean ROC AUC:", np.mean(svm_scores))
It's hard to communicate this properly, we should probably make sure to have a favourable example ready, but just included the simplest one!
LLMs meet AutoML: in an effort to integrate user knowledge into AutoML, our new tool CAAFE uses LLMs to generate semantically meaningful features for tabular data (and also explains them). Towards an AI assistant for human data scientists
CAAFE uses GPT-4 and a textual description of the dataset to iteratively generate Python code that creates new features and explanations for their utility. It thereby harnesses the creative power of LLMs in combination with a systematic verification process that interacts with the LLM in an iterative fashion.
While hyperparameter optimization can profit from domain knowledge as well, we believe that steps of the pipeline that are closer to the data and the user, such as feature engineering, can benefit much more from additional semantic information. This may open exciting possibilities for broadening the range of applications of #AutoML to help practitioners with more of the data science pipeline.
Executing AI-generated code requires careful consideration. We've implemented a whitelist of safe python commands, but risks remain. Also, AI can replicate or even exacerbate biases present in the training data. Much more work is needed to avoid this. Please use CAAFE cautiously and examine its generated features critically, especially with an eye on principles from algorithmic fairness.
Why not let GPT generate your features directly or simply use OpenA’s Code Interpreter? While GPT-4 is a powerful model, it's not specifically designed for ML. CAAFE steps in with a systematic verification process to ensure the generated features are useful for the task at hand, also providing feedback to the LLM.
AI has transformed text, images, and code—but structured data remains overlooked. Prior Labs is an early-stage startup building Foundation Models for tabular data, unlocking a new AI modality with the potential to transform science, finance, healthcare, and data science itself. Our model, published in Nature, is already state-of-the-art for small datasets, and we’re scaling this into a step-change for data science.
We’re backed by Balderton, XTX Ventures, and leaders from Hugging Face, Black Forest Labs, DeepMind, DataRobot, .. Our team includes world-class researchers and engineers from top AI labs, and we’re growing fast.
We're hiring founding engineers & builders to help define this new category:
Software Engineer – Build scalable infrastructure and APIs to integrate our models into real-world applications.
Product Manager – Define and execute the vision for AI-native tools for structured data.
Developer Relations – Grow the developer community, drive adoption, and showcase use cases.
ML Engineer – Optimize and scale our foundation models for structured data.
Location: On-site in Berlin or Freiburg (we believe in building together). Why Join? Competitive salary + equity, shape the future of foundation models for structured data from day one.
Apply now: https://jobs.ashbyhq.com/prior-labs
Questions? Reach out at noah@priorlabs.ai