For the last year I’ve been developing Hyperparam — a collection of small, fast, dependency-free open-source libraries designed for data scientists and ML engineers to actually look at their data.
- Hyparquet: Read any Parquet file in browser/node.js
- Icebird: Explore Iceberg tables without needing Spark/Presto
- HighTable: Virtual scrolling of millions of rows
- Hyparquet-Writer: Export Parquet easily from JS
- Hyllama: Read llama.cpp .gguf LLM metadata efficiently
CLI for viewing local files: npx hyperparam dataset.parquet
Example dataset on Hugging Face Space: https://huggingface.co/spaces/hyperparam/hyperparam?url=http...
No cloud uploads. No backend servers. A better way to build frontend data applications.
GitHub: https://github.com/hyparam
Feedback and PRs welcome!
> This stems from an industry-wide realization that model performance is ultimately bounded by data quality, not just model architecture or hyperparameters.
Generally we think of model architecture + weights (parameters) as making up the model itself, and hyperparam(s|eters) are the more relevant to how one arrives at those weights -- and for this reason are more relevant to the efficacy of training than the performance of the resultant model.