More

stonebraker · 2025-01-11T17:36:46 1736617006

Nice! This should be helpful. https://github.com/oidlabs-com/Lexoid?tab=readme-ov-file#ben...

stonebraker · 2025-01-07T05:55:57 1736229357

True, but the cost for other models will continue to go down as well.

stonebraker · 2024-04-18T21:33:15 1713475995

recently, started using rye. Convenient for sure.

stonebraker · on Feb 28, 2024

Blog: https://medium.com/the-story-within/state-of-text-to-sql-dc3...

stonebraker · on Jan 22, 2024

Might also be worth scanning, Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

https://arxiv.org/abs/2308.03188

stonebraker · on Jan 22, 2024

I guess for self-evaluation and generation, we might want to choose a model that's performant for the job. This means that if the 70B is fine-tuned, that is probably the judge + augmentor vs a generic model. Also, I think the paper shows the win rate using the Mistral medium on some preliminary benchmark (Table 2)

But, I liked the idea that the reward model is not static, and if the user is provided with multiple options, then the extra score might help break the tie.

stonebraker · on Jan 17, 2024

So you would rather connect to a databricks lake from a privately hosted jupyterhub if that's an option?

sseagull · on Jan 17, 2024

maybe? I'm wasn't even sure what databricks was until I looked it up. I might not be the right person to answer this question, so you might need to narrow down what your audience is :)

My POV comes from academia. Our data is rarely in any sort of database, and is often just in various files we have to parse or otherwise ingest and analyze. So the notebooks tend to live next to the data (on our laptop or maybe a group server).

As an aside, academics often have a very hard time paying for services. Universities want to get involved in any sort of recurring subscription, get the legal department to look at the contracts, maybe even negotiate, etc. This is true even for $10/month. So we often look to do things locally ("shadow IT" is kind of a related phenomenon).

However, researchers and data analysts in private industry probably have a very different way of working, so don't take this as gospel. But as for your original question, yes, many many scientists use jupyter locally.

stonebraker · on Jan 19, 2024

Thanks

stonebraker · on July 19, 2023

Interesting

stonebraker · on March 21, 2023

Has anyone compared performance between LLaMa.cpp vs Alpaca.cpp?

stonebraker · on Dec 1, 2017

this is awesome. great work :)