It keeps saying the phrase “model you can run locally”, but despite days of tryi...

valine · on April 30, 2023

I’ve got 4 different llama models running locally with CUDA and can freely switch between them, including LLaVA which is a multimodal LLaMA variant.

None of them are particularly difficult to get running, the trick is to search the project’s github issue tracker. 99% of the time your problem will be in there with steps to fix it.

jiggawatts · on April 30, 2023

> the trick is to search the project’s github issue tracker.

What ever happened to the crazy notion of Dockerfiles that simply build successfully?

Isn’t half the point of containerisation that it papers over the madness of the Python module ecosystem?

avereveard · on April 30, 2023

The problem with that is gpu access from the docker image and iirc Nvidia doesn't have a Windows passtrough cuda driver

UncleEntity · on April 30, 2023

They require someone to take time out of their busy schedule to build?

Complaining that people won’t work for you for free is a bit much, don’t you think?

Nuzzerino · on April 30, 2023

I remember the days when lowering the barrier to entry was considered to be a safe investment since it would pay for itself through increased project interest (and more contributions by the community).

Now it is apparently seen as "working for free for ungrateful people"

valine · on April 30, 2023

Most of the repos for these new LLaMA derivatives are brand new, and put together by people who specialize in ML research not software engineering. I’m just happy to have access to their code. I’m sure most of these teams would be happy to merge a PR that lowered the barrier to entry, why don’t you get on that?

jiggawatts · on April 30, 2023

> specialize in ML research not software engineering.

That has nothing to do with Python tooling being bad. A safe assumption is that Python package managers are being developed by developers, who have no excuse.

If a C++ codebase developed by scientists had null pointer exceptions in it, then I could excuse things. But if the C++ compiler itself introduced unforced null pointer errors, then it absolutely deserves criticism.

It shouldn't be possible for a ML researcher to use Conda or whatever package manager in a way that despite using a formally specified "requirements.txt", it won't build a week later because of how loose the specification of module versions is allowed to be.

The Python attitude and more specifically Conda is at fault here, not the ML researching trying to get his job done.

alchemist1e9 · on April 30, 2023

One point you have I absolutely agree with:

> Conda is at fault here

Conda is so so bad. But trying to explain why to people who have fallen into it’s trap is difficult. People don’t realize the packages are not signed on enough information to reproduce them. The optimizer to find matching versions to make an environment that satisfies your constraints is really bad idea.

As an experienced C++ developer unfortunately and fortunately I’ve concluded the most “correct” solution is to use nixpkgs.

dventimi · on April 30, 2023

What's a good alternative? No, really. Conda has caused me so much pain.

alchemist1e9 · on April 30, 2023

As I mentioned I’ve settled on Nix with nixpkgs however it’s got a steep learning curve and really isn’t appropriate for anyone but fairly experienced unix hackers.

It’s a problem.

qaq · on April 30, 2023

Poetry is fairly reasonable package manager

UncleEntity · on April 30, 2023

There’s a big difference between making a project easy to use and requiring they package it all up in someone’s preferred container format which can’t even legally include all the dependencies.

Facebook seems to be pretty hands off (as is expected since the code is open source) unless you distribute the model weights and then they drop the dmca banhammer.

So, yeah, simply complaining with no effort to understand the problem is kind of ungrateful.

MakeUsersWant · on April 30, 2023

Could you publish a set of known-good versions (pip freeze, OS version, etc)?

nl · on April 30, 2023

Use the HuggingFace Transformer library. Unlike random github repos they are professionally maintained with proper versioning.

Here's the docs: https://huggingface.co/docs/transformers/main/model_doc/llam...

int_19h · on April 29, 2023

All of these things exist in the Python package ecosystem, and are generally much more common outside of ML/DS stuff. The latter... well, it reminds me of coding in early PHP days. Basically, anything goes so long as it works.

kmod · on April 30, 2023

I believe the cuda stuff, via Nvidia licensing restrictions, is forced to live outside of these packaging systems (so that you sign a Nvidia eula). Not saying this is a good thing but I think that none of the systems you mentioned would handle this well either

emikulic · on April 30, 2023

This used to be the case but nowadays you can just pip install e.g. https://libraries.io/pypi/nvidia-cuda-runtime-cu11

crowwork · on April 30, 2023

https://mlc.ai/mlc-llm/

jiggawatts · on April 30, 2023

I love how the response to a complaint about unreproducible builds without any versions being specified is an install script that straight up clones the "current commit" of a Git repo instead of a specific working commit id or tag.

Astonishing.

Spivak · on April 30, 2023

There's some truth to the "Arch" philosophy for some types of software it's actually more stable to just pull from master.

jiggawatts · on April 30, 2023

Not in this case. I’ve tried several repos and they all fail with various version mismatch issues.

Taek · on April 29, 2023

I have it running locally using the oobabooga webui, setup was moderately annoying but I'm definitely no python expert and I didn't have too much trouble.

DustinBrett · on April 30, 2023

I had it running before with Dalai (https://github.com/cocktailpeanut/dalai) but have since moved to using the browser based WebGPU method (https://mlc.ai/web-llm/) which uses Vicuna 7B and is quite good.

KETpXDDzR · on April 29, 2023

llama.cpp was easy to setup IMO

jiggawatts · on April 30, 2023

Can you link to a working Dockerfile?

I've heard several people say that it is easy, but then surely it ought to be trivial to set script the build so that it works reliable in a container!

PostOnce · on April 30, 2023

No need to drag a gigabyte of docker stuff into this, just extract the zip file from github and type make into your terminal

congratulations, it now works.

If you're not a developer, maybe you'll have to type sudo apt install build-essential first. Congratulations, now you too, a non-developer, are running it locally.

https://github.com/ggerganov/llama.cpp

vidarh · on April 30, 2023

Did that, got a compiler error within seconds within seconds. Looks like it might need a newer version of gcc than is in the distro on my laptop. Which is why people ask for Docker. If it really will work just with make with build-essential on a new enough distro image, a Dockerfile that documents that would be trivial, and does not at all stop people from just typing make if their setup is new enough.

kolinko · on April 30, 2023

but what for?

rch · on April 29, 2023

Just use Nixpkgs already.

danieldk · on April 30, 2023

Upstream Hydra doesn't build packages with CUDA because it uses a non-FLOSS license. So they are not in the binary cache. You'll end up rebuilding every CUDA-using package every time a transitive dependency is changed. Yeah, I know, pin the world. But you'll still have to build these packages on every machine. So, you have to run your own binary cache. As you see, the rabbit hole gets deep pretty quickly.

The only recourse is using the -bin flavors of PyTorch, etc. which will just download the precompiled upstream versions. Sadly, the result will still be much slower than other distributions. First because Python isn't compiled with optimizations and LTO in nixpkgs by default, because it is not reproducible. So, you override the Python derivation to enable optimizations and LTO. Python builds fine, but to get the machine learning ecosystem on you machine, Nix needs to build a gazillion Python packages, since the derivation hash of Python changed. Turns out that many derivations don't actually build. They build with the little amount of parallelism available on Hydra builders, but many Python packages will fail to build because of concurrency issues in tests that do manifest on your nice 16 core machine.

So, you spend hours fixing derivations so that they build on many core machines and upstream all the diffs. Or YOLO and you disable unit tests altogether. A few hours/days later (depending on your knowledge of Nix), you finally have a built of all packages that you want, you launch whatever you are doing on your CUDA-capable GPU. Turns out that it is 30-50% slower. Finding out why is another multi-day expedition in profiling and tinkering.

In the end pyenv (or a Docker container) on a boring distribution doesn't look so bad.

(Disclaimer: I initially added the PyTorch/libtorch bin packages to nixpkgs and was co-maintainer of the PyTorch derivation for a while.)

alchemist1e9 · on April 30, 2023

As a heavy nixpkgs user your comment resonates and makes me nervous.

I was thinking if it is possible in nixpkgs to create a branch that attempts to create a version match to specific distributions, especially Ubuntu as the ML world is most using it. My idea is to somehow use the deb package information to “shadow” another distribution.

> First because Python isn't compiled with optimizations and LTO in nixpkgs by default, because it is not reproducible. So, you override the Python derivation to enable optimizations and LTO. Python builds fine, but to get the machine learning ecosystem on you machine, Nix needs to build a gazillion Python packages, since the derivation hash of Python changed. Turns out that many derivations don't actually build. They build with the little amount of parallelism available on Hydra builders, but many Python packages will fail to build because of concurrency issues in tests that do manifest on your nice 16 core machine.

I understand your comments including above and the one about CUDA binaries. Just one clarification on the concurrency in tests failure, do you mean it overloads the machine running multi process tests that then tests fail due to assumptions by the package authors?

My main point is that Nix as a system is so incredibly powerful that perhaps there is an ability to “shadow” boring distributions, especially debian based, in some automated way. The we would have the best of both, baseline stability from the distribution and extensibility of nix.

danieldk · on April 30, 2023

I understand your comments including above and the one about CUDA binaries. Just one clarification on the concurrency in tests failure, do you mean it overloads the machine running multi process tests that then tests fail due to assumptions by the package authors?

I've found that quite some test suites have race conditions (e.g. simultaneous modification of files, etc.), which manifest themselves e.g. when a package uses pytest-xdist (and the machine has enough cores).

My main point is that Nix as a system is so incredibly powerful that perhaps there is an ability to “shadow” boring distributions,

I think things would improve vastly if it was possible to do CUDA builds in Hydra and have the resulting packages in the binary cache. My idea (when I was still contributing to nixpkgs) was to somehow mark CUDA derivations specially, so that they get built but not stored in the binary cache. That would allow packages with CUDA dependencies to get built as well (e.g. PyTorch). Nix would then only have to build CUDA locally (which is cheap, since it only entails unpacking the binary distribution and putting things in the right output paths) and would get everything else through the binary cache (like prebuilt PyTorch). But AFAIR it'd require some coordinated changes between Nix, Hydra, etc.

Then I started working for a company in the Python/Cython ecosystem and quickly found out that Nix is not really viable for most Python development. So I am now just using pyenv and pip, which works fine most of the time (we have some people in our team who are very good at maintaining proper package version bounds).

alchemist1e9 · on April 30, 2023

Have you ever come across anything debian -> nix expression tools?

Ubuntu seems to be winning mindshare across the board and while this would be different than nixpkgs itself I was thinking if it is possible to mass convert deb packages into nix expressions, this combined with overlays would allow rapid incremental testing of marginal modifications to a current distribution’s stacking of versions.

A bit like how Nix community has tools on top of the various language packaging systems but this would be a layer on top of the debian packaging standards.

Maybe it’s crazy but just an idea I’ve been having recently and wondering how hard it might be. Importantly debian deb and apt systems are very reproducible and structured which is a good fit for a Nix based layer.

rch · on April 30, 2023

Granted, but I already have to run isolated container registries, pypi, maven, terraform, CI/CD, etc., etc., and so locally addressing the problems you've described is unavoidable and will realize significant efficiencies in any case. Everything about working in partially or fully air gapped environments is painful - no surprises there.

But I also think it's fine for individuals and researchers working in ML to expect some extra compiling, as long as the outcome is reliable. I'm stuck at home this weekend resurrecting an analysis from 10+ years ago, complete with Python, R, Java, and Fortran dependencies^, and I'm definitely wishing I'd known about Nix back then.

^btw, thanks to whomever included hdf5-mpi in Nixpkgs. Your work is greatly appreciated.

rch · on April 30, 2023

I'll add that I do end up disabling tests more than I'd like, even for very popular packages. Point taken there.

UncleEntity · on April 30, 2023

I learned the very hard way not to mess with the python version the system depends on.

If you absolutely must then build it separately and link (or use) that exactly like blender does with their binaries. Campbell (one of the core blender devs) used to love to bump the python version as soon as it was released and if you wanted to do any dev work you’d have to run another python environment until the distro version caught up. Being as I liked to use the fedora libs as a sort of sanity check this was a bit of a hassle to say the least.

danieldk · on April 30, 2023

I learned the very hard way not to mess with the python version the system depends on.

That's not much of an issue on Nix. You can just override Python for your particular (machine learning) package set. The rest of the system will continue to use Python unmodified.

dagenix · on April 30, 2023

There is a lot that can be improved with python packaging, but calling it "childish" is itself a pretty immature comment.

jiggawatts · on April 30, 2023

Is it?

Literally every example I've seen so far is completely unversioned and mere weeks after being written simply doesn't work as a direct consequence.

E.g: https://github.com/oobabooga/text-generation-webui/blob/ee68...

Take this line:

    pip3 install torch torchvision torchaudio

Which version of torch is this? The latest.

    FROM nvidia/cuda:11.8.0-runtime-ubuntu22.04

Which version of CUDA is this? An incompatible one, apparently. Game over.

Check out "requirements.txt":

    accelerate==0.18.0
    colorama
    datasets
    flexgen==0.1.7
    gradio==3.25.0
    markdown
    numpy
    pandas
    Pillow>=9.5.0
    pyyaml
    requests
    rwkv==0.7.3
    safetensors==0.3.0
    sentencepiece
    tqdm

Wow. Less than half of those have any version specified. The rest? "Meh, I don't care, whatever."

Then this beauty:

    git+https://github.com/huggingface/peft

I love reaching out to the Internet in the middle of a build pipeline to pull the latest commit of a random repo, because that's so nice and safe, scalable, and cacheable in an artefact repository!

The NPM ecosystem gets regularly excoriated for the exact same mistakes, which by now are so well known, so often warned against, so often exploited, so regularly broken that it's getting boring.

It's like SQL injection. If you're still doing it in 2023, if your site is still getting hacked because of it, then you absolutely deserve to be labelled immature and even childish.

dagenix · on April 30, 2023

> you absolutely deserve to be labelled immature and even childish

Do you appreciate that people aren't making technical mistakes on purpose just to spite you? Or that maybe some of the folks writing these libraries are experts in fields other than dependency management? Are you an expert in all things? Would you find it helpful if someone identifies one thing that you aren't great at and then calls you names on the internet over that one thing?

There is a pretty significant difference between making a technical critique and just being rude. And being right about the former doesn't make the latter ok.

forgingahead · on April 30, 2023

You're right, Python's ecosystem and dependency management is a shitshow and everybody involved should be ashamed of themselves. But of course there are many "you're holding it wrong" commenters here who are in denial of this fact. It's an absolute pity that Python has become the de-facto language for public ML projects.

dagenix · on April 30, 2023

> everybody involved should be ashamed of themselves

Have you ever read about how open source project leaders often experience a lot of toxicity and anxiety about trying to keep up with the users they are supporting for free? If not, I suggest you do since this is the exact type of comment that is hurtful and unhelpful.

detrites · on April 30, 2023

You both seem right. While the base criticism is correct, the emotion attached to it is only an impediment. Python+ML seems apparently in a "move fast and break things" phase. We should expect rough edges and be calm if we're to address that well.

dagenix · on April 30, 2023

I agree with the person I'm responding to 100% on the technical merits.