Hacker Newsnew | past | comments | ask | show | jobs | submit | Curiositry's commentslogin

Qwen3.5 9b seems to be fairly competent at OCR and text formatting cleanup running in llama.cpp on CPU, albeit slow. However, I have compiled it umpteen ways and still haven't gotten GPU offloading working properly (which I had with Ollama), on an old 1650 Ti with 4GB VRAM (it tries to allocate too much memory).

I found that the drivers I had were no longer compatible with the newer kernels. After upgrading to newer drivers it was able to offload again.

I have a 1660ti and the cachyos + aur/llama.cpp-cuda package is working fine for me. With about 5.3 GB of usable memory, I find that the 35B model is by far the most capable one that performs just as fast as the 4B model that fits entirely on my GPU. I did try the 9B model and was surprisingly capable. However 35B still better in some of my own anecdotal test cases. Very happy with the improvement. However, I notice that qwen 3.5 is about half the speed of qwen 3

Are you running with all the --fit options and it’s not working correctly? You could try looking at how many layers are being attempted to offload and manually adjust from there. Walk down --n-gpu-layers with a bash script until it loads.

> GPU offloading working

I had this issue which in my case was solved by installing a newer driver. YMMV.

  sudo apt install nvidia-driver-570

If you’re building from source, the vulkan backend is the easiest to build and use for GPU offloading.

Yes, that's what I tried first. Same issue with trying to allocate more memory than was available.

This has been my main editor for prose and code for a few years now (Sublime Text -> Atom -> Vim -> Helix). Overall, it has been great. Many LSPs work almost out-of-the-box, and my config is a fraction the size of my old .vimrc.

Surprisingly, it didn’t take that long to update my Vim muscle memory. Days or weeks, maybe? However, I still have mixed feelings about modal editors in general, and most of my gripes with Helix are actually about modal editors and/or console editors in general.

Code folding is a feature I’m still waiting for.


Curious to hear your gripes about modal editors! I'm a long time Emacs user (traditional keybindings, not evil-mode), but I also started using Vim in parallel a little over a decade ago. I feel very proficient/productive in both, regularly using many of Vim's more advanced motions and functionality. I generally love the power and composability of Vim text objects, and definitely experience the benefit of using them. But there are some times where I am doing things like many small edits within a line where rapidly changing modes for all of the edits starts to feel cumbersome.

For Emacs, I use multiple cursors and a treesitter-based plug-in for incrementally expanding or reducing the selection by text objects. I also have a collection of my own helper functions for working with text that make my non-modal Emacs approach still feel very comparable to the power of manipulating text in Vim.

Curious to hear if your issues with modal editing are similar.


- It feels like I know all the efficient keybindings, but when someone looks over my shoulder, I become conscious of how much time I spend mashing Esc/CapsLock and i/I/a/A/o/O, compared to how much editing actually happens.

- I have nomouse mode on, to try to learn modal editors properly. But the mouse is actually fairly fast for getting to a specific cursor position. In theory, using Helix motions could be faster (and there's gw if I don't know what motion to use). In practice, the mental process of turning a point on the coordinate plane into the correct series of motions (including i) feels vastly slower.

Still, Vim, Helix, etc are incredible for structural manipulation of text, and I miss what they provide any time I edit text somewhere else, even with the universal keybindings that are available for navigating/selecting/deleting words, lines, etc. I tried Vim mode in Zed and it just didn't cut it.

Some things about Helix that I particularly like: speed and stability (no weird lag on visual block insert!), the jump to diagnostics/changes pattern (]d <Space>a is a surprisingly nice spellcheck interface, with <Space>d for the overview), the jumplist, and the good-by-default fuzzy pickers.


My main gripe with modal editors is that they still use the Escape key to go back to normal mode even though Escape was chosen for historical reasons (used to sit much much closer to the home row on older Unix Keyboards) In Linux and MacOs I can change it with just one gui setting but it's still annoying how everyone went with it. It's not mentionned in most vim tutorials. According to a vim reddit poll, at least half of the users are just using Escape where it is now instead of one of the alternatives. This is beyond me, it feels like someone inventing glasses in order to see better but everyone settled on cast iron frames.

Cassepipe, it’s not a great default for sure. What do you have yours mapped to? I mapped jj to return to normal mode and also save my file. So, as I’m typing, I just hit jj, the jj vanishes, and this command is run:

<Esc>:w<CR>

I could just have it escape instead without saving.

If I hadn’t chose jj it would have been ff, which is also always under an index finger. I do wish I’d been clued into the idea when I started with Vim instead of two years later.


I find the jj/jk hack a bit too clever for my taste. I just map CapsLock to Escape system-wide because it also unlocks quick escaping for shells vi-modes too and I realized that actually Escape is a really nice key to have around in a lot of UIs to get out/go back/cancel what you are doing. I also like that it's a simple gui setting away (or registry key editing in windows).

I either put CapsLock where Escape sits or use both shifts simultaneously (one cancels it) but even then I almost never use it. The rare times I need to type a lot of uppercase together is generally code in vim and visual selection + gU does the job.

The point of my comment was not to shill for a particular solution though but for the vim community to acknowledge the problem publicly instead of it being some insider knowledge you discover in a random internet comment six months into fighting vim (if you haven't dropped out yet)


What key would be a good candidate as a default though? Imagine the memes for exiting vim if you needed a modifier to get into normal mode. Caps lock is truly a useless key and should be escape anyway.

As someone who cut their teeth on a sun "programmer" layout, I really need control to be in that position. I might try mapping the vestigial control key to escape though. Or maybe the hack that dtj1123 describes (tap is escape, hold is control), if I can pull that off on macos.

<ctrl-[> always works out of the box which is less of a stretch than esc.

I do jk as I always find a roll easier on the fingers than a double-tap jj or kk. You could also use space provided you aren't using one of those distros that bases its identity on the spacebar.


Yea for me capslock is a systemwide esc for me. Works great.

I have caps remapped to esc when tapped, and ctrl when held. Takes perhaps a weekend to get used to, but once the muscle memory is there it feels incredibly comfortable and natural.

Which tool can do that kind of wizardry? I've seen either but not both.

I have similar types of bindings. I just found a keyboard that can use ZMK. There's quite a few out there.

ZMK (or it's free software cousin QMK) are super flexible and you can create lots of custom behaviors for keys (tap/hold behaviors, double press, layering, etc...). It takes some time and effort to learn how to set it all up. Some of the more complicated behaviors require using their dsl for mapping the keys instead of their GUI editor. Considering the ridiculous amount of hours I spend at my computer using a keyboard, I felt it was worth the investment in learning.


On macOS I use Karabiner-Elements to do the exact same thing. Also, my config is only applied in terminals, everywhere else the original functionality is kept. So, I'd say it is quite flexible.

Karabiner on macos, keyd on Linux.

Is this macro mapped in vim or OS level? Sounds interesting.

Last time I checked, on all OSes you need to install some third-party software alas. Hopefully I am wrong now.

I believe it's done via a daemon that intercepts keystrokes

I configure it in the firmware of my keyboard with QMK

I have hesitated many times to set this up but I don't want to get used to something that I cannot set up in less than 30 seconds on a new machine.

Not having Escape where CapsLock sits on a new machine already makes it infuriatingly unusable already :)


I map caps to ctrl and do ctrl-[ to get to normal mode. The main reason is using Vim bindings in other editors where Esc can get intercepted by other bindings but ctrl-[ has always worked everywhere.

My opinion is that going back to normal mode is too important a key to be a key combo, and a weird one at that (is it [ or ] ?). I am pretty sure you can get used to it but we humans get used to anything really, doesn't make it good. My pressing on CapsLock happens at a subconscious level. Quick edit and then punctuate with CapsLock with the pinkie. Some random key combo is not acceptable.

But again my point is that the default sucks. You probably learned a about Ctrl + [ while looking online for alternatives after realizing the default sucked


at least on linux you can map caps lock to esc if tapped and ctrl if held

natively ? how ?


But you do have to install and configure xcape. By native I meant something that would either be an gui option on your DE or a simple command from something that is already installed on a linux distro like `setxkbmap`

I mean... if people don't mind reaching, so what? I purposefully don't remap my leader key, although I don't have many leader mappings so it's not like I'm reaching for it constantly.

It is not good for your wrists. Don't do that to yourself

My wrists are fine! Unless by "you" you mean the people reaching for escape.

I mean if people don't mind having cast iron glasses, sure

No but really, vim's paradim that you should go back to normal mode constantly. With the current situation you get posts on the vim subreddit asking/telling you about insert mode editing commands. You might as well use Emacs at that point, at least it would be the intended workflow


Ah so you're saying people would probably not ask these question if exiting insert mode had a more accessible key?

Exactly

Possibly, though would have to see data, I guess. I feel like a lot of people are just curious about Vim and spend all their time in insert mode anyway. I have seen a bunch of those questions around "how would I do this without exiting insert mode" questions, though I'm not sure that having a more accessible escape key would change all of their minds. The "I want to use this new thing but I'm only interested in using it in a way that I'm used to" mindset is rampant among programmers.

Last night, I published a directory of indie blog directories on my (indie) blog.

ramkarthikk had built a directory of indie blogs, which included my blog’s RSS feed, and found my directory of directories post in the directory he had built.

This morning, he emailed me with the story of how he found my post, and asked if I’d consider adding his directory to my directory of directories. His directory was so nice I added it to my directory of directories and posted it here :)

This, I think, it how the indieweb is supposed to work.


This is something I really want to exist. But vibe-coded security tooling? Pretty much the last thing I want.


> This is something I really want to exist

How come? Cloudflare's free plan is great, and I can't think of a scenario where you want something more advanced but can't afford to pay for their enterprise plan.


No, but I have wanted to implement this on my site, and I have seen examples of it in the wild (maybe even the same one).

It seems like the hard part would be categorizing the posts accurately, and picking the axes to filter.



Not my product, but I agree it’s confusing. I assume that, like Ollama, it started out with support for one family of models, and then expanded scope and outgrew its name.


This was a breeze to install on Linux. However, I haven't managed to get realtime transcription working yet, ala Whisper.cpp stream or Moonshine.

--from-mic only supports Mac. I'm able to capture audio with ffmpeg, but adapting the ffmpeg example to use mic capture hasn't worked yet:

ffmpeg -f pulse -channels 1 -i 1 -f s16le - 2>/dev/null | ./voxtral -d voxtral-model --stdin

It's possible my system is simply under spec for the default model.

I'd like to be able to use this with the voxtral-q4.gguf quantized model from here: https://huggingface.co/TrevorJS/voxtral-mini-realtime-gguf


I am interested in a way to capture audio not only from the mic, but also from one of the monitor ports so you could pipe the audio you are hearing from the web directly for real-time transcription with one of these solutions. Did anyone manage to do that?

I can, for example, capture audio from that with Audacity or OBS Studio and do it later, so it should be possible to do it in real time too assuming my machine can keep up.


Set -i 1 to -i default or to one of your monitors, look them up with pactl list short sources

https://trac.ffmpeg.org/wiki/Capture/PulseAudio


Does it work if you use ffmpeg to feed it audio from a file? I personally would try file->ffmpeg->voxtral then mic->ffmpeg->file, and then try to glue together mic->ffmpeg->voxtral.

(But take with grain of salt; I haven't tried yet)


Recording audio with FFMPEG, and transcribing a file that’s piped from FFMPEG both work.

Given that it took 19.64 mins to transcribe the 11 second sample wav, it’s possible I just didn’t wait long enough :)


Ah. In that case... Yeah. Is it using GPU, and does the whole model fit in your (V)RAM?


This is a CPU implementation only.


Oh, that's interesting. The readme talks about GPU acceleration on Apple Silicon and I didn't see anything explicit for other platforms, so I assumed it needs GPU everywhere, but it does BLAS acceleration which a web search seems to agree is just a CPU optimized math library. That's great; should really increase the places where it's useful:)


It should be possible to develop a cuBLAS backend to accelerate BLAS on Nvidia.


From my testing on Linux this model is way too slow for anything close to realtime. The machine I’m using is kinda old, but a 12 minute input file took half a day to process.


If Voxtral can process rapid speech as well as it claims to, an obvious cost optimization would be to speed up normal laconic speech to the maximum speed the model can handle accurately.


I haven’t actually tried step three yet, but:

1. Kobo ereaders are dirt cheap at thrift stores (and run Linux)

2. KOreader is simple to install (I have done this)

3. KOReader has a text editor + terminal built in, and has a setting to switch to USB-OTG mode, which should allow you to plug in a USB-C hub, and a mechanical keyboard.

Boom! Internet connected e-ink writing tablet with excellent battery life, and the best keyboard you have, for ~$5-100.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: