Qwen3.5 9b seems to be fairly competent at OCR and text formatting cleanup running in llama.cpp on CPU, albeit slow. However, I have compiled it umpteen ways and still haven't gotten GPU offloading working properly (which I had with Ollama), on an old 1650 Ti with 4GB VRAM (it tries to allocate too much memory).
I have a 1660ti and the cachyos + aur/llama.cpp-cuda package is working fine for me.
With about 5.3 GB of usable memory, I find that the 35B model is by far the most capable one that performs just as fast as the 4B model that fits entirely on my GPU.
I did try the 9B model and was surprisingly capable. However 35B still better in some of my own anecdotal test cases.
Very happy with the improvement. However, I notice that qwen 3.5 is about half the speed of qwen 3
Are you running with all the --fit options and it’s not working correctly? You could try looking at how many layers are being attempted to offload and manually adjust from there. Walk down --n-gpu-layers with a bash script until it loads.
This has been my main editor for prose and code for a few years now (Sublime Text -> Atom -> Vim -> Helix). Overall, it has been great. Many LSPs work almost out-of-the-box, and my config is a fraction the size of my old .vimrc.
Surprisingly, it didn’t take that long to update my Vim muscle memory. Days or weeks, maybe? However, I still have mixed feelings about modal editors in general, and most of my gripes with Helix are actually about modal editors and/or console editors in general.
Curious to hear your gripes about modal editors! I'm a long time Emacs user (traditional keybindings, not evil-mode), but I also started using Vim in parallel a little over a decade ago. I feel very proficient/productive in both, regularly using many of Vim's more advanced motions and functionality. I generally love the power and composability of Vim text objects, and definitely experience the benefit of using them. But there are some times where I am doing things like many small edits within a line where rapidly changing modes for all of the edits starts to feel cumbersome.
For Emacs, I use multiple cursors and a treesitter-based plug-in for incrementally expanding or reducing the selection by text objects. I also have a collection of my own helper functions for working with text that make my non-modal Emacs approach still feel very comparable to the power of manipulating text in Vim.
Curious to hear if your issues with modal editing are similar.
- It feels like I know all the efficient keybindings, but when someone looks over my shoulder, I become conscious of how much time I spend mashing Esc/CapsLock and i/I/a/A/o/O, compared to how much editing actually happens.
- I have nomouse mode on, to try to learn modal editors properly. But the mouse is actually fairly fast for getting to a specific cursor position. In theory, using Helix motions could be faster (and there's gw if I don't know what motion to use). In practice, the mental process of turning a point on the coordinate plane into the correct series of motions (including i) feels vastly slower.
Still, Vim, Helix, etc are incredible for structural manipulation of text, and I miss what they provide any time I edit text somewhere else, even with the universal keybindings that are available for navigating/selecting/deleting words, lines, etc. I tried Vim mode in Zed and it just didn't cut it.
Some things about Helix that I particularly like: speed and stability (no weird lag on visual block insert!), the jump to diagnostics/changes pattern (]d <Space>a is a surprisingly nice spellcheck interface, with <Space>d for the overview), the jumplist, and the good-by-default fuzzy pickers.
My main gripe with modal editors is that they still use the Escape key to go back to normal mode even though Escape was chosen for historical reasons (used to sit much much closer to the home row on older Unix Keyboards)
In Linux and MacOs I can change it with just one gui setting but it's still annoying how everyone went with it. It's not mentionned in most vim tutorials. According to a vim reddit poll, at least half of the users are just using Escape where it is now instead of one of the alternatives. This is beyond me, it feels like someone inventing glasses in order to see better but everyone settled on cast iron frames.
Cassepipe, it’s not a great default for sure. What do you have yours mapped to? I mapped jj to return to normal mode and also save my file. So, as I’m typing, I just hit jj, the jj vanishes, and this command is run:
<Esc>:w<CR>
I could just have it escape instead without saving.
If I hadn’t chose jj it would have been ff, which is also always under an index finger. I do wish I’d been clued into the idea when I started with Vim instead of two years later.
I find the jj/jk hack a bit too clever for my taste. I just map CapsLock to Escape system-wide because it also unlocks quick escaping for shells vi-modes too and I realized that actually Escape is a really nice key to have around in a lot of UIs to get out/go back/cancel what you are doing. I also like that it's a simple gui setting away (or registry key editing in windows).
I either put CapsLock where Escape sits or use both shifts simultaneously (one cancels it) but even then I almost never use it. The rare times I need to type a lot of uppercase together is generally code in vim and visual selection + gU does the job.
The point of my comment was not to shill for a particular solution though but for the vim community to acknowledge the problem publicly instead of it being some insider knowledge you discover in a random internet comment six months into fighting vim (if you haven't dropped out yet)
What key would be a good candidate as a default though? Imagine the memes for exiting vim if you needed a modifier to get into normal mode. Caps lock is truly a useless key and should be escape anyway.
As someone who cut their teeth on a sun "programmer" layout, I really need control to be in that position. I might try mapping the vestigial control key to escape though. Or maybe the hack that dtj1123 describes (tap is escape, hold is control), if I can pull that off on macos.
<ctrl-[> always works out of the box which is less of a stretch than esc.
I do jk as I always find a roll easier on the fingers than a double-tap jj or kk. You could also use space provided you aren't using one of those distros that bases its identity on the spacebar.
I have caps remapped to esc when tapped, and ctrl when held. Takes perhaps a weekend to get used to, but once the muscle memory is there it feels incredibly comfortable and natural.
I have similar types of bindings. I just found a keyboard that can use ZMK. There's quite a few out there.
ZMK (or it's free software cousin QMK) are super flexible and you can create lots of custom behaviors for keys (tap/hold behaviors, double press, layering, etc...). It takes some time and effort to learn how to set it all up. Some of the more complicated behaviors require using their dsl for mapping the keys instead of their GUI editor. Considering the ridiculous amount of hours I spend at my computer using a keyboard, I felt it was worth the investment in learning.
On macOS I use Karabiner-Elements to do the exact same thing. Also, my config is only applied in terminals, everywhere else the original functionality is kept. So, I'd say it is quite flexible.
I map caps to ctrl and do ctrl-[ to get to normal mode. The main reason is using Vim bindings in other editors where Esc can get intercepted by other bindings but ctrl-[ has always worked everywhere.
My opinion is that going back to normal mode is too important a key to be a key combo, and a weird one at that (is it [ or ] ?). I am pretty sure you can get used to it but we humans get used to anything really, doesn't make it good. My pressing on CapsLock happens at a subconscious level. Quick edit and then punctuate with CapsLock with the pinkie. Some random key combo is not acceptable.
But again my point is that the default sucks. You probably learned a about Ctrl + [ while looking online for alternatives after realizing the default sucked
But you do have to install and configure xcape. By native I meant something that would either be an gui option on your DE or a simple command from something that is already installed on a linux distro like `setxkbmap`
I mean... if people don't mind reaching, so what? I purposefully don't remap my leader key, although I don't have many leader mappings so it's not like I'm reaching for it constantly.
I mean if people don't mind having cast iron glasses, sure
No but really, vim's paradim that you should go back to normal mode constantly. With the current situation you get posts on the vim subreddit asking/telling you about insert mode editing commands. You might as well use Emacs at that point, at least it would be the intended workflow
Possibly, though would have to see data, I guess. I feel like a lot of people are just curious about Vim and spend all their time in insert mode anyway. I have seen a bunch of those questions around "how would I do this without exiting insert mode" questions, though I'm not sure that having a more accessible escape key would change all of their minds. The "I want to use this new thing but I'm only interested in using it in a way that I'm used to" mindset is rampant among programmers.
Last night, I published a directory of indie blog directories on my (indie) blog.
ramkarthikk had built a directory of indie blogs, which included my blog’s RSS feed, and found my directory of directories post in the directory he had built.
This morning, he emailed me with the story of how he found my post, and asked if I’d consider adding his directory to my directory of directories. His directory was so nice I added it to my directory of directories and posted it here :)
This, I think, it how the indieweb is supposed to work.
How come? Cloudflare's free plan is great, and I can't think of a scenario where you want something more advanced but can't afford to pay for their enterprise plan.
Not my product, but I agree it’s confusing. I assume that, like Ollama, it started out with support for one family of models, and then expanded scope and outgrew its name.
I am interested in a way to capture audio not only from the mic, but also from one of the monitor ports so you could pipe the audio you are hearing from the web directly for real-time transcription with one of these solutions. Did anyone manage to do that?
I can, for example, capture audio from that with Audacity or OBS Studio and do it later, so it should be possible to do it in real time too assuming my machine can keep up.
Does it work if you use ffmpeg to feed it audio from a file? I personally would try file->ffmpeg->voxtral then mic->ffmpeg->file, and then try to glue together mic->ffmpeg->voxtral.
(But take with grain of salt; I haven't tried yet)
Oh, that's interesting. The readme talks about GPU acceleration on Apple Silicon and I didn't see anything explicit for other platforms, so I assumed it needs GPU everywhere, but it does BLAS acceleration which a web search seems to agree is just a CPU optimized math library. That's great; should really increase the places where it's useful:)
From my testing on Linux this model is way too slow for anything close to realtime. The machine I’m using is kinda old, but a 12 minute input file took half a day to process.
If Voxtral can process rapid speech as well as it claims to, an obvious cost optimization would be to speed up normal laconic speech to the maximum speed the model can handle accurately.
1. Kobo ereaders are dirt cheap at thrift stores (and run Linux)
2. KOreader is simple to install (I have done this)
3. KOReader has a text editor + terminal built in, and has a setting to switch to USB-OTG mode, which should allow you to plug in a USB-C hub, and a mechanical keyboard.
Boom! Internet connected e-ink writing tablet with excellent battery life, and the best keyboard you have, for ~$5-100.
reply