Looks great! I had looked at Chonkie a few months back, but didn't need it in our pipelines. I was just writing a POC for an agentic chunker this week to handle various formatting and chunking requirements. I'll give Chonkie a shot!
Your tool has been awesome! Seeing what it can do inspired me to write a POC that connected to an enterprise IBM application that I used to implement: https://github.com/karbasia/tririga-data-workbench (also uses DuckDB and Perspective with some additional hacks to make it work with IBM's tool).
I've used AG Grid at many of my companies and found it worth the price.
I've also dabbled in Perspective (https://github.com/finos/perspective), but it may be overkill for you. It was a fun project similar to SQL Workbench where data can be uploaded and consumed from a specific IWMS API for client-side analysis. It's quite powerful and open source.
From my experience working with IBM products at an enterprise level, a lot of the integrations only provide basic functionality or they are used to demonstrate a happy path for the sales team. Once you go into the implementation phase, you run into multiple issues and require substantial development to actually make it work within a client's environment.
I believe that the issue is that graphic cards require really fast memory. This requires close memory placement (that's why the memory is so close to the core on the board). expandable memory will not be able to provide the required bandwidth here.
The universe used to have hierarchies. Fast memory close, slow memory far. Registers. L1. L2. L3. RAM. Swap.
The same thing would make a lot of sense here. Super-fast memory close, with overflow into classic DDR slots.
As a footnote, going parallel also helps. 8 sticks of RAM at 1/8 the bandwidth each is the same as one stick of RAM at 8x the bandwidth, if you don't multiplex onto the same traces.
It's not so simple... The way GPU architecture works is that it needs as-fast-as-possible access to its VRAM. The concept of "overflow memory" for a GPU is your PC's RAM. Adding a secondary memory controller and equivalent DRAM to the card itself would only provide a trivial improvement over, "just using the PC RAM".
Point of fact: GPUs don't even use all the PCI Express lanes they have available to them! Most GPUs (even top of the line ones like Nvidia's 4090) only use about 8 lanes of bandwidth. This is why some newer GPUs are being offered with M.2 slots so you can add an SSD (https://press.asus.com/news/asus-dual-geforce-rtx-4060-ti-ss... ).
GPUs have memory hierarchies too. A 4090 has about 16MB of L1 cache and 72MB of L2 cache, followed by the 24GB of GDDR6 RAM, followed by host ram that can be accessed via PCIe.
The issue is that GPUs are massively parallel. A 4090 has 128 streaming multiprocessors, each executing 128 "threads" or "lanes" in parallel. If each "thread" works on a different part of memory that leaves you with 1kB of L1 cache per thread, and 4.5kB of L2 cache each. For each clock cycle you might be issuing thousands of request to your memory controller for cache misses and prefetching. That's why you want insanely fast RAM.
You can write CUDA code that directly accesses your host memory as a layer beyond that, but usually you want to transfer that data in bigger chunks. You probably could make a card that adds DDR4 slots as an additional level of hierarchy. It's the kind of weird stuff Intel might do (the Phi had some interesting memory layout ideas).
Isn't part of the problem that the connectors add too much inductance, making the lines difficult to drive at high speed? Similar issue to distance I suppose but more severe.
I'm personally working on a specialized system monitor software to address deficiencies with a popular enterprise IWMS. It is aimed at companies that do not want to splurge for Splunk and require some specific system admin controls and metrics. There is a forwarder and backend API which will be completely self-hosted. I'm using this project to build some expertise in Go :)
I paid $590 for the minisforum UM700 in 2021 [0], now worth ~$215. Ryzen 7 3750H. Upgraded the 16GB RAM to 32GB which was another 50 bucks or something I can't remember.
Been my daily driver desktop since 2021.
I play Victoria 3, Cities Skylines, Valheim. And develop on it. Both Windows programs and also Linux stuff SSH-ed into my beefier dev servers.
I've also done some light video recording and editing.
I've run out of disk space now so I think I'll grab an external drive. But I'd buy one of these again.
Pros:
- Great variety of foods around here (I love Hakka!)
- Relatively "affordable" where you can afford a semi or detached home (albeit it'll be 100 years old with a smaller backyard)
- Great walkability to small markets, restaurants, breweries and Lake Ontario
- Sense of community
- Pretty good transit for North America (24 hour street car, subway line and a local rail within a 5 min walk)
- Bike lanes becoming more common
Cons:
- Generally cold climate means a short growing season for my victory garden
- Our salaries lag our southern neighbours
- The health care system is showing signs of failing under pressure
How was the effort to migrate from DRF to Django-Ninja? I saw Django-Ninja mentioned in another post and am thinking of switching one of my projects over from DRF. After skimming their documentation, it looks very pleasant and simple and perfect for my use case.
Going from 0 → 1 migrated route was "medium difficulty", I'd say — there was a non-trivial amount of scaffolding I had to build out around auth, error codes, and so on, and django-ninja's docs are still a little lackluster when it comes to edge cases.
Once I had that one migrated route, though, the rest were very simple. (And I am very happy with the migration overall!)
reply