> out of which only 3.1GB are usable That would be a real issue. I vaguely recal...

dandinu · 2025-11-30T11:59:21 1764503961

So the deal with AWE (Address Windowing Extensions) is that it lets 32-bit apps access memory above 4GB by essentially doing manual page mapping. You allocate physical pages, then map/unmap them into your 32-bit address space as needed. It's like having a tiny window you keep sliding around over a bigger picture.

The problemis that llama.cpp would need to be substantially rewritten to use it. We're talking:

  cAllocateUserPhysicalPages()
  MapUserPhysicalPages()
  // do your tensor ops on this chunk
  UnmapUserPhysicalPages()
  // slide the window, repeat

You'd basically be implementing your own memory manager that swaps chunks of the model weights in and out of your addressable space. It's not impossible, but it's a pretty gnarly undertaking for what amounts to "running AI on a museum piece."