Hacker News new | past | comments | ask | show | jobs | submit login
Exactly how much physical memory is installed? (toroid.org)
73 points by todsacerdoti on Nov 29, 2020 | hide | past | favorite | 42 comments

The kernel doesn't necessarily know how much physical memory is installed. It doesn't need to. It only needs to know how much physical memory it can use, and where it lives in physical address space. On PCs, that is provided by the firmware at boot time via the e820 map (or via a UEFI method these days, and there is also the multiboot standard which GRUB will use to pass the info along to the booted kernel).

Therefore, there is no portable way to obtain that information that is guaranteed to work, because some platforms might not even provide it at all. Only if the kernel has/needs drivers that interact with the physical memory controller directly, or gets platform metadata such as via DMI, is this information readily available.

Indeed. Also: what if you're inside virtualisation? The platform probably shouldn't tell you that there is more memory than allocated to your VM.

Building systems that rely on looking round the curtain to the hardware has a habit of breaking when new hardware appears that does something surprising.

That would be, all kernels. They all program the physical memory controller to suit their needs. They would all know how much memory is actually provisioned. Its a matter of a uniform way of exposing this information.

> That would be, all kernels. They all program the physical memory controller to suit their needs.

None of the operating system kernels I know of (including Linux, all the BSDs, and Microsoft Windows) program the physical memory controller; it's always done by the platform firmware while running from ROM or cache-as-RAM, before the operating system kernel is loaded into RAM.

Linux: https://www.kernel.org/doc/html/latest/driver-api/memory-dev...

Windows is harder to find (not open source) but issue tracking indicates it has one: https://www.intel.com/content/www/us/en/support/articles/000...

Of course the bootloaders all set up RAM. They have to, to load the operating system. Then the OS sets it up again, to suit itself.

According to https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin... these drivers are only used in some embedded systems. I looked at the kernel configuration for the distribution I'm using (/boot/config-`uname -r`), and CONFIG_MEMORY is not set, so none of these drivers are even available. That is: at least for desktop Linux, the firmware (not the bootloader) is the only one which sets up the physical memory controller (and actually, the only one which knows how to set up the physical memory controller; according to https://en.wikipedia.org/wiki/Coreboot#Initializing_DRAM there's no public information on how to do so for modern desktop platforms).

The critical issue with such design choices is, how do you fix/update such code? If there's no driver and no kernel module, it gets knarly (and beyond the average consumer's skills).

btw I work entirely in the embedded domain, thus my view on the subject.

How much memory is reserved by UEFI and why? I would hope it's a negligible amount.

If you have certain GPUs, it can be multiple gigs stolen as that's set aside for video memory

Do you mean phisical address space or actual physical memory?

Actual physical memory. I actually have a hack in order to reclaim and use some of this on my laptop.

How can you tell how much video memory is allocated & reserved by UEFI? I’d love to read more about this (and the hacks you used to get around it).

I just noticed the difference between what the manufacturer advertised and what the BIOS provided. Maybe I'll do a write-up of what I did at some point, but it's highly non-portable, and somewhat unsafe. Generally, the BIOS is supposed to allow the amount of video memory taken to be set along with other BIOS options, but there's no option in mine for whatever reason.

Physical memory for integrated GPUs to use as VRAM.

> Interestingly, the subdirectories are not memory0 to memory127, but memory0 to memory20 and memory32 to memory138. There must be some explanation for the 11-block hole in the numbers, but I don't know what it is.

The obvious explanation is the PCI window. Looking at the memoryN/valid_zones file, I can see on my machine that the ones below memory32 have DMA32, while the ones above have Normal; DMA32 is the zone with memory below 4G, while Normal is the zone with memory above 4G. There's a window at the top of the 32-bit physical address space where most PCI devices are mapped; you can see these mappings by looking at /proc/iomem (as root, since the addresses are masked otherwise).

That is, the physical memory which should be at that hole was instead remapped to somewhere else (otherwise it would have been wasted); the directories above memory127 probably correspond to that remapped memory.

Yes, this is probably it. More specifically, any 128MB areas in the physical address space that do not have "System RAM" will not get a memoryN directory. You can view all the "System RAM" areas in /proc/iomem (if root, otherwise they're all 0's).

The most relevant kernel code is here:


I am one of the original authors listed at the top of that file.

Instead of grepping /sys/devices/system/memory/, you can use lsmem(1) from util-linux.

I'd have to check how it does/did it, but in the past I used the NHC (node health check) system for monitoring HPC cluster nodes. In this context the relevant check was for DIMMs dropping out. What it read needed a fudge factor, because the value changed at least once with a BIOS update. You couldn't just check the number was the same as when you initially set it up.

Experience with several hundred-node clusters is that the hardware doesn't obey the conservation law many people assume. It's not that uncommon to find cores and DIMMs disappearing from "sight" -- or being absent initially and not spotted by the vendor. (I've yet to see an HPC cluster vendor that understands real system management.)

Well... sorry for the tangent, but this blew my mind (totally knew about pcie and disks):

> On systems with hot-swappable RAM, memoryN/state might be “offline”,

Anyone have any information on this or real world use cases not better suited for having a cluster or multiple redundant machines? Are these still prevalent in some markets 2020? Looks like, according to really bad results in teh googs, CPUS are hot-swappable too? Is it a an extra socket on the motherboard or like some sort of franken dual-board setup?

I'm more curious than anything how systems like that are used or what their niche is... I'm sure they're insanely impractical for almost any conceivable purpose but what are those few cases they aren't? What's it like swapping out a CPU or DIMM in those? Just a lot of holding onto your butt and praying?

Hot-swappable RAM was and is a standard feature on mainframe equipment (along with often hot-swappable CPUs), and was also a common feature on higher-end server equipment through the '90s (for example Sun). It also appeared of course in some of the more in-between architectures such as HP Tandem/NonStop, which also featured hot-swappable CPUs. It was not unusual on these machines to have flashing status lights which would lead a technician to which drawer to open and which module in that drawer to swap out.

There wasn't a particular niche for this equipment, it was just intended to be repairable while in service. At the time, it was typical to spend a larger dollar amount on equipment which was manufactured for very high serviceability. A combination of improvement in the actual reliability of hardware, increasing cost pressure, and probably as the lesser factor a shift towards more horizontal architectures (where redundancy was across machines instead of across components in machines) has made this a lot less common, although still not at all unheard of on high-end equipment.

I've upgraded both RAM and CPUs on IBM RS/6000 machines (pSeries these days I believe) while the system was running. Very nice capability to minimize downtime. The procedure is 100% supported. You start by telling the OS (AiX in the cases where I did this) that you are going to remove these RAM modules or these CPUs and the OS let's you know when it has moved everything off of those. Then you remove them and plug in the replacement, then tell the OS that the new hardware can be used. Very smooth process.

Though I think on Linux it was added for VMs, so you can scale those up and down without rebooting.

Banks and airline reservation systems tend to use vertically scaled "mainframe" architectures. In this model you need very high availability so hot-swappability is important. Linux even has hot-swap CPU support for systems like this [1]. Some IBM systems have dozens of hot-swappable hardware elements. Power supplies, CPUs, network cards, RAM, disks.

[1] https://www.kernel.org/doc/html/latest/core-api/cpu_hotplug....

Taking memory offline is pretty common, that’s part of how “chipkill” ECC works. Servers that are even slightly fancy have mirroring support with a spare for every module. HPE ProLiant servers have this feature and they’re just rack mounted PCs.

Actually removing and replacing these parts while the system runs requires electrical and mechanical features that most machine lack.

I recently upgraded an old notebook. It had 2GB of RAM, and dmidecode assured me it would support 4GB. Although dmidecode said that both slots would support 2GB, it turned out only one of them does.

I searched around a bit more and I now suspect the 2GB slot might actually support 4GB, but I don't want to spend money on this bet (it's an old machine with a Core 2 Duo processor).

So now I'm left wondering how dmidecode can be so wrong.

DMI information is supplied by BIOS - and those can be very, very wrong. More often than not the only DMI/ACPI information correctness the manufacturer cares about is "Windows doesn't complain".

Are you sure it's dmidecode that's wrong, and not the DMI itself? dmidecode just parses what's given to it by the BIOS.

Yes, I agree. Still, you can't trust the result.

Some versions of the Core 2 Duo can only support up to 3GB of memory. It’s possible that your CPU is the limiting factor, not your memory slots.

> I've often wanted to calculate the exact amount of physical memory installed in a system running Linux


The obvious reason to me would be to make sure all your DIMMs are working.

You're so, so, much better off booting into MemTest86+ when in doubt. "Reported existing by BIOS" definitely doesn't mean "working".

Should't code taking care of this better be a part of the kernel?

Is there any syscall to do the equivalent but from C?

e.g. If you were pre-allocating hash tables for a caching service, it would be convenient if the config could stipulate 80% of the system memory instead of the user having to specify the exact amount.

Use /proc/meminfo. What could a caching service possibly care about the bit of memory that is reserved for system use? It's not getting its hands on those pages under any circumstances.

There is no interface for that information because at the level of abstraction at which userspace applications operate, that information has no meaning.

As a comment above already mentioned - you should not try to work around and find the size of the physical memory installed because it often has no correlation to the amount of memory available to the kernel (and, to the user).

None of the mentioned alternatives work on raspberry pi 4 (running on armv7l kernel), including lsmem.

The "dmidecode" one-liner does not work on chromebook C710.

dmidecode is a tool to parse legacy BIOS information tables, and one wouldn't expect it to work on chromebooks.

I get it, it is a reflashed chromebook, with seabios.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact