Adrian Chadd, known for his extensive WiFI work, writes about his findings of NUMA (non-uniform memory access) in FreeBSD.
I just committed “NUMA” to FreeBSD. Well, no, I didn’t. I did almost no actual NUMA-y work in FreeBSD. I just exposed the existing NUMA stuff in FreeBSD out and re-enabled it.
FreeBSD-9 introduced basic NUMA awareness in the physical allocator (sys/vm/vm_phys.c.) It implemented first-touch page allocation, and then fell back to searching through the domains, round-robin style. It wasn’t perfect, for some workloads it was apparently okay. But it had some shortcomings – it wasn’t configurable, UMA and other subsystems didn’t know about NUMA domains, and the scheduler really didn’t know about NUMA domains. So I’m sure there are plenty of workloads which it didn’t work for.
That was all ripped out before FreeBSD-10. FreeBSD-10 NUMA just implements round-robin physical page allocation. It still tracks the per-domain physical memory regions, but it doesn’t do any kind of NUMA aware allocation. From what I can gather, it was removed until something ‘better’ would land.
However, nothing (yet) has landed. So I decided I’d take a look into it. I found that for a lot of simple workloads (ie, where you’re doing lots of anonymous memory allocation – eg, you’re doing math crunching) the FreeBSD-9 model works fine. It’s also a perfectly good starting point for experimenting.
So all my NUMA work in -HEAD does is provide an API to exactly the above. It doesn’t teach the kernel APIs about domain aware allocations – there’s currently no way to ask for memory from a specific domain when calling UMA, or contigmalloc, etc. The scheduler doesn’t know about NUMA, so threads/processes will migrate off-socket very quickly unless you explicitly limit things. Devices don’t yet do NUMA local work – the ACPI code is in there to enumerate which NUMA domain they’re in, but it’s not used anywhere just yet.
Then what is it good for?