Augustin Cavalier 08c53ca964 kernel/vm: Use more than one object_cache for the page mappings.
Every time a page is mapped into an area on fault, we have to
allocate a mapping object for it. While the object_cache
does have per-CPU depots, these depots only store a limited
number of items, and once they run out the object_cache's lock
must be acquired.

So, to reduce lock contention on SMP systems, create a number
of object caches corresponding to the nearest power of 2
that is equal or smaller than the count of CPUs. (We already
allocate dozens of object caches for the block allocator
no matter how many CPUs there are, so a few more depending
on CPU count shouldn't impact memory use too much. Besides,
the object_caches are wired into the low_resource system.)

This significantly reduces lock contention on SMP systems.
Same benchmark setup as yesterday (compile mime_db and relink
HaikuDepot, VMware, -j4), before:
real    0m16.981s
user    0m14.357s
sys     0m6.060s

after:
real    0m14.522s
user    0m14.194s
sys     0m4.337s

And the page_mappings object_cache locks went from having 200,000+ waits
and ~14 seconds waiting time (across all threads) down to ~900 (yes,
that's not a typo) and ~0.05s wait time (though these numbers were captured
in conjunction with the following commit.)
2024-07-24 16:31:20 -04:00
..
2023-10-06 20:38:24 +00:00
2022-02-18 21:27:06 +00:00
2020-06-13 23:24:27 +02:00
2022-08-17 19:23:51 +00:00
2024-06-10 16:34:25 +00:00
2023-06-29 21:04:40 -04:00
2021-08-27 19:02:20 -04:00
2021-06-19 18:09:25 +00:00