253 lines
5.6 KiB
C
Raw Normal View History

/*
* Copyright 2009-2010, Ingo Weinhold, ingo_weinhold@gmx.de.
* Copyright 2002-2009, Axel Dörfler, axeld@pinc-software.de.
* Distributed under the terms of the MIT License.
*
* Copyright 2001-2002, Travis Geiselbrecht. All rights reserved.
* Distributed under the terms of the NewOS License.
*/
#ifndef _KERNEL_VM_VM_AREA_H
#define _KERNEL_VM_VM_AREA_H
#include <vm_defs.h>
#include <lock.h>
#include <util/DoublyLinkedList.h>
* vm_delete_areas(): Changed return type to void (was status_t and not used). * _user_map_file(), _user_unmap_memory(): Verify that the address (if given) is page aligned. * Reworked memory locking (wiring): - VMArea does now have a list of wired memory ranges and supports waiting for a range to be removed. - vm_soft_fault(): - Added "wirePage" parameter that, if given, makes the function wire the page and return it. - Added "wiredRange" parameter (for calls from lock_memory_etc()) and made sure we never unmap wired pages. This could e.g. happen when a page from a lower cache was read-mapped and a write fault occurred. Now in such a situation the function waits for the page to be unwired and restarts. - All functions that manipulate areas in a way that could affect wired ranges do now either require the caller to make sure there are no wired ranges in the way or do that themselves. Added a few wait_if_*_is_wired() helper functions for that purpose. - lock_memory_etc(): - Does now also work correctly when the range spans more than one area. - Adds VMAreaWiredRanges to the affected VMAreas and retains an address space reference (so that the address space won't be deleted as long as a wired range exists). - Resolved TODO: The area's caches are now locked when increment_page_wired_count() is called. - Resolved TODO: The race condition due to missing locking after looking up the page mapping is now prevented. We hold the cache locks (in case the page is already mapped) and the new vm_soft_fault() parameter allows us to get the page wired. - unlock_memory_etc(): Changes symmetrical to those in lock_memory_etc() and resolved all TODOs. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36030 a95241bf-73f2-0310-859d-f6bbb57e9c96
2010-04-03 18:01:29 +00:00
#include <util/SinglyLinkedList.h>
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
#include <util/AVLTree.h>
#include <vm/vm_types.h>
struct VMAddressSpace;
struct VMCache;
struct VMKernelAddressSpace;
struct VMUserAddressSpace;
* vm_delete_areas(): Changed return type to void (was status_t and not used). * _user_map_file(), _user_unmap_memory(): Verify that the address (if given) is page aligned. * Reworked memory locking (wiring): - VMArea does now have a list of wired memory ranges and supports waiting for a range to be removed. - vm_soft_fault(): - Added "wirePage" parameter that, if given, makes the function wire the page and return it. - Added "wiredRange" parameter (for calls from lock_memory_etc()) and made sure we never unmap wired pages. This could e.g. happen when a page from a lower cache was read-mapped and a write fault occurred. Now in such a situation the function waits for the page to be unwired and restarts. - All functions that manipulate areas in a way that could affect wired ranges do now either require the caller to make sure there are no wired ranges in the way or do that themselves. Added a few wait_if_*_is_wired() helper functions for that purpose. - lock_memory_etc(): - Does now also work correctly when the range spans more than one area. - Adds VMAreaWiredRanges to the affected VMAreas and retains an address space reference (so that the address space won't be deleted as long as a wired range exists). - Resolved TODO: The area's caches are now locked when increment_page_wired_count() is called. - Resolved TODO: The race condition due to missing locking after looking up the page mapping is now prevented. We hold the cache locks (in case the page is already mapped) and the new vm_soft_fault() parameter allows us to get the page wired. - unlock_memory_etc(): Changes symmetrical to those in lock_memory_etc() and resolved all TODOs. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36030 a95241bf-73f2-0310-859d-f6bbb57e9c96
2010-04-03 18:01:29 +00:00
struct VMAreaUnwiredWaiter
: public DoublyLinkedListLinkImpl<VMAreaUnwiredWaiter> {
VMArea* area;
addr_t base;
size_t size;
ConditionVariable condition;
ConditionVariableEntry waitEntry;
};
typedef DoublyLinkedList<VMAreaUnwiredWaiter> VMAreaUnwiredWaiterList;
struct VMAreaWiredRange : SinglyLinkedListLinkImpl<VMAreaWiredRange> {
VMArea* area;
addr_t base;
size_t size;
bool writable;
bool implicit; // range created automatically
VMAreaUnwiredWaiterList waiters;
VMAreaWiredRange()
{
}
* vm_delete_areas(): Changed return type to void (was status_t and not used). * _user_map_file(), _user_unmap_memory(): Verify that the address (if given) is page aligned. * Reworked memory locking (wiring): - VMArea does now have a list of wired memory ranges and supports waiting for a range to be removed. - vm_soft_fault(): - Added "wirePage" parameter that, if given, makes the function wire the page and return it. - Added "wiredRange" parameter (for calls from lock_memory_etc()) and made sure we never unmap wired pages. This could e.g. happen when a page from a lower cache was read-mapped and a write fault occurred. Now in such a situation the function waits for the page to be unwired and restarts. - All functions that manipulate areas in a way that could affect wired ranges do now either require the caller to make sure there are no wired ranges in the way or do that themselves. Added a few wait_if_*_is_wired() helper functions for that purpose. - lock_memory_etc(): - Does now also work correctly when the range spans more than one area. - Adds VMAreaWiredRanges to the affected VMAreas and retains an address space reference (so that the address space won't be deleted as long as a wired range exists). - Resolved TODO: The area's caches are now locked when increment_page_wired_count() is called. - Resolved TODO: The race condition due to missing locking after looking up the page mapping is now prevented. We hold the cache locks (in case the page is already mapped) and the new vm_soft_fault() parameter allows us to get the page wired. - unlock_memory_etc(): Changes symmetrical to those in lock_memory_etc() and resolved all TODOs. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36030 a95241bf-73f2-0310-859d-f6bbb57e9c96
2010-04-03 18:01:29 +00:00
VMAreaWiredRange(addr_t base, size_t size, bool writable, bool implicit)
:
area(NULL),
base(base),
size(size),
writable(writable),
implicit(implicit)
{
}
void SetTo(addr_t base, size_t size, bool writable, bool implicit)
{
this->area = NULL;
this->base = base;
this->size = size;
this->writable = writable;
this->implicit = implicit;
}
* vm_delete_areas(): Changed return type to void (was status_t and not used). * _user_map_file(), _user_unmap_memory(): Verify that the address (if given) is page aligned. * Reworked memory locking (wiring): - VMArea does now have a list of wired memory ranges and supports waiting for a range to be removed. - vm_soft_fault(): - Added "wirePage" parameter that, if given, makes the function wire the page and return it. - Added "wiredRange" parameter (for calls from lock_memory_etc()) and made sure we never unmap wired pages. This could e.g. happen when a page from a lower cache was read-mapped and a write fault occurred. Now in such a situation the function waits for the page to be unwired and restarts. - All functions that manipulate areas in a way that could affect wired ranges do now either require the caller to make sure there are no wired ranges in the way or do that themselves. Added a few wait_if_*_is_wired() helper functions for that purpose. - lock_memory_etc(): - Does now also work correctly when the range spans more than one area. - Adds VMAreaWiredRanges to the affected VMAreas and retains an address space reference (so that the address space won't be deleted as long as a wired range exists). - Resolved TODO: The area's caches are now locked when increment_page_wired_count() is called. - Resolved TODO: The race condition due to missing locking after looking up the page mapping is now prevented. We hold the cache locks (in case the page is already mapped) and the new vm_soft_fault() parameter allows us to get the page wired. - unlock_memory_etc(): Changes symmetrical to those in lock_memory_etc() and resolved all TODOs. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36030 a95241bf-73f2-0310-859d-f6bbb57e9c96
2010-04-03 18:01:29 +00:00
bool IntersectsWith(addr_t base, size_t size) const
{
return this->base + this->size - 1 >= base
&& base + size - 1 >= this->base;
}
};
typedef SinglyLinkedList<VMAreaWiredRange> VMAreaWiredRangeList;
struct VMPageWiringInfo {
VMAreaWiredRange range;
phys_addr_t physicalAddress;
// the actual physical address corresponding to
// the virtual address passed to vm_wire_page()
// (i.e. with in-page offset)
vm_page* page;
};
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
struct VMAreasTreeNode {
AVLTreeNode tree_node;
};
struct VMArea : private VMAreasTreeNode {
public:
enum {
// AddWaiterIfWired() flags
IGNORE_WRITE_WIRED_RANGES = 0x01, // ignore existing ranges that
// wire for writing
};
public:
area_id id;
char name[B_OS_NAME_LENGTH];
uint32 protection;
uint32 protection_max;
uint16 wiring;
private:
uint16 memory_type; // >> shifted by MEMORY_TYPE_SHIFT
public:
VMCache* cache;
off_t cache_offset;
uint32 cache_type;
VMAreaMappings mappings;
uint8* page_protections;
struct VMAddressSpace* address_space;
private:
DoublyLinkedListLink<VMArea> fCacheLink;
public:
typedef DoublyLinkedList<VMArea,
DoublyLinkedListMemberGetLink<VMArea, &VMArea::fCacheLink> > CacheList;
addr_t Base() const { return fBase; }
size_t Size() const { return fSize; }
inline uint32 MemoryType() const;
inline void SetMemoryType(uint32 memoryType);
bool ContainsAddress(addr_t address) const
{ return address >= fBase
&& address <= fBase + (fSize - 1); }
* vm_delete_areas(): Changed return type to void (was status_t and not used). * _user_map_file(), _user_unmap_memory(): Verify that the address (if given) is page aligned. * Reworked memory locking (wiring): - VMArea does now have a list of wired memory ranges and supports waiting for a range to be removed. - vm_soft_fault(): - Added "wirePage" parameter that, if given, makes the function wire the page and return it. - Added "wiredRange" parameter (for calls from lock_memory_etc()) and made sure we never unmap wired pages. This could e.g. happen when a page from a lower cache was read-mapped and a write fault occurred. Now in such a situation the function waits for the page to be unwired and restarts. - All functions that manipulate areas in a way that could affect wired ranges do now either require the caller to make sure there are no wired ranges in the way or do that themselves. Added a few wait_if_*_is_wired() helper functions for that purpose. - lock_memory_etc(): - Does now also work correctly when the range spans more than one area. - Adds VMAreaWiredRanges to the affected VMAreas and retains an address space reference (so that the address space won't be deleted as long as a wired range exists). - Resolved TODO: The area's caches are now locked when increment_page_wired_count() is called. - Resolved TODO: The race condition due to missing locking after looking up the page mapping is now prevented. We hold the cache locks (in case the page is already mapped) and the new vm_soft_fault() parameter allows us to get the page wired. - unlock_memory_etc(): Changes symmetrical to those in lock_memory_etc() and resolved all TODOs. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36030 a95241bf-73f2-0310-859d-f6bbb57e9c96
2010-04-03 18:01:29 +00:00
bool IsWired() const
{ return !fWiredRanges.IsEmpty(); }
bool IsWired(addr_t base, size_t size) const;
void Wire(VMAreaWiredRange* range);
void Unwire(VMAreaWiredRange* range);
VMAreaWiredRange* Unwire(addr_t base, size_t size, bool writable);
* vm_delete_areas(): Changed return type to void (was status_t and not used). * _user_map_file(), _user_unmap_memory(): Verify that the address (if given) is page aligned. * Reworked memory locking (wiring): - VMArea does now have a list of wired memory ranges and supports waiting for a range to be removed. - vm_soft_fault(): - Added "wirePage" parameter that, if given, makes the function wire the page and return it. - Added "wiredRange" parameter (for calls from lock_memory_etc()) and made sure we never unmap wired pages. This could e.g. happen when a page from a lower cache was read-mapped and a write fault occurred. Now in such a situation the function waits for the page to be unwired and restarts. - All functions that manipulate areas in a way that could affect wired ranges do now either require the caller to make sure there are no wired ranges in the way or do that themselves. Added a few wait_if_*_is_wired() helper functions for that purpose. - lock_memory_etc(): - Does now also work correctly when the range spans more than one area. - Adds VMAreaWiredRanges to the affected VMAreas and retains an address space reference (so that the address space won't be deleted as long as a wired range exists). - Resolved TODO: The area's caches are now locked when increment_page_wired_count() is called. - Resolved TODO: The race condition due to missing locking after looking up the page mapping is now prevented. We hold the cache locks (in case the page is already mapped) and the new vm_soft_fault() parameter allows us to get the page wired. - unlock_memory_etc(): Changes symmetrical to those in lock_memory_etc() and resolved all TODOs. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36030 a95241bf-73f2-0310-859d-f6bbb57e9c96
2010-04-03 18:01:29 +00:00
bool AddWaiterIfWired(VMAreaUnwiredWaiter* waiter);
bool AddWaiterIfWired(VMAreaUnwiredWaiter* waiter,
addr_t base, size_t size, uint32 flags = 0);
* vm_delete_areas(): Changed return type to void (was status_t and not used). * _user_map_file(), _user_unmap_memory(): Verify that the address (if given) is page aligned. * Reworked memory locking (wiring): - VMArea does now have a list of wired memory ranges and supports waiting for a range to be removed. - vm_soft_fault(): - Added "wirePage" parameter that, if given, makes the function wire the page and return it. - Added "wiredRange" parameter (for calls from lock_memory_etc()) and made sure we never unmap wired pages. This could e.g. happen when a page from a lower cache was read-mapped and a write fault occurred. Now in such a situation the function waits for the page to be unwired and restarts. - All functions that manipulate areas in a way that could affect wired ranges do now either require the caller to make sure there are no wired ranges in the way or do that themselves. Added a few wait_if_*_is_wired() helper functions for that purpose. - lock_memory_etc(): - Does now also work correctly when the range spans more than one area. - Adds VMAreaWiredRanges to the affected VMAreas and retains an address space reference (so that the address space won't be deleted as long as a wired range exists). - Resolved TODO: The area's caches are now locked when increment_page_wired_count() is called. - Resolved TODO: The race condition due to missing locking after looking up the page mapping is now prevented. We hold the cache locks (in case the page is already mapped) and the new vm_soft_fault() parameter allows us to get the page wired. - unlock_memory_etc(): Changes symmetrical to those in lock_memory_etc() and resolved all TODOs. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36030 a95241bf-73f2-0310-859d-f6bbb57e9c96
2010-04-03 18:01:29 +00:00
protected:
VMArea(VMAddressSpace* addressSpace,
uint32 wiring, uint32 protection);
~VMArea();
status_t Init(const char* name, uint32 allocationFlags);
protected:
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
friend struct VMAreasTreeDefinition;
friend struct VMKernelAddressSpace;
friend struct VMUserAddressSpace;
protected:
void SetBase(addr_t base) { fBase = base; }
void SetSize(size_t size) { fSize = size; }
protected:
addr_t fBase;
size_t fSize;
* vm_delete_areas(): Changed return type to void (was status_t and not used). * _user_map_file(), _user_unmap_memory(): Verify that the address (if given) is page aligned. * Reworked memory locking (wiring): - VMArea does now have a list of wired memory ranges and supports waiting for a range to be removed. - vm_soft_fault(): - Added "wirePage" parameter that, if given, makes the function wire the page and return it. - Added "wiredRange" parameter (for calls from lock_memory_etc()) and made sure we never unmap wired pages. This could e.g. happen when a page from a lower cache was read-mapped and a write fault occurred. Now in such a situation the function waits for the page to be unwired and restarts. - All functions that manipulate areas in a way that could affect wired ranges do now either require the caller to make sure there are no wired ranges in the way or do that themselves. Added a few wait_if_*_is_wired() helper functions for that purpose. - lock_memory_etc(): - Does now also work correctly when the range spans more than one area. - Adds VMAreaWiredRanges to the affected VMAreas and retains an address space reference (so that the address space won't be deleted as long as a wired range exists). - Resolved TODO: The area's caches are now locked when increment_page_wired_count() is called. - Resolved TODO: The race condition due to missing locking after looking up the page mapping is now prevented. We hold the cache locks (in case the page is already mapped) and the new vm_soft_fault() parameter allows us to get the page wired. - unlock_memory_etc(): Changes symmetrical to those in lock_memory_etc() and resolved all TODOs. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36030 a95241bf-73f2-0310-859d-f6bbb57e9c96
2010-04-03 18:01:29 +00:00
VMAreaWiredRangeList fWiredRanges;
};
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
struct VMAreasTreeDefinition {
typedef area_id Key;
typedef VMArea Value;
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
AVLTreeNode* GetAVLTreeNode(VMArea* value) const
{
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
return &value->tree_node;
}
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
VMArea* GetValue(AVLTreeNode* node) const
{
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
const addr_t vmTreeNodeAddr = (addr_t)node
- offsetof(VMAreasTreeNode, tree_node);
VMAreasTreeNode* vmTreeNode =
reinterpret_cast<VMAreasTreeNode*>(vmTreeNodeAddr);
return static_cast<VMArea*>(vmTreeNode);
}
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
int Compare(area_id key, const VMArea* value) const
{
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
const area_id valueId = value->id;
if (valueId == key)
return 0;
return key < valueId ? -1 : 1;
}
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
int Compare(const VMArea* a, const VMArea* b) const
{
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
return Compare(a->id, b);
}
};
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
typedef AVLTree<VMAreasTreeDefinition> VMAreasTree;
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
struct VMAreas {
static status_t Init();
static status_t ReadLock()
{ return rw_lock_read_lock(&sLock); }
static void ReadUnlock()
{ rw_lock_read_unlock(&sLock); }
static status_t WriteLock()
{ return rw_lock_write_lock(&sLock); }
static void WriteUnlock()
{ rw_lock_write_unlock(&sLock); }
static VMArea* LookupLocked(area_id id)
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
{ return sTree.Find(id); }
static VMArea* Lookup(area_id id);
static area_id Find(const char* name);
static status_t Insert(VMArea* area);
static void Remove(VMArea* area);
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
static VMAreasTree::Iterator GetIterator()
{ return sTree.GetIterator(); }
private:
static rw_lock sLock;
vm: Replace the VMAreas OpenHashTable with an AVLTree. Since we used a hash table with a fixed size (1024), collisions were obviously inevitable, meaning that while insertions would always be fast, lookups and deletions would take linear time to search the linked-list for the area in question. For recently-created areas, this would be fast; for less-recently-created areas, it would get slower and slower and slower. A particularly pathological case was the "mmap/24-1" test from the Open POSIX Testsuite, which creates millions of areas until it hits ENOMEM; it then simply exits, at which point it would run for minutes and minutes in the kernel team deletion routines; how long I don't know, as I rebooted before it finished. This change fixes that problem, among others, at the cost of increased area creation time, by using an AVL tree instead of a hash. For comparison, mmap'ing 2 million areas with the "24-1" test before this change took around 0m2.706s of real time, while afterwards it takes about 0m3.118s, or around a 15% increase (1.152x). On the other hand, the total test runtime for 2 million areas went from around 2m11.050s to 0m4.035s, or around a 97% decrease (0.031x); in other words, with this new code, it is *32 times faster.* Area insertion will no longer be O(1), however, so the time increase may go up with the number of areas present on the system; but if it's only around 3 seconds to create 2 million areas, or about 1.56 us per area, vs. 1.35 us before, I don't think that's worth worrying about. My nonscientific "compile HaikuDepot with 2 cores in VM" benchmark seems to be within the realm of "noise", anyway, with most results both before and after this change coming in around 47s real time. Change-Id: I230e17de4f80304d082152af83db8bd5abe7b831
2023-03-14 17:42:25 -04:00
static VMAreasTree sTree;
};
uint32
VMArea::MemoryType() const
{
return (uint32)memory_type << MEMORY_TYPE_SHIFT;
}
void
VMArea::SetMemoryType(uint32 memoryType)
{
memory_type = memoryType >> MEMORY_TYPE_SHIFT;
}
#endif // _KERNEL_VM_VM_AREA_H