This reverts commit 1db0961121bdb8145fe368fdca0205ddef354729.
It turns out the comment is not obsolete; what it refers to isn't
PAE systems but true 32-bit ones. I'm not sure we should use
64-bit cache offsets even there, but that's a decision for another
time.
B_PATH_NAME_LENGTH == PATH_MAX, and PATH_MAX is inclusive of the final
NULL terminator, so we don't need a + 1 here.
The original KPath default was to not use + 1, but that was changed in
42e3c6f97874f37701385e7027c77e4366d7c450 due to all the consumers that did.
But all those consumers are wrong, it appears; they should just be
using the default length instead. So now we do that.
Previously it was not initialized until "post-VM", but there are
a number of ways VM initialization can go wrong that it would
be nice to know about without needing a serial port.
On arches which map the whole physical memory into the kernel
address space (x86_64, at least), we can get the bluescreen facility
initialized using KERNEL_PMAP_BASE. On other architectures, we
just fail to init then, and do the usual setup later on.
A slight bit of extra code cleanup in blue_screen_init_early:
we now just call module->info.std_ops() rather than a
frame-buffer-console specific method.
Applications that don't call open() or like functions too often,
and call many FD-related methods across multiple threads at once
(like "git status") now don't wait on the context lock as much.
("git status" performance isn't much improved because threads just
hit the "unused vnodes" lock instead.)
* add SOCK_NONBLOCK and SOCK_CLOEXEC
* also extends the type parameter on socketpair() and socket()
Change-Id: I73570d5bfb57c2da00c1086149c9f07547ba61ce
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8515
Tested-by: Commit checker robot <no-reply+buildbot@haiku-os.org>
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
Previously, there was only platform_init_heap/platform_release_heap,
which allocated a single static heap region for the heap to use,
and any subsequent heap allocations had to go through the standard
platform_allocate_region, which allocates regions visible both
to the bootloader and the kernel.
But as mentioned in previous changes, it isn't always easy to
release regions allocated that way. And besides, some bootloaders
(like EFI) use a completely separate mechanism to allocate
bootloader-local memory, which will never get "leaked" into
the kernel.
So instead, refactor all platforms to instead provide two
new methods: platform_{allocate,free}_heap_region. On EFI
this is easy to implement; on most other platforms we have
logic based more on the old platform_init_heap or allocate_region.
(On the BIOS loader in particular, we can only fully release
the memory if it's the last thing we allocated in the physical
addresses. If the "large allocation" threshhold is lowered
back to 16 KB, then we are unable to do this enough times
that we will run past the end of the 8 MB identity map and
thus fail to boot. But with the larger threshhold, we don't
leak nearly as much, and don't hit the threshhold.)
This should further reduce the amount of bootloader memory
permanently "leaked" into the kernel's used memory, though
on some platforms it may still be nonzero.
Change-Id: I5b2257fc5a425c024f298291f1401a26ea246383
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8440
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
Will be used in following commits.
Change-Id: Ica89d28cbf6980aca8dc347dfdcb200a0e637e9a
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8442
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
page_num_t is typedef'd to phys_addr_t, so it's 64-bits on 32-bit
platforms with PAE. In fact it's been so since the introduction
of phys_addr_t, so this comment was obsolete from the start...
* This is needed in order to support syscalls and other exceptions
that need to be able to inspect/modify userspace register contents.
Change-Id: I8a638c0c40dd44ed882adad0591ae3bf5493a6b9
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8329
Haiku-Format: Haiku-format Bot <no-reply+haikuformatbot@haiku-os.org>
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
Tested-by: Commit checker robot <no-reply+buildbot@haiku-os.org>
Using the page attribute table to set the memory type on a per page
mapping basis is the more modern and flexible approach to physical
memory type handling compared to using MTRRs.
Most of the needed infrastructure was already in place, as setting the
page table entry attributes was already done for uncachable and
write-back memory types. Using the PAT now also allows to set the last
remaining memory type of write-combining through the PTE flags. The PAT
is configured to have entry 4 mean write-combining and the PAT bit in
the PTE is set to point to that.
When PAT is supported and not disabled, MTRRs are completely ignored
and left as set up by the system firmware, where the basic uncachable
and RAM ranges are supposed to be set up. These configurations are then
overridden by the PTE flags as needed.
Change-Id: I0a74b3fc7d3ba9fa384251290ce41621b69d3a02
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8340
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
We allow requesting an explicit memory type when calling
map_physical_memory but default to the uncached B_MTR_UC when not given.
When called without an explicitly requested memory type, allow
arch_vm_set_memory_type to modify and return an effective memory type.
When an overlapping range already exists, the effective memory type is
set to the one of the existing mapping. If there is an explicit memory
type request that conflicts with an existing range, or if multiple
overlaps with conflicting types would be produced, the mapping is
disallow (and a panic is triggered under KDEBUG).
This effectively detects and panics when conflicting aliases of physical
memory would be created. This is also useful on an MTRR based setup,
as such overlaps cannot be properly represented.
When using the page attribute table (PAT) to set the memory type on a
per page virtual memory mapping basis, this is needed to prevent
aliasing of the same physical memory with different types. As per the
specs, such aliasing is unsupported and may result in undefined
operations that lead to system failure.
The mechanism is extended to the general arch_vm_set_memory_type as such
aliasing prevention also seems to apply to other architectures (at least
on ARM, aliasing is also strongly discouraged).
Change-Id: I7aaf6ea8415e92e74cd1643b67793a6857619eea
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8339
Tested-by: Commit checker robot <no-reply+buildbot@haiku-os.org>
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
This reverts commit 0caf23319cb4e9f9c8a2ecd30e133e98a1d2b989.
This change is not safe because changing MAIR may invalidate
early mappings. It's also not clear if it's needed, as e.g.
FreeBSD does not use Device GRE mappings.
Change-Id: I95a904ee928281d44989ce707ed1ac59985a308d
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8268
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
Haiku-Format: Haiku-format Bot <no-reply+haikuformatbot@haiku-os.org>
Reviewed-by: Milek7 Milek7 <me@milek7.pl>
This breaks kernel ABI on KDEBUG builds (but not non-KDEBUG builds),
but it does so in order to resolve a long-standing incompatibility
between them: until now, any kernel add-ons built against one which
made use of these lock facilities could not be run on the other;
instead you would get hangs and/or crashes.
After this change, kernel add-ons built with a KDEBUG configuration
should work on a non-KDEBUG kernel, while add-ons built with a
non-KDEBUG configuration will fail to load on a KDEBUG kernel
with unresolved symbols, preventing incorrect and broken operation.
If we just use the kernel entry time, then the pre-syscall tracing
routine (with a debugger message send) will be counted in the syscall's
runtime.
Makes the output of timing in strace and strace -c much more accurate,
however it won't include the "syscall overhead" (time spent in the
syscall entry routines, etc.) But we already can't account for time
spent in the userland-to-kernel transition, so that should probably
be measured some other way if knowing it is desired.
In fact, on architectures which used the generic syscall dispatcher
(e.g. RISC-V), this is the behavior that already existed. So this just
makes x86 consistent with them.
Change-Id: I8cef6111e478ab49b0584e15575172eea77a8760
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8240
Tested-by: Commit checker robot <no-reply+buildbot@haiku-os.org>
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
A basic bump allocator that can handle arbitrary amounts of allocations,
so long as all are allocated and freed in a "stack"-like manner.
(Actually it could be extended to support non-stack-like operation,
but that would require more logic that isn't needed at the moment.)
Change-Id: I47077146ea282600130778d312f7d86bd8c032e0
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8238
Tested-by: Commit checker robot <no-reply+buildbot@haiku-os.org>
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
Reviewed-by: Michael Lotz <mmlr@mlotz.ch>
This can be used to replace mutex_trylock/mutex_unlock pairs. Once the
locker has been created, the success of the locking attempt needs to be
checked via locker.IsLocked().
Change-Id: Iba4b4ce21cac5059a3577a84a6eebe28d2cc4058
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8179
Tested-by: Commit checker robot <no-reply+buildbot@haiku-os.org>
Reviewed-by: Jérôme Duval <jerome.duval@gmail.com>
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
So that it has the same naming as DoublyLinkedList.
No functional change. It seems there aren't any users
of this API in the default build at the moment.
* This is the closest thing ARM has the semantics of write-combining
MTR on x86.
Change-Id: I12a1582e0af871e2ab729262e90695ffe928c85b
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8223
Tested-by: Commit checker robot <no-reply+buildbot@haiku-os.org>
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
Implemented using __builtin_clz where available, otherwise using
an algorithm derived from "Bit Twiddling Hacks" which is similar
to the one ramfs uses. GCC and Clang seem to unroll the loop on
x86 at least (but it doesn't matter there as the builtin exists,
implemented using the "bsr" instruction.)
These are architecture-specific routines, so they deserve proper
architecture-specific naming. The user memory access routines are
already under arch_cpu (arch_cpu_user_memcpy, etc.), and the methods
usually change a CPU flag, so it makes sense to put these there too.
RISC-V had get_ac but nothing else defined or used it, so it's removed.
No functional change intended.
Change-Id: Id4715214e32f73d4a93bc7ba8249411a0878d174
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8106
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
Reviewed-by: X512 X512 <danger_mail@list.ru>
Reviewed-by: Jérôme Duval <jerome.duval@gmail.com>
Tested-by: Commit checker robot <no-reply+buildbot@haiku-os.org>
So we need to check that it didn't when creating areas.
Change-Id: I4342463113046b543722faa7a51ca269ed67e8bf
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8137
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
According to the ARMv7 Reference Manual, "Wait for Interrupt" is supported only through the WFI instruction on ARMv7.
The currently used ARMv6 equivalent may not work on ARMv7 and newer CPUs.
Fixes #18520
Change-Id: I69a136870654be33c0c789004e08bf610db3dd97
Reviewed-on: https://review.haiku-os.org/c/haiku/+/8044
Haiku-Format: Haiku-format Bot <no-reply+haikuformatbot@haiku-os.org>
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
ramfs needs to create caches that are both temporary and unmergeable,
so add another flag to make this state possible.
Otherwise, mmap'ed files from ramfs might wind up in VMCache
trying to merge the caches when the last one is closed, which
we don't want.
The IORequest internally likes to deal with transferEndOffset
not transferredBytes because of sub-requests potentially being
prepared all at once (in some paths in the I/O scheduler),
thus fTransferSize can get incremented in Advance() before we have
actually executed that transfer.
But external consumers much prefer just knowing transferredBytes
not transferEndOffset. And many of them actually named their
variables that (or "bytesTransferred") and just passed the
transferEndOffset through to variables with that name! That's
obviously wrong, and it's surprising it wasn't discovered before now.
The problem was uncovered by repeated KDLs in PrecacheIO.
That method used the "bytesTransferred" value as a count of
pages transferred, which would then run past the end of the array
if the transfer start offset was not 0 (which the majority
of the time it would be, since this method gets called on
the first mmap() of a file, probably before any pages are read in.)
Most other consumers of this API did not check the value, it seems,
or otherwise had some mitigating factor that prevented it from
causing more problems. An exception is the page code, which
may have spuriously considered writes as successful when they
really weren't.
May fix some of the "invalid concurrent access to page" KDLs.
Every time a page is mapped into an area on fault, we have to
allocate a mapping object for it. While the object_cache
does have per-CPU depots, these depots only store a limited
number of items, and once they run out the object_cache's lock
must be acquired.
So, to reduce lock contention on SMP systems, create a number
of object caches corresponding to the nearest power of 2
that is equal or smaller than the count of CPUs. (We already
allocate dozens of object caches for the block allocator
no matter how many CPUs there are, so a few more depending
on CPU count shouldn't impact memory use too much. Besides,
the object_caches are wired into the low_resource system.)
This significantly reduces lock contention on SMP systems.
Same benchmark setup as yesterday (compile mime_db and relink
HaikuDepot, VMware, -j4), before:
real 0m16.981s
user 0m14.357s
sys 0m6.060s
after:
real 0m14.522s
user 0m14.194s
sys 0m4.337s
And the page_mappings object_cache locks went from having 200,000+ waits
and ~14 seconds waiting time (across all threads) down to ~900 (yes,
that's not a typo) and ~0.05s wait time (though these numbers were captured
in conjunction with the following commit.)