mirror of
https://review.haiku-os.org/buildtools
synced 2025-02-12 08:47:41 +01:00
Old version was from 2012-05-06, 6.1.2 is from 2016-12-16 A lot of support for newer processors and speedups since then See gmp/NEWS for details
163 lines
3.4 KiB
Plaintext
163 lines
3.4 KiB
Plaintext
Copyright 1996, 1999, 2001, 2002, 2004 Free Software Foundation, Inc.
|
|
|
|
This file is part of the GNU MP Library.
|
|
|
|
The GNU MP Library is free software; you can redistribute it and/or modify
|
|
it under the terms of either:
|
|
|
|
* the GNU Lesser General Public License as published by the Free
|
|
Software Foundation; either version 3 of the License, or (at your
|
|
option) any later version.
|
|
|
|
or
|
|
|
|
* the GNU General Public License as published by the Free Software
|
|
Foundation; either version 2 of the License, or (at your option) any
|
|
later version.
|
|
|
|
or both in parallel, as here.
|
|
|
|
The GNU MP Library is distributed in the hope that it will be useful, but
|
|
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
|
|
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
|
|
for more details.
|
|
|
|
You should have received copies of the GNU General Public License and the
|
|
GNU Lesser General Public License along with the GNU MP Library. If not,
|
|
see https://www.gnu.org/licenses/.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This directory contains mpn functions for various HP PA-RISC chips. Code
|
|
that runs faster on the PA7100 and later implementations, is in the pa7100
|
|
directory.
|
|
|
|
RELEVANT OPTIMIZATION ISSUES
|
|
|
|
Load and Store timing
|
|
|
|
On the PA7000 no memory instructions can issue the two cycles after a store.
|
|
For the PA7100, this is reduced to one cycle.
|
|
|
|
The PA7100 has a lookup-free cache, so it helps to schedule loads and the
|
|
dependent instruction really far from each other.
|
|
|
|
STATUS
|
|
|
|
1. mpn_mul_1 could be improved to 6.5 cycles/limb on the PA7100, using the
|
|
instructions below (but some sw pipelining is needed to avoid the
|
|
xmpyu-fstds delay):
|
|
|
|
fldds s1_ptr
|
|
|
|
xmpyu
|
|
fstds N(%r30)
|
|
xmpyu
|
|
fstds N(%r30)
|
|
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
|
|
addc
|
|
stws res_ptr
|
|
addc
|
|
stws res_ptr
|
|
|
|
addib Loop
|
|
|
|
2. mpn_addmul_1 could be improved from the current 10 to 7.5 cycles/limb
|
|
(asymptotically) on the PA7100, using the instructions below. With proper
|
|
sw pipelining and the unrolling level below, the speed becomes 8
|
|
cycles/limb.
|
|
|
|
fldds s1_ptr
|
|
fldds s1_ptr
|
|
|
|
xmpyu
|
|
fstds N(%r30)
|
|
xmpyu
|
|
fstds N(%r30)
|
|
xmpyu
|
|
fstds N(%r30)
|
|
xmpyu
|
|
fstds N(%r30)
|
|
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
ldws N(%r30)
|
|
addc
|
|
addc
|
|
addc
|
|
addc
|
|
addc %r0,%r0,cy-limb
|
|
|
|
ldws res_ptr
|
|
ldws res_ptr
|
|
ldws res_ptr
|
|
ldws res_ptr
|
|
add
|
|
stws res_ptr
|
|
addc
|
|
stws res_ptr
|
|
addc
|
|
stws res_ptr
|
|
addc
|
|
stws res_ptr
|
|
|
|
addib
|
|
|
|
3. For the PA8000 we have to stick to using 32-bit limbs before compiler
|
|
support emerges. But we want to use 64-bit operations whenever possible,
|
|
in particular for loads and stores. It is possible to handle mpn_add_n
|
|
efficiently by rotating (when s1/s2 are aligned), masking+bit field
|
|
inserting when (they are not). The speed should double compared to the
|
|
code used today.
|
|
|
|
|
|
|
|
|
|
LABEL SYNTAX
|
|
|
|
The HP-UX assembler takes labels starting in column 0 with no colon,
|
|
|
|
L$loop ldws,mb -4(0,%r25),%r22
|
|
|
|
Gas on hppa GNU/Linux however requires a colon,
|
|
|
|
L$loop: ldws,mb -4(0,%r25),%r22
|
|
|
|
This is covered by using LDEF() from asm-defs.m4. An alternative would be
|
|
to use ".label" which is accepted by both,
|
|
|
|
.label L$loop
|
|
ldws,mb -4(0,%r25),%r22
|
|
|
|
but that's not as nice to look at, not if you're used to assembler code
|
|
having labels in column 0.
|
|
|
|
|
|
|
|
|
|
REFERENCES
|
|
|
|
Hewlett Packard, "HP Assembler Reference Manual", 9th edition, June 1998,
|
|
part number 92432-90012.
|
|
|
|
|
|
|
|
----------------
|
|
Local variables:
|
|
mode: text
|
|
fill-column: 76
|
|
End:
|