diff options
Diffstat (limited to 'linden/indra/libgcrypt/libgcrypt-1.2.2/mpi/i586/README')
-rwxr-xr-x | linden/indra/libgcrypt/libgcrypt-1.2.2/mpi/i586/README | 26 |
1 files changed, 26 insertions, 0 deletions
diff --git a/linden/indra/libgcrypt/libgcrypt-1.2.2/mpi/i586/README b/linden/indra/libgcrypt/libgcrypt-1.2.2/mpi/i586/README new file mode 100755 index 0000000..b698cba --- /dev/null +++ b/linden/indra/libgcrypt/libgcrypt-1.2.2/mpi/i586/README | |||
@@ -0,0 +1,26 @@ | |||
1 | This directory contains mpn functions optimized for Intel Pentium | ||
2 | processors. | ||
3 | |||
4 | RELEVANT OPTIMIZATION ISSUES | ||
5 | |||
6 | 1. Pentium doesn't allocate cache lines on writes, unlike most other modern | ||
7 | processors. Since the functions in the mpn class do array writes, we have to | ||
8 | handle allocating the destination cache lines by reading a word from it in the | ||
9 | loops, to achieve the best performance. | ||
10 | |||
11 | 2. Pairing of memory operations requires that the two issued operations refer | ||
12 | to different cache banks. The simplest way to insure this is to read/write | ||
13 | two words from the same object. If we make operations on different objects, | ||
14 | they might or might not be to the same cache bank. | ||
15 | |||
16 | STATUS | ||
17 | |||
18 | 1. mpn_lshift and mpn_rshift run at about 6 cycles/limb, but the Pentium | ||
19 | documentation indicates that they should take only 43/8 = 5.375 cycles/limb, | ||
20 | or 5 cycles/limb asymptotically. | ||
21 | |||
22 | 2. mpn_add_n and mpn_sub_n run at asymptotically 2 cycles/limb. Due to loop | ||
23 | overhead and other delays (cache refill?), they run at or near 2.5 cycles/limb. | ||
24 | |||
25 | 3. mpn_mul_1, mpn_addmul_1, mpn_submul_1 all run 1 cycle faster than they | ||
26 | should... | ||