New version of runstack.py.
- Reworked code to reflect that Python's sort uses (start_offset, run_length) pairs to record runs.
- Two unbounded-integer power implementations, one using a loop and the other division. The loop version implies that, in Python's C implementation, size_t arithmetic would always suffice. The division version shows that 2*d-1 bit unsigned division suffices if the # of array elements fits in d bits (so 64-bit ints in C would suffice for arrays up to 2**32 elements).
- Another power implementation using frexp - unpromising.
- And another that specializes the division method by rounding the array size up to a power of 2, removing the need for division. Maybe worth looking at more, but offhand the results were significantly poorer.
- Added a "bad case" for powersort - surprising! timsort and 2-merge are optimal in this case. powersort "merge cost" is up to a third larger. |