Fastest gap sequence for shell sort?

0 votes
asked Mar 29, 2010 by sawyer

According to Marcin Ciura's Optimal (best known) sequence of increments for shell sort algorithm, the best sequence for shellsort is 1, 4, 10, 23, 57, 132, 301, 701..., but how can I generate such a sequence? In Marcin Ciura's paper, he said:

Both Knuth’s and Hibbard’s sequences are relatively bad, because they are defined by simple linear recurrences.

but most algorithm books I found tend to use Knuth’s sequence: k = 3k + 1, because it's easy to generate. What's your way of generating a shellsort sequence?

5 Answers

0 votes
answered Mar 3, 2010 by chromatix

If your data set has a definite upper bound in size, then you can hardcode the step sequence. You should probably only worry about generality if your data set is likely to grow without an upper bound.

The sequence shown seems to grow roughly as an exponential series, albeit with quirks. There seems to be a majority of prime numbers, but with non-primes in the mix as well. I don't see an obvious generation formula.

A valid question, assuming you must deal with arbitrarily large sets, is whether you need to emphasise worst-case performance, average-case performance, or almost-sorted performance. If the latter, you may find that a plain insertion sort using a binary search for the insertion step might be better than a shellsort. If you need good worst-case performance, then Sedgewick's sequence appears to be favoured. The sequence you mention is optimised for average-case performance, where the number of comparisons outweighs the number of moves.

0 votes
answered Mar 29, 2010 by john-feminella

Ciura's paper generates the sequence empirically -- that is, he tried a bunch of combinations and this was the one that worked the best. Generating an optimal shellsort sequence has proven to be tricky, and the problem has so far been resistant to analysis.

The best known increment is Sedgewick's, which you can read about here (see p. 7).

0 votes
answered Mar 27, 2011 by u0b34a0f6ae

I would not be ashamed to take the advice given in Wikipedia's Shellsort article,

With respect to the average number of comparisons, the best known gap sequences are 1, 4, 10, 23, 57, 132, 301, 701 and similar, with gaps found experimentally. Optimal gaps beyond 701 remain unknown, but good results can be obtained by extending the above sequence according to the recursive formula h_k = \lfloor 2.25 h_{k-1} \rfloor.

Tokuda's sequence [1, 4, 9, 20, 46, 103, ...], defined by the simple formula h_k = \lceil h'_k \rceil, where h'k = 2.25h'k − 1 + 1, h'1 = 1, can be recommended for practical applications.

guessing from the pseudonym, it seems Marcin Ciura edited the WP article himself.

0 votes
answered Mar 24, 2015 by piotr-grochowski

The sequence is 1, 4, 10, 23, 57, 132, 301, 701, 1750. For every next number after 1750 multiply previous number by 2.25 and round down.

0 votes
answered Sep 15, 2017 by foo-bar

I've found this sequence similar to Marcin Ciura's sequence:

1, 4, 9, 23, 57, 138, 326, 749, 1695, 3785, 8359, 18298, 39744, etc.

For example, Ciura's sequence is:

1, 4, 10, 23, 57, 132, 301, 701, 1750

This is a mean of prime numbers. Python code to find mean of prime numbers is here:

import numpy as np

def isprime(n):
    ''' Check if integer n is a prime '''
    n = abs(int(n))  # n is a positive integer
    if n < 2:  # 0 and 1 are not primes
        return False
    if n == 2:  # 2 is the only even prime number
        return True
    if not n & 1:  # all other even numbers are not primes
        return False
    # Range starts with 3 and only needs to go up the square root
    # of n for all odd numbers
    for x in range(3, int(n**0.5)+1, 2):
        if n % x == 0:
            return False
    return True

# To apply a function to a numpy array, one have to vectorize the function
vectorized_isprime = np.vectorize(isprime)

a = np.arange(10000000)
primes = a[vectorized_isprime(a)]
#print(primes)
for i in range(2,20):
    print(primes[0:2**i].mean())

The output is:

4.25
9.625
23.8125
57.84375
138.953125
326.1015625
749.04296875
1695.60742188
3785.09082031
8359.52587891
18298.4733887
39744.887085
85764.6216431
184011.130096
392925.738174
835387.635033
1769455.40302
3735498.24225

The gap in the sequence is slowly decreasing from 2.5 to 2. Maybe this association could improve the Shellsort in the future.

Welcome to Q&A, where you can ask questions and receive answers from other members of the community.
Website Online Counter

...