Shell Sort
Shell Sort is a sorting algorithm that takes all elements that are n spaces apart, sorts them and places them back into the array, decreasing the gap each time.
How Shell Sort works
Shell Sort starts by taking the nearest log2 of the array’s length (eg length 10 would have a log2 of 8) to determine the initial gap size. This is going to be the initial gap n that we work with.
Then we get every nth item in the list and move it to a separate array and perform Insertion Sort on it. They are placed back into the main array in the same positions they were taken out) and the process is repeated using the same gap but shifted along one unit.
Once this gap size has been exhausted, the gap size is reduced by half until the size becomes 1. At this point we’re doing an insertion sort like normal, but the items have been moved so close to their final destination that they only need to move a space or two over.
Example
Let’s consider the following array of 10 items:
[9, 3, 2, 6, 4, 1, 0, 5, 8, 7]
Starting with a gap of 8, so we’ll start by isolating the 0th and 8th elements. They are then sorted using Insertion Sort.
[9, 8]
[8, 9]
Once that’s done, they are placed back in their original spaces at index 0 and 8.
[8, 3, 2, 6, 4, 1, 0, 5, 9, 7]
Now let’s do the same thing for the elements at position 1 and 9.
[3, 7]
[3, 7]
[8, 3, 2, 6, 4, 1, 0, 5, 9, 7]
Gap size 4
Now we reduce the gap to 4. Let’s sort the elements at position 0, 4, 8.
[8, 4, 9]
[4, 8, 9]
[4, 3, 2, 6, 8, 1, 0, 5, 9, 7]
Again for 1, 5, 9.
[3, 1, 7]
[1, 3, 7]
[4, 1, 2, 6, 8, 3, 0, 5, 9, 7]
Indexes 2, 6
[2, 0]
[0, 2]
[4, 1, 0, 6, 8, 3, 2, 5, 9, 7]
Indexes 3, 7
[6, 5]
[5, 6]
[4, 1, 0, 5, 8, 3, 2, 6, 9, 7]
Gap size 2
Now we reduce the size of the gap to 2. Start with indices 0, 2, 4, 6, 8:
[4, 0, 8, 2, 9]
[0, 2, 4, 8, 9]
[0, 1, 2, 5, 4, 3, 8, 6, 9, 7]
Odd indices:
[1, 5, 3, 6, 7][1, 3, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 8, 6, 9, 7]
Gap size 1
The final step is to perform the sort with a gap size of 1. It’s essentially just a normal insertion sort on the whole list, but the previous steps prepped the array so that each element is close to its target position. The first half od the array is already sorted by this point, and the rest of the elements are only up to 2 spaces away from where they should be.
[0, 1, 2, 3, 4, 5, 8, 6, 9, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
And it’s sorted!
Visualisation
TBD
Time complexity
Shell Sort uses comparisons and swaps to do the Insertion Sorts, and then inserts to the array.
Comparisons and insertions
Comparisons and insertion for Shell Sort have the same time complexity of n2/8
, so I’ll group them here together. It makes sense because comparisons are all done on the extracted sub-array which is then reinserted back into the main array.
This version of Shell Sort in particular is quite noteworthy because even though it’s completed in O(n2) time, it’s the fastest algorithm that falls under that category. It is however limited to that speed because it relies Insertion Sort as the method for sorting the sub-arrays.
Swaps
Where Shell Sort differs from Insertion Sort is in thenumber of swaps. Since each element is no longer restricted to moving one space at a time, the amount of swaps have now been reduced to just O(nlog(n)).
Distribution
This is how the comparisons, insertions and swaps are distributed for this sorting algorithm of size 100 for 100,000 sorts.
These are all fairly standard normal distributions, with the exception of insertions which have no random elements to them.
C code
void shell_sort(int *p, int len) {
int exp = (int)log2((double)len); // Exponent
int i, j, sub_list_size, loops;
int *sub_list;
int gap;
// Loop through all positive exponents
while (exp > 0) {
// Calculate the gap as 2^i
gap = (int)pow(2,(double)exp--)-1;
if (gap == 1) {
insertion_sort(p, len);
break;
}
// Allocate memory for a sublist
sub_list_size = len % gap ? len / gap +1 : len/gap;
sub_list = malloc(sub_list_size * sizeof(int));
// Loop through gap shifts
loops = len/gap + len%gap;
for (i=0; i<loops && sub_list_size>1; i++) {
// Test for sublists that may go over len
if (gap * (sub_list_size-1) + i >= len && gap != 1) {
sub_list_size -= 1;
if (sub_list_size == 1){
break;
}
}
// Generate sublist
for (j=0; j<sub_list_size; j++) {
sub_list[j] = p[i+j*gap];
}
// Sort the sublist
insertion_sort(sub_list, sub_list_size);
// Reinsert the sublist
for (j=0; j<sub_list_size; j++) {
insert(&p[i+j*gap], sub_list[j]);
}
}
// Free sublist memory
free(sub_list);
}
}