Institutt for informatikk seminar: Thursday 18 April, kl. 14:15

Two 'new' sorting algorithms - and the effect of multi level caching on performance

Arne Maus

Ifi, Oslo

Sorting algorithms are often in textbooks divided into two categories: Comparison based methods (such as Quicksort and Bubblesort) and distribution based methods (such as Radix and Bucket sort). This talk will present two 'new' algorithms, ALR (Adaptive Left Radix) and Permutation-sort, both might be said to fall into the second category, but ALR utilizes more of the available information than an ordinary Radix-sort to achieve a better performance, and Permutation sort asks a different question than the ordinary ("rearrage the elements in array A in a sorted order").

Central to the performance issue seems to be the non-valid assumption made in almost all theoretical performance analysis, that all accesses to memory have the same cost. Modern computers are cache based in 2 or more levels, and the access time to an element in the closest cache to the CPU may be 50-100 times as fast as to data that is not cached. The second part of the presentation therefore will give some results from tests on the effect of caches on such algorithms (like sorting) that typically uses array indexed by other arrays (possibly again indexed by still other arrays...). Designing cache friendly algorthms seems essential to achieve good performance."

Norsk:
Oftest deles sorteringsalgoritmer inn i de som bare tar hensyn til verdiene i hvert elemnet i sorteringsarrayen(distribusjonsbaserte) som radix og bøttesortering - og de som sorterer ved sammenligning av elementene med hverandre (sammenligningsbaserte) som Quicksort og boblesortering. Her introduseres to 'nye' algoritmer som ikke entydig faller i disse to kategoriene: Adaptiv Radix som justerer sifferbredden etter den aktuelle distribusjonen av tallene, og permutasjonssortering som finner den permutasjonsvektoren som vil presentere den originale arrayen i sortert rekkefølge.
Effekten av flere lag med cache i en modrene datamaskin undersøkles og nyttes til å forklare kjøretidene når disse to algoritmene sammenlignes med andre, kjente algoritmer.

Back to seminar homepage