Skip to main content

Sorting guide in C

 

Sorting

 Written by: Manjula Sharma

Sorting means arranging a set of data in the same order. In our daily life,we can see so many 

applications using the data in sorted order as telephone directory,merit,list,roll number etc.So ,

sorting is nothing but storage of data in sorted order(it can be in ascending or descending order).


Some Sorting techniques:

  1. Bubble sort

  2. Selection sort

  3. Insertion sort

  4. Merge sort

  5. Quick sort

  6. Heap sort

  7. Radix sort

  8. Shell sort

  9. Bucket Sort

  10. Counting sort


The main component in any sorting is the key comparison because most of the time we need to 

compare the key with elements in a list.The number of comparisons are more,then time for executing 

the program is also more.


Let ‘n’ be the input size of an array. If n increases,then time for execution also increases,So, execution 

time will be varying with different volumes of data.

The efficiency of any sorting technique can be represented using mathematical notations.

All the sorting technique efficiencies are in-between 0(n log n) to 0(n2).

Bubble sort:

The bubble sort technique for sorting a list of data in ascending order is as follows: 

In the first iteration, the first element of the array is compared with the second element. If the first element is 

found to be greater than the second element, they are interchanged. Now, the second element is compared 

with the third and interchanged if required. In the same way, comparison is done till the last element.At the 

end of first iteration, the largest element will be stored at the last position. 

In the second iteration, again the comparison is done from the first element to last-but-one element. At the 

end of this iteration, the second largest element will be placed in its proper position.


 If there are ‘n’ elements in the given list, then after (n-1) iterations, the array gets sorted.

 Consider the following list of integers to be sorted: 



A  =  

28

20

30

15

5

        0      1        2        3      4


1st iteration:

28

20

30

15

5

                    Compare A[0] and A[1]  28>20 interchange


20

28

30

15

5

                    Compare A[1] and A[2]  28<30 no change


20

28

30

15

5

                    Compare A[2] and A[3]  30>15 interchange


20

28

15

30

5

                    Compare A[3] and A[4]  30>5 interchange


20

28

15

5

30

                    30 largest element in the array



2nd iteration:



20

28

15

5

30

                    Compare A[0] and A[1]  20<28  no change


20

28

15

5

30

                    Compare A[1] and A[2]  28>15 interchange


20

15

28

5

30

                    Compare A[2] and A[3]  28>5 interchange


20

15

5

28

30

                    28 second largest element



3rd iteration:



20

15

5

28

30

                    Compare A[0] and A[1]  20>15 interchange


15

20

5

28

30

                    Compare A[1] and A[2]  20>5 interchange


15

5

20

28

30

                    20 Third largest element



4th iteration:



15

5

20

28

30

                    Compare A[0] and A[1]  15>5 interchange


5

15

20

28

30

                    Sorted array



Algorithm:Bubblesort(A[0…..n-1])

A is an array of n elements to be sorted.

for pass =1 to n-1

    for j=0 to n-pass-1

        if(a[j]>a[j+1] )then

temp=a[j]

a[j]=a[j+1]

a[j+1]=temp

End for

End for


Program:


#include<stdio.h>

#include<conio.h>

void main()

{

int a[10],n,i,temp,j;

clrscr();

printf("Enter the size of the array:");

scanf("%d",&n);

printf("\nEnter array elements:\n");

for(i=0;i<n;i++)

scanf("%d",&a[i]);

for(i=0;i<n;i++)

{

for(j=0;j<n-i-1;j++)

{

if(a[ j]>a[j+1])

{

temp=a[j];

a[j]=a[j+1];

a[j+1]=temp;

}

}

}

printf("\nSorted list is:\n");

for(i=0;i<n;i++)

printf("%d\t",a[i]);

getch();

}


Time Complexities:

  • Worst Case Complexity:O(n2)
    If we want to sort in ascending order and the array is in descending order then, the worst case

    occurs.

  • Best Case Complexity:O(n)
    If the array is already sorted, then there is no need for sorting.

  • Average Case Complexity:O(n2)
    It occurs when the elements of the array are in jumbled order (neither ascending nor descending).

  • Space Complexity:O(1)

Space complexity is O(1) because an extra variable temp is used for swapping.

In the optimized algorithm, the variable swapped adds to the space complexity thus, making it O(2).

  • Auxiliary Space: O(1)

Boundary Cases: Bubble sort takes minimum time (Order of n) when elements are already sorted.

Bubble Sort Applications

Bubble sort is used in the following cases where

  • The complexity of the code does not matter.

  • A short code is preferred.


Selection sort 

Selection sort is a simplest method of sorting technique. To sort the given list in ascending order, we will
compare the first element with all the other elements. If the first element is found to be greater then the
compared element, then they are interchanged. Thus at the end of first interaction, the smallest element
will be stored in first position, which is its proper position. Then in the second interaction, we will repeat
the procedure from second element to last element. The algorithm is continued till we get sorted list. If
there are n elements, we require (n-1) iterations, in general.


A =

38

47

24

42

17

        0      1        2        3      4

   


1st iteration:

38

47

24

42

17

                    Compare A[0] and A[1]  38<47 no change


38

47

24

42

17

                    Compare A[0] and A[2]  38>24 interchange


24

47

38

42

17

                    Compare A[0] and A[3]  24<42 no change


24

47

38

42

17

                    Compare A[o] and A[4]  24>17 interchange


17

47

38

42

24

                    17 smallest element in the array



2nd iteration:



17

47

38

42

24

                    Compare A[1] and A[2] 47>38  interchange


17

38

47

42

24

                    Compare A[1] and A[3]  38<42  no change


17

38

47

42

24

                    Compare A[1] and A[4]  38>24  interchange


17

24

47

42

38

                    24 second smallest element



3rd iteration:



17

24

47

42

38

                    Compare A[2] and A[3]  47>42 interchange


17

24

42

47

38

                    Compare A[2] and A[4]  42>38 interchange


17

24

38

47

42

                    38 Third smallest element



4th iteration:



17

24

38

47

42

                    Compare A[3] and A[4]  47>42 interchange


17

24

38

42

47

                    Sorted array



Algorithm : Selection sort(A[0…..n-1])

for(i=0;i<n-1;i++)

{

    for(j=i+1;j<n;j++)

    {

if(a[ i]>a[j])

{

    temp=a[i];

                a[i]=a[j];

               a[j]=temp;

            }

    }

}

Program:

#include<stdio.h>

#include<conio.h>

void main()

{

int a[10],n,i,temp,j;

clrscr();

printf("Enter the size of the array:");

scanf("%d",&n);

printf("\nEnter array elements:\n");

for(i=0;i<n;i++)

scanf("%d",&a[i]);

for(i=0;i<n-1;i++)

{

for(j=i+1;j<n;j++)

{

if(a[ i]>a[j])

{

temp=a[i];

a[i]=a[j];

a[j]=temp;

}

}

}

printf("\nSorted list is:\n");

for(i=0;i<n;i++)

printf("%d\t",a[i]);

getch();

}


Time Complexities:

  • Worst Case Complexity: O(n2)
    If we want to sort in ascending order and the array is in descending order then, the worst case 

    occurs.

  • Best Case Complexity: O(n2)
    It occurs when the array is already sorted

  • Average Case Complexity: O(n2)
    It occurs when the elements of the array are in jumbled order (neither ascending nor descending).

The time complexity of the selection sort is the same in all cases. At every step, you have to find the 

minimum element and put it in the right place. The minimum element is not known until the end of the 

array is not reached.

  • Space Complexity:

Space complexity is O(1) because an extra variable temp is used.


Selection Sort Applications

The selection sort is used when:

  • a small list is to be sorted

  • cost of swapping does not matter

  • checking of all the elements is compulsory

  • cost of writing to a memory matters like in flash memory (number of writes/swaps is O(n) as compared to O(n2) of bubble sort)


Insertion sort

This sorting technique involves inserting a particular element in proper position. In the first iteration, 

the second element is compared with the first. In second iteration, the third element is compared with 

second and then the first. Thus in every iteration, the element is compared with all the elements before it. 

If the element is found to be greater than any of its previous elements, then it is inserted at that position and

 all other elements are moved to one position towards right, to create the space for inserting element. The 

procedure is repeated till we get the sorted list.



A =

38

47

24

42

89

        0      1        2        3      4



1st iteration:


38

47

24

42

17

                    Compare A[0] and A[1]  38<47, 






                    so consider 38 as  First position


38

47

24

42

17

                    First 2 elements are sorted order 


2nd iteration:


38

47

24

32

89

                    Compare A[2] with A[1] and A[2]  24<38, 






                    so insert 24 at 0th position                   



24

38

47

32

17

                    First 3 elements are sorted order 




3rd iteration:



24

38

47

32

89

                    Compare A[3] with A[1] and A[2] and A[3]  






                    32>24, nothing done

                    32<38 insert 32 at 1st position 

                  


24

32

38

47

89

                    First 4 elements are in sorted order 



4th iteration:


24

38

47

32

89

                    Compare A[4] with A[0….3] element  






                    89>24, nothing done

                    89>38, nothing done

                    89>47, nothing done

                    89>32, nothing done


                  



24

32

38

47

89

                    First 5 elements are in sorted order 



Algorithm:insertion sort(A[0…..n-1])

for(i=1;i<n;i++)

{

item=a[i];

for(j=i-1;j>=0 && item<a[j];j--)

a[j+1]=a[j];

a[j+1]=item;

}

Program:

#include<stdio.h>

#include<conio.h>

void main()

{

int a[10],n,i,item,j;

clrscr();

printf("Enter the size of the array:");

scanf("%d",&n);

printf("\nEnter array elements:\n");

for(i=0;i<n;i++)

scanf("%d",&a[i]);

for(i=1;i<n;i++)

{

item=a[i];

for(j=i-1;j>=0 && item<a[j];j--)

a[j+1]=a[j];

a[j+1]=item;

}

printf("\nSorted list is:\n");

for(i=0;i<n;i++)

printf("%d\t",a[i]);

getch();

}

Time Complexities:

  • Worst Case Complexity: O(n2)
    Suppose, an array is in ascending order, and you want to sort it in descending order. In this case, 

    worst case complexity occurs.
    Each element has to be compared with each of the other elements so, for every nth element, (n-1) 

    number of comparisons are made.
    Thus, the total number of comparisons = n*(n-1) ~ n2

  • Best Case Complexity: O(n)
    When the array is already sorted, the outer loop runs for n number of times whereas the inner loop 

    does not run at all. So, there are only n number of comparisons. Thus, complexity is linear.

  • Average Case Complexity: O(n2)
    It occurs when the elements of an array are in jumbled order (neither ascending nor descending).

  • Space Complexity

Space complexity is O(1) because an extra variable key is used.

Insertion Sort Applications

The insertion sort is used when:

  • the array is has a small number of elements

  • there are only a few elements left to be


Merge sort

The procedure for merge sort contains two main parts viz. divide and merge.

 Divide: The original array is divided into two equal parts. Then each sub-array is divided into two equal 

parts. This method is continued till each sub array contains only one element.

 Merge: The first element of first sub-array is compared with first element of the second sub-array. The 

lesser among these is put into result-array. The remaining element is compared with the second element 

of the other array. The procedure is continued till both the arrays get exhausted.

 The divide and merge parts are done recursively on given array to get sorted list.


Consider an array A[0...7] element:



0

1

2

3

4

5

6

  A=

35

10

15

45

25

20

40


Low = 0 , high = 6 , n = 7, mid = (low+high)/2 = (0+6)/2 = 3

Algorithm:

Algorithm:merge_sort(A,low,high)


//sort the elements in ascending order.

//low and high=first and last elements 

if(low<high)

mid=(low+high)/2

Call merge_sort(A,low,mid)

Call merge_sort(A,mid+1,high)

Call merge(A,low,mid,high)

Exit


Algorithm:merge(a,low,mid,high)

Set i=low

      j=mid+1

      k=low

while((i<=mid)&&(j<=high)) do

if(a[i]<a[j])

   Res[k]=a[i]

   k=k+1

   k=k+1

   i=i+1

else

              Res[k]=a[j]

  K=K+1

  J=J+1

End while

while(i<=mid)

Res[k]=a[i]

k=k+1

i=i+1

End while

while(j<=high)

Res[k]=a[j]

k=k+1

j=j+1

End while

For i=low to k-1

a[i]=Res[i]

End for

Return



Program:

#include<stdio.h>

#include<conio.h>

void merge(int a[10], int low,int mid,int high)

{

Int i,j,Res[30];

 i=low;

 j=mid+1; 

 k=low;

while((i<=mid)&&(j<=high)) 

{

if(a[i]<a[j])

{

   Res[k]=a[i]

   k++;

   i++;

}

else

{

              Res[k]=a[j]

  k++;

  j++;

}

}

while(i<=mid)

{

Res[k]=a[i]

  k++;

  i++;

}

while(j<=high)

{

 Res[k]=a[j]

  k++;

  j++;

}

for(i=0;i<=k-1;i++)

a[i]=Res[i];

}





void merge_sort(int a[],int low,int high)

{

int mid;

if(low<high)

{

mid=(low+high)/2

merge_sort(A,low,mid)

merge_sort(A,mid+1,high)

merge(A,low,mid,high)

}

}



void main()

{

int a[20],n,i;

clrscr();

printf("Enter the size of array:");

scanf("%d",&n);

printf("\nEnter elements:\n");

for(i=0;i<n;i++)

scanf("%d",&a[i]);

merge_sort(a,0,n-1);

printf("\nSorted list is:\n");

for(i=0;i<n;i++)

printf("%d\t",a[i]);

getch();

}

Time Complexity:

  • Best Case Complexity: O(n*log n)

  • Worst Case Complexity: O(n*log n)

  • Average Case Complexity: O(n*log n)

  • Space Complexity:O(n)

Merge Sort Applications

  • Inversion count problem

  • External sorting

  • E-commerce applications


QUICK SORT

Quick Sort is a technique that will sort a list of data significantly faster than any other sorting techniques. 

This algorithm is based on the fact that – it is always easier and faster to sort two small arrays than one 

single big array.


 Here, the given array is divided into two sub-arrays such that the elements at the left-side of some key 

element are less than the key element and the elements at the right-side of the key element are greater 

than the key element. 


The dividing procedure is done with the help of two index variables and one key element as explained below –

 Usually the first element of the array is treated as key. The position of the second element is taken as the first 

index variable left and the position of the last element will be the index variable right.

 Now the index variable left is incremented by one till the value stored at the position left is greater than the key.

Similarly right is decremented by one till the value stored at the position right is smaller than the key.


Now, these two elements are interchanged. Again from the current position, left and right are incremented and 

decremented respectively and exchanges are made appropriately, if required.

This process is continued till the index variables crossover. Now, exchange key with the element at the position 

right.


 Now, the whole array is divided into two parts such that one part is containing the elements less than 

the key element and the other part is containing the elements greater than the key. And, the position of 

key is fixed now.


The above procedure (from step i to step vi) is applied on both the sub-arrays. After some iteration we 

will end-up with sub-arrays containing single element. By that time, the array will be stored.


 Let us illustrate this algorithm using an example. Consider an array 



0

1

2

3

4

5

  A=

45

36

15

92

35

71




n = 6, high = 5, key = a[low] = a[0] = 45

i = low+1 = 1, i = high = 5



0

1

2

3

4

5

  1)

45

36

15

92

35

71


low

i




high,j


Check→  key>a[i]

45>36 → true  i++




0

1

2

3

4

5

  2)

45

36

15

92

35

71




i



j


Check→  key>a[i]

45>15 → true  i++




0

1

2

3

4

5

  3)

45

36

15

92

35

71





i


j


Check→  key>a[i]

45>92 → False

Compare key with a[j]

45<71→ True   j--




0

1

2

3

4

5

  4)

45

36

15

92

35

71





i

j

 


Check→  key<a[j]

45<35 → False

If (i<j) exchange a[i] and a[j] and 

   Repeat same process

(3<4) exchange a[3] and a[4]




0

1

2

3

4

5

  5)

45

36

15

35

92

71





i

j

 


Check→  key<a[j]

45>35 → True  i++




0

1

2

3

4

5

  6)

45

36

15

35

92

71






i  j

 


Check→  key<a[j]

45>92 → False

                        Compare key with a[j]

45<92→ True   j--

 



0

1

2

3

4

5

  7)

45

36

15

35

92

71





j

i

 


Check→  key<a[j]

45<32 → False

If (i>j) exchange a[j] and key  

  



0

1

2

3

4

5

  8)

35

36

15

45

92

71







 


45 reached to the correct position in an array

A[0:2] < 45 > A[4:5]


Thus, all the elements at the left-side of key(i.e. 45) are less than key and all the elements at the 

right-side of key are greater than key. Hence, we have got two sub-arrays as – 


                                Left side                                                      Right side


0

1

2


3


4

5

   

35

36

15

  <

45

  >

92

71



Now, the position of 45 will not get changed. But, we have to sort two sub-arrays separately, by referring 

the above explained steps.



Algorithm:

Algorithm:Quicksort(A[0…..n-1],low,high)

if (low < high) then

 { 

pos = partition(x,low,high);

Quicksort(x,low,pos-1); 

Quicksort(x,pos+1,high);

 }

exit


Algorithm:partition(X[0…..n-1],low,high)

    Set  key = x[low]; 

i = low +1;

j = high; 

while(true)

{

while ((i< high) && (key >= x[i]))

 i++;

 while(key < x[j])

 j--;

 if(i < j) 

temp = x[left];

x[left] = x[j]; 

x[j] = temp; 

} else 

temp = x[low]; 

x[low] = x[j]; 

x[j] = temp; 

return(j);

 }

 } 

return 0;

 

Program: 

#include<stdio.h>

quick_sort(int x[], int low, int high)              //Function to apply quick sort technique 

{

int pos; 

if (low < high)

 { 

pos = partition(x,low,high);

quick_sort(x,low,pos-1); 

quick_sort(x,pos+1,high);

 }

 return;

 }

 int partition(int x[], int low, int high)    //Function for partitioning the array 

{

int key, temp, true = 1;

int left, right; 

key = x[low]; 

i = low +1;

j = high; 

while(true)

{

while ((i< high) && (key >= x[i]))

 i++;

 while(key < x[j])

 j--;

 if(i < j) 

temp = x[left];

x[left] = x[j]; 

x[j] = temp; 

} else 

temp = x[low]; 

x[low] = x[j]; 

x[j] = temp; 

return(j);

 }

 } 

return 0;

 }


void main()

{

int a[10],n,i,low,high;

clrscr();

printf("Enter array size\n");

scanf("%d",&n);

printf("Enter the elements\n");

for(i=0;i<n;i++)

scanf("%d",&a[i]);

low = 0;

high = n-1;

quick_sort(a,low,high);

printf("The sorted list is \n");

for(i=0;i<n;i++)

printf("%d\t",a[i]);

getch();

}


Time Complexities:

  • Worst Case Complexity [Big-O]: O(n2)
    It occurs when the pivot element picked is either the greatest or the smallest element.
    This condition leads to the case in which the pivot element lies in an extreme end of the sorted 

    array. One sub-array is always empty and another sub-array contains n - 1 elements. Thus, 

    quicksort is called only on this sub-array.
    However, the quick sort algorithm has better performance for scattered pivots.

  • Best Case Complexity [Big-omega]: O(n*log n)
    It occurs when the pivot element is always the middle element or near to the middle element.

  • Average Case Complexity [Big-theta]: O(n*log n)
    It occurs when the above conditions do not occur.

  • Space Complexity

The space complexity for quicksort is O(log n).


Quicksort Applications

Quicksort is implemented when

  • the programming language is good for recursion

  • time complexity matters

  • space complexity matters


Heap Sort

Heap sort works by visualizing the elements of the array as a special kind of complete binary tree called 

a heap. Heap is a special tree-based data structure. A binary tree is said to follow a heap data structure 

if it is a complete binary tree. All nodes in the tree follow the property that they are greater than their 

children i.e. the largest element is at the root and both its children and smaller than the root and so on. 

Such a heap is called a max-heap. If instead, all nodes are smaller than their children, it is called a 

min-heap.


Max Heap

Min Heap


Heap sort is basically improvement over the binary tree sort. There are two phases involved in sorting 

the elements using heap sort

  1. Construct a heap by adjusting the array elements.

  2. Repeatedly eliminate the root element of the heap by shifting it to the end of the array and then 

    restore the heap structure with remaining elements.


Example:

Consider a max heap in construction and the root element of max-heap is always the largest 

element. The sorting ends when all the root elements in the heap has been moved to the end of 

the array. The resulting array now contains a sorted list. Now let us see how to create a max-heap 

for the given array elements.

 

Creating heap tree:


A  =

13

86

43

38

54

23

08

63

        0      1        2        3      4      5         6           7



Step 1: Insert 13 

         

  

Step 2 : Insert 86 on LHS (13<86 insert in root part)

Step 3: insert 43 on RHS. (43<86)


Step 4: insert 38 on LHS (38>13<86 insert in root part)

 


Step 5: insert 54 on RHS (38<54 insert in root part)


Step 6: insert 23 on LHS (43>23)


Step 7: insert 08 on RHS (43>08)

Step 8: insert 63 on LHS (63>54>13 insert it on root part)




Now let us consider heap tree and apply heap sort technique.



Step 1:

Root element 86 is moved to the last location by exchanging it with 13.

86->eliminated 


        0      1        2        3      4      5         6           7

A  =








86



Adjust Above Heap tree (Constructing heap tree)



Step 2: Root element 63 is moved to the last but one location by exchanging 08 (last element)



      0      1        2        3      4      5         6           7

A  =







63

86



3: Root element 54 is moved to A[5]th location



      0      1        2        3      4      5         6           7

A  =






54

63

86



Step 4: Root 43 is moved to A[4]th location




       0      1        2        3      4      5         6           7

A  =





43

54

63

86



Step 5: Root 38 is moved to A[3]rd location



       0      1        2        3      4      5         6           7

A  =




38

43

54

63

86



Step 6: Root 23 is moved to A[2]rd location



       0      1        2        3      4      5         6           7

A  =



23

38

43

54

63

86



Step 7: Root 13 is moved to A[1]st location



       0      1        2        3      4      5         6           7

A  =


13

23

38

43

54

63

86



Step 8: Root 08 is moved to A[0]th location


       0      1        2        3      4      5         6           7

A  =

08

13

23

38

43

54

63

86


Sorted array.


Algorithm:

//Constructing max heap

// Find largest among root, left child and right child

 void heapify(int arr[], int n, int i)       

 {

  

    int largest = i;

    int left = 2 * i + 1;

    int right = 2 * i + 2;

  

    if (left < n && arr[left] > arr[largest])

      largest = left;

  

    if (right < n && arr[right] > arr[largest])

      largest = right;

  

       if (largest != i)            // Swap and continue heapifying if root is not largest

     {

      swap(&arr[i], &arr[largest]);

      heapify(arr, n, largest);

    }

  }


  // Main function to do heap sort

   void heapSort(int arr[], int n)      {

      for (int i = n / 2 - 1; i >= 0; i--)     // Build max heap

      heapify(arr, n, i);

  

       for (int i = n - 1; i >= 0; i--) {         // Heap sort

      swap(&arr[0], &arr[i]);

  

        heapify(arr, i, 0);                         // Heapify root element to get highest element at root again

    }

  }



Program:


#include <stdio.h>

 void swap(int *a, int *b)  // Function to swap the the position of two elements

 {

    int temp = *a;

    *a = *b;

    *b = temp;

  }

  

  void heapify(int arr[], int n, int i)       // Find largest among root, left child and right child

 {

  

    int largest = i;

    int left = 2 * i + 1;

    int right = 2 * i + 2;

  

    if (left < n && arr[left] > arr[largest])

      largest = left;

  

    if (right < n && arr[right] > arr[largest])

      largest = right;

  

       if (largest != i)            // Swap and continue heapifying if root is not largest

     {

      swap(&arr[i], &arr[largest]);

      heapify(arr, n, largest);

    }

  }

  

   void heapSort(int arr[], int n)     // Main function to do heap sort

 {

      for (int i = n / 2 - 1; i >= 0; i--)     // Build max heap

      heapify(arr, n, i);

  

       for (int i = n - 1; i >= 0; i--) {         // Heap sort

      swap(&arr[0], &arr[i]);

  

        heapify(arr, i, 0);                         // Heapify root element to get highest element at root again

    }

  }

  void printArray(int arr[], int n) {

    for (int i = 0; i < n; ++i)

      printf("%d ", arr[i]);

    printf("\n");

  }

  int main() {

    int arr[] = {1, 12, 9, 5, 6, 10};

    int n = sizeof(arr) / sizeof(arr[0]);

  

    heapSort(arr, n);

  

    printf("Sorted array is \n");

    printArray(arr, n);

}


Heap Sort Complexity:

Heap Sort has O(nlog n) time complexities for all the cases ( best case, average case, and worst case).

Let us understand the reason why. The height of a complete binary tree containing n elements is log n

As we have seen earlier, to fully heapify an element whose subtrees are already max-heaps, we need to 

keep comparing the element with its left and right children and pushing it downwards until it reaches a point 

where both its children are smaller than it.

In the worst case scenario, we will need to move an element from the root to the leaf node making a 

multiple of log(n) comparisons and swaps. During the build_max_heap stage, we do that for n/2 elements 

so the worst case complexity of the build_heap step is n/2*log n ~ nlog n.

During the sorting step, we exchange the root element with the last element and heapify the root element. For 

each element, this again takes log n worst time because we might have to bring the element all the way 

from the root to the leaf. Since we repeat this n times, the heap_sort step is also nlog n.

Also since the build_max_heap and heap_sort steps are executed one after another, the algorithmic 

complexity is not multiplied and it remains in the order of nlog n. Also it performs sorting in O(1) space 

complexity. Compared with Quick Sort, it has a better worst case ( O(nlog n) ). Quick Sort has complexity

 O(n^2) for worst case. But in other cases, Quick Sort is fast. Introsort is an alternative to heapsort that 

combines quicksort and heapsort to retain advantages of both: worst case speed of heapsort and average 

speed of quicksort.



Heap Sort Applications

Systems concerned with security and embedded systems such as Linux Kernel use Heap Sort because 

of the O(n log n) upper bound on Heapsort's running time and constant O(1) upper bound on its auxiliary 

storage.

Although Heap Sort has O(n log n) time complexity even for the worst case, it doesn't have more applications 

( compared to other sorting algorithms like Quick Sort, Merge Sort ). However, its underlying data structure, 

heap, can be efficiently used if we want to extract the smallest (or largest) from the list of items without the 

overhead of keeping the remaining items in the sorted order. For e.g Priority Queues.


Radix Sort 

Radix sort is a sorting technique that sorts the elements by first grouping the individual digits of the 

same place value. Then, sort the elements according to their increasing/decreasing order.


Suppose, we have an array of 8 elements. First, we will sort elements based on the value of the unit 

place. Then, we will sort elements based on the value of the tenth place. This process goes on until 

the last significant place.


 Let us illustrate this algorithm using an example. Consider an array 


  A=

121

432

564

23

1

45

788


Max = largest element in the array = 788

X= number of digits in max = 3 

loop= 3times(up to hundreds place)


Starting from the rightmost(last) digit,sort the number based on that digit:




  A=

121

432

564

023

001

045

788


               Sorted array


  A=

121

001

432

023

564

045

788





 Now, sort the elements based on digits at tens place.


  A=

121

001

432

023

564

045

788


           Sorted array


  A=

001

121

023

432

045

564

788



Finally sort the element by the leftmost digit.


  A=

001

121

023

432

045

564

788


Sorted array


  A=

001

023

045

121

432

564

788


Radix Sort Algorithm

radixSort(array)

  d <- maximum number of digits in the largest element

  create d buckets of size 0-9

  for i <- 0 to d

    sort the elements according to ith place digits using countingSort


countingSort(array, d)

  max <- find largest element among dth place elements

  initialize count array with all zeros

  for j <- 0 to size

    find the total count of each unique digit in dth place of elements and

    store the count at jth index in count array

  for i <- 1 to max

    find the cumulative sum and store it in count array itself

  for j <- size down to 1

    restore the elements to array

    decrease count of each element restored by 1


Radix Sort in C Programming

#include <stdio.h>


// Function to get the largest element from an array

int getMax(int array[], int n) {

  int max = array[0];

  for (int i = 1; i < n; i++)

    if (array[i] > max)

      max = array[i];

  return max;

}


// Using counting sort to sort the elements in the basis of significant places

void countingSort(int array[], int size, int place) {

  int output[size + 1];

  int max = (array[0] / place) % 10;


  for (int i = 1; i < size; i++) {

    if (((array[i] / place) % 10) > max)

      max = array[i];

  }

  int count[max + 1];


  for (int i = 0; i < max; ++i)

    count[i] = 0;


  // Calculate count of elements

  for (int i = 0; i < size; i++)

    count[(array[i] / place) % 10]++;

    

  // Calculate cummulative count

  for (int i = 1; i < 10; i++)

    count[i] += count[i - 1];


  // Place the elements in sorted order

  for (int i = size - 1; i >= 0; i--) {

    output[count[(array[i] / place) % 10] - 1] = array[i];

    count[(array[i] / place) % 10]--;

  }


  for (int i = 0; i < size; i++)

    array[i] = output[i];

}


// Main function to implement radix sort

void radixsort(int array[], int size) {

  // Get maximum element

  int max = getMax(array, size);


  // Apply counting sort to sort elements based on place value.

  for (int place = 1; max / place > 0; place *= 10)

    countingSort(array, size, place);

}


// Print an array

void printArray(int array[], int size) {

  for (int i = 0; i < size; ++i) {

    printf("%d  ", array[i]);

  }

  printf("\n");

}


// Driver code

int main() {

  int array[] = {121, 432, 564, 23, 1, 45, 788};

  int n = sizeof(array) / sizeof(array[0]);

  radixsort(array, n);

  printArray(array, n);

}


Time Complexity:

Since radix sort is a non-comparative algorithm, it has advantages over comparative sorting algorithms.

For the radix sort that uses counting sort as an intermediate stable sort, the time complexity is O(d(n+k)).



Here, d is the number cycle and O(n+k) is the time complexity of counting sort.

Thus, radix sort has linear time complexity which is better than O(nlog n) of comparative sorting algorithms.

If we take very large digit numbers or the number of other bases like 32-bit and 64-bit numbers then it can perform in linear time however the intermediate sort takes large space.

Radix Sort Applications:

  • Radix sort is implemented in DC3 algorithm (Kärkkäinen-Sanders-Burkhardt) while making a suffix array.

  • places where there are numbers in large ranges.


Shell Sort

Shell sort is an algorithm that first sorts the elements far apart from each other and successively reduces 

the interval between the elements to be sorted. It is a generalized version of insertion sort.


In shell sort, elements at a specific interval are sorted. The interval between the elements is gradually 

decreased based on the sequence used. The performance of the shell sort depends on the type of 

 

sequence used for a given input array.


 Let us illustrate this algorithm using an example. Consider an array 



          A =

9

8

3

7

5

6

4

1


Initial array A

We are using the shell's original sequence (N/2, N/4, ...1) as intervals in our algorithm.

N=8 

Interval = N/2 = 4



          A =

9

8

3

7

5

6

4

1


0

  1

2

  3

  4

  5 

  6

7

Compare and swap 

A[0]>A[4] = 9>5 True , swap


Rearrange the elements at n/2 interval


          A =

5

8

3

7

9

6

4

1


0

  1

2

  3

  4

  5 

  6

7


This process goes on for all the remaining elements.



          A =

5

8

3

7

9

6

4

1

8>6 True, swap


0

  1

2

  3

  4

  5 

  6

7



          A =

5

8

3

7

9

6

4

1

3>4 False no change


0

  1

2

  3

  4

  5 

  6

7



          A =

5

8

3

7

9

6

4

1

7>1 True,swap


0

  1

2

  3

  4

  5 

  6

7


Rearrange all the elements at n/2 interval

          A =

5

6

3

1

9

8

4

7


0

  1

2

  3

  4

  5 

  6

7


3. In the second loop

    Interval = N/4 =8/4 = 2



Rearrange the elements at n/4 interval


          A =

5

6

3

1

9

8

4

7

5>3  True,swap


0

  1

2

  3

  4

  5 

  6

7




          A =

3

6

5

1

9

8

4

7

6>1  True,swap




0

  1

2

  3

  4

  5 

  6

7


All the elements in the array lying at current interval are compared.

  • 4th and 2nd position are compared.

  • 2nd and 0th position are compared.


          A =

3

1

5

6

9

8

4

7

5>9 False,no change

3>5 False,no change


0

  1

2

  3

  4

  5 

  6

7




          A =

3

1

5

6

9

8

4

7

6>8 False,no change

6>1 False,no change


0

  1

2

  3

  4

  5 

  6

7




          A =

3

1

5

6

9

8

4

7

9>4 True,swap

5>4 True,swap


0

  1

2

  3

  4

  5 

  6

7



          A =

3

1

4

6

5

8

9

7

8>7 True,swap

6>7 False,swap


0

  1

2

  3

  4

  5 

  6

7




Rearrange the elements at n/4 interval


          A =

3

1

4

6

5

7

9

8


0

  1

2

  3

  4

  5 

  6

7

Finally,when the interval is N/8=8/8=1 then array elements lying at the interval of 1 are sorted.

Array A=

3

1

4

6

5

7

9

8

3>1 True, swap

1

3

4

6

5

7

9

8

  No change

1

3

4

6

5

7

9

8

  No change

1

3

4

6

5

7

9

8

6>5 True, swap 

1

3

4

5

6

7

9

8

No change

1

3

4

5

6

7

9

8

No change

1

3

4

5

6

7

9

8

9>8 True, swap

1

3

4

5

6

7

8

9

Sorted array


Sorted array:

   

          A =

1

3

4

5

6

7

8

9


0

  1

2

  3

  4

  5 

  6

7


Shell Sort Algorithm:

shellSort(array, size)

  for interval i <- size/2n down to 1

    for each interval "i" in array

        sort all the elements at interval "i"

end shellSort


Shell Sort in C programming

#include <stdio.h>

// Shell sort

void shellSort(int array[], int n) {

  // Rearrange elements at each n/2, n/4, n/8, ... intervals

  for (int interval = n / 2; interval > 0; interval /= 2) {

    for (int i = interval; i < n; i += 1) {

      int temp = array[i];

      int j;

      for (j = i; j >= interval && array[j - interval] > temp; j -= interval) {

        array[j] = array[j - interval];

      }

      array[j] = temp;

    }

  }

}


// Print an array

void printArray(int array[], int size) {

  for (int i = 0; i < size; ++i) {

    printf("%d  ", array[i]);

  }

  printf("\n");

}


// Driver code

int main() {

  int data[] = {9, 8, 3, 7, 5, 6, 4, 1};

  int size = sizeof(data) / sizeof(data[0]);

  shellSort(data, size);

  printf("Sorted array: \n");

  printArray(data, size);

}

Time Complexity:

Shell sort is an unstable sorting algorithm because this algorithm does not examine the elements lying 

in between the intervals.


  • Worst Case Complexity: less than or equal to O(n2)

Worst case complexity for shell sort is always less than or equal to O(n2).


According to Poonen Theorem, worst case complexity for shell sort is Θ(Nlog N)2/(log log N)2) 

or Θ(Nlog N)2/log log N) or Θ(N(log N)2) or something in between.

  • Best Case Complexity: O(n*log n)

When the array is already sorted, the total number of comparisons for each interval (or increment) 

is equal to the size of the array.

  • Average Case Complexity: O(n*log n)

It is around O(n1.25).

The complexity depends on the interval chosen. The above complexities differ for different increment 

sequences chosen. Best increment sequence is unknown.

  • Space Complexity:

The space complexity for shell sort is O(1).


Shell Sort Applications

Shell sort is used when:

  • calling a stack is overhead. uClibc library uses this sort.

  • recursion exceeds a limit. bzip2 compressor uses it.

  • Insertion sort does not perform well when the close elements are far apart. Shell sort helps in reducing 

    the distance between the close elements. Thus, there will be less number of swappings to be performed.



Comments

Popular posts from this blog

Mastering SQL for Data Science: Top SQL Interview Questions by Experience Level

Introduction: SQL (Structured Query Language) is a cornerstone of data manipulation and querying in data science. SQL technical rounds are designed to assess a candidate’s ability to work with databases, retrieve, and manipulate data efficiently. This guide provides a comprehensive list of SQL interview questions segmented by experience level—beginner, intermediate, and experienced. For each level, you'll find key questions designed to evaluate the candidate’s proficiency in SQL and their ability to solve data-related problems. The difficulty increases as the experience level rises, and the final section will guide you on how to prepare effectively for these rounds. Beginner (0-2 Years of Experience) At this stage, candidates are expected to know the basics of SQL, common commands, and elementary data manipulation. What is SQL? Explain its importance in data science. Hint: Think about querying, relational databases, and data manipulation. What is the difference between WHERE

What is Bort?

 Introduction: Bort, is the new and more optimized version of BERT; which came out this october from amazon science. I came to know about it today while parsing amazon science's news on facebook about bort. So Bort is the newest addition to the long list of great LM models with extra-ordinary achievements.  Why is Bort important? Bort, is a model of 5.5% effective and 16% total size of the original BERT model; and is 20x faster than BERT, while being able to surpass the BERT model in 20 out of 23 tasks; to quote the abstract of the paper,  ' it obtains performance improvements of between 0 . 3% and 31%, absolute, with respect to BERT-large, on multiple public natural language understanding (NLU) benchmarks. ' So what made this achievement possible? The main idea behind creation of Bort is to go beyond the shallow depth of weight pruning, connection deletion or merely factoring the NN into different matrix factorizations and thus distilling it. While methods like knowle

Spacy errors and their solutions

 Introduction: There are a bunch of errors in spacy, which never makes sense until you get to the depth of it. In this post, we will analyze the attribute error E046 and why it occurs. (1) AttributeError: [E046] Can't retrieve unregistered extension attribute 'tag_name'. Did you forget to call the set_extension method? Let's first understand what the error means on superficial level. There is a tag_name extension in your code. i.e. from a doc object, probably you are calling doc._.tag_name. But spacy suggests to you that probably you forgot to call the set_extension method. So what to do from here? The problem in hand is that your extension is not created where it should have been created. Now in general this means that your pipeline is incorrect at some level.  So how should you solve it? Look into the pipeline of your spacy language object. Chances are that the pipeline component which creates the extension is not included in the pipeline. To check the pipe eleme