Thursday, February 7, 2013

Inversions

The first thing I've learned from Algorithms: Design and Analysis Part I is how to count inversions.

What is an inversion? An inversion is a pair of objects that are out of order. If we are given an array and our goal is to sort from low to high, an inversion would be two items ai and aj such that ai > aj, but  i < j. We can see that they are misplaced.

An array with no inversions is one that is sorted and an array with the maximum number of possible inversions would be sorted in reverse order. Thus, counting inversions is a way of checking how sorted or unsorted something is.

How do we find the number of inversions? Well, the naive and obvious way would be to traverse the array and compare each element to every other element. This algorithm would be O(n^2). Can we do it a better way? Yes. We'll use the divide and conquer method and piggyback on the MergeSort algorithm to do it in O(nlogn). By counting the number of inversions, we will also be sorting it.

By splitting the array in half, we know that:
total inversions = inversions in left + inversions in right + split inversions
A split inversion is one in which we have one element on the left half that will be an inversion with an element on the right half. To find the number of inversions in each half, we'll just recursively call the algorithm on each half. Finding the number of split inversions is the trickier task.

Say we have two halves, [2, 5, 6] and [1, 3, 8]. The number of split inversions will actually be counted when we do the merge subroutine. So the way merge works is that it will keep track of two pointers, one to the front of each half array. It will take the smaller element and increment the pointer. If an element is taken from the second array, we know that it must be smaller than the element in the first subarray, and subsequently every element thereafter. Then we can just sum up the number of elements left in the first array each time the second subarray is chosen from before the first.

Here is my code for this algorithm:
 int countInversions(int[] a) {
  return countInversions(a, 0, a.length, new int[a.length]);
 }

 int countInversions(int[] a, int start, int end, int[] t) {
  if (start == end - 1)
   return 0;
  int mid = (start + end) / 2;
  int x = countInversions(a, start, mid, t);
  int y = countInversions(a, mid, end, t);
  int z = subroutine(a, start, mid, end, t);
  return x + y + z;
 }

 int subroutine(int[] a, int start, int mid, int end, int[] t) {
  int i = start;
  int j = mid;
  int k = i;
  int count = 0;
  while (i < mid && j < end)
   if (a[i] < a[j])
    t[k++] = a[i++];
   else {
    t[k++] = a[j++];
    count += (mid - i);
   }
  System.arraycopy(a, i, t, k, mid - i);
  System.arraycopy(a, j, t, k, end - j);
  System.arraycopy(t, start, a, start, end - start);

  return count;
 }

No comments:

Post a Comment