Bubble sort, also sometimes called sinking sort, is a sorting algorithm that is sometimes used to introduce algorithms. It turns out that it is a pretty inefficient means of sorting, especially if the list is very much unsorted and contains many elements. For that reason, some textbooks now avoid teaching Bubble sort; the idea is that its impracticality undermines its pedagogical use. But I disagree.
When learning other concepts, such as the meaning of “valid” in a critical reasoning or first-order propositional logic course, it is common to present valid arguments that no one would ever use for the purpose of arriving at true beliefs. Those arguments can nevertheless be a good heuristic for grasping what validity is. It is sometimes the case that thought experiments or ideas that have no real-world application, when put in their proper context, can be useful for understanding something else about the world. Bubble sort can still help us get a good first grasp of how to think about algorithms. For that reason, I am going to talk about how I think through writing an algorithm using Bubble sort as a heuristic.
How does Bubble Sort Work?
Bubble sort starts at the beginning of a list and compares the one element with the next element. If the first element is greater than the second element, then it swaps them; otherwise, it does nothing with them. If there are more items following the second element, then it moves to the next two elements, so that what was previously the second element is now the first element and the third (previously not compared element) is now the second element. This process continues until it makes a comparison with the last item in the list. After that point, it will start back again at the beginning, run through the compare and possibly swap strategy, except that we need not check the very last item in the list because our first go-around already put the largest number at the end. So we need only check one item less on each iteration. This whole process continues until we have compared the first two elements in the list, if there are even two to be compared. Let’s see this in action by starting with a simple case and working our way up to more and more complex cases.
A simple case is to sort a list that has only one element. Since there is no second element to which the first element might be compared, we are done. It is trivially sorted. We could immediately return the list. Since a list with only one element exists at index zero, what this suggests at first pass is that the stopping point at which no iterations of compare-and-maybe-swap happen is when right most item to be checked is at the zeroth index, i.e., stoppingPoint == 0. That’s incorrect, in fact. But assume that is true for now and we’ll see why it is wrong later. So to calculate how many iterations we must do, take the count the number of items in the list and subtract it by one. Since a list with only one item equals zero, we do not need to compare and so we just return the list.
A more complicated case is a list that has two elements: [ 5, 2 ]
The first thing we do is determine whether we need to compare and possibly swap. So we take the count of items in the list, subtract one. 2 – 1 = 1, which is the value of our stoppingPoint. 1 does not equal zero. So we must do one comparison. Compare the first two elements, which are 5 and 2. Then we check if the first item is greater than the second. 5 is greater than 2. So we swap their values in the list. So now the list looks like this: [2, 5]. Then we check if the second item in in the list index that matches our stoppingPoint. The second second item is in index 1, and our stoppingPoint equals 1. At this point in the iteration at which we can stop comparing values. Now we subtract one from our stoppingPoint, so stoppingPoint equals zero. Do we need to go through our list again? Only if stoppingPoint isn’t zero. Bu it is equal to zero. So we are done. Return the list.
A more complicated case is to sort [ 7, 6, 5 ] since there are now 3 items. Using the above approach, our list would sort as following.
stoppingPoint = number of items in list – 1 // so stoppingPoint = 2.
2 != 0. //so we are going to do at least one iteration of compare and possibly swap
[ 7, 6, 5 ] // Compare the first two elements in the list.
Is the first item (7) larger than the second item (6)? Yes. So we swap them.
[ 6, 7, 5 ] // Swapped
Is the second item (7) at the stopping point? No, since the second item is at index 1 and our stoppingPoint == 2. So we continue with compare and possibly swap.
[ 6, 7, 5 ] // Compare the next two items.
Is the first item (7) larger than the second item (5)? Yes. So we swap them.
[ 6, 5, 7 ] // Swapped
Is the second item (7) at the stopping point? Yes, since it is at an index value equal to 2, which is the value of stoppingPoint. So we update the stoppingPoint value by subtracting it by 1.
stoppingPoint = stoppingPoint – 1;
stoppingPoint = 1;
Is stoppingPoint equal to 0? No. Next iteration:
[ 6, 5, 7 ] //Compare
[ 5, 6, 7 ] // Swapped
Second item (i.e. 6) is at index 1. Stopping point is 1. So no more compare and possibly swap. So subtract one from stoppingPoint:
stoppingPoint = 0;
Is stoppingPoint equal to 0? Yes. No more iterations. Return list.
Looking at the Code
So here we see that we pass in an array of integers. We initialize the stoppingPoint value to the lenght of the array minus one in order to not bet a out of range exception. And we are going to loop through the for-loop as long as the stoppingPoint is not equal to zero. Each time we reach the end of the loop, we will subtract one from the stoppingPoint.
Inside the for-loop, we initialize two values, val1 and val2, to hold the indexes of the array, which correspond to the items being compared. We set them to 0 and 1 because we want to start at the beginning of the array. Then we check if the right-most value, val2, is less than or equal to the stoppingPoint. If it s less than the stoppingPoint, then there are more items in the array we will need to compare and possibly swap. If they are equal, then we need to check at least once.
So think about this if we passed in the following array: [ 1 ]. The array length, that is the size of the array, is one. So 1 – 1 is 0. Since we will enter the for-loop only if stoppingPoint is not equal to 0, we will never enter the for-loop and we return the same array.
Now consider our example: [ 5, 2 ]. Stopping point equals the size of the array minus one, i.e., 2 – 1, which equals 1. It is true that stoppingPoint is not equal to 0. So we enter the for loop, initialize val1 and val2. Is val2 less than or equal to stoppingPoint? That is, is 1 <= 1? Yes. Enter the while-loop. Compare the values. In this case, we swap them using a temp variable to hold one of the values. Exit the if-block. Then increment the values, so next time we go through the while-loop, if we do go through it, we would be comparing the next two items. After the first iteration of our while loop, we have the an array that looks like this: [ 2, 5 ], with val1 = 1 and val2 = 2. Is val2 (i.e. 2) less than or equal our stoppingPoint (i.e. 1)? No. Exit the while-loop. Before we enter the for-loop again, stoppingPoint is decremented by one. So when we started, stoppingPoint was equal to 1. But after being decremented, it is equal to 0. The for-loop checks if stoppingPoint equals 0 before entering the block again. Is stoppingPoint (i.e. 0) not equal to 0? No, for they are equivalent. So we exit the for loop. Then we return the array.
This looks pretty good. So let’s run some tests:
And the tests pass! But the tests are not adequate. What happens if we pass in an empty array, something we had not considered above? What will the stoppingPoint value be initially? It will be the size of the array, 0, minus 1. So stoppingPoint equals -1, right? Wrong. It will equal 2147483647. (There’s a homework assignment for you to figure out why). So we enter the for-loop. val2 = 1. The while-loop checks: is 1 <= 2147483647, which of course it is. Now notice what happens at the if-block. It looks for values in the array at indexes 0 and 1 (on this first iteration). Given there are no values in this array, we cannot find values at indexes that do not exist. So we get a IndexOutOfRangeException.
Fixing the Code
There is a hack way to fix this code and a better way. The hack fix is before the for-loop, toss in an if-block that checks whether the array is empty or not. If it is empty, return the array immediately, otherwise, continue to the for-loop. But this is unnecessary. Above, I said we were going to assume the incorrect statement that “the stopping point at which no iterations of compare-and-maybe-swap happen is when right most item to be checked is at the zeroth index.” And I told you this was incorrect. What we really need to check is not whether the stoppingPoint equals zero, but that the stoppingPoint is always greater than zero.
If we pass in an array with no elements, the stoppingPoint value will this time equal -1. Wtf?!? Why was it that by simply changing the second part of the for-loop from stoppingPoint != 0 to stoppingPoint > 0 change the value stoppingPoint is intialized to? Consider that part of your homework again. In any case, -1 is not greater than 0. So the for-loop is never entered and the array is immediately returned. Once we rerun all of the test cases, the tests pass.
Here’s the fixed version: