Quick Sort
Background
Sorting is an integral part of computer science. From user applications like Excel to basic system operations like memory management, rearranging data based on some criteria provides easier access to information.
One such algorithm that is frequently used is called the quick sort algorithm. Quick sort performs well for large inputs and is able to integrate simpler algorithms for smaller inputs.
Time and Supplies
This instructable will require about 30 minutes of your time. You will need access to a computer with internet access so that you can download an IDE.
Note about compilation method
Use of an IDE is assumed in this instructable, but the process can be done using command line compilation and execution as well. If you are already have access an IDE or are able to compile and execute your program on the command line, you may begin at step 2.
Setup an IDE
Note: If you already have an IDE with which you are familiar or some other method of compilation and execution, you may skip to step 3. If you already have access to Code::Blocks but are unfamiliar with it, you may skip to step 2.
Background
IDE stands for Integrated Development Environment. An IDE is an extremely useful tool when working on software project because it takes many of the tools that programmers can use in the programming process and puts them in one place with a neatly organized graphical interface.
Steps
1) Visit http://www.codeblocks.org/downloads/binaries
2) Download the package that best fits your operating system and level expertise. Look for helpful notes such as "IF UNSURE, USE "codeblocks-13.12mingw-setup.exe"!"
3) Once you have successfully downloaded the appropriate file, follow the installation process and start the application.
Creating Your Project
Background
We will be writing this program in C++, a very popular programming language. We'll make a single file that contains all of our functions and calling code.
Goal
First let's learn how to make a new project and a new file, then we'll dive into writing the code.
Steps
1) Once you have the Code::Blocks application open, click on "Create a new project" and choose "Empty project".
2) You'll need to name your project and save it somewhere. I'll use the name "QuickSort" for this project.To browse your folders, you can click on the button with three dots next to the "Folder to create project in:" box.
3) After choosing your project title and location, press "Next" and "Finish". We won't worry about the other settings.
4) To make your first file, go to File > New > File. Click on the "C/C++ source" icon and select "C++" on the next page.
5) To name your file, click on the three dots next to the text box and type in your file name. My example will be "main.cpp", so I will type in "main" and hit save. Click the "All" button followed by the "Finish" button.
Review
At this point you should see something similar to the above image. We are now ready to start writing code.
Basic Outline of the Program
Background
Next we'll write the basic outline of the program.
Goal
The above image is what your file should look like once this step is complete. Let's walk through each piece of it.
Steps
1) The lines beginning with "#include" allow us to use function from those libraries. The next line allows us to use the standard namespace, a concept outside the scope of this instructable but necessary for our success. The line beginning with "const" is a variable declaration that we will use later and will represent the size of our test input.
2) Our main function, called main, is the function that will be called when we run the program. It is critical that you name this function main. It returns an "int", which is an integer, and it takes no parameters, as indicated by the empty parentheses. In reality, main does accept parameters, but that is outside the scope of this instructable.
3) Now notice the two lines that come before main. These are the function prototypes, and they are placed before main so that we can use those functions in main. The implementations of each will go below main.
Review
Again, some of the code you see you may not recognize. If you'd like more information on any part of the program, it is very easily accessible online.
Next we will take a look at the implementation of the functions we outlined.
What Are Arrays?
Let's take a quick detour to talk about arrays before we continue.
Arrays are nothing more than multiple values stored sequentially in memory. You can have arrays of anything, but in our example we use an array of integers.
When describing the above array, one would say, for example, that at index 4 is the value 15. If we were to sort this array, the value 1 would be at index 0, and the value 20 would be at index 9.
Array indices begin at at the number 0, not the number 1. So an array of size n contains values at indices 0 through n - 1.
Many functions, including the two that we will write, take advantage of arrays.
Implementing Qs_partition
Background
Before getting around to implementing the function quick_sort, we must understand that quick_sort doesn't do much work by itself. Our qs_partition function does most of the work. qs_partition takes as parameters an array of integers, a left boundary in that array, and a right boundary in that array, and returns an integer, just like main.
* Note: because a library we are using (algorithm.h) already has a function called partition, we will call our partition function qs_partition.
Goal
The job of qs_partition is to pick a number in the array as a pivot and separate all the numbers in the array into two groups: those that are less than or equal to the pivot and those that are greater. It places all the lesser or equal numbers before the pivot and all the greater numbers after. Here's one way that can be accomplished.
Steps
1) qs_partition first takes the first thing it is allowed to touch (the item at the left boundary) and calls it a pivot. There is nothing special going on here. All the function does first is declare that a value called pivot is equal to the first item that the function is allowed to access.
2) In the same way, we'll call a variable toSwap and assign it the index of the pivot.
3) We then run through the array, looking at each item from the second value to the right boundary. We don't compare the first value to the pivot value because the first value is the pivot value.
4) At each spot, we compare the value at that spot to our pivot. If that value is less than or equal to the pivot, we want to move that value as far toward the beginning of the array as possible. That's where toSwap comes in. We'll swap the value at toSwap (because toSwap is an index) and the current value we are seeing, then we'll increase toSwap by one so we aren't swapping out those values back out into the middle of the array.
5) Once we have run through all the numbers, our toSwap index will represent the spot where we want to place our pivot. So we'll swap the value at toSwap and the pivot value. Lastly, we'll return, or pass back to the function that called qs_partition, the index of where we put the pivot. You'll see why soon.
Review
Hopefully you can see that by sequentially swapping numbers less than the pivot into the first spot after our last swap, we have created the desired effect: all numbers before the pivot are less than or equal to the pivot and all numbers after it are greater.
Implementing Quick_sort
Background
Now that we have written the function that does the bulk of the work, we can get to the easy part. Our quick_sort function is very easy to understand and is based on the idea that we want to do as little work as possible ourselves.
Our first question is this: what is the easiest array to sort? In my opinion, it is one with a single element. If you give me one number, I can sort it instantly. Working with that notion, let's write quick_sort.
Goal
We want to break up the work using our qs_partition function so that never have to do any real work.
Steps
1) First we'll check if we can avoid work altogether. If our left boundary is less than our right boundary, then we have work to do, unfortunately. If our left boundary is the same as our right boundary, the section we are sorting is only one element. If left is greater than right, we are in an error state.
2) Our next strategy in work avoidance is telling other people to do it. We'll call qs_partition and receive the pivot index that it used. As you know from the last step, anything in the array before this index is less than or equal to the value at the pivot. Everything after is greater than the pivot.
3) What does that mean? It means that if we sort everything before the pivot and everything after the pivot, we will have a sorted array, because the pivot is already in the correct spot. So we'll do just that. Do we know any sorting algorithms? We have quick_sort.
Review
It looks like we can get away with not doing any real work. You'll see that if we call quick_sort, it results in two more quick_sort calls, one on each side of our pivot. This branching behavior will continue until each call runs into an array that is only one element. Since an array of one element is sorted, then our entire array is sorted.
Write Calling Code
Background
You have now implemented the quick sort algorithm. Let's see if you did it correctly. To do this, we will need to write code that will call the quick_sort function.
To compile and run your program, you can go to Build > Build and run, or you can press F9 if you are using Windows.
Goal
Writing in main, make an array of integers and sort it using our quick_sort function. Print out the array before and after sorting.
Steps
1) The first two lines are a bit of setup. We are seeding our random number generator with the current time, to get more variability. Then we are creating an array with ARRAY_SIZE integers in it. ARRAY_SIZE is defined earlier in the file, and for this test we will set it to 30, though you can test larger or smaller sizes as well.
2) Next, fill the array with random integers between 0 and 500 inclusive, printing out the array as you go. You can change range of possible random integers as well.
3) You'll see then that we call quick_sort and pass to it
i) the array,
ii) 0, meaning the first element in the array, and
iii) ARRAY_SIZE - 1, the last spot in the array.
4) To see if it worked, print out the the array again.
Review
Once you have written your calling code, you can build and run your program to see the results. If there are errors, be sure to double check all of your code and compile and run it again.