Feb
15
2011

Big-O (Theta) analysis

Big-O (Theta) analysis

A good understanding of big-O analysis is critical to making a good impression with the interviewer. Big-O analysis is a form of run-time analysis that measures the efficiency of an algorithm in terms of the time it takes for the algorithm to run as a function of the input size. It’s not a formal benchmark, just a simple way to classify algorithms by relative efficiency.

In mathematics, computer science, and related fields, big-O notation (also known as big Oh notation, big Omicron notation, Landau notation, Bachmann–Landau notation, and asymptotic notation) (along with the closely related big-Omega notation, big-Theta notation, and little o notation) describes the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation.

How to Do Big-O Analysis

The general procedure for big-O run-time analysis is as follows:

  • Figure out what the input is and what n represents.
  • Express the number of operations the algorithm performs in terms of n.
  • Eliminate all but the highest-order terms.
  • Remove all constant factors.

For the algorithms you’ll be tested on, big-O analysis should be straightforward as long as you correctly identify the operations that are dependent on the input size.

Most coding problem solutions iinclude a run-time analysis to help you solidify your understanding of the algorithms.

Analyzing Two Examples

Let’s start with an example of big-O analysis in action. Consider a simple function that returns the maximum value stored in an array of non-negative numbers. The size of the array is n. There are at least two easy ways to implement the function. In the first alternative, you keep track of the current largest number as the function iterates through the array and return that value when you are done iterating. This implementation, called CompareToMax, looks like this:

/* Returns the largest integer in the array */
int CompareToMax(int array[], int n)
{
int curMax, i;

if (n <= 0)
trhow new ArgumentException("Make sure that there is at least one element in the array.");

/* Set the largest number so far to the first array value. */
curMax = array[0];

/* Compare every number with the largest number so far. */
for (i = 1; i < n; i++) {
if (array[i] > curMax) {
curMax = array[i];
}
}
return curMax;
}

The second alternative compares each value to all the other values. If all other values are less than or equal to a given value, that value must be the maximum value. This implementation, called CompareToAll, looks like this:

/* Returns the largest integer in the array */
int CompareToAll(int array[], int n)
{
int i, j;
bool isMax;

if (n <= 0)
trhow new ArgumentException("Make sure that there is at least one element in the array.");

for (i = n-1; i > 0; i--) {
isMax = true;
for (j = 0; j < n; j++) {
/* See if any value is greater. */
if (array[j] > array[i])
isMax = false; /* array[i] is not the largest value. */
}
/* If isMax is true, no larger value exists; array[i] is max. */
if (isMax) break;
}

return array[i];
}

Both of these functions correctly return the maximum value. Which one is more efficient? You could try benchmarking them, but it’s usually impractical (and inefficient) to implement and benchmark every possible alternative. You need to be able to predict an algorithm’s performance without having to implement it. Big-O analysis enables you to do exactly that: compare the predicted relative performance of different algorithms.

How Big-O Analysis Works

In big-O analysis, input size is assumed to be n. In this case, n simply represents the number of elements in an array. In other problems, n may represent the number of nodes in a linked list, or the number of bits in a data type, or the number of entries in a hash table, and so on. After figuring out what n means in terms of the input, you have to determine how many times the n input items are examined in terms of n. “Examined” is a fuzzy word because algorithms differ greatly. Commonly, an examination might be something like adding an input value to a constant, creating a new input item, or deleting an input value. In big-O analysis, these operations are all considered equivalent. In both CompareToMax and CompareToAll, “examine” means comparing an array value to another value.

In CompareToMax, each array element was compared once to a maximum value. Thus, the n input items are each examined once, resulting in n examinations. This is considered O(n), usually referred to as linear time: The time required to run the algorithm increases linearly with the number of input items.

You may notice that in addition to examining each element once, there is a check to ensure that the array is not empty and a step that initializes the curMax variable. It may seem more accurate to call this an O(n + 2) function to reflect these extra operations. Big-O analysis, however, yields the asymptotic running time, the limit of the running time as n gets very large. As n approaches infinity, the difference between n and n + 2 is insignificant, so the constant term can be ignored. Similarly, for an algorithm running in n + n2 time, the difference between n2 and n + n2 is negligible for very large n. Thus, in big-O analysis you eliminate all but the highest-order term, the term that is largest as n gets very large. In this case, n is the highest-order term. Therefore, the CompareToMax function is O(n).

 
The analysis of CompareToAll is a little more difficult. First, you have to make an assumption about where the largest number occurs in the array. For now, assume that the maximum element is at the end of the array. In this case, this function may compare each of n elements to n other elements. Thus, there are n(n examinations, and this is an O(n2) algorithm.

The analysis so far has shown that CompareToMax is O(n) and CompareToAll is O(n2). This means that as the array grows, the number of comparisons in CompareToAll will become much larger than in CompareToMax. Consider an array with 30,000 elements. CompareToMax compares on the order of 30,000 elements, whereas CompareToAll compares on the order of 900,000,000 elements. You would expect CompareToMax to be much faster because it examines 30,000 times fewer elements. In fact, one benchmark timed CompareToMax at less than .01 seconds, while CompareToAll took 23.99 seconds.

The fastest-possible running time for any run-time analysis is O(1), commonly referred to as constant running time. An algorithm with constant running time always takes the same amount of time to execute, regardless of the input size.

Best, Average, and Worst Cases

You may think this comparison was stacked against CompareToAll because the maximum value was at the end. This is true, and it raises the important issues of best case, average case, and worst case running times. The analysis of CompareToAll was a worst-case scenario: The maximum value was at the end of the array. Consider, however, the average case, where the largest value is in the middle. You end up checking only half the values n times because the maximum value is in the middle. This results in checking n (n / 2) = n2/ 2 times. This would appear to be an O(n2/ 2) running time. Consider, though, what the 1/2 factor means. The actual time to check each value is highly dependent on the machine instructions that the code translates to and then on the speed at which the CPU can execute the instructions. Therefore, the 1/2 doesn’t mean very much. You could even come up with an O(n2) algorithm that was faster than an O(n2/ 2) algorithm. In big-O analysis, you drop all constant factors, so the average case for CompareToAll is no better than the worst case. It is still O(n2).

The best-case running time for CompareToAll is better than O(n2). In this case, the maximum value is at the beginning of the array. The maximum value is compared to all other values only once, so the result is an O(n) running time.

Note that in CompareToMax, the best-case, average-case, and worst-case running times are identical. Regardless of the arrangement of the values in the array, the algorithm is always O(n).

Ask the interviewer which scenario they’re most interested in. Sometimes there are clues to this in the problem itself. Some sorting algorithms with terrible worst cases for unsorted data may nonetheless be well suited for a problem if the input is already sorted.

Optimizations and Big-O Analysis

Algorithm optimizations do not always yield the expected changes in their overall running times. Consider the following optimization to CompareToAll: Instead of comparing each number to every other number, compare each number only with the numbers that follow it in the array. In essence, every number before the current number has already been compared to the current number. Thus, the algorithm is still correct if you compare only to numbers occurring after the current number.

What’s the worst-case running time for this implementation? The first number is compared to n numbers, the second number to n – 1 numbers, the third number to n – 2, resulting in a number of comparisons equal to n + (n – 1) + (n – 2) + (n – 3) + + 1. This is a very common result, a mathematical series whose answer is expressed as n2/2 + n/2. n2 is the highest-order term, so this version of the algorithm still has an O(n2) running time in the worst case. For large input values, the change you make to the algorithm has no real effect on its running time.

Summary

Just remember following steps for tetha or big O analysys: figure out what the input is and what n represents, after that express the number of operations the algorithm performs in terms of n, afterwards eliminate all but the highest-order terms and  later on remove all constant factors, because CPU and memory speed difference is not important in algorithm speed and velocity analyze. How are you calculating speed of your algorithms? Do you think that your code can be done faster?

Big O notation

Best articles: .NET Web frameworks, IOC pattern, IT articles, Visual Studio 2010 Tips.

DotNetShoutout
DotnetKicks

This blog content RSS subscrption.

Add comment

  Country flag

biuquote
  • Comment
  • Preview
Loading

Author - Agafonov Viacheslav

Agafonov Slava site

Hello world! My name is Agafonov Viacheslav. I'm a software engineer at Microsoft located in Bellevue next to Redmond campus and Seattle downtown, state Washington. I was born in Ukraine. My passion for programming is in my ability to create tools that make people's lives easier.

Vyacheslav Agafonov profileAgafonov blog Agafonov Slava on Twitter Counter.com

Month List

Disclaimer

The opinions and information that expressed here do not represent my employer's view in any way. Information in this blog is my own opinion and does not reflect on employer. Content on this site is licensed under a Creative Commons 3.0 license.

Advertise with me!