Topic: Big O analysis

Simple Explanation

Big O analysis

Advanced Explanation

Test your Understanding

DESCRIPTION

What makes one program perform better than another? Obviously the speed of the computer matters (600Mhz Pentium vs 350 Celeron).
However there are certain programs which will not perform well on any processor and there are other programs to do the same job that will run surprisingly well even on outdated hardware (like a 66 Mhz 486 computer)
Understanding the intrinsic differences in performance which are a function of the program's design and not the underlying hardware upon which is executes is called algorithm analysis.
An algorihtm is a particular method or strategy for solving a problem, the program is the writing of an algorihm in a particular computer langauge.
The performance of an algorihtm is independent of the machine executing it. Rather it is a function of 1) the nature of the problem being solved, 2) the design of the algorithm and 3) the "size" of the problem.
The size of a problem is important in measuring performance. Some algorithms run fine for small size problems but become unusable for large problems.
Size is typically measured in terms of number of objects in the problem (number of elements in a list,an array, etc)
An interesting question is, what will happen to performance if I double the size of the problem?
The complexity of an algorihtm is traditionally measured by the Big "O" (O for the order of complexity) notation.
Really good performance is same regardless of size. This is called a constant time performance, denoted by O(1) in Big "O"notation
Linear performance increases as a linear function of the problem size. For instance, doubling the size of the problem, doubles the amount of time takend to solve the problem. linear performance is denoted by O(n), where "n" is the size of the problem.
Some algorihtms (like bubblesort) quadruple when the size of the problem doubles, That it is of order O(n*n). This is called quadratic performance.
Some algorihtms only need one more step (one more time around the loop) as the size of the problems double. These algorithms are known as logarithmic performane and are denoted by O(log(n)).
This method of measuring performance of a single algorihtm/program as the size of the problem increase has several advantages.
- It is independent of the particular compiler and hardware used to program the algorihtm
- It does not require any other algorihtm/program to make the assessment of performance.

The difference between constant, logarithmic, linear and quadratic performance is quite dramatic as the following table illustrates: (let "n" be the "size" of the problem and "t" be the unit of time to solve the problem for size "n")

Performance Class	Units of time as size doubles	Additional Units Needed	Comments
Constant	1	0	time is independent of size e.g. access to an array element
Logarithmic	log(2n)	1	pretty good, doubling size is manageable e.g. search ordered list
Linear	2n	n	twice the size, twice the work e.g. search unordered list
Quadratic	n*n	(n-1)*n	This could get ugly if the problem gets too big. e.g. simple sorting

EXAMPLES
Click here for more examples

// constant time algorithm - getting "i-th" element from an array
// same time regardless of size of array
	answer = a[ i ];

// linear time algorithm - finding an element in an unordered list (either linked list of array)
// we show th array algorithm here - problem: find the location of the element whose value is "value"
// in general, doubling the size of the list will double the time to find the element
// (assuming it has equal probability of being anywhere in the list)

	for(int i = 0; i < SIZE_ARRAY; i++)
		if(a[i] == value) {
			// found it !! - do something and break out of loop
			break;
		}

/* logarithmic algorithm: searching an ordered list. (for details see binary search algorithm) you would be considered slightly eccentric if you searched a telephone book for a number by starting on page 1 and reading every name until you found the one you are looking for.
A better strategy (algorithm) is to turn to approximately where you think the name might be and if the page you pick has name after the one you are looking for, you look in the first part, if not you look in the second part. this is a modified form of a binary serach algorithm.

A classic guessing game that also uses the binary search algorithm is number guessing. You think of a number between let's say 1 and 32., I will guess 16 and you will tell me if I am low or high. If you guess is too low, then I know that the numbers from 1 to 16 are out - in other words I can throw away half the problem. Then I will guess a number halfway between 17 and 32 - 25. If you say too high, then I know the answer is between 16 and 25. Now I only have 1/4 the list to guess in. Every guess reduces the remaining list in half.

32 is 2**5 (2 raised to the 5th power. one guess leaves a list of 16 = 2 **4 elements, two guesses leaves a list of 8 = 2**3.
If I don't find it before, I am guaranteed to find the element when I whittle the list down to one element 2**0 = 1.Since the last isn't a guess, it takes at most 5 guesses.

Now double the range of numbers 1 to 64. How much harder is the problem?

Not much. In one guess I reduce the size of the list to 32 elements and I am back to the same size problem I had before.

64 = 2**6, so at most 6 guesses.

Why is it called logarithmic? well the logarithm to the base 2 of 32 is 5 and the logarithm to the base 2 of 64 is 6.

So if the size of the problem is "n", then log(n) is the number of steps.

to guess a number between 1 and 64,000 takes at most 16 guesses since 2**16 = 65576 */

/* quadratic algorithm - simple sorts are the most common example but you can make a simple quadratic algorithm with nested loops as shown in this example */

for(int i = 0; i < n; i++)
	for(int j = 0; j < n; i++) {
		// statements here will run "n**2" (or n*n) times longer as :"n" is doubled

TIPS

big "O" analysis takes a big picture viewpoint. It does not sweat small details, like whether you make a function call to do something or just program it inline. This is good, since it focuses on the intrinsic performance. Some algorithms are so bad that buying a faster machine or tweaking the program will not make the program acceptable.
Some problems are intrinsically hard and there are no known "good" algorihtms. These problems can only be solved for small sizes or by giving up the quality of the solution by approximating it.

Copyright chris wild 1999.
For problems or questions regarding this website contact [Chris Wild (e-mail:cs333@cs.odu.edu].
Last updated: October 24, 1999.

Copyright chris wild 1999. For problems or questions regarding this website contact [Chris Wild (e-mail:cs333@cs.odu.edu]. Last updated: October 24, 1999.

Copyright chris wild 1999.
For problems or questions regarding this website contact [Chris Wild (e-mail:cs333@cs.odu.edu].
Last updated: October 24, 1999.