Notes on Analysis of Algorithms

For:  Students of Algorithms and Data Structures,
      Numerical Analysis, and
      Connoisseurs of Efficient Scientific Computing
ANALYSIS OF ALGORITHMS is the branch of theoretical computer science concerned with characterizing the computer resources required by an algorithm. The resources we are most often concerned with are time and memory requirements.

Abu Abd-Allah ibn Musa al'Khwarizmi, 9th century mathematician whose name gives us the word algorithm and whose writings give us the word algebra.

We wish to characterize these resource requirements as functions of one or more of the "essential parameters" (from the standpoint of the resource to be measured) describing the data operated on by the algorithm. Such parameters might include the size of an array (for example, finding the minimum entry takes time depending primarily on the number of entries in a list), the size of a number (for example, the time required to raise a real number to an integral power depends primarily on the absolute value of the exponent), and the number of processors (if a parallel computer is used).

The interesting questions in analysis of algorithms usually are concerned with values of the "essential parameters" that tend to make an algorithm run slowly -- for example, n tending to infinity, if n is the size of a list to be sorted. If n is small, even an inefficient (correct) algorithm may yield the desired result in acceptable time. However, for large n, there may be, for example, an enormous performance difference between the "Quick Sort" method and the equally correct but slower "Bubble Sort" method. The problem, roughly speaking, is to describe the running time of the algorithm as being approximately proportional to some simple function of the "essential parameters" -- for example, sorting a list of size n on a serial computer takes, on average, time proportional to

The mathematical tools needed for the analysis of running times typically include mathematical induction and related analysis of recursive relations, and tools from advanced calculus such as the relation between the definite integral and area, L'Hopital's Rule, summation of infinite series, and other limit theorems "at infinity."

Omega, Theta, and O are used to mean, roughly, "order at least," "order exactly," and "order at most," respectively. Specifically, let f and g be non-negative functions defined on the non-negative integers. We say

Technically, Omega, Theta, and O are set-valued functions. It would be preferable to speak of f as a member of Omega(g), Theta(g), or O(g). However, the notation introduced above is by now well-established in the literature of the analysis of algorithms.

In using order notation, we generally want the argument of Omega, Theta, or O to be as simple as possible. For example, while both of the following statements are correct, the second is preferred:

The reader should be attempt to prove the following:

Proposition 1: Suppose f(n) and g(n) are positive-valued functions for n > 0. Then

Proposition 2: Let f and g be positive functions defined for n > 0 such that

the limit as n goes to infinity of f(n)/g(n)
  equals a for some real number a > 0.

Note that the converse statements in Proposition 2 are, in general, false. For example, 2 + sin n = Theta(1), although limit as n goes to infinity of (2 + sin n)/1 is undefined.

In analyzing the running time of an algorithm, we generally assume that each of the following operations may be performed in Theta(1) time:

Strictly speaking, these are not all realistic assumptions. For example, one might argue that arithmetic operations take time proportional to the number of bits used for the operands, so that, e.g., addition of integers of small absolute value is faster than addition of integers of huge absolute value. In practice, however, the above are fairly realistic, as the operations cited are both fundamental and fast, and we are usually more concerned with how the running time of an algorithm depends on the quantity of data being processed than in how it depends on the running times of elementary operations.

When it is reasonable to assume the arguments of the functions are suitably restricted, evaluation of mathematical functions such as logarithms and exponentials, trigonometric and inverse-trigonometric functions, etc., are regarded as having Theta(1) running times. When such functions are considered to have unbounded arguments, their running times typically depend in non-constant fashion on their arguments. E.g., consider two different algorithms for computing the n-th power of a real number: one, based on repetitive multiplication by the base of the exponentiation, requires Theta(n) time (and Theta(1) memory); another, based on repetitive squaring, requires Theta(log n) time (and Theta(log n) memory).

Since a program consists of a series of instructions, we need the following rules to analyze the running time of a piece of code:

  1. If Instruction A requires Theta(f(n)) time and Instruction B requires Theta(g(n)) time, then Instruction A followed by Instruction B requires Theta(f(n)+g(n)) time.
    Often, the above simplifies nicely, since
    Theta(f(n)+g(n)) = Theta("long-term max"{f(n), g(n)}) (*)
    when the right-hand side of (*) makes sense. The reader is encouraged to prove each of the following without using (*):
  2. If the body of a loop requires Theta(f(n)) time and the loop body is performed Theta(g(n)) times, then completion of the loop (all its performances of the loop body) requires Theta(f(n)*g(n)) running time.
We may obtain similar rules for Omega and O notation.

REFERENCES

A.V. Aho, J.E. Hopcroft, and J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, Mass., 1974

D.E. Knuth, The Art of Computer Programming I: Fundamental Algorithms, Addison-Wesley, Reading, Mass., 1968

R. Miller and L. Boxer, Algorithms Sequential and Parallel: A Unified Approach, 2nd ed., Charles River Media, Inc., Hingham, MA, 2005

Back to Boxer's Home Page