Lecture 01 - Algorithm Analysis
1.0 - Course Overview
1.1 - Course Aims
- Expand your ability to analyse, critique, design and implement advanced data structures and algorithms
1.2 - Assumed Background
- Programming experience, including basic data structures and recursive procedures
- Familiarity with using a programming language - using Java for assignments
- Mathematical background
- Familiarity with proof by mathematical induction
- Knowledge of calculus, including differentiation, limits, L’Hôpital’s rule, and summations
1.3 - Motivation
- Practical basis for creating more efficient algorithms
- Theoretical basis for justifying your choice of algorithms
- Improve your problem-solving skills
- A prerequisite for looking for a job at Google, Oracle, etc.
2.0 - Recap of Algorithm Analysis
- Design and analyse efficient algorithms
- Analysis: Bounded below
, bounded above , bounded tightly - Constant terms can be disregarded for large enough inputs
- Summations useful for analysing the cost of an algorithm
2.1 - What is an Algorithm
- An algorithm is a well-defined computation procedure that takes some values as an input and produces some value, or set of values as an output [CLRS, Chapter 1]
- Usually defined to solve a specific computational problem
- We would like some algorithms to be correct (with respect to their problem) as well as efficient.
- We would also like the algorithms to be designed in a way that they are readable so that we can (a) verify that they are correct (b) maintain these algorithms.
2.2 - Sorting Problem
More broadly, how do we define a problem in a way so that we can define algorithms to solve them
Input
A sequence of numbers Output
A permutation
2.2.1 - Sorting Algorithms: Insertion Sort
We have many sorting algorithms, and one of these are called Insertion Sort.
// Note that in this example, the algorithm assumes
// that the starting index of an array is 1
for j = 2 to A.length
key = A[j]
// Insert A[j] into the sorted sequence A[1..j-1]
i = j-1
while i > 0 and A[i] > key:
A[i + 1] = A[i]
i = i - 1
A[i + 1] = key
Run insertion sort on the array A = [5, 4, 2, 6, 3]
Initially, j=2 - that is, we begin the insertion sort on the second element.
Additionally at this point,
Since
We decrement the value of
We now work on the insertion of the following elements (where we try and insert it into the sorted section).
How do we verify that this algorithm is correct?
🌱 We can use the idea of invariants to help us prove that algorithms are correct using an inductive argument. That is, we can use these invariants as the inductive argument to prove that these algorithms are correct.
- An invariant for insertion sort could be
- A[1..j-1] contains the original elements from A[1..j-1], but in sorted order
- This invariant is initially true, as a single element is inherently in sorted order
- The body of the loop preserves the loop invariant
- Therefore, using an inductive argument, we can verify that the
3.0 - Execution Time
🌱 What does execution time depend on?
- Execution time depends on a variety of factors, including:
Input Size
e.g. sorting 10 vs 1000 elementsInput Value
e.g. sorting an already-sorted list vs a reverse-sorted listComputer Architecture
e.g. the basic instructions available, computation speed
- Generally, we want an upper bound on execution time
3.1 - Execution Time: Worst, Average and Best Case
Worst Case
Maximum execution time over all input of size Average Case
Average execution time over all inputs of size , weighted by probability of input. Worst Case
Minimum execution time over all inputs of size
3.2 - Running Time Analysis
🌱 The running time of an algorithm on a given input can be measured in terms of the number of primitive steps executed.
-
Let’s compute the running time of insertion sort
-
Let
be A.length
-
Let’s measure the time in terms of the number of array comparisons (in red)
- This is an alright approximation, as the amount of work that must be done to finish this execution (i.e., execution time) is proportional to the number of times this comparison is executed.
// Note that in this example, the algorithm assumes // that the starting index of an array is 1 for j = 2 to A.length key = A[j] // Insert A[j] into the sorted sequence A[1..j-1] i = j-1 while i > 0 and A[i] > key: A[i + 1] = A[i] i = i - 1 A[i + 1] = key
-
We want to compute the worst-case running time of our algorithm
-
When
we can perform the comparison a maximum of 1 time. -
When
we can perform the comparison a maximum of times (if the two preceding elements are greater than it, as we have to shuffle both of these elements) -
..
-
For an arbitrary value of
, we can perform the comparison a maximum of times. -
Therefore, the worst-case running time of the algorithm is:
-
-
We now want to compute the best-case running time of our algorithm
-
The best case occurs when the array that we want to sort is already in sorted order.
-
That is, for every iteration of the for loop, we only perform the check once.
-
Therefore, the best-case is:
-
3.3 - Running Time Analysis - Recursion
- More complicated to analyse than conditionals and loops
- Running time can often be described by a recurrence
- Overall running time on a problem of size
is described in terms of running time(s) on smaller inputs and functions of .
3.4 - Asymptotic Analysis: The General Idea
- Groups functions together based on their rate of growth:
Merge Sort
Insertion Sort
- For large inputs, the difference in order outweigh constant factors:
- For example, merge sort is ultimately better for large enough
no matter what the constant factors are.
- For example, merge sort is ultimately better for large enough
- Ignores implementation dependent constants such as machine speed or the compiler.
4.0 - Asymptotic Notation
4.1 - Growth of Functions
🌱 The following table shows the largest instance that can be solved in a given time
T(n) | 1 second | 1 day | 1 year |
---|---|---|---|
1,000,000 | 86,400,000,000 | 31,536,000,000,000 | |
62,746 | 2,755,147,514 | 798,160,978,500 | |
1,000 | 293,938 | 5,615,629 | |
100 | 4,421 | 31,593 | |
19 | 36 | 44 |
4.2 - Limitations of Asymptotic Analysis
- Constant factors are relevant for:
- Small input sizes
- Algorithms of the same order
4.3 - Asymptotic Notation
For functions
- is asymptotically bounded above by to within a constant factor. - is asymptotically bounded below by to within a constant factor - is asymptotically bounded above and below by to within a constant factor.