Greedy Algorithms
Greedy algorithms make locally optimal choices at each step with the hope of finding a global optimum. This module covers the principles of greedy algorithms, common problems, and techniques for proving their correctness.
Module Contents
Introduction to Greedy Algorithms
Greedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum. They are typically used for optimization problems where we need to maximize or minimize some value.
What is a Greedy Algorithm?
A greedy algorithm builds up a solution piece by piece, always choosing the next piece that offers the most immediate benefit. This approach means making a locally optimal choice at each step with the hope that these local optimums will lead to a global optimum solution.
When to Use Greedy Algorithms
Greedy algorithms are suitable when:
- The problem has optimal substructure (an optimal solution to the problem contains optimal solutions to the sub-problems)
- The problem has the greedy choice property (a globally optimal solution can be arrived at by making locally optimal choices)
Advantages and Limitations
Advantages:
- Simple and easy to implement
- Generally efficient in terms of time complexity
- Works well for many practical problems
Limitations:
- May not always yield the optimal solution
- Requires careful proof of correctness
- Not applicable to all optimization problems
Principles and Properties
Understanding the key principles and properties of greedy algorithms is essential for determining when they can be applied and how to prove their correctness.
1. Greedy Choice Property
A problem exhibits the greedy choice property if a globally optimal solution can be arrived at by making locally optimal (greedy) choices. In other words, when we make a choice that seems best at the moment, it leads to an optimal solution to the complete problem.
For example, in the coin change problem with denominations like US coins (1, 5, 10, 25 cents), always choosing the largest possible coin leads to the minimum number of coins needed.
2. Optimal Substructure
A problem has optimal substructure if an optimal solution to the problem contains optimal solutions to the sub-problems. This property is shared with dynamic programming.
For example, in the shortest path problem, if the shortest path from A to C goes through B, then the path from A to B and the path from B to C must also be the shortest paths between those respective points.
3. Proving Correctness
Unlike many other algorithm design techniques, greedy algorithms require a proof of correctness to ensure they yield the optimal solution. This is typically done using:
- Greedy stays ahead: Show that after each step, the greedy solution is at least as good as any other solution.
- Exchange argument: Show that any non-greedy solution can be transformed into the greedy solution without worsening the result.
- Mathematical induction: Prove that if the greedy choice works for problems of size n, it also works for problems of size n+1.
4. Greedy vs. Dynamic Programming
Greedy Algorithms | Dynamic Programming |
---|---|
Make locally optimal choices | Consider all possible choices and pick the best one |
Generally more efficient | May be less efficient but more versatile |
May not always find the optimal solution | Always finds the optimal solution if applicable |
Requires proof of correctness | Correctness follows from the principle of optimality |
Classic Greedy Problems
Several classic problems can be efficiently solved using greedy algorithms. Understanding these problems helps in recognizing when a greedy approach might be applicable.
1. Activity Selection Problem
Given a set of activities with start and finish times, select the maximum number of activities that can be performed by a single person, assuming that a person can only work on a single activity at a time.
Greedy Approach:
Sort activities by finish time and select activities that don't overlap with the previously selected activity.
Why it works:
By always choosing the activity that finishes earliest, we maximize the time available for remaining activities.
2. Fractional Knapsack
Given weights and values of n items, put these items in a knapsack of capacity W to get the maximum total value. In the fractional knapsack problem, we can break items for maximizing the total value.
Greedy Approach:
Calculate the value-to-weight ratio for each item, sort items by this ratio, and take items with the highest ratio first. If an item can't be taken completely, take a fraction of it.
Why it works:
By always choosing items with the highest value-to-weight ratio, we maximize the value per unit weight in the knapsack.
3. Huffman Coding
A lossless data compression algorithm that assigns variable-length codes to input characters, with shorter codes assigned to more frequent characters.
Greedy Approach:
Build a binary tree where the two least frequent characters are combined at each step, and their frequencies are added.
Why it works:
By combining the least frequent characters first, we ensure that they are assigned longer codes, which minimizes the overall code length.
4. Minimum Spanning Tree (Prim's and Kruskal's)
Find a subset of edges in a connected, undirected graph that connects all vertices together with the minimum possible total edge weight.
Greedy Approaches:
- Prim's Algorithm: Start with a single vertex and grow the tree by adding the cheapest edge that connects a vertex in the tree to a vertex outside the tree.
- Kruskal's Algorithm: Sort all edges by weight and add them to the tree one by one, skipping edges that would create a cycle.
Why they work:
Both algorithms make locally optimal choices (selecting the cheapest valid edge) that lead to a globally optimal solution (the minimum spanning tree).
5. Dijkstra's Shortest Path Algorithm
Find the shortest path from a source vertex to all other vertices in a weighted graph with non-negative edge weights.
Greedy Approach:
Maintain a set of vertices whose shortest distance from the source is already determined. For each step, select the vertex with the minimum distance from the source and update the distances of its adjacent vertices.
Why it works:
By always selecting the vertex with the minimum distance, we ensure that we find the shortest path to each vertex.
Implementation Examples
Let's look at some code examples of greedy algorithms to better understand their implementation.
Activity Selection Problem
public class ActivitySelection {
// Function to select maximum activities
public static int maxActivities(int[] start, int[] finish) {
int n = start.length;
// Sort activities by finish time
// (Assuming activities are already sorted by finish time for simplicity)
int count = 1; // First activity is always selected
int lastFinish = finish[0];
// Consider rest of the activities
for (int i = 1; i < n; i++) {
// If this activity has start time greater than or equal
// to the finish time of previously selected activity,
// then select it
if (start[i] >= lastFinish) {
count++;
lastFinish = finish[i];
}
}
return count;
}
public static void main(String[] args) {
int[] start = {1, 3, 0, 5, 8, 5};
int[] finish = {2, 4, 6, 7, 9, 9};
System.out.println("Maximum number of activities: " + maxActivities(start, finish));
}
}
Fractional Knapsack
public class FractionalKnapsack {
// Item class
static class Item {
int value;
int weight;
public Item(int value, int weight) {
this.value = value;
this.weight = weight;
}
// Value per unit weight
public double ratio() {
return (double) value / weight;
}
}
// Function to get maximum value
public static double getMaxValue(Item[] items, int capacity) {
// Sort items by value-to-weight ratio in non-increasing order
Arrays.sort(items, (a, b) -> Double.compare(b.ratio(), a.ratio()));
double totalValue = 0;
int currentWeight = 0;
for (Item item : items) {
// If adding the complete item doesn't exceed capacity
if (currentWeight + item.weight <= capacity) {
currentWeight += item.weight;
totalValue += item.value;
} else {
// Add a fraction of the item
int remainingCapacity = capacity - currentWeight;
totalValue += item.value * ((double) remainingCapacity / item.weight);
break;
}
}
return totalValue;
}
public static void main(String[] args) {
Item[] items = {
new Item(60, 10),
new Item(100, 20),
new Item(120, 30)
};
int capacity = 50;
System.out.println("Maximum value: " + getMaxValue(items, capacity));
}
}
Huffman Coding
import heapq
class Node:
def __init__(self, freq, symbol, left=None, right=None):
self.freq = freq
self.symbol = symbol
self.left = left
self.right = right
self.huff = ''
def __lt__(self, nxt):
return self.freq < nxt.freq
def print_nodes(node, val=''):
# Huffman code for current node
new_val = val + str(node.huff)
# If node is not an edge node, traverse inside it
if node.left:
print_nodes(node.left, new_val)
if node.right:
print_nodes(node.right, new_val)
# If node is edge node, print its huffman code
if not node.left and not node.right:
print(f"{node.symbol} -> {new_val}")
def huffman_coding(chars, freq):
nodes = []
# Create a leaf node for each character and add it to the priority queue
for i in range(len(chars)):
heapq.heappush(nodes, Node(freq[i], chars[i]))
# Build Huffman Tree
while len(nodes) > 1:
# Remove two nodes of highest priority (lowest frequency)
left = heapq.heappop(nodes)
right = heapq.heappop(nodes)
# Assign directional value to these nodes
left.huff = 0
right.huff = 1
# Create a new internal node with these two nodes as children
# and with frequency equal to the sum of the two nodes' frequencies
new_node = Node(left.freq + right.freq, left.symbol + right.symbol, left, right)
heapq.heappush(nodes, new_node)
# Print Huffman codes
print_nodes(nodes[0])
# Example usage
chars = ['a', 'b', 'c', 'd', 'e', 'f']
freq = [5, 9, 12, 13, 16, 45]
huffman_coding(chars, freq)
Practice Problems
Apply your knowledge of greedy algorithms by solving these carefully selected problems:
1. Coin Change (Greedy Approach)
Given a set of coin denominations and a target amount, find the minimum number of coins needed to make up that amount.
2. Job Sequencing with Deadlines
Given a set of jobs with deadlines and profits, find the maximum profit that can be earned by executing jobs within their deadlines.
3. Minimum Platforms
Given arrival and departure times of all trains that reach a railway station, find the minimum number of platforms required for the railway station so that no train waits.
4. Gas Station
There are N gas stations along a circular route, where the amount of gas at station i is gas[i]. You have a car with an unlimited gas tank and it costs cost[i] of gas to travel from station i to its next station (i+1). Find the starting gas station's index if you can travel around the circuit once, otherwise return -1.