Greedy Algorithms

Greedy algorithms make locally optimal choices at each step with the hope of finding a global optimum. This module covers the principles of greedy algorithms, common problems, and techniques for proving their correctness.

Introduction to Greedy Algorithms

Greedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum. They are typically used for optimization problems where we need to maximize or minimize some value.

What is a Greedy Algorithm?

A greedy algorithm builds up a solution piece by piece, always choosing the next piece that offers the most immediate benefit. This approach means making a locally optimal choice at each step with the hope that these local optimums will lead to a global optimum solution.

When to Use Greedy Algorithms

Greedy algorithms are suitable when:

  • The problem has optimal substructure (an optimal solution to the problem contains optimal solutions to the sub-problems)
  • The problem has the greedy choice property (a globally optimal solution can be arrived at by making locally optimal choices)

Advantages and Limitations

Advantages:

  • Simple and easy to implement
  • Generally efficient in terms of time complexity
  • Works well for many practical problems

Limitations:

  • May not always yield the optimal solution
  • Requires careful proof of correctness
  • Not applicable to all optimization problems

Principles and Properties

Understanding the key principles and properties of greedy algorithms is essential for determining when they can be applied and how to prove their correctness.

1. Greedy Choice Property

A problem exhibits the greedy choice property if a globally optimal solution can be arrived at by making locally optimal (greedy) choices. In other words, when we make a choice that seems best at the moment, it leads to an optimal solution to the complete problem.

For example, in the coin change problem with denominations like US coins (1, 5, 10, 25 cents), always choosing the largest possible coin leads to the minimum number of coins needed.

2. Optimal Substructure

A problem has optimal substructure if an optimal solution to the problem contains optimal solutions to the sub-problems. This property is shared with dynamic programming.

For example, in the shortest path problem, if the shortest path from A to C goes through B, then the path from A to B and the path from B to C must also be the shortest paths between those respective points.

3. Proving Correctness

Unlike many other algorithm design techniques, greedy algorithms require a proof of correctness to ensure they yield the optimal solution. This is typically done using:

  • Greedy stays ahead: Show that after each step, the greedy solution is at least as good as any other solution.
  • Exchange argument: Show that any non-greedy solution can be transformed into the greedy solution without worsening the result.
  • Mathematical induction: Prove that if the greedy choice works for problems of size n, it also works for problems of size n+1.

4. Greedy vs. Dynamic Programming

Greedy AlgorithmsDynamic Programming
Make locally optimal choicesConsider all possible choices and pick the best one
Generally more efficientMay be less efficient but more versatile
May not always find the optimal solutionAlways finds the optimal solution if applicable
Requires proof of correctnessCorrectness follows from the principle of optimality

Classic Greedy Problems

Several classic problems can be efficiently solved using greedy algorithms. Understanding these problems helps in recognizing when a greedy approach might be applicable.

1. Activity Selection Problem

Given a set of activities with start and finish times, select the maximum number of activities that can be performed by a single person, assuming that a person can only work on a single activity at a time.

Greedy Approach:

Sort activities by finish time and select activities that don't overlap with the previously selected activity.

Why it works:

By always choosing the activity that finishes earliest, we maximize the time available for remaining activities.

2. Fractional Knapsack

Given weights and values of n items, put these items in a knapsack of capacity W to get the maximum total value. In the fractional knapsack problem, we can break items for maximizing the total value.

Greedy Approach:

Calculate the value-to-weight ratio for each item, sort items by this ratio, and take items with the highest ratio first. If an item can't be taken completely, take a fraction of it.

Why it works:

By always choosing items with the highest value-to-weight ratio, we maximize the value per unit weight in the knapsack.

3. Huffman Coding

A lossless data compression algorithm that assigns variable-length codes to input characters, with shorter codes assigned to more frequent characters.

Greedy Approach:

Build a binary tree where the two least frequent characters are combined at each step, and their frequencies are added.

Why it works:

By combining the least frequent characters first, we ensure that they are assigned longer codes, which minimizes the overall code length.

4. Minimum Spanning Tree (Prim's and Kruskal's)

Find a subset of edges in a connected, undirected graph that connects all vertices together with the minimum possible total edge weight.

Greedy Approaches:

  • Prim's Algorithm: Start with a single vertex and grow the tree by adding the cheapest edge that connects a vertex in the tree to a vertex outside the tree.
  • Kruskal's Algorithm: Sort all edges by weight and add them to the tree one by one, skipping edges that would create a cycle.

Why they work:

Both algorithms make locally optimal choices (selecting the cheapest valid edge) that lead to a globally optimal solution (the minimum spanning tree).

5. Dijkstra's Shortest Path Algorithm

Find the shortest path from a source vertex to all other vertices in a weighted graph with non-negative edge weights.

Greedy Approach:

Maintain a set of vertices whose shortest distance from the source is already determined. For each step, select the vertex with the minimum distance from the source and update the distances of its adjacent vertices.

Why it works:

By always selecting the vertex with the minimum distance, we ensure that we find the shortest path to each vertex.

Implementation Examples

Let's look at some code examples of greedy algorithms to better understand their implementation.

Activity Selection Problem

public class ActivitySelection {
    // Function to select maximum activities
    public static int maxActivities(int[] start, int[] finish) {
        int n = start.length;
        
        // Sort activities by finish time
        // (Assuming activities are already sorted by finish time for simplicity)
        
        int count = 1;  // First activity is always selected
        int lastFinish = finish[0];
        
        // Consider rest of the activities
        for (int i = 1; i < n; i++) {
            // If this activity has start time greater than or equal
            // to the finish time of previously selected activity,
            // then select it
            if (start[i] >= lastFinish) {
                count++;
                lastFinish = finish[i];
            }
        }
        
        return count;
    }
    
    public static void main(String[] args) {
        int[] start = {1, 3, 0, 5, 8, 5};
        int[] finish = {2, 4, 6, 7, 9, 9};
        
        System.out.println("Maximum number of activities: " + maxActivities(start, finish));
    }
}

Fractional Knapsack

public class FractionalKnapsack {
    // Item class
    static class Item {
        int value;
        int weight;
        
        public Item(int value, int weight) {
            this.value = value;
            this.weight = weight;
        }
        
        // Value per unit weight
        public double ratio() {
            return (double) value / weight;
        }
    }
    
    // Function to get maximum value
    public static double getMaxValue(Item[] items, int capacity) {
        // Sort items by value-to-weight ratio in non-increasing order
        Arrays.sort(items, (a, b) -> Double.compare(b.ratio(), a.ratio()));
        
        double totalValue = 0;
        int currentWeight = 0;
        
        for (Item item : items) {
            // If adding the complete item doesn't exceed capacity
            if (currentWeight + item.weight <= capacity) {
                currentWeight += item.weight;
                totalValue += item.value;
            } else {
                // Add a fraction of the item
                int remainingCapacity = capacity - currentWeight;
                totalValue += item.value * ((double) remainingCapacity / item.weight);
                break;
            }
        }
        
        return totalValue;
    }
    
    public static void main(String[] args) {
        Item[] items = {
            new Item(60, 10),
            new Item(100, 20),
            new Item(120, 30)
        };
        int capacity = 50;
        
        System.out.println("Maximum value: " + getMaxValue(items, capacity));
    }
}

Huffman Coding

import heapq

class Node:
    def __init__(self, freq, symbol, left=None, right=None):
        self.freq = freq
        self.symbol = symbol
        self.left = left
        self.right = right
        self.huff = ''
        
    def __lt__(self, nxt):
        return self.freq < nxt.freq
        
def print_nodes(node, val=''):
    # Huffman code for current node
    new_val = val + str(node.huff)
    
    # If node is not an edge node, traverse inside it
    if node.left:
        print_nodes(node.left, new_val)
    if node.right:
        print_nodes(node.right, new_val)
        
    # If node is edge node, print its huffman code
    if not node.left and not node.right:
        print(f"{node.symbol} -> {new_val}")

def huffman_coding(chars, freq):
    nodes = []
    
    # Create a leaf node for each character and add it to the priority queue
    for i in range(len(chars)):
        heapq.heappush(nodes, Node(freq[i], chars[i]))
    
    # Build Huffman Tree
    while len(nodes) > 1:
        # Remove two nodes of highest priority (lowest frequency)
        left = heapq.heappop(nodes)
        right = heapq.heappop(nodes)
        
        # Assign directional value to these nodes
        left.huff = 0
        right.huff = 1
        
        # Create a new internal node with these two nodes as children
        # and with frequency equal to the sum of the two nodes' frequencies
        new_node = Node(left.freq + right.freq, left.symbol + right.symbol, left, right)
        
        heapq.heappush(nodes, new_node)
    
    # Print Huffman codes
    print_nodes(nodes[0])

# Example usage
chars = ['a', 'b', 'c', 'd', 'e', 'f']
freq = [5, 9, 12, 13, 16, 45]
huffman_coding(chars, freq)

Practice Problems

Apply your knowledge of greedy algorithms by solving these carefully selected problems:

1. Coin Change (Greedy Approach)

Given a set of coin denominations and a target amount, find the minimum number of coins needed to make up that amount.

EasyGreedyArray

2. Job Sequencing with Deadlines

Given a set of jobs with deadlines and profits, find the maximum profit that can be earned by executing jobs within their deadlines.

MediumGreedySorting

3. Minimum Platforms

Given arrival and departure times of all trains that reach a railway station, find the minimum number of platforms required for the railway station so that no train waits.

MediumGreedySorting

4. Gas Station

There are N gas stations along a circular route, where the amount of gas at station i is gas[i]. You have a car with an unlimited gas tank and it costs cost[i] of gas to travel from station i to its next station (i+1). Find the starting gas station's index if you can travel around the circuit once, otherwise return -1.

MediumGreedyArray