Greedy Algorithm
\n\nGreedy algorithm (English: greedy algorithm), also known as the greedy algorithm, is an algorithmic strategy that, at each step, selects the best or optimal (i.e., most advantageous) choice under the current state, hoping to lead to a globally optimal result.
\n\nGreedy algorithm is like what we often say: focusing only on immediate benefits, without considering overall optimality; the choices made are merely locally optimal solutions in some sense.
\n\n\n\n\nImagine you have a pile of coins with different denominations (e.g., Β₯1, Β₯0.5, Β₯0.1), and you need to make exactly Β₯1.8 using the minimum number of coins.
\n \nA natural idea is: always take the coin with the largest denomination first.
\n \nFirst take Β₯1, leaving Β₯0.8; then take Β₯0.5, leaving Β₯0.3; finally take three Β₯0.1 coins.
\n \nThroughout this process, at every step we made the best choice available at that momentβthis is the core idea of the greedy algorithm.\n
\n
Greedy algorithm is like a short-sighted but decisive person:
\n\n- \n
- Each step selection: Make the seemingly optimal choice under the current state \n
- No global consideration: Do not consider how this choice affects future steps \n
- Expected outcome: Hope that local optimal choices will lead to a globally optimal solution \n
Real-world examples:
\n\n- \n
- Change-making: Always use the largest denomination coin first \n
- Hill climbing: Always move in the steepest direction \n
- Shopping: Always buy the currently most discounted item \n
- Route selection: Always choose the currently shortest path \n
Characteristics of Greedy Algorithms
\n\n- \n
- Greedy Choice Property: The optimal solution at each step can be derived from the previous locally optimal solutions. Later choices do not affect earlier ones. \n
- Optimal Substructure: The optimal solution to a problem contains optimal solutions to its subproblems. This property is also shared by dynamic programming, but greedy algorithms "greedily" select the optimal solution for each subproblem at every step. \n
| Feature | \nDescription | \nAdvantages | \nDisadvantages | \n
|---|---|---|---|
| Local Optimality | \nChoose the current optimal solution at each step | \nSimple and intuitive | \nMay not yield the global optimum | \n
| Irrevocability | \nOnce a choice is made, it cannot be undone | \nFast decision-making | \nEasily trapped in local optima | \n
| Efficiency | \nUsually low time complexity | \nHigh practicality | \nLimited applicability | \n
| Simple Implementation | \nClear and easy-to-understand logic | \nEasy to code | \nRequires correctness proof | \n
Difference from Dynamic Programming
\n\nThis is a common point of confusion for beginners. Letβs compare them using a table:
\n\n| Feature | \nGreedy Algorithm | \nDynamic Programming | \n
|---|---|---|
| Decision Basis | \nMakes the current optimal choice at each step, no backtracking. | \nEach step's choice depends on subproblem solutions; stores intermediate results and compares multiple options. | \n
| Guarantee of Optimal Solution | \nDoes not necessarily yield a globally optimal solution; requires the problem itself to possess the greedy choice property. | \nTypically yields a globally optimal solution. | \n
| Efficiency | \nHighly efficient, usually with low time complexity. | \nRelatively lower efficiency due to solving overlapping subproblems. | \n
| Applicable Scenarios | \nProblems with greedy choice property and optimal substructure, e.g., Huffman coding, Minimum Spanning Tree. | \nProblems with optimal substructure and overlapping subproblems, e.g., Knapsack Problem, Shortest Path. | \n
In simple terms, dynamic programming is βplan thoroughly before actingβ, sacrificing immediate gains for long-term benefit; whereas greedy algorithm is βenjoy the present momentβ, pursuing only immediate benefits.
\n\n\n\n
Basic Framework of Greedy Algorithm
\n\nThe greedy algorithm does not have a fixed code template, but its idea can be summarized into the following steps:
\n\n- \n
- Establish a mathematical model: Abstract the real-world problem into a mathematical one. \n
- Define the greedy strategy: Determine the criterion for selecting the βoptimal solutionβ at each step. This is the core and difficulty of the algorithm. \n
- Prove the greedy strategy: Prove that the defined greedy strategy leads to a globally optimal solution (optional but important; for algorithm problems, understanding correctness is usually required). \n
- Iterative solution: Starting from the initial state, make greedy choices step by step according to the strategy until the problem is solved. \n
A general pseudocode framework is as follows:
\n\nExample
\n\ndef greedy_algorithm(problem):\n\n# 1. Initialization: may include sorting, creating result containers, etc.\n\n solution =[]\n\n# 2. Iterate through all options, usually requiring sorting by a certain rule first\n\nfor item in sorted(problem.items, key=Greedy Strategy Rules):\n\n# 3. Greedy selection: if the current choice satisfies the condition, adopt it\n\nif is_feasible(solution, item):\n\n solution.append(item)\n\n# 4. Return the final constructed solution\n\nreturn solution\n\n\n\n\n
Classic Problems and Code Examples
\n\nLetβs experience the application of greedy algorithms through several classic problems. For each problem, we will: 1) analyze the problem and greedy strategy; 2) provide complete code; 3) explain line by line.
\n\nExample 1: Coin Change Problem
\n\nProblem Description: Assume the coin system is [100, 50, 20, 10, 5, 1] (unit: cents), and you need to make change for amount cents. Find the minimum number of coins required.
Greedy Strategy: At each step, select the largest coin whose denomination does not exceed the remaining amount. For the Chinese RMB system, this strategy is effective.
\n\nCode Implementation:
\n\nExample
\n\ndef coin_change_greedy(coins, amount):\n\n"""\n\n Use greedy algorithm to solve the coin change problem (for specific coin systems)\n\n :param coins: list, list of coin denominations, should be sorted in descending order\n\n :param amount: int, total amount to make change for\n\n :return: list, list of coin denominations used\n\n """\n\n coins.sort(reverse=True)# Ensure coins are sorted from largest to smallest denomination\n\n result =[]# Container for coins used in change\n\n remaining = amount # Remaining amount to make change for\n\nfor coin in coins:\n\n# When the current coin denomination is less than or equal to the remaining amount, use it as much as possible\n\nwhile remaining >= coin:\n\n result.append(coin)\n\n remaining -= coin # Subtract the current coin denomination from remaining amount\n\n# Check if exact change is made\n\nif remaining ==0:\n\nreturn result\n\nelse:\n\n# For cases where exact change cannot be made (though wonβt happen in this coin system)\n\nreturn None\n\n# Test data\n\n test_coins =[100,50,20,10,5,1]\n\n test_amount =186# Β₯1.86\n\nchange = coin_change_greedy(test_coins, test_amount)\n\nprint(f"Making change for {test_amount} cents requires coins: {change}")\n\nprint(f"Number of coins: {len(change)}")\n\n# Output: Making change for 186 cents requires coins: [100, 50, 20, 10, 5, 1]\n\n# Number of coins: 6\n\n\nCode Explanation:
\n\n- \n
- Line 8:
coins.sort(reverse=True)is the key preparation for the greedy strategy, ensuring we always consider larger denomination coins first. \n - Lines 12β15: The
whileloop ensures that as long as the current denomination can still be used, we keep using itβimplementing the βuse as many as possibleβ greedy choice. \n - Lines 18β22: Check the final result to ensure the amount matches exactly. \n
Important Note: The greedy algorithm for the coin change problem does not always yield the optimal solution**! For example, if the coin system is [25, 20, 10, 5, 1] and you need to make 40 cents, the greedy method chooses 25+10+5 (three coins), but the optimal solution is 20+20 (two coins). Only for specific currency systems (like RMB) is the greedy strategy valid.
\n\nExample 2: Interval Scheduling Problem
\n\nProblem Description: Given many meetings (each with a start and end time), how should you schedule them to maximize the number of meetings held in a single room?
\n\nGreedy Strategy: Always select the meeting that ends earliest**. This leaves more time for subsequent meetings.
\n\nCode Implementation:
\n\nExample
\n\ndef interval_scheduling(intervals):\n\n"""\n\n Solve the maximum non-overlapping intervals problem (interval scheduling)\n\n :param intervals: list of [start, end], list of intervals\n\n :return: list of [start, end], list of selected intervals\n\n """\n\n# 1. Sort by end time in ascending order\n\n intervals_sorted =sorted(intervals, key=lambda x: x)\n\nselected =[]# Store selected intervals\n\n last_end_time = -float('inf')# Initialize the end time of the last selected interval\n\nfor interval in intervals_sorted:\n\n start, end = interval\n\n# 2. Greedy selection: if the current interval's start time is not earlier than the last selected interval's end time, select it\n\nif start >= last_end_time:\n\n selected.append(interval)\n\n last_end_time = end # Update the latest end time\n\nreturn selected\n\n# Test data: each sublist represents a meeting [start time, end time]\n\n test_intervals =[\n\n[1,3],[2,4],[3,5],[4,6],[5,7],\n\n[1,2],[2,3],[1,4]\n\n]\n\nresult = interval_scheduling(test_intervals)\n\nprint("Maximum number of meetings that can be scheduled (after sorting by end time):")\n\nfor meeting in result:\n\nprint(f" Meeting [{meeting}, {meeting}]")\n\nprint(f"Total: {len(result)} meetings")\n\n# One possible output (order may vary after sorting, but count is optimal):\n\n# Meeting [1, 2]\n\n# Meeting [2, 3]\n\n# Meeting [3, 5] or [4, 6], etc.\n\n# Total: 4 meetings\n\n\nCode Explanation:
\n\n- \n
- Line 10:
sorted(intervals, key=lambda x: x)is the core to implementing the greedy strategyβsorting by end time. \n - Lines 16β19:
if start >= last_end_timechecks feasibility, ensuring the current meeting does not overlap with already selected meetings. \n - The algorithm has a time complexity of O(n log n), mainly due to sorting. \n
Example 3: Huffman Coding β Concept and Simplified Implementation
\n\nHuffman coding is a greedy algorithm used for lossless data compression. Its core idea is: assign shorter codes to characters that appear more frequently, and longer codes to those appearing less frequently.
\n\nGreedy Strategy**: Repeatedly merge the two nodes with the smallest frequencies to build a binary tree.
\n\nSince the full implementation is lengthy, here we show the core greedy merging process:
\n\nExample
\n\nimport heapq\n\nclass Node:\n\n"""Node class for Huffman tree"""\n\ndef __init__ (self, char, freq):\n\nself.char= char # Character (only leaf nodes have this)\n\nself.freq= freq # Frequency\n\nself.left=None\n\nself.right=None\n\n# Define comparison method to allow comparison in heap\n\ndef __lt__ (self, other):\n\nreturn self.freq< other.freq\n\ndef build_huffman_tree(char_freq):\n\n"""\n\n Build Huffman tree\n\n :param char_freq: dict, characters and their frequencies, e.g., {'a': 5, 'b': 9, 'c': 12, 'd': 13, 'e': 16, 'f': 45}\n\n :return: Node, root node of the Huffman tree\n\n """\n\n# 1. Initialization: create a leaf node for each character and put them into a min-heap (priority queue)\n\n heap =[Node(char, freq)for char, freq in char_freq.items()]\n\nheapq.heapify(heap)# Convert list into a min-heap\n\n# 2. Greedy merging: while more than one node remains in the heap\n\nwhile len(heap)>1:\n\n# Pop two nodes with smallest frequencies\n\n left =heapq.heappop(heap)\n\n right =heapq.heappop(heap)\n\n# Create a new internal node whose frequency is the sum of its children's frequencies\n\n merged = Node(None, left.freq + right.freq)\n\n merged.left= left\n\n merged.right= right\n\n# Push the new node back into the heap\n\nheapq.heappush(heap, merged)\n\n# The last remaining node in the heap is the root of the Huffman tree\n\nreturn heapif heap else None\n\n# Test data\n\n test_freq ={'a': 5,'b': 9,'c': 12,'d': 13,'e': 16,'f': 45}\n\n root = build_huffman_tree(test_freq)\n\nprint("Huffman tree built successfully! Root node frequency =", root.freq if root else 0)\n\n# Output: Huffman tree built successfully! Root node frequency = 100\n\n\nAlgorithm Visualization: The following figure shows the first few merging steps of building the Huffman tree using the above test data:
\n\n
Illustration: The algorithm always selects the two nodes with the smallest frequencies (initially leaf nodes, later including internal nodes) for merging, generating a new internal node. This process repeats until a complete binary tree is formed. The leaf nodes correspond to original characters, and the path from root to leaf (left branch = 0, right branch = 1) gives the Huffman code for that character.
\n\n\n\n
Correctness Proof of Greedy Algorithms
\n\nHow to determine whether a problem can be solved by a greedy algorithm? Typically, two points must be proven:
\n\n- \n
- Greedy Choice Property**: Prove via mathematical induction or contradiction: the optimal choice at the first step is included in some globally optimal solution. \n
- Optimal Substructure**: Prove that after making the first greedy choice, the original problem reduces to a smaller-scale subproblem of the same type. \n
Take the Interval Scheduling Problem as an example for the proof outline:
\n\n- \n
- Greedy Choice**: Let interval
ibe the one ending earliest among all intervals. There exists an optimal solution containing intervali.\n- \n
- Proof: Suppose some optimal solution does not contain
i. Letjbe the first interval in that optimal solution. Sinceiends earliest, replacingjwithicauses no conflict with other intervals and keeps the solution size unchanged; thus, a solution containingiis also optimal. \n
\n - Proof: Suppose some optimal solution does not contain
- Optimal Substructure**: After selecting interval
i, the
YouTip