YouTip LogoYouTip

Sorting Algorithm Derivative Problem

Sorting Algorithm Derivative Problems

\\n

When we have mastered basic sorting algorithms like bubble sort and quicksort, we find that sorting is not just about arranging data in order.

\\n

In practical programming and interviews, many problems are variants or derivative problems of sorting algorithms.

\\n

Understanding these derivative problems can help us grasp the essence of sorting algorithms more deeply and improve our ability to solve complex problems.

\\n

This article will explore several classic sorting algorithm derivative problems, analyzing their core ideas, solutions, and practical applications to help you build a more systematic algorithmic thinking.

\\n
\\n

What are Sorting Algorithm Derivative Problems?

\\n

Sorting algorithm derivative problems refer to those that do not directly require sorting, but whose core ideas, algorithmic processes, or data structures are highly related to classic sorting algorithms.

\\n

These problems typically have the following characteristics:

\\n
    \\n
  • Different Goals: The final objective is not to output a sorted sequence.
  • \\n
  • Shared Concepts: They use core ideas from sorting, such as comparison, swapping, and divide-and-conquer.
  • \\n
  • Related Complexity: Their time complexity is often in the same order of magnitude as sorting algorithms.
  • \\n
  • Widespread Application: They are frequently encountered in practical development.
  • \\n
\\n

Let's delve deeper through several specific problems.

\\n
\\n

Classic Derivative Problem 1: Top K Problem

\\n

Problem Definition

\\n

Find the largest (or smallest) K elements from n elements.

\\n

Practical Application Scenarios:

\\n
    \\n
  • E-commerce websites: Finding the top 10 best-selling products.
  • \\n
  • Social networks: Finding the 100 users with the most followers.
  • \\n
  • Data analysis: Finding the 5 most visited pages.
  • \\n
\\n

Solution Comparison

\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n
MethodTime ComplexitySpace ComplexityApplicable Scenario
Sort directly and take the first KO(n log n)O(1) or O(n)When K is close to n
Bubble sort K timesO(n Γ— K)O(1)When K is very small (K < 10)
Min-heap / Max-heapO(n log K)O(K)Most common, when K is much smaller than n
Quickselect algorithmO(n) average, O(nΒ²) worstO(1)When complete order is not needed, only Top K
\\n

Method 1: Quickselect Based on Quicksort Idea

\\n

The quickselect algorithm is a variant of quicksort. It does not need to completely sort the entire array, only to find the position of the K-th largest element.

\\n

Example

\\n
def quick_select(nums, k):\\n    """\\n    Use quickselect algorithm to find the k-th smallest element\\n    nums: Input Array\\n    k: The k-th smallest element to find (1-based)\\n    Return: The value of the k-th smallest element\\n    """\\n    def partition(left, right, pivot_idx):\\n        """Partition function, divides Array into two parts: less than and greater than pivot"""\\n        pivot = nums\\n        # Move pivot to the end\\n        nums, nums = nums, nums\\n        store_idx = left\\n        for i in range(left, right):\\n            if nums < pivot:\\n                nums, nums = nums, nums\\n                store_idx += 1\\n        # Move pivot to the correct position\\n        nums, nums = nums, nums\\n        return store_idx\\n\\n    def select(left, right, k_smallest):\\n        """Main selection function"""\\n        if left == right:\\n            return nums\\n        # Randomly select pivot\\n        pivot_idx = left + (right - left) // 2\\n        # Partition and get the final position of pivot\\n        pivot_idx = partition(left, right, pivot_idx)\\n        # Pivot is exactly the k-th smallest element\\n        if k_smallest == pivot_idx:\\n            return nums\\n        # The k-th smallest element is in the left half\\n        elif k_smallest = n:\\n        return sorted(nums, reverse=largest)\\n    if largest:\\n        # Find the k largest = Find the(n-k+1)smallest element\\n        kth_value = quick_select(nums, n - k + 1)\\n        # Collect all elements greater than or equal to kth_value\\n        result = [x for x in nums if x > kth_value]\\n        # If the count is insufficient, supplement with equal elements\\n        while len(result) < k:\\n            result.append(kth_value)\\n        return result\\n    else:\\n        # Find the k smallest = Find theksmallest element\\n        kth_value = quick_select(nums, k)\\n        result = [x for x in nums if x < kth_value]\\n        while len(result) < k:\\n            result.append(kth_value)\\n        return result\\n\\n# TestData\\ntest_data = [3, 2, 1, 5, 6, 4, 9, 8, 7, 0]\\nprint("Original Data:", test_data)\\nprint("The 3 largest elements:", find_top_k(test_data.copy(), 3, largest=True))\\nprint("The 3 smallest elements:", find_top_k(test_data.copy(), 3, largest=False))
\\n

Method 2: Solution Based on Heap Sort

\\n

The heap is a classic data structure for solving the Top K problem, especially suitable for handling data stream scenarios.

\\n

Example

\\n
import heapq\\n\\ndef find_top_k_with_heap(nums, k, largest=True):\\n    """\\n    Use heap to find the k largest elements\\n    nums: Input Array\\n    k: Number of elements to find\\n    largest: TrueFind the k largest,FalseFind the k smallest\\n    Return: List of the k largest elements\\n    """\\n    if largest:\\n        # Find the k largest:Using Min Heap\\n        heap = []\\n        for num in nums:\\n            if len(heap)  heap: # Current element is larger than heap top\\n                heapq.heapreplace(heap, num)\\n        return sorted(heap, reverse=True)\\n    else:\\n        # Find the k smallest:Using Max Heap (Implemented via Negation)\\n        heap = []\\n        for num in nums:\\n            if len(heap) < k:\\n                heapq.heappush(heap, -num) # Store negative numbers\\n            elif num < -heap: # Current element is smaller than heap top\\n                heapq.heapreplace(heap, -num)\\n        return sorted()\\n\\n# TestHeap method\\nprint("nUse Heap method:")\\nprint("The 3 largest elements:", find_top_k_with_heap(test_data, 3, largest=True))\\nprint("The 3 smallest elements:", find_top_k_with_heap(test_data, 3, largest=False))
\\n
\\n

Classic Derivative Problem 2: Inversion Counting

\\n

Problem Definition

\\n

In an array, if a preceding element is greater than a succeeding element, these two elements form an inversion. Calculate the total number of inversions in the array.

\\n

Practical Application Scenarios:

\\n
    \\n
  • Measuring the degree of order in data.
  • \\n
  • User preference analysis in recommendation systems.
  • \\n
  • Modification conflict detection in version control systems.
  • \\n
\\n

Solution: Algorithm Based on Merge Sort

\\n

The process of merge sort naturally allows for counting the number of inversions, which is a typical application of the divide-and-conquer idea.

\\n

Example

\\n
def count_inversions(nums):\\n    """\\n    Count the number of inversions in Array\\n    Use merge sort-based method\\n    """\\n    def merge_sort_count(left, right):\\n        """Merge sort and count inversions"""\\n        if left >= right:\\n            return 0, [nums] if left == right else []\\n        mid = (left + right) // 2\\n        # Recursively count inversions in left and right halves\\n        left_count, left_arr = merge_sort_count(left, mid)\\n        right_count, right_arr = merge_sort_count(mid + 1, right)\\n        # Count inversions crossing the midpoint\\n        cross_count = 0\\n        merged = []\\n        i = j = 0\\n        while i < len(left_arr) and j < len(right_arr):\\n            if left_arr  right_arr,Form an inversion pair\\n                merged.append(right_arr)\\n                cross_count += len(left_arr) - i # Key: all remaining elements on the left with right_arrForm an inversion pair\\n                j += 1\\n        # Add remaining elements\\n        merged.extend(left_arr[i:])\\n        merged.extend(right_arr[j:])\\n        total_count = left_count + right_count + cross_count\\n        return total_count, merged\\n\\n    if not nums:\\n        return 0\\n    count, _ = merge_sort_count(0, len(nums) - 1)\\n    return count\\n\\n# TestInversion count\\ntest_cases = [\\n    ([1, 2, 3, 4, 5], 0), # Completely sorted, inversions = 0\\n    ([5, 4, 3, 2, 1], 10), # Completely reversed, inversions = C(5,2)=10\\n    ([2, 4, 1, 3, 5], 3), # Example\\n    ([7, 5, 6, 4], 5), # Complex Example\\n]\\nprint("nInversion countTest:")\\nfor nums, expected in test_cases:\\n    result = count_inversions(nums.copy())\\n    print(f"Array {nums}: Computed Value={result}, Expected value={expected}, {'Correct' if result == expected else 'Errors'}")
\\n

The following diagram illustrates how inversions are counted during the merge sort process. The key point is: when an element from the right half is smaller than an element from the left half, all remaining elements in the left half form inversions with that right-half element.

\\n

Image 1

\\n
\\n

Classic Derivative Problem 3: K-th Largest/Smallest Element

\\n

This problem is a special case of the Top K problem, but with more diverse solutions.

\\n

Comparison of Multiple Solutions

\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n
MethodTime ComplexitySpace ComplexityCharacteristics
Sort and take directlyO(n log n)O(1) or O(n)Simplest and most intuitive
QuickselectO(n) averageO(1)Optimal average complexity
Heap methodO(n log k)O(k)Suitable for data streams
Counting sortO(n + m)O(m)Suitable when data range is small
\\n

Complete Implementation Example

\\n

Example

\\n
def find_kth_smallest(nums, k):\\n    """Find the k-th smallest element (1-based)"""\\n    # Method 1: Direct sorting\\n    def method_sort():\\n        return sorted(nums)\\n\\n    # Method 2: Quick Select(Implemented above)\\n    def method_quick_select():\\n        return quick_select(nums.copy(), k)\\n\\n    # Method 3: Heap method\\n    def method_heap():\\n        # Using Max Heap to Find the k-th Smallest\\n        heap = []\\n        for num in nums:\\n            if len(heap) < k:\\n                heapq.heappush(heap, -num) # Max heap implemented via negative numbers\\n            elif num = k:\\n                return i + min_val\\n        return None\\n\\n    # Verify all methods produce consistent results\\n    result1 = method_sort()\\n    result2 = method_quick_select()\\n    result3 = method_heap()\\n    result4 = method_counting()\\n    print(f"nLine{k}SmallElement Search Test (Array: {nums}οΌ‰:")\\n    print(f"Sorting method: {result1}")\\n    print(f"Quick Select: {result2}")\\n    print(f"Heap method: {result3}")\\n    print(f"Counting Method: {result4}")\\n    # Verify Consistency\\n    assert result1 == result2 == result3 == result4, "Inconsistent Results Across Different Methods!"\\n    return result1\\n\\n# TestData\\ntest_nums = [3, 2, 3, 1, 2, 4, 5, 5, 6]\\nfor k in [1, 3, 5, 7]:\\n    find_kth_smallest(test_nums.copy(), k)
\\n
\\n

Classic Derivative Problem 4: Interval Merging

\\n

Problem Definition

\\n

Given a set of intervals, merge all overlapping intervals.

\\n

Practical Application Scenarios:

\\n
    \\n
  • Merging time slots in calendar applications.
  • \\n
  • Task scheduling in project management.
  • \\n
  • Merging network IP address ranges.
  • \\n
\\n

Solution: Sorting + Linear Scan

\\n

Example

\\n
def merge_intervals(intervals):\\n    """\\n    Merge Overlapping Intervals\\n    intervals: List of Lists, Each Sublist Represents an Interval [start, end]\\n    Return: Merged Interval List\\n    """\\n    if not intervals:\\n        return []\\n    # Key Step 1: Sort by Interval Start Point\\n    intervals.sort(key=lambda x: x)\\n    merged = []\\n    # Key Step 2: Linear scan merge\\n    current_start, current_end = intervals\\n    for interval in intervals[1:]:\\n        start, end = interval\\n        if start <= current_end: # Has Overlap\\n            # Merge Intervals: Take the Maximum End Point\\n            current_end = max(current_end, end)\\n        else: # No Overlap\\n            # Save Current Interval\\n            merged.append([current_start, current_end])\\n            # Start a new interval\\n            current_start, current_end = start, end\\n    # Add the last interval\\n    merged.append([current_start, current_end])\\n    return merged\\n\\n# TestInterval Merging\\ntest_intervals = [\\n    [[1,3],[2,6],[8,10],[15,18]],\\n    [[1,4],[4,5]],\\n    [[1,4],[0,4]],\\n    [[1,4],[2,3]], # Fully Contained\\n    [[1,4],[0,2],[3,5]], # Complex Cases\\n]\\nprint("nInterval MergingTest:")\\nfor i, intervals in enumerate(test_intervals, 1):\\n    result = merge_intervals(intervals.copy())\\n    print(f"TestUse Case {i}:")\\n    print(f"  Input: {intervals}")\\n    print(f"  Output: {result}")\\n    print()
\\n

Algorithm Process Analysis

\\n

Image 2

\\n
\\n

Practice Exercises

\\n

Now, let's consolidate what we've learned through a few exercises:

\\n

Exercise 1: Sort Colors (Dutch National Flag Problem)

\\n

Given an array with n objects colored red, white, or blue, sort them in-place so that objects of the same color are adjacent, with the colors in the order red, white, and blue.

\\n

Example

\\n
def sort_colors(nums):\\n    """\\n    In-place Sort Colors Array\\n    0: Red, 1: White, 2: Blue\\n    Use three-pointer method (similar to Quick Sort's Partition idea)\\n    """\\n    # Initialize three pointers\\n    left = 0 # RedRight boundary of the region\\n    current = 0 # Current element being checked\\n    right = len(nums) - 1 # BlueLeft Boundary of the Region\\n    while current <= right:\\n        if nums == 0: # Red\\n            nums, nums = nums, nums\\n            left += 1\\n            current += 1\\n        elif nums == 1: # White\\n            current += 1\\n        else: # Blue\\n            nums, nums = nums, nums\\n            right -= 1\\n            # Note: Here, current is not incremented because the swapped element hasn't been checked yet.\\n    return nums\\n\\n# Test\\ncolors = [2, 0, 2, 1, 1, 0]\\nprint("Before color sorting:", colors)\\nprint("After color sorting:", sort_colors(colors.copy()))
\\n

Exercise 2: K-th Largest Element in an Array

\\n

Find the k-th largest element in an unsorted array.

\\n

Example

\\n
def find_kth_largest(nums, k):\\n    """\\n    Find the k-th largest element\\n    Use the Quickselect algorithm\\n    """\\n    # k-th largest = Line(n-k+1)Small\\n    k_smallest = len(nums) - k + 1\\n    def quick_select(arr, left, right, k_small):\\n        if left == right:\\n            return arr\\n        # Select pivot\\n        pivot_idx = (left + right) // 2\\n        pivot = arr\\n        # Partition\\n        arr, arr = arr, arr\\n        store_idx = left\\n        for i in range(left, right):\\n            if arr < pivot:\\n                arr, arr = arr, arr\\n                store_idx += 1\\n        arr, arr = arr, arr\\n        # Recursive selection\\n        if k_small == store_idx:\\n            return arr\\n        elif k_small < store_idx:\\n            return quick_select(arr, left, store_idx - 1, k_small)\\n        else:\\n            return quick_select(arr, store_idx + 1, right, k_small)\\n    return quick_select(nums.copy(), 0, len(nums) - 1, k_smallest - 1)\\n\\n# Test\\ntest_array = [3, 2, 1, 5, 6, 4]\\nk = 2\\nprint(f"nArray {test_array} In the k-th {k} The large element is: {find_kth_largest(test_array, k)}")
← Python3 Os OpenPython3 Os Minor β†’