Find the kth smallest element in an array.

Input: nums = [3,2,1,5,6,4], k = 3
Output: 3

Leetcode

Reframing 1.0

  1. This question asks us to find the kth smallest element in an array. For example, given the array [3,2,1,5,6,4] and k = 3, we want to find the 3rd smallest number (1st = 1, 2nd = 2, 3rd = 3).

  2. Let's consider a simpler value for k = 1, in order to build intuition for this question.

    Input: nums = [3,2,1,5,6,4], k = 1
    Output: 3
    
  3. In this case, the kth smallest element is equivalent to the 1st smallest element. Also known as the minimum.

  4. To find the minimum value in an array, we can sort the entire array and return the first element. However, this approach has a time complexity of O(nlog(n)) at best.

    def find_minimum(nums):
        nums.sort()
        return nums[0]
    
    
  5. Is there a better way to find the minimum value in an array?

    1. We are doing a-lot of extra work when sorting the whole array.

    2. One technique you might be familiar with is to traverse the entire array and use a variable to store the minimum value. Simply update this variable as you find smaller values in the array. The time complexity of that is O(N). Way better than O(nlog(n)).

      1. Imagine the minimum value as a bucket. We only remove from the bucket when we encounter a smaller number, updating the value in the bucket.
      myList = [3, 2, 1, 5, 6, 4]
      
      minimum = myList[0] # start with the first element as the minimum
      
      for num in myList:
          if num < minimum:
              minimum = num # update the minimum if a smaller value is found
      
      print(minimum) # output the minimum value
      
      

A step further

  1. Let's increase the value of k. When k equals 2, we are looking for the second smallest value. Building on the intuitions from previous section. To find the second smallest value in an array, you can sort the array and return the element at index 1.

    Input: nums = [3,2,1,5,6,4], k = 2
    Output: 2
    
    def find_minimum(nums):
        nums.sort()
        return nums[1] # k = 2
    
    
  2. Is it necessary to sort the array? Sorting takes NlogN time. In the previous section, we optimized the process using the minimum variable. Can we apply a similar optimization here?

  3. We cannot store two items in a single variable, What do we do?

    1. List is a variable with multiple values, we can use a list to store two smallest items(or minimums).

    2. We can store the two smallest items in a list and keep adding elements from the array. If incoming elements are smaller than those in the list, we can simply replace the items in the list with the incoming items.

    3. We will try to maintain this condition.

      The list always contains the k (2 in this case) smallest items from the array that has been traversed (looked at).

    4. For naming sake we will call this list a bucket—A bucket of minimums.

  4. Let's examine some cases where we have an initial bucket and an incoming value.

    1. The incoming item(value of 1) is smaller than all elements in the bucket.

      bucket = [4,2] 
      	number to add = 1
      bucket = [2,1]
      	number 4 is kicked out
      
    2. The incoming item(value of 1 ) is than the maximum(value of 4 ) in the bucket

      bucket = [2,4] 
      	number to add = 3
      bucket = [2,3]
      
      number 4 is kicked out
      
    3. The incoming item(value of 5 )Smaller than none of the items in the array

      bucket = [2,4] 
      	number to add = 5
      bucket = [2,4] 
      
      No change...
      
  5. Let's see if the invariance holds for all values of k.

    The list always contains the k smallest items from the array that has been traversed(looked at)

  6. A casual Proof: By loop invariance

    1. We need to demonstrate that, at every iteration of the loop, the bucket always contains the k smallest elements from the array that has been traversed. Thus, at the end of the array traversal, the bucket will contain the k smallest elements in the array.
      1. Initialization
        1. To begin, we create a bucket of size k and add the first k items from the array to it. Since both the array and the bucket are of size k, we can conclude that the bucket contains the k smallest items from the array that have been traversed.
      2. Maintenance
        1. We need to demonstrate that at the end of each iteration, the bucket still holds the k smallest items from the traversed array.

        2. A test of the element at k + 1.

          1. Players
            1. e element to add
            2. r random element in the bucket
            3. m maximum item in the bucket
        3. If element e is greater than all elements in the bucket.

          1. No change, invariance holds
          bucket = [2,4] 
          	number to add = 5
          bucket = [2,4] 
          
          No change...
          
        4. If the element e is less than any random item in the bucket r

          1. To maintain the size of the bucket at k and have the smallest k elements in the bucket, we need to remove y and make room for e.
          2. Proof by contradiction .
            1. If e is smaller than y, it will also be smaller than the maximum (assuming that the maximum is not y).

              bucket = [4,2] 
              	number to add(e) = 1
              smaller than 2(random) also smaller than 4(maximum)
              
              bucket = [4,1] # invarience doesn't hold
              
              
            2. Doing so will result in the maximum item still being in the bucket. Why is that a problem. If y is smaller than the maximum. We have kicked a smaller item out of the bucket which might be have been our k smallest elements, the variance will not hold.

        5. So can we remove the maximum every-time.

          1. If you have to remove an element to make room for a smaller element, it's best to remove the maximum. This is because the maximum is always the largest element in the bucket and will always be larger than all other elements in the bucket, including the incoming element. By removing the maximum, we ensure that the size of the bucket remains k and that the invariance holds.

            The list always contains the k smallest items from the array that has been traversed.

      3. Termination
        1. At the end of the array traversal, the bucket contains the k smallest elements in the array. Therefore, the bucket always contains the k smallest items from the array that has been traversed. The invariance holds for all iterations of the loop.