Find the kth smallest element in an array.
Input: nums = [3,2,1,5,6,4], k = 3
Output: 3
This question asks us to find the kth smallest element in an array. For example, given the array [3,2,1,5,6,4] and k = 3, we want to find the 3rd smallest number (1st = 1, 2nd = 2, 3rd = 3).
Let's consider a simpler value for k = 1, in order to build intuition for this question.
Input: nums = [3,2,1,5,6,4], k = 1
Output: 3
In this case, the kth smallest element is equivalent to the 1st smallest element. Also known as the minimum.
To find the minimum value in an array, we can sort the entire array and return the first element. However, this approach has a time complexity of O(nlog(n)) at best.
def find_minimum(nums):
nums.sort()
return nums[0]
Is there a better way to find the minimum value in an array?
We are doing a-lot of extra work when sorting the whole array.
One technique you might be familiar with is to traverse the entire array and use a variable to store the minimum value. Simply update this variable as you find smaller values in the array. The time complexity of that is O(N). Way better than O(nlog(n))
.
myList = [3, 2, 1, 5, 6, 4]
minimum = myList[0] # start with the first element as the minimum
for num in myList:
if num < minimum:
minimum = num # update the minimum if a smaller value is found
print(minimum) # output the minimum value
Let's increase the value of k. When k equals 2, we are looking for the second smallest value. Building on the intuitions from previous section. To find the second smallest value in an array, you can sort the array and return the element at index 1.
Input: nums = [3,2,1,5,6,4], k = 2
Output: 2
def find_minimum(nums):
nums.sort()
return nums[1] # k = 2
Is it necessary to sort the array? Sorting takes NlogN time. In the previous section, we optimized the process using the minimum variable. Can we apply a similar optimization here?
We cannot store two items in a single variable, What do we do?
List is a variable with multiple values, we can use a list to store two smallest items(or minimums).
We can store the two smallest items in a list and keep adding elements from the array. If incoming elements are smaller than those in the list, we can simply replace the items in the list with the incoming items.
We will try to maintain this condition.
The list always contains the k (2 in this case) smallest items from the array that has been traversed (looked at).
For naming sake we will call this list a bucket—A bucket of minimums.
Let's examine some cases where we have an initial bucket and an incoming value.
The incoming item(value of 1) is smaller than all elements in the bucket.
bucket = [4,2]
number to add = 1
bucket = [2,1]
number 4 is kicked out
The incoming item(value of 1 ) is than the maximum(value of 4 ) in the bucket
bucket = [2,4]
number to add = 3
bucket = [2,3]
number 4 is kicked out
The incoming item(value of 5 )Smaller than none of the items in the array
bucket = [2,4]
number to add = 5
bucket = [2,4]
No change...
Let's see if the invariance holds for all values of k.
The list always contains the k smallest items from the array that has been traversed(looked at)
A casual Proof: By loop invariance
We need to demonstrate that at the end of each iteration, the bucket still holds the k smallest items from the traversed array.
A test of the element at k + 1.
e
element to addr
random element in the bucketm
maximum item in the bucketIf element e
is greater than all elements in the bucket.
bucket = [2,4]
number to add = 5
bucket = [2,4]
No change...
If the element e
is less than any random item in the bucket r
If e
is smaller than y
, it will also be smaller than the maximum (assuming that the maximum is not y
).
bucket = [4,2]
number to add(e) = 1
smaller than 2(random) also smaller than 4(maximum)
bucket = [4,1] # invarience doesn't hold
Doing so will result in the maximum item still being in the bucket. Why is that a problem. If y is smaller than the maximum. We have kicked a smaller item out of the bucket which might be have been our k smallest elements, the variance will not hold.
So can we remove the maximum every-time.
If you have to remove an element to make room for a smaller element, it's best to remove the maximum. This is because the maximum is always the largest element in the bucket and will always be larger than all other elements in the bucket, including the incoming element. By removing the maximum, we ensure that the size of the bucket remains k and that the invariance holds.
The list always contains the k smallest items from the array that has been traversed.