BlogsDope image BlogsDope

Find the smallest positive missing number in unsorted array

Feb. 13, 2021 PYTHON ARRAY ALGORITHM DATA STRUCTURE 16532

Let's see how to find the smallest positive number missing from an unsorted array with some examples:

We have to find the smallest positive number, so we may ignore all the negative numbers and start searching from the smallest positive integer that is 1 and so on. 

Note that we are not considering the integer 0 while finding the smallest positive number. We will start searching from integer 1. 
If the input array is:

Input: arr1 = [2,9,-10,5,3,1,11,-1,7]

Output: 4

Here, we see 1 is present, 2 and 3 are present but, 4 is not present in the array hence, 4 is the smallest positive number missing in the array.

Input: arr2 = [-10,4,-4,2,15,13,6,8,-1]

Output: 1

Here, the first positive integer i.e., 1 is missing.

Input: arr3 = [1,1,0,18,1,-12,-6,18,21] 

Output: 2

In this example, the first positive integer i.e., 1 is present but the second i.e., 2 is not present.

Here, we are going to see a total of five approaches to solve this problem:

  1. Naive approach having O(n2) time complexity
  2. Hashing based approach (O(n) time complexity and O(n) extra space)
  3. Sorting based approach having O(nLogn) time complexity.
  4. Boolean Hashing with O(n) time complexity and O(n) extra space
  5. Inline Swapping approach with O(n) time complexity and O(1) extra space solution

Naive approach having O(n2) time complexity

This is a Brute Force solution to the problem. We will search all the positive integers starting from 1. We know the smallest missing integer in the array would be between value 1 and size(array)+1. For the worst case, we will have to search for n+1 numbers. For each integer value from 1 to size(array)+1, we will have to traverse the entire array. Let's see the implementation:
def findNum(arr, n):
    for i in range(1, n+1):  # search for 1 to n+1 elements
        flag = False

        for j in range(n-1):  # iterating through the array
            if arr[j] == i:
                flag = True
                break
        if flag == False:
            return i


arr = [-2, 2, 1, -3, 2, 10, 0]
length = len(arr)
res = findNum(arr, length)
print("The smallest positive missing number in the array is:",res)

The smallest positive missing number in the array is: 3

We are passing the input unsorted array to the findNum() function.

  1. In findNum(), we will run two for loops.
  2. The outer for loop will run from 1 to size(array)+1
  3. For each value of outer for loop, we will first set the flag to False
  4. The inner for loop will traverse through the entire array from 0 to n-1 for each value of the outer loop.
  5. If we find the current value of the outer for loop in the array, we set the flag to True and do not continue to traverse further by breaking the inner for loop.
  6. If the entire inner for loop gets traversed, the flag will never change to True indicating that the element was not found in the array. That is the flag will remain False.
  7. In such a case, the control goes to the if condition flag == False which becomes True and hence the missing element is returned.

Hashing based approach (O(n) time complexity and O(n) extra space)

One way to solve this algorithm is through Hashing. Hashing means using some function or algorithm to map object data to some representative integer value. We are going to use Dictionary in Python for this approach. Dictionaries are used to store data values in key:value pairs. It is a collection that is ordered, changeable, and does not allow duplicates.

Let's look at the approach of the algorithm:

Our unsorted input array can contain repeating, positive, and negative numbers including integer 0. We are going to declare an empty dictionary and store all the positive array elements of the input array as the key of the dictionary and the count that is the number of times that element repeats in the input array will be the value of that particular key element. When all the positive elements of the input array get stored in the dictionary, we then traverse through the keys of the dictionary from key integer 1 (as it is ordered) and as soon as we find any positive number missing, we return that number. If my input array is: arr1 = [1, 3, 2, 0, -1, 11,1], our dictionary will look like: 

Let's see the actual implementation of the algorithm:

def small_missNum(arr, size):
    my_dict = dict()

    for i in range(size):
        if arr[i] > 0:
            if arr[i] not in my_dict.keys():
                my_dict[arr[i]] = 0
            else:
                my_dict[arr[i]] += 1

    index_val = 1

    while True:
        if index_val not in my_dict.keys():
            return index_val

        index_val += 1


# driver code
if __name__ == "__main__":
    arr = [1, 3, 2, 0, -1, 11,1]
    size = len(arr)

    print("The smallest positive missing number in the array is :", small_missNum(arr, size))

The smallest positive missing number in the array is : 4

  1. Here, we are passing the input array arr and the size of the array as arguments to small_missNum(). 
  2. We are creating an empty dictionary my_dict.
  3. With the for loop we are traversing through the input array arr and storing the values in the my_dict 
  4. The if condition checks whether the element already exists in my_dict. If yes, increment the value by 1, if no, create a new key of that element and initialise its value to 0.
  5.  After the creation of Dictionary my_dict is done, we are creating a variable index_val and initialising it to 1. 
  6.  We are then traversing through the keys in the my_dict with the help of while loop.
  7.  If we got the missing value, return the index_value, if not, then keep incrementing the value of index_value by 1.

Sorting based approach having O(nLogn) time complexity

This is another approach for this algorithm using the sorting method. We can sort the given input array and then start looking for the smallest positive number missing. For sorting, we can either use Quicksort or Mergesort as they are better sorting methods considering the time complexity. They both have time complexity as O(nLogn). But, the worst time complexity of Quick sort is n2 whereas the worst time complexity of Merge sort remains O(nLogn). So we are going to see the algorithm using Merge sort. Let us learn in brief how merge sort works:

Merge Sort:

Merge sort is a sorting technique based on the Divide and Conquer technique with worst-case time complexity being Ο(nlogn). Merge sort first divides the array into equal halves and then combines them in a sorted manner.
Step 1 − if the input array has only one element that is the length of the array is 1, it's already sorted.
Step 2 − recursively keep dividing the array into two halves until it can no more be divided. 
Step 3 − compare and merge the smaller lists in sorted order.

Have a look at the image below:

Let's look at the implementation of the algorithm using the above-discussed approach:

def merge_sort(arr):                   # function to recursively divide the array
    if len(arr) <= 1:
        return

    mid = len(arr)//2

    left = arr[:mid]
    right = arr[mid:]

    merge_sort(left)
    merge_sort(right)

    merge_two_sorted_lists(left, right, arr)


def merge_two_sorted_lists(a, b, arr):       # function to compare and merge the elements in sorted order
    len_a = len(a)
    len_b = len(b)

    i = j = k = 0

    while i < len_a and j < len_b:
        if a[i] <= b[j]:
            arr[k] = a[i]
            i += 1
        else:
            arr[k] = b[j]
            j += 1
        k += 1

    while i < len_a:
        arr[k] = a[i]
        i += 1
        k += 1

    while j < len_b:
        arr[k] = b[j]
        j += 1
        k += 1


def findNum(arr):         # function to find the smallest missing element in the sorted array
    merge_sort(arr)

    index_val = 1
    while True:
        if index_val not in arr:
            return index_val

        index_val += 1


if __name__ == '__main__':
    arr=[-2, 2, 1, -3, 2, 10, 0]
    result = findNum(arr)
    print("The smallest positive missing number in the array is :",result)

The smallest positive missing number in the array is 3

In the above code, we are using two functions for the Merge sort which sorts the array inline without making any extra array, and one function to find the smallest missing element.

  1. We are passing the unsorted input array to the function findNum(). Within this function we call the merge_sort() function.
  2. mergesort() recursively keeps dividing the array into two halves until it can no more be divided.
  3.  Then it makes a call to the merge_two_sorted_lists() function which compares and merges the elements in sorted order.
  4. Now, we have our sorted array ready. 
  5. We are creating a variable index_val and initializing it to 1. 
  6.  We are then traversing the sorted array with the help of a while loop. 
  7.  If we got the missing value, return the index_value, if not, then keep incrementing the value of index_value by 1.

Boolean Hashing Approach having O(n) time complexity and O(n) extra space

In this approach, we are going to create an array other than the input array of the size equivalent to the maximum value --max(input_array)-- of the input array. We will initially set all the elements of this newly created array as False. Then we traverse through the input array and for each positive value of the input array, we will set the value of the new array as True at the index value equal to the traversed element. We, then start looking through the new array from left to right and return the index of the first False value. Note that at the end, to find the missing value, we will start traversing the new array from index 1. Look at the image below to understand better:

Now let's look at the code:

def findNum(arr):

    size_new_arr = max(arr)
    if size_new_arr < 1:        # a case to handle where all elements are negative
        return 1

    if len(arr) == 1:          # a case to handle where array has only one element
        if arr[0] == 1:
            return 2
        else:
            return 1

    new_arr = [False] * (size_new_arr+1)  # new array created with all values False
    for i in range(len(arr)):
        if arr[i] > 0:
            if new_arr[arr[i]] != True:
                new_arr[arr[i]] = True

    for i in range(1,len(new_arr)):
        if new_arr[i] == False:
            return i
    return i + 1


# Driver Code
arr = [1, 3, 2, 0, -1, 5,1]
print("The smallest positive missing number in the array is :",findNum(arr))

​The smallest positive missing number in the array is : 4

In the above approach, we are passing the input array arr to the findNum() function.

  1. As discussed, we will store the max(arr) that is in the above example its 11 to make the new array new_arr of size 11.
  2. We have two if conditions in the start before we start traversing the array to handle cases where:
    1. if all the input array elements are negative then simply return 1.
    2. if length of the input array is 1, if the element is 1 then return 2 and for all other cases, simply return 1 as it's the smallest positive number in all cases other than 1.
  3. Now, we have created the new_arr[] of size size_new_arr + 1 discussed in step 1. We are giving +1 as the index starts from 0.
  4. The approach is in the new_arr[],we will change the value from False to True where index value equal to the current traversed element of input array arr[]
  5. We traverse the input array from 0 to len(arr) with the for loop and if we come through a positive number in the array, then we first check if it's already been visited which happens when the array has repeating numbers. In that case it would already be set to True.
  6. That is why, we first check, new_arr[arr[i]] != True, if this condition is True then we set the value to True.
  7. The final new_arr[] would be [True, True, True, True, False, True]
  8. After the creation of new_arr[], we start traversing the new_arr[] from index 1 as we do not have to bother about value 0.
  9. We return the index as soon as we find the first False value in the new_arr[]. 

Approach with O(n) time complexity and O(1) extra space solution

Consider out input array is arr = [4,-1,2,1,-5] and somehow we arrange this array as [1, 2, -1, 4, -5]. Look carefully, we can say that all the positive numbers greater than 0 are kind of arranged in a sorted manner. The missing values are replaced by any number in the list which is less than 1 but the positive numbers are at their proper positions. Also, note that if my index value is i then my element at that index is i+1. For example, look at the image below:

If my index value is 0, the value at the index is 0+1=1

If my index value is 1, the value at the index is 1+1=2

If my index value is 2, the value at the index is 2+1=3, but this does not satisfy as seen in the image hence we return index+1 as the missing value that is 3.

Let's see how we can convert our array from [4,-1,2,1,-5] to [1, 2, -1, 4, -5]. We will traverse the given array from index value 0 to size of array-1. We can ignore the numbers that are less than 1 that is 0 and negative numbers. Also note that if my array size is n, then the valid numbers are from 1 to n. For example, if the size of input array is 5, the valid numbers can be any in range 1,2,3,4,5. So in our case, the size is 5 and if my array had numbers like 6,10,11, etc. we can safely ignore them. Let's traverse the input array and see the result after every traversal:

arr = [4,-1,2,1,-5]

now i=0 and a[i]!=a[i+1] is True, hence we swap(arr[i], arr[arr[i] - 1]) that is swap(4,1). The array becomes:

arr = [1,-1,2,4,-5]

Note that we have not done i+=1 yet, so i is still pointing at 0 index. Now i=0 and a[i]!=a[i+1] is False indicating the value 1 is at correct position. Now, do i+=1

arr = [1,-1,2,4,-5]

Now i =1, but value at index 1 is negative(-1) so as discussed earlier we ignore it. Do i+=1

arr = [1,-1,2,4,-5]

Now, i= 2, and a[i]!=a[i+1] is True, so we swap (arr[i], arr[arr[i] - 1]) that is swap(2,-1). The array becomes:

arr = [1,2,-1,4,-5]

i is still pointing at index 2 but value at index 2 is negative so we ignore and increment i+=1

arr = [1,2,-1,4,-5]

Now i=3 and a[i]!=a[i+1] is False indicating the value 4 is at correct position. Now, do i+=1

arr = [1,2,-1,4,-5]

Now i =4, but value at index 4 is negative so as discussed earlier we ignore it and we come out of the loop as the array reached its end.

So, we can see the array got arranged as we wanted and now we simply traverse and check where the condition a[i]!=a[i+1] becomes True, we return i+1.

In a nutshell we are checking three conditions: 

  1. arr[i] >= 1
  2. arr[i] <= length(array)
  3. arr[i] != arr[arr[i] - 1]) 

If all these three conditions are true, we swap the number, else, we just go to the next number in the array by incrementing i+=1. Check the implementation of the algorithm below:

def findNum(arr, n):
    if n == 0:
        return 1

    i = 0
    while i < n:
        if arr[i] >= 1 and arr[i] <= n and arr[i] != arr[arr[i] - 1]:

            temp = arr[i] - 1
            temp_2 = arr[i]
            arr[i] = arr[temp]
            arr[temp] = temp_2
        else:
            i += 1

    for i in range(n):
        if arr[i] != i + 1:
            return i + 1

    return n + 1    # if the array is [1,2,3,4] then return n+1 that is 5


arr = [4,-1,2,1,-5]
n = len(arr)

print(findNum(arr, n))

​3

In the above code, we are traversing the array with the while loop and if all three conditions of if statement is satisfied, the numbers get swapped, if not, else condition gets executed.

After we get the modified array, we simply run the for loop and return the i+1 where this condition arr[i] != i + 1 returns True.


Liked the post?
Rarely seen, always noticed.
Editor's Picks
0 COMMENT

Please login to view or add comment(s).