In this problem, we are given an array of integers but this array has duplicate integers too.
Also, one condition is always applied that if my input array is of size n
, then, the integers in the array can range between 0 to n-1
. Here, we have an array with repeated values and our task is to find the values that are repeated more than once. Let's see the input and output of this problem:
Input: [2, 7, 1, 4, 1, 7, 8, 2, 8]
Output: 2, 7, 1, 8
The Brute Force or naive approach for this algorithm is, in every iteration we can compare a single element with the rest of the elements and find out the repeated numbers. This method will take O(1) constant space but O(n2) time complexity.
Another approach is we can build a Hashmap where the keys are the elements of the array and the value will be the number of their occurrences. For this approach, the time complexity will be O(n) but the Hashmap will require extra space O(n). We want a solution with time complexity of O(n) and O(1) space.
Let me give you a little hint. We are sure with one thing that if the input array size is n then the elements in the array will be in the range 0 to n-1. So, we can use the input array as a Hashmap itself. Think about this. Try finding a solution using this hint and then read ahead.
Let's see how we are going to achieve this. Consider the input array is arr = [2, 3, 2, 3, 1]
. For each element, we need to store the number of occurrences of that number at its index itself. What does this mean?
Example: The first element in the array is 2
, so, its number of occurrences should be stored at the position arr[2]
. Similarly, the next element is 3
, then the number of occurrences of element 3
should be stored at arr[3]
. But these positions are not empty, they already have some values. That is arr[2]
, arr[3]
already have some values stored. So, we have to come up with something, where we can store the number of occurrences and also know the original value at that position. That is at the position arr[2]
, we can store the number of occurrences of element 2
and also know the original value that is 2
.
This all sounds so difficult but is achieved with a simple formula. At each iteration, we are going to use this:
arr[arr[i] % n] += n
Let's see how this formula works for each iteration:
The input array is: [2, 3, 2, 3, 1] and size, n=5
For i = 0, arr[arr[i] % n] += n
arr[arr[0] % 5] += 5
arr[2%5] += 5
arr[2] += 5
arr[2] = arr[2] + 5
arr[2] = 7
So, the array is arr = [2, 3, 7, 3, 1]
For i = 1,
arr[arr[1] % 5] += 5
arr[3%5] += 5
arr[3] += 5
arr[3] = arr[3] + 5
arr[3] = 8
So, the array is arr = [2, 3, 7, 8, 1]
For i = 2,
arr[arr[2] % 5] += 5
arr[7%5] += 5
arr[2] += 5
arr[2] = arr[2] + 5
arr[2] = 12
So, the array is arr = [2, 3, 12, 8, 1]
For i = 3,
arr[arr[3] % 5] += 5
arr[8%5] += 5
arr[3] += 5
arr[3] = arr[3] + 5
arr[3] = 13
So, the array is arr = [2, 3, 12, 13, 1]
For i = 4,
arr[arr[4] % 5] += 5
arr[1%5] += 5
arr[1] += 5
arr[1] = arr[1] + 5
arr[2] = 7
So, the array is arr = [2, 8, 12, 13, 1]
Now, this is the modified array we have got after using the formula. Now, to find out the occurrence of each element, we use
Occurence of i = arr[i]/5
Occurence of 0 = arr[0]/5 = 2/5 = 0
Occurence of 1 = arr[1]/5 = 8/5 = 1
Occurence of 2 = arr[2]/5 = 12/5 = 2
Occurence of 3 = arr[3]/5 = 13/5 = 2
Occurence of 4 = arr[4]/5 = 1/5 = 0
That's it! We got the occurrences of each element from 0 to n-1 in O(n) time complexity and constant space. Let's see the implementation of the same in Python3:
def repeat_num(arr, n):
for i in range(0, n):
index = arr[i] % n
arr[index] += n
for i in range(0, n):
if (arr[i] / n) >= 2:
print(i, end=", ")
# Driver code
arr = [2, 7, 1, 4, 1, 7, 8, 2, 8]
n = len(arr)
repeat_num(arr, n)
1, 2, 7, 8,