So I was looking at the Codility sample problems and came across the following problem.
Given an array A of N integers we draw N discs in a 2D plane, such that i-th disc has center in (0,i) and a radius A[i]. We say that k-th disc and j-th disc intersect, if
and k-th and j-th discs have at least one common point.
Write a function
int number_of_disc_intersections(int[] A);
which given an array A describing N discs as explained above, returns the number of pairs of intersecting discs. For example, given N=6 and
A[0] = 1
A[1] = 5
A[2] = 2
A[3] = 1
A[4] = 4
A[5] = 0
there are 11 pairs of intersecting discs: 0th and 1st 0th and 2nd 0th and 4th 1st and 2nd 1st and 3rd 1st and 4th 1st and 5th 2nd and 3rd 2nd and 4th 3rd and 4th 4th and 5th
so the function should return 11. The function should return -1 if the number of intersecting pairs exceeds 10,000,000. The function may assume that N does not exceed 10,000,000.
The trick here is to do a solution that works in O(N log(N)) time and O(N) space.
def number_of_disc_intersections(a)
event = Hash.new{|h,k| h[k] = {:start => 0, :stop => 0}}
a.each_index {|i|
event[i - a[i]][:start] += 1
event[i + a[i]][:stop ] += 1
}
sorted_events = (event.sort_by {|index, value| index}).map! {|n| n[1]}
past_start = 0
intersect = 0
sorted_events.each {|e|
intersect += e[:start] * (e[:start]-1) / 2 +
e[:start] * past_start
past_start += e[:start]
past_start -= e[:stop]
}
return intersect
end
Which is a beautiful solution and runs in O(N log(N)) time and O(N) space, as we wanted. The only problem is it is very difficult to understand what it is doing and why. So in this blog post, I hope to explain the above code.
The algorithm is as follows. Imagine the initial array that we are passed in spans out forever on each side. Instead of the individual numbers we are passed in, we want to know how many discs start and stop at each position. Once we can say, for example, at index 4, 2 discs start and 1 disc ends - and do that for every index - we are able to calculate how many intersections there are.
You can see in the first 5 lines of the function it creates a new hash where the value is defaulted to another hash that set start and stop to zero. This allows us to increment using "+= 1" rather than having to check for nils. It then populates the event hash by iterating over the passed in array (a) and incrementing the start value at the lower bounds of the disc, and the stop at the upper bounds of the disc.
sorted_events = (event.sort_by {|index, value| index}).map! {|n| n[1]}
The 6th line of the function looks complicated but all it is doing is sorting by the key of the event hash (the index) and then discarding the keys and just using the values of the hash. But we can't use ".values" on the result of the sort_by, because it returns an array (hashes can't be sorted)
past_start = 0
intersect = 0
sorted_events.each {|e|
intersect += e[:start] * (e[:start]-1) / 2 +
e[:start] * past_start
past_start += e[:start]
past_start -= e[:stop]
}
Now for the really tricky part. We iterate over our sorted array of hashes which tell us how many discs start and stop at each position. There are two cases we need to worry about:
1) how many discs are already going when the new disc starts (e[:start] * past_start)
2) if multiple discs start at the same position (e[:start] * (e[:start]-1) / 2)
The first condition is fairly simple. If there are 3 discs that we're in the middle of and a new disc starts, it intersects with all 3 of them. If there are 3 discs that we're in the middle of, and 2 new discs start, each of them will intersect with the 3 existing ones, making 6 intersections.
The two lines at the bottom keep track of how many discs are currently open at each index.
The second condition is a little trickier. If multiple discs start at the same location we need to use some maths. It's the same problem you probably did in school: If 5 people all shake hands with each other, how many handshakes have there been?
To do this, because it is groups of 2, you have to pick 2. So on the first pick, there are 5 options, then once you've picked that first one, there are only 4 options. So 5 x 4 = 20. But, because when person A shakes person B's hand, it's the same as when person B shakes person A's hand, we have to divide our result by 2, so as not to double count.
This is the same with the intersections. If there are 5 new discs starting at a position, the number of intersections between them is equal to 5x4/2 = 10.
So that's it. The number of intersections at each index is n*(n-1)/2 - for if multiple discs start at that index - plus n*past_starts for when discs are already in progress when we start a new one. Add them all together and you have your solution.
I hope this has made sense to people. If not, please leave a comment and I'll try to address it asap.