Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
"standard top-n algorithm"? Well seems to be not so standard to Google, but anyway, just look for "C++ partial_sort implementation", it does the very same thing. And in more detail.
1) Add elements to the heap until it has size K
2) Then you always remove the min element which is log(K) in exchange for adding a LARGER element. This way you get all the (n-k) smallest elements out of there...
3) Return the head of the min-heap.
So you get O(log(K) * (n-k) + k) for a pairing/fibbo-heap. Not all heaps will give you this worst-case performance... So a binary heap would end up with O(log(K) * (n-k) + k*log(k)) = O(n*log(k))...
Alternatively you could use randomized Quickselect, which to me sounds nicer... and gives you a more stable complexity of O(n) (randomized).
This may help you..
int partition(int* a, int low, int high)
{
/*Start index for scanning the array*/
int left = low;
/*Select middle element of the array as pivot*/
int pivotIdx = low + (high - low)/2;
/*Swap pivot with right most element of the array*/
int pivot = a[pivotIdx];
a[pivotIdx] = a[high];
a[high] = pivot;
pivotIdx = high;
/*Pivot will be placed at this index such that all elements
to right of it are greater than pivot and all elements to
the left of it are smaller than pivot*/
int partitionIdx = low;
while (left < high)
{
/*Initially our partition index is set to leftmost element
index in the input array. We keep traversing towards right
of the input array and if we find an element lesser than
pivot, we swap it with element at partiotion index and
increment partition index else just keep moving right*/
if (a[left] < pivot)
{
int tmp = a[left];
a[left] = a[partitionIdx];
a[partitionIdx] = tmp;
++partitionIdx;
}
++left;
}
/*Place the pivot value at the partition index*/
a[pivotIdx] = a[partitionIdx];
a[partitionIdx] = pivot;
return partitionIdx;
}
/*K is the kth largest element to find. Initial call
to this function should be called with 0 and size-1 for
low and high respectively.*/
int quickSelect(int* a, int low, int high, int k)
{
assert(a);
assert(k <= high+1);
if (low == high)
return a[low];
int pivotIdx = partition(a, low, high);
int sizeOfLeftSubArray = pivotIdx - low + 1;
/*If'pivotIdx' is greater than 'k', keep looking towards
left part*/
if (sizeOfLeftSubArray > k)
{
return quickSelect(a, low, pivotIdx-1, k);
}
/*If'pivotIdx' is less than 'k', keep looking towards
right part*/
else if (sizeOfLeftSubArray < k)
{
return quickSelect(a, pivotIdx+1, high, k-sizeOfLeftSubArray);
}
/*We just found our kth index, return it*/
else
{
return a[pivotIdx];
}
}
For this problem, we could try to use quicksort multiple times.
First choose a pivot A[i], then put the doubles that is less than A[i] on its left, the doubles bigger than A[i] on its right. Then check the number of doubles are bigger than A[i], name it m, if m>K, we throw away the part that is less or equal to A[i], K is unchanged;
if m==K, we return A[i]
if m<K, we throw away the part that is bigger than A[i], update K to K-m. Then we do the same thing to the remaining array :choose pivot and sort around that pivot
The time complexity is O(n)+O(n/2)+O(n/4)+.....=O(n) in the average case. Worst case senario would be O(n^2)though.
Space complexity is O(n)
it can be done in two way:-
1)either sort the array then take the kth element from last.space complexity 0(1) time complexity o(nlogn).
2)take first k element of array and make min heap from that,for the rest element check the value of root of heap with array value if array value > root value,replace the root value with array value.space complexity is o(k) which is constant time is o(klogk)+(n-k)log(k).
No one mentioned quickselect ? Expected complexity O(n)
Here's is a java solution :
import java.util.Random;
public class KLargest {
static double [] tab;
static Random rand = new Random();
public static double largest(double [] intab,int k){
tab=intab;
return largestAux(k, 0, tab.length-1);
}
private static double largestAux(int k, int start, int end)
{
int pivot = rand.nextInt(end-start+1)+start;
int res = partition(start,end,pivot);
if (res-start == k)
return tab[res];
else if(res-start <k)
return largestAux(k-(res-start+1),res+1,end);
else
return largestAux(k,start,res-1);
}
private static int partition(int start, int end, int pivot){
double val = tab[pivot];
int insert = start;
swap(pivot,end);
for(int i=start;i<=end;i++){
if(tab[i]>=val){
swap(i,insert);
insert++;
}
}
return insert-1;
}
private static void swap(int i, int j){
double temp = tab[i];
tab[i] = tab[j];
tab[j]=temp;
}
public static void main(String [] args){
System.out.println(largest(new double[]{1,2,3,4,5},2));
}
}
This question is a good test for the underlying understanding of tradeoff.
1> For Memory Compromise, I can use a Hash of k buckets to hold k different values which has a time advantage of O(k) but memory O(k)
2> For Time Compromise, I will use any sorting perhaps the merge sort which can provide the best of all giving O(nlogn) . (I m talking about Merge function without using extra additional array ).
Wow What is this ?? Am I reading it correct ?? Can't we find kth largest using hash ?? if so why the hash is used for ??
Sorry man. if I disappointed ya..coz its not abt what u think its abt what is right.. So if U hv any trouble in understanding or hving better idea post it.. Don't take it personal..
Anyway.. lemme know if U donno how to find kth largest using Hash.. Coz Hash basically is used to find values with condition and here the condition is the largest value..
k^th largest using hash? I call BS.
Given a random set of objects which support a Compare function, give an algo using hash, to find the k^th largest.
Go on...
There are certain psyco person like this "Anonymous" who does not even have enough gud to express his identity, ruining the technical discussion and moving down the correct idea at the bottom of the discussion.
I urge the users to just ignore the non-technical views and provide your own analysis.. I am going to report this mis conduct
LOL @hprem991.
First, anonymous users cannot vote down. So, your answer was voted down by someone who has registered in this site, a brave person, gasp.
Second, all I see in a comment by Anonymous is a technical question, which you don't seem to have understood (or are trying to dodge). It is a very valid question to your hash claim.
Perhaps you consider the BS comment rude? Welcome to the internet.
Interesting. You claim you can do this using a hash but give no example :P. Great. Must be a cool hash function that orders your values. Even if it would exist, it would still create a log(n)n algorithm... Because either the isnert or retrival of your hashmap can't be O(1).
And mergesort is by far not the best sorting algorithm and especially not the inplace version which is a huge mess in itself.
the naive way to sort the array in descending order, the best complexity would be O(n*log(n)), quick sort would be fine for that and then just return Kth element.
- S.Abakumoff February 15, 2013The second thought is using the standard top-N algorithm to build the the min-heap of size K+1 from the input array, the heap will contain K+1 largest elements from the input. Then return the heap top which is Kth largest.
The complexity is O(n*log(K) ), the space is O(K+1)