Facebook Interview Question
SDE1sCountry: United States
@avv: you're wrong: stackoverflow.com/questions/10657503/find-running-median-from-a-stream-of-integers.
it's obvious that a single heap won't help, but it's a standard algorithm to calculate a running mean with two heaps. One has the upper part and the other the lower part of the values received from the streams. Please at least use google before posting such a nonsense.
std::vector<double> get_medians(int** mat, int m, int n, int k) {
std::vector<double> medians;
std::multiset<int> window;
for (int i = 0; i <= m - k; ++i) {
window.clear();
for (int l = 0; l < k; ++l) {
for (int j = 0; j < k; ++j) {
window.insert(mat[l + i][j]);
}
}
medians.push_back(get_median(window));
for (int j = 1; j <= n - k; ++j) {
for (int l = 0; l < k; ++l) {
int val = mat[l + i][j - 1];
window.erase(window.lower_bound(val)); // only delete one instace
window.insert(mat[l + i][j + k - 1]);
}
medians.push_back(get_median(window));
}
}
return medians;
}
Treat it as a stream: when moving the window smartly k elemts leave the window and k enter. Let's suppose we do partition the first k*k window, to find the median, this takes O(k*k). Then I move right, I will know of the k elements I remove and I get k elements to add. If I only added k elements I could simply use two heaps and kind of calculate a "running" mean. But now, I have to remove k elements as well. So I'd need a heap where I can find and remove a specific element. One can do this with a custom bin heap implementation and use a HT as index.
- Chris September 19, 2017Let's see, the heaps are O(k^2) in size. Initial creation is O(k*k*lg(k)). Then for each of the (n-k)^2 moves it takes 2*lg(k). So O(k^2*lg(k^2)+n^2*lg(k)). That is O(n^2*lg(k)). Instead of O(n^2*n^2) or O(n^4*lg(n^2)) depending on brute force approach.
I think decent, improvement possible by Fibonacci heaps, but either way a bit ugly to code due to the special requirement on having a trackable heap element that can be referenced (indexes) by hash table.