Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
A simple way to do this is to maintain a hashMap of Map<String,Set<String>> where the key is the band name and the set of strings is basically the number of employees who have requested this band. This way we can easily invalidate those bands whose value count is lesser than 2 as nobody else likes this band.
Now with this reduced subset, we can convert it to a graph problem where the verticies are the music bands who have more than 1 recommendations and the edges are the employees. Now this is reduced to a spanning tree problem which can be efficiently solved
Space wise we have just a hashmap with the number of entries equal to the number of bands in this case the worst case scenario would be the case where all the 10000 bands requested by each employee will be different in which case we'd have n * 10000 entries where n is the number of employees. On the flip side there would be no processing required here as the reference count for each value would just be one.
There are many efficient ways to solve the graph problem when we need to process the data.
This is interesting to use a graph.
Can you explain why the vertices are bands and the edges are people? To me, it seems like the vertices should be people and the edges be bands, since we're interested in the connectivity between people.
But maybe this is just because I don't know much about spanning trees.
using Hashing method , store all the band(key) of each colleuges and initialize the value with 0.
Whnever you are getting the given band name ,increment the value of the band name.
That band name whihc is getting the more number of value,that bands will be selected .
struct HashIntPair
{
size_t operator()(const pair<int,int>& intPair) const
{
std::hash<int> hasher;
return hasher(intPair.first) + 13*hasher(intPair.second);
}
};
int _tmain(int argc, _TCHAR* argv[])
{
ifstream ifs ("c:\\src\\example.txt");
if (!ifs.is_open())
{
return -1;
}
vector<string> UserNames;
unordered_map<string, vector<int>> AlbumToUsersMap;
unordered_map<pair<int, int>, int, HashIntPair> RelatedScoreMap;
unordered_map<int, vector<int>> RelatedUsers;
// process input
string temp;
getline(ifs, temp);
int numUsers = 0;
stringstream(temp)>> numUsers;
for (int i = 0; i < numUsers; ++i)
{
getline(ifs, temp);
stringstream liness(temp);
string name;
getline(liness, name, ':');
UserNames.push_back(name);
// read music
string album;
while(getline(liness, album, ','))
{
AlbumToUsersMap[album].push_back(i);
}
}
ifs.close();
// updated related scores
for (unordered_map<string, vector<int>>::iterator at = AlbumToUsersMap.begin(); at != AlbumToUsersMap.end(); ++at)
{
for (vector<int>::iterator it= at->second.begin(); it != at->second.end(); ++it)
{
for (vector<int>::iterator jt = it+1; jt != at->second.end(); ++jt)
{
pair<int,int> related = *it < *jt ? make_pair<int, int>(*it, *jt) : make_pair<int, int>(*jt, *it);
unordered_map<pair<int, int>, int>::iterator rtscore = RelatedScoreMap.find(related);
if (rtscore == RelatedScoreMap.end())
{
RelatedScoreMap[related] = 1;
}
else
{
++ (rtscore->second);
RelatedUsers[*it].push_back(*jt);
RelatedUsers[*jt].push_back(*it);
}
}
}
}
// output
ofstream ofs ("c:\\src\\exampleout.txt");
for (size_t i = 0; i < UserNames.size(); ++i)
{
ofs << UserNames[i] << " : ";
for (vector<int>::iterator it = RelatedUsers[i].begin(); it != RelatedUsers[i].end(); ++it)
{
if (it != RelatedUsers[i].begin()) { ofs << " , " ; }
ofs << UserNames[*it];
}
ofs << endl;
}
ofs.close();
return 0;
}
this problem could be done in O(length of file) as follows:
maintain a trie and put the name of the bands into it. when finished putting a band into the trie, add the corresponding college name into the last trie node's college list.
each time you put a new band into the trie, you either found that this is a new band, or this band also belongs to some other college. so if you maintain a hash table for each college, you can find the required pairs in O(1).
Go through the input once, build a hash table with the keys being the band name, values being a list of colleagues
Now go through the input again, for each band mentioned by each colleague, create a new hash table with key being other colleagues, value being the number of times they appear in the bands that the current colleague likes.
Now for each colleague, we just need to go through the new hash table and find the entry with the maximum value
Code can be tested here: ideone.com/4bgU7
One caveat is that the order of the compatibility list is not preserved (although it wasn't mentioned in the problem statement, but I can see that if there's a tie, output in the order same as the input colleague list).
import sys
def parseLine(line):
parts = line.split(':')
name = parts[0]
bands = parts[1].rstrip().split(',')
return name, bands
num_lines = int(sys.stdin.readline())
names = [] # to preserve the original order
colleague_likes = {}for line in sys.stdin:
name, bands = parseLine(line)
colleague_likes[name] = bands
names.append(name)
band_likes = {}
for item in colleague_likes.items():
name = item[0]
bands = item[1]
for band in bands:
if band_likes.has_key(band):
band_likes[band].append(name)
else:
band_likes[band] = [name]
for name in names:
bands = colleague_likes[name]
compatibility = {}
for band in bands:
for colleague in band_likes[band]:
if colleague == name:
continue
else:
if compatibility.has_key(colleague):
compatibility[colleague] = compatibility[colleague] + 1
else:
compatibility[colleague] = 1
max_compatibility = 0
compatibility_list = []
for item in compatibility.items():
if item[1] < 2:
continue
if item[1] > max_compatibility:
compatibility_list = [item[0]]
max_compatibility = item[1]
elif item[1] == max_compatibility:
compatibility_list.append(item[0])
print '%s: %s' % (name, ', '.join(compatibility_list))
- swamy540@yahoo.co.uk July 04, 2012