Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: Phone Interview
good point for word truncating problem. When processing a chunk, see whether this chunk ends with a delimiter(e.g. space, common, period...), if so, handle all words in this chunk; otherwise, handle all words before the last one since it may not be complete and update the file pointer accordingly. Then read the next chunk.
Read the file in chunk, big enough to accommodate in memory. Process the chunk and reverser it and put into another file, Repeat this operation for the next chunk
I think this is the way to do this. @Annonymous: Pleasae explain how complexity is more? and suggest your solution of less complexity.
Let us take the extreme case that one word of the file is too big that it can't fit into the Memory, then we can take pointer(PTR1) to the first character of that word(stored in some block 'x' of hard disk ) and pointer(PTR2) to the last character of that word(stored in some block 'y' of hard disk ) and keep on swapping and incrementing PTR1 and decrementing PTR2 till the end of the blocks.
Then again we will do this for the rest part of the word.
Until and unless PTR1 becomes equal to the PTR2.
If (PTR1==PTR2)
Means the whole word is reversed.
Like this can be done to reverse all the words in the first file.
After that we can read the maximum part of the first file which can fit into the memory and then copy it into the second file.
I am posting similar question asked to me for other company.
How do we sort a file that doesn't fit into Main Memory?
One way could be by dividing the file into say K chunks, each of which could fit in memory.
And now, sort each of the K chunks.
Finally apply K - way merge operation on them.
Seems like there is some confusion here. Is this to be done in place? If so then just reverse the file by reading and swapping first n and last n bytes until you reach the middle. Then from the beginning read each word and reverse. (Lets assume that each word is small enough to fit in the buffer.)
If a new file is to be created from the existing file, a lay approach would be to read file backwards and reverse each word before writing to the new file.
divide the file into n/k chunks. distribute n/k chunks to a set of computers, they reverse the words and give it back to the distributor. distributor on receiving prepares the output file. The distributor may receive some chunk first some second, so it either waits or writes the chunk to target position based on the worker's id to the desired destination
Do like this...
1. Break the huge data into some number of pieces so that each of the piece will fit into the memory.
2. Use multithreading to reverse each of the pieces one by one after bringing into the memory. Reverse using the technique such that on one piece of chunk 1st character is swaped with last, 2nd character is swaped with the second last. use one thread on one character.
3. After reversing all pieces of chunks use the same swaping technique using multithreading to reverse first piece of chunk with last piece of chunk for all.
You are done.....Best of Luck
(1) take *start and *end pointer at d beginning and end of 1st word respectively.
(2) Let *tmp is another pointer at *end.
(3) copy character at *tmp into another file and decrement tmp. Do this step untill *tmp reaches *start. This ll copy 1st word in reverse order.
(4) Now put *start at *end+1 (i.e at d beginning of next word). again find next *end position for d next word and repeat above step.
(5) do all above process untill *start reaches EOF
Complexity: O(2n) i.e O(n) (since every word is read twice)
take a file of same size,divide the original into chunks,and start copying from the original file 'character by character' to the end of the new file decrementing towards the start of this new file, i.e, 1st char at last pos,2nd char at 2nd last pos.......
File contains billions of instruction that cannot be fit into memory means we can't take complete file as input, we need to divide file into parts and only parts we need to read. we read every word one by one and reverse of that word then we store into another same file.and rejoin them into another file.
How about.. reverse whole file first. We can swap two characters at a time. First with last and so on. So buff_size /2 characters from the beginning and buff_size/2 from the end and then read each word and reverse it.
For example abc de fg -> gf ed cba -> fg de abc.
We would need a buffer that can accommodate the longest word though.
hi guys,
- aliciabondie23 July 10, 2012I've read every proposed answer, and beside pointer thing, I don't thing they'll work.
I've written a pseudocode something line below answer.
"Read the file in chunk, big enough to accommodate in memory. Process the chunk and reverser it and put into another file, Repeat this operation for the next chunk"
I tried to get current available memory size, and read that bytes from original file, parse words by space, reverse and store them in the second file.
But the thing is when you read a chunk of byes from original file, it may be a part of only 1 word. So in this case, word cannot fir into memory, so that was a problem.
BTW, they didn't look for pseudocode or just explanations, they looked for a working solution.