Let's say there is a file cons

Facebook Interview Question for Software Engineer / Developers

1

of 1 vote

35
Answers
Let's say there is a file consists of billions of records data.

The file cannot fit into memory, and you need to reverse each word in that huge file and then save the reversed words to another file.

How would you implement this?
- aliciabondie23 July 05, 2012 in United States | Report Duplicate | Flag | PURGE
Facebook Software Engineer / Developer Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

hi guys,
I've read every proposed answer, and beside pointer thing, I don't thing they'll work.
I've written a pseudocode something line below answer.

"Read the file in chunk, big enough to accommodate in memory. Process the chunk and reverser it and put into another file, Repeat this operation for the next chunk"

I tried to get current available memory size, and read that bytes from original file, parse words by space, reverse and store them in the second file.

But the thing is when you read a chunk of byes from original file, it may be a part of only 1 word. So in this case, word cannot fir into memory, so that was a problem.

BTW, they didn't look for pseudocode or just explanations, they looked for a working solution.

- aliciabondie23 July 10, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

good point for word truncating problem. When processing a chunk, see whether this chunk ends with a delimiter(e.g. space, common, period...), if so, handle all words in this chunk; otherwise, handle all words before the last one since it may not be complete and update the file pointer accordingly. Then read the next chunk.

- jiangok2006 August 16, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Hi, whats u mean that they looked for a working solution? Does that mean they want a code to solve it?

- Anonymous February 16, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Read the file in chunk, big enough to accommodate in memory. Process the chunk and reverser it and put into another file, Repeat this operation for the next chunk

- Anonymous July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Complexity is more

- Anonymous July 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I think this is the way to do this. @Annonymous: Pleasae explain how complexity is more? and suggest your solution of less complexity.

- Akash Jain July 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

This would be my preferred solution.

- Anonymous July 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Let us take the extreme case that one word of the file is too big that it can't fit into the Memory, then we can take pointer(PTR1) to the first character of that word(stored in some block 'x' of hard disk ) and pointer(PTR2) to the last character of that word(stored in some block 'y' of hard disk ) and keep on swapping and incrementing PTR1 and decrementing PTR2 till the end of the blocks.
Then again we will do this for the rest part of the word.
Until and unless PTR1 becomes equal to the PTR2.
If (PTR1==PTR2)
Means the whole word is reversed.

Like this can be done to reverse all the words in the first file.

After that we can read the maximum part of the first file which can fit into the memory and then copy it into the second file.

- alexander July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Sounds like you're altering the original input.

- Anonymous July 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

We have to first reverse in the original file and then store in another file

- alexander July 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Why do you need to swap ? You only read chunks at word boundaries. If a single word is too big for memory, you advance the pointer to the end of the word and just copy until it reaches your current pointer. You then move on to the next sequence of word.

- Anonymous November 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

I don't quite get the point... You mean this file contains very huge string and *reverse* means reverse strings per line or entire file??

- TS July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

file has billions of data. can be 1 line of string and it may have 1+ words. the requirement is reverse every word and write to another file

- aliciabondie23 July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@pawan We are asked to reverse individual word not the whole file.
So I still still to my answer above

Read the file in chunk, big enough to accommodate in memory. Process the chunk and reverser it and put into another file, Repeat this operation for the next chunk

- -|- Pramod Chandoria July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I am posting similar question asked to me for other company.

How do we sort a file that doesn't fit into Main Memory?

- Arulmozhi July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

One way could be by dividing the file into say K chunks, each of which could fit in memory.
And now, sort each of the K chunks.
Finally apply K - way merge operation on them.

- Pavan Dittakavi July 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

This is not a similar question.

- Anonymous July 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Isn't the question about be implementation of External merge sort?

- nerd July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

No.

- Anonymous July 06, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Use an LRU having a buffer size equal to the permissible memory size.

- Anonymous July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

Seems like there is some confusion here. Is this to be done in place? If so then just reverse the file by reading and swapping first n and last n bytes until you reach the middle. Then from the beginning read each word and reverse. (Lets assume that each word is small enough to fit in the buffer.)

If a new file is to be created from the existing file, a lay approach would be to read file backwards and reverse each word before writing to the new file.

- Pranav July 06, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

divide the file into n/k chunks. distribute n/k chunks to a set of computers, they reverse the words and give it back to the distributor. distributor on receiving prepares the output file. The distributor may receive some chunk first some second, so it either waits or writes the chunk to target position based on the worker's id to the desired destination

- nutcracker July 06, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

So basically you're suggesting we use Hadoop to do this?

- Bonzo July 08, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Do like this...

1. Break the huge data into some number of pieces so that each of the piece will fit into the memory.
2. Use multithreading to reverse each of the pieces one by one after bringing into the memory. Reverse using the technique such that on one piece of chunk 1st character is swaped with last, 2nd character is swaped with the second last. use one thread on one character.
3. After reversing all pieces of chunks use the same swaping technique using multithreading to reverse first piece of chunk with last piece of chunk for all.

You are done.....Best of Luck

- Anonymous July 09, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

(1) take *start and *end pointer at d beginning and end of 1st word respectively.
(2) Let *tmp is another pointer at *end.
(3) copy character at *tmp into another file and decrement tmp. Do this step untill *tmp reaches *start. This ll copy 1st word in reverse order.
(4) Now put *start at *end+1 (i.e at d beginning of next word). again find next *end position for d next word and repeat above step.
(5) do all above process untill *start reaches EOF

Complexity: O(2n) i.e O(n) (since every word is read twice)

- SHAN July 10, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

In this way, we dont need to store word into memory. We are just copying character by character.

- SHAN July 10, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

take a file of same size,divide the original into chunks,and start copying from the original file 'character by character' to the end of the new file decrementing towards the start of this new file, i.e, 1st char at last pos,2nd char at 2nd last pos.......

- rudraksh July 16, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Reading char by char takes too much time we need to use divide and conquer strategy. Whole file needs to divide for parallel processing of input.

- Nahush July 17, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

File contains billions of instruction that cannot be fit into memory means we can't take complete file as input, we need to divide file into parts and only parts we need to read. we read every word one by one and reverse of that word then we store into another same file.and rejoin them into another file.

- Nahush July 17, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

How about.. reverse whole file first. We can swap two characters at a time. First with last and so on. So buff_size /2 characters from the beginning and buff_size/2 from the end and then read each word and reverse it.
For example abc de fg -> gf ed cba -> fg de abc.
We would need a buffer that can accommodate the longest word though.

- Pranav July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

This works in place.

- Pranav July 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Incase if this is problem then better approach is
-> Read word from first file
--> seek to start always in second file & write the read word.

- Good July 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

Read one word at a time reverse it and then store it into the destination file, since the file is very big.

- alexander July 05, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

What if one word doesn't fit into memory at a time?

- Aashish July 05, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I have updated my solution

- alexander July 08, 2012 | Flag

CareerCup

Facebook Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews