Apple Interview Question
Software Engineer / DevelopersCountry: United States
How about creating a Iterator and Resource Pull mechanism. Basically iterator will hold the file and each workers will call next and get chunk of data to process. Some thread might complete task quicker then others in that case they will ask for more work to do. This is idea mechanism as no thread is sitting idea.
I think it depends on if the threads will edit the file or not, if the threads only read file then it should be easy. One can use the skip(int n) method in RandomAccessFile class to advance the file pointer. If the file size is 5k and we want 5 threads to read it, we can assign 1k to each. And while spawning a thread we can tell from which location it should read.
This depends on the type of work the threads will do over the file. If processing an entry in the file is significantly more expensive than reading the file, have one thread reading the file and producing entries while all other threads consume these entries and process them.
Now suppose that processing an entry in the file is comparable in terms of processing time to reading such entry. The above schema would not work well since the producer wouldn't keep up with the consumers. Let's try something different.
Let's suppose each line in the file with F lines is an entry to be processed. Scan every character in the file and index all line breaks. Now create N threads and use a FileInputStream to skip to the nth line of the file and from there on use a BufferedReader.readLine(). Each thread should stop after it completed its own portion of the job (N/F lines).
You can remove the line indexing if, instead of splitting the file in lines, you split it in bytes. In this case you have to be very careful if you use BufferedReader because knowing when to stop is not trivial. You might have to count the number of characters in each entry so you know how many characters a thread has processed.
use fork/join, keep dividing the file based on lines and invoking the fork.
- bot25 October 05, 2014