Booking.com Interview Question
Software DevelopersCountry: India
Interview Type: Phone Interview
We can solve it in the simplest way
static final String[] words = {"aaa", "bbb", "ccc", "booking", "alpha", "beta", "gamma"};
private static Map<String,Integer> wordList = new HashMap<>();
public static void main(String[] args) {
// TODO Auto-generated method stub
String w = "booking";
int k = 2;
//creating dictionary with word position
for(int i = 0;i<words.length;i++) {
wordList.put(words[i], i);
}
grep(w,k);
}
static void grep(String w,int k) {
int location = wordList.get(w) == null ? -1 : wordList.get(w);
if(location == -1 || k > location)
return;
int start = location - k;
for(int i = start;i<=location;i++) {
System.out.print(words[i]+" ");
}
}
Create a map<word, index> and sequence of words (typically an array). Map is either backed by hashing or BST. Now by word index is found if exists then from original word sequence expected output is obtained.
Assumed this whole operation can be done using in memory. If not then it may need external indexing.
From file create a map<word, index> which is backed by hashing, BST etc and original word sequences, typically by an array. Now given word index if found if exists then from original word sequences expected output can be obtained.
Assumed memory is fit for all words, if not then it may need external indexing etc.
From file create a map<word, index> which is backed by hashing, BST etc and original word sequences, typically by an array. Now given word index if found if exists then from original word sequences expected output can be obtained.
Assumed memory is fit for all words, if not then it may need external indexing etc.
I think your solution idea is good. But it's a simple question. So, what was the interviewer interested in?
- the algorithm runs in O(n), and most time is consumed by fetching data from disc to memory.
- the queue contributes to memory usage and constant time factors. A ringbuffer could help on reducing constant factors on time and space complexity, (less copying around) but this optimization will come for the price of code complexity, so it needs to be well decided.
Other than that, I only see he was thinking of a pattern instead a word to look for, which would explain your O(n*m) time. In this case, preprocess the pattern so matching is O(n) again.
package com.home.careercup;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/*
This is NOT a solution with space complexity lesser than O(n)
I am watching to see if anyone posts a solution that is better than O(n) space
All solutions that use a map <word, index> already take up O(n) space inside
the index. Correct me if I am wrong.
*/
public class KGrepp {
final static String[] words = {"aaa", "bbb", "ccc", "booking", "alpha", "beta", "gamma"};
private static Map<String, List<Long>> wordLocator = new HashMap<>();
static {
build(words);
}
public static void main(String[] args) {
grep("beta", 2);
System.out.println("===");
grep("booking", 3);
}
static void build(String[] words) {
for (int i = 0; i < words.length; i++) {
final String w = words[i];
wordLocator.putIfAbsent(w, new ArrayList<>(2));
wordLocator.get(w).add((long) i);
}
}
static void grep(String w, int k) {
if (k > words.length - 1) return;
List<Long> locations = wordLocator.get(w);
if (locations == null || locations.size() == 0) return;
for (long loc : locations) {
for (long i = Math.max(0, loc - k); i <= loc; i++) {
System.out.println(words[(int) i]);
}
}
}
}
I assume it can be achieved in O(n) complexity. We don't need to buffer the data and search for the key string again within buffer. Instead keep only k number of lines in memory and if new line is equals to key string, print the buffer along with key.
I assume, k number of lines are read already before reading the key value.
package careercup;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
public class LimitedLengthQueue {
private List<String> innerStore;
private int size;
private static final String FILENAME="C:/filepath.txt.txt";
public LimitedLengthQueue(int size) {
this.innerStore = new ArrayList<>();
this.size = size;
}
boolean add(String s){
if(innerStore.size() == size){
innerStore.remove(0);
innerStore.add(s);
return true;
}
innerStore.add(s);
return true;
}
public static void main(String[] args)
throws FileNotFoundException, IOException {
LimitedLengthQueue inputQueue =
new LimitedLengthQueue(Integer.parseInt(args[0]));
String key = args[1];
try( FileReader fr = new FileReader(FILENAME);
BufferedReader br = new BufferedReader(fr)){
String input = null;
while( (input = br.readLine()) != null){
if(key.equals(input.trim())){
inputQueue.innerStore.forEach(System.out::println);
System.out.println(input);
break;
}
inputQueue.add(input);
}
}
}
}
Hi Chris,
Wouldnt the time complxity be O(n*m) as for each iteration there will be m comparisons to check whether the iterated word is matching with the given word w?
Also can you please explain prepocessing thing in a little detail or give me a reference where i can read this up? can this help in bringing down space/time complexity?
@prashant.tah
maybe imagine it that way:
- if you extract the words from the input buffer and then compare each extracted word against w, what do you do:
- extract approximately n/m words in O(n)
- after (or while) doing that you do for each word at most m comparsions, which is again O(n)
- total of O(n)
or another way:
- while finding word separators in the input buffer, you could compare against w, knowing if w
is a match while you extract. This is O(n), too - slightly more obvious.
But if you try to search a pattern within each extracted word with a naive approach you would do:
- extracting n/m words in O(n)
- for each word of length m and the pattern of length p, perform p*(m-p+1) comparisons.
For pattern matching in linear time Knuth-Morris-Pratt is especially interesting because it covers the whole prefix and suffix discussion which I think gives some insight (a pattern is a match if it has a common prefix on a common suffix - sounds strange until it makes click).
If we use a balanced binary tree to represent the document, where each node contains the word and the line or offset where we found the word in the document, we can have time complexity of log(n) for lookup.
Then we have to find the k words previous.
We keep the original ArrayList and get the k words using index O(1) each operation thus O(k).
Time complexity is O(k*log(n)) for lookup
Space complexity is O(2*n)
Pros: very fast
Cons: maintenance, code complexity
public void grep(String[] input, String word, int k) {
List<String> cappedList = new ArrayList<String>(k) {
@Override
public boolean add(String s) {
if (size() > k) {
super.remove(0);
}
return super.add(s);
}
};
for(String s : input) {
cappedList.add(s);
if (s.equals(word)) {
break;
}
}
cappedList.stream().forEach(System.out::println);
}
public void solve(String[] input, String word, int k) {
List<String> cappedList = new ArrayList<String>(k) {
@Override
public boolean add(String s) {
if (size() > k) {
super.remove(0);
}
return super.add(s);
}
};
for(String s : input) {
cappedList.add(s);
if (s.equals(word)) {
break;
}
}
cappedList.stream().forEach(System.out::println);
}
public void solve(String[] input, String word, int k) {
List<String> cappedList = new ArrayList<String>(k) {
@Override
public boolean add(String s) {
if (size() > k) {
super.remove(0);
}
return super.add(s);
}
};
for(String s : input) {
cappedList.add(s);
if (s.equals(word)) {
break;
}
}
cappedList.stream().forEach(System.out::println);
}
To read you have to go through the file at least once till the word is not found.
So Let's start reading the file line by line. Maintain a queue of size k. Just Keep popping from the queue and keep inserting into queue till the word has reached.
Once the word is found just print the content of queue followed by the word.
import java.util.*;
public class Main {
public static void main(String[] args) {
String[] input = {"aaa",
"bbb",
"ccc",
"booking",
"alpha",
"beta",
"gamma"};
int K = 2;
String word = "beta";
List<String> cappedList = new ArrayList<String>() {
@Override
public boolean add(String s) {
if (size() > K) {
super.remove(0);
}
return super.add(s);
}
};
for (String string : input) {
if (string.equals(word)) {
cappedList.add(string);
break;
}
cappedList.add(string);
}
cappedList.stream().forEach(System.out::println);
}
}
public class HelloWorld{
public static void main(String []args){
System.out.println("Hello World");
String[] input = {"aaa",
"bbb",
"ccc",
"booking",
"alpha",
"beta",
"gamma"};
int K = 2;
String word = "beta";
grep(input, K, word);
}
static void grep(String[] input, int k, String word){
boolean printenabled = false;
for(int i=0; i<input.length;i++){
if(i+k < input.length && input[i+k].equalsIgnoreCase(word)){
printenabled = true;
}
if(printenabled)
System.out.println(input[i]);
if(word.equalsIgnoreCase(input[i]))
break;
}
}
}
will this help?
static final String[] words = {"aaa", "bbb", "ccc", "booking", "alpha", "beta", "gamma"};
private static Map<String,Integer> wordList = new HashMap<>();
public static void main(String[] args) {
// TODO Auto-generated method stub
String w = "booking";
int k = 2;
//creating dictionary with word position
for(int i = 0;i<words.length;i++) {
wordList.put(words[i], i);
}
grep(w,k);
}
static void grep(String w,int k) {
int location = wordList.get(w) == null ? -1 : wordList.get(w);
if(location == -1 || k > location)
return;
int start = location - k;
for(int i = start;i<=location;i++) {
System.out.print(words[i]+" ");
}
}
As LinkedList is faster in add or remove element , i would use below :
import java.util.LinkedList;
public class GrepCommand {
public static void grep(String[] words, String word, int k) {
if (k > words.length-1){
return;
}
LinkedList<String> linesBuffer = new LinkedList<String>() {
public boolean add(String object){
if(this.size() == k) {
this.removeFirst();
}
return super.add(object);
}
};
for(int i=0; i< words.length; i++) {
if (word.equals(words[i])) {
break;
}
linesBuffer.add(words[i]);
}
linesBuffer.stream().forEach(System.out::println);
System.out.println(word);
}
public static void main(String[] args) {
grep(new String[]{"aaa", "bbb", "ccc", "booking", "alpha", "beta", "gamma"}, "booking", 3);
}
}
import java.util.LinkedList;
public class GrepCommand {
public static void grep(String[] words, String word, int k) {
if (k > words.length-1){
return;
}
LinkedList<String> linesBuffer = new LinkedList<String>() {
public boolean add(String object){
if(this.size() == k) {
this.removeFirst();
}
return super.add(object);
}
};
for(int i=0; i< words.length; i++) {
if (word.equals(words[i])) {
break;
}
linesBuffer.add(words[i]);
}
linesBuffer.stream().forEach(System.out::println);
System.out.println(word);
}
public static void main(String[] args) {
grep(new String[]{"aaa", "bbb", "ccc", "booking", "alpha", "beta", "gamma"}, "booking", 3);
}
}
- capped list January 12, 2019