Amazon Interview Question
Principal Software EngineersCountry: India
Interview Type: Phone Interview
First pass : - Create an ordered trie for the set of words as the look up
Second pass :- starting with the first word look up the tie for last char available on the current word. If present concatenate to the result else go through with the next word in the source.
Construct a graph (connect two words accoding to the condition) and run a topological sort
{let s be the no. of words;
let n=1;
let l=0;
while(n<s OR (s-n-l-1)>0)
{
take nth word;
look the last letter;
loop:
search for the word starting with that letter in the remaining s-n words;
if found
{
interchange (n+1)th and the newly found word;
n=n+1;
}
else
{
interchange the nth and (l+1)th last word;
l++;
} //this else condition is to place the words for which there won't exist any words start with its last letter
}}
One thing to note is that it may not always be possible to sort the numbers as asked in the question.
Suppose you have the strings:
abc
cba
bbb
Where will you place the string 'bbb' w.r.t the other strings?
Supposing that the input strings can be arranged as required in the question, first sort the array 'strings[]' using the natural order of the strings. Then we can maintain an auxiliary array that holds the locations of strings starting with a particular character. For ex: Lets have the array index_manager[] such that:
index_manager['A'] = 0
index_manager['B'] = 12
index_manager['C'] = -1
...
index_manager['Z'] = 2014
The following algorithm should work:
Starting with the first string in the sorted array, do the following
0. Let the current string be 'current_string' and print it.
1. Let the first character of current_string be 'F' and increment index_manager['F'].
If first_char_of(string[index_manager['F']]) != first_char_of(current_string) then index_manager['F'] = -1
2. Look at the last character of that string. Let it be some character 'L'. Look up the character in the index_manager[] array.
3. if (index_manager['L'] == -1) print 'Sorting not possible' otherwise go to step 4
4. Set current_string = string[index_manager['L']] and go to step 4.
We may need to take care of some other boundary conditions, but the schema in general should work.
import java.util.*;
/**
* Given a array of names, this program would sort all names such that last character of preceding string is equivalent to first character of current name
**/
class NameSort{
/**
* @return String[] array of sorted names as per requirement.
* If not possible to sort as per given conditions than return null;
*/
String[] sortNames(String[] names) {
Arrays.sort(names); //Alphabetical order
String[] orderedNames = null;
for(int index = 0; index < names.length; index++) {
Set<String> orderedNamesSet = new LinkedHashSet<String>();
if(visitNames(orderedNamesSet, names[index], names)) {
orderedNames = orderedNamesSet.toArray(new String[orderedNamesSet.size()]);
break;
}
}
return orderedNames;
}
private boolean visitNames(Set<String> orderedNames, String candidateString, String[] names) {
orderedNames.add(candidateString);
for(String nextCandidate : findNamesStartingWithChar(names, candidateString.charAt(candidateString.length()-1))) {
if(!orderedNames.contains(nextCandidate)) {
Set<String> cloneSet = new LinkedHashSet<String>(orderedNames);
if(visitNames(cloneSet, nextCandidate, names)) {
orderedNames.clear();
orderedNames.addAll(cloneSet);
break; //I found my candidate
}
}
}
return orderedNames.size() == names.length;
}
/**
* Iterate through each name, and do a binary search in endCharSortNames string
* Else i will have to try with all possible String sequence
*/
private List<String> findNamesStartingWithChar(String[] alphabetSortNames, char startChar) {
List<String> names = new ArrayList<String>();
int low = 0;
int high = alphabetSortNames.length - 1;
while(low <= high) {
int middle = (low + high) >>> 1;
if(getFirstChar(alphabetSortNames[middle]) == startChar) {
for(int index = middle; index >=0 && getFirstChar(alphabetSortNames[index]) == startChar; index--) {
names.add(alphabetSortNames[index]);
}
for(int index = middle+1; index <alphabetSortNames.length && getFirstChar(alphabetSortNames[index]) == startChar; index++) {
names.add(alphabetSortNames[index]);
}
break;
} else if(getFirstChar(alphabetSortNames[middle]) < startChar) {
low = middle+1;
} else {
high = middle-1;
}
}
return names;
}
private char getFirstChar(String name) {
return (name == null || name.isEmpty()) ? ' ' : name.charAt(0);
}
public static void main(String[] args) {
String[] names = new String[]{"vikash", "arnabv", "hritu"};
System.out.println("Original Names: " + Arrays.toString(names));
String[] sortedNames = new NameSort().sortNames(names);
System.out.println("Sorted Names: " + (sortedNames == null ? "No possible order satisfying given condition" : Arrays.toString(sortedNames)));
}
}
solution takes O(N^2) but very straight foreword -
/// <summary>
/// builds the word link by connecting the head and end of each of the word in the list.
/// </summary>
/// <param name="leadingChar">expected Head Char for the string at the top of the returning list. </param>
/// <param name="candidateList">the list of the strings</param>
/// <returns>sorted string list, null if such list can't be built</returns>
public static List<string> BuildWordLink(char leadingChar, List<string> candidateList)
{
for (int i = 0; i < candidateList.Count; i++)
{
var currentelement = candidateList[i];
if (leadingChar == ' ' || currentelement[0] == leadingChar)
{
// currentelement is the only element left, we should return here.
if (candidateList.Count == 1)
{
var returnList = new List<string>();
returnList.Add(currentelement);
return returnList;
}
// remove the currentelement from the candidatelist, build a new wordlink using the remaining words that can link to the end character of the current element
candidateList.RemoveAt(i);
var result = BuildWordLink(currentelement[currentelement.Length - 1], candidateList);
if (result != null)
{
result.Insert(0, currentelement);
return result;
}
candidateList.Insert(i,currentelement);
}
}
return null;
}
To test -
var test = new List<string> {"luis", "hector", "selena", "emmanuel", "amish", "anna", "andrea", "rawle"};
var result = BuildWordLink(' ', test);
public class OrderWordsNextStartsWithLastLetterOfPrevious {
public static void orderWords(List<String> words) {
TreeSet<String>[] starts = (TreeSet<String>[]) new TreeSet<?>[26];
TreeSet<String>[] ends = (TreeSet<String>[]) new TreeSet<?>[26];
for(String word:words) {
int startIndex = word.charAt(0) - 'a';
int endIndex = word.charAt(word.length()-1) - 'a';
if(starts[startIndex] == null) {
starts[startIndex] = new TreeSet<String>();
}
if(ends[endIndex] == null) {
ends[endIndex] = new TreeSet<String>();
}
starts[startIndex].add(word);
ends[endIndex].add(word);
}
//find first and last
int startIndex = -1;
int endIndex = -1;
for(int i = 0; i < starts.length; i++) {
if(starts[i]==null && ends[i]==null) continue;
else if(starts[i]!=null && ends[i]==null) {startIndex=i; continue;}
else if(starts[i]==null && ends[i]!=null) {endIndex=i; continue;}
else {
if(startIndex<0) startIndex = i;
if(endIndex<0) endIndex = i;
if(starts[i].size() > ends[i].size()) startIndex = i;
else if(ends[i].size() > starts[i].size()) endIndex = i;
}
}
String word = starts[startIndex].pollFirst();
while(word!=null) {
System.out.println(word);
int end = word.charAt(word.length()-1)-'a';
ends[end].remove(word);
if(starts[end]==null) {word=null;continue;}
word = starts[end].pollFirst();
int endOfNewWord = word.charAt(word.length()-1) - 'a';
if((starts[endOfNewWord]==null || starts[endOfNewWord].size()==0) && starts[end].size()>0) {String tmp = word; word=starts[end].pollFirst(); starts[end].add(tmp);}
}
}
public static void main(String[] args) {
String[] words = {
"luis",
"hector",
"selena",
"emmanuel",
"amish",
"heoj",
"jeoh"
};
orderWords(Arrays.asList(words));
}
}
Interesting question.
- EulGam05 January 08, 2014Someone suggested reducing the problem to finding a Hamiltonian Path in a graph. I might be wrong but I think finding a Hamiltonian Path in a general graph is an NP-Complete problem. However, finding an Euler path in a graph is not, so instead of reducing the problem to find a Hamiltonian path we might try reducing it to finding an Euler path.
Instead of looking at the names as vertices, we'll look at them as edges between two characters (the starting character and the ending character). Thus, our graph G will be defined as follows:
The vertices are all the starting and ending characters from the strings (no repetitions).
For every two vertices c1 and c2, there is an edge (c1,c2) for every name which begins with c1 and ends with c2 (with parallel edges as well, meaning that if there are two names starting with c1 and ending with c2 then there would be two edges from c1 to c2 in the graph as well).
For example:
Input = {"luis","hector","selena","emmanuel","amish","anna","andrea","rawle"}
The graph G would be as follows:
Vertices = {'l','s','h','r','a','e'}
Edges (the format is (from,to,name)) = {('l','s',"luis"), ('h','r',"hector"),('s','a',"Selena"),('e','l',"emmanuel"),('a','h',"amish"),('a','a',"anna"),('a','a',"andrea"),('r','e',"rawle")}
Notice that there are two edges from 'a' to 'a' (one for "anna" and another for "andrea").
Now, notice that each path (with more than one vertex) in the graph we just defined corresponds to some names in the right order. For instance:
Path: 's' --("selena")---> 'a' --("andrea") ---> 'a' --("amish")---> 'h'
Thus, in order to sort all the names we must find a path in G which goes through all the edges exactly once, in other words we must find an Euler path in the graph G.
Algorithm:
1. Build the graph G from the set of strings as described above.
2. Find an Euler path in the graph G (for instance: using Hierholzer's algorithm).
3. If an Euler path was found in (2) then Output the strings in the order induced by the path found in (2).
4. else, output "Sorting not possible".
Complexity:
1. Building the graph can be done in linear time.
2. Hierholzer's algorithm can be implemented in linear complexity as well.
3. Printing the path is also done in linear time (the path contains at most n edges).
Overall complexity: O(n) where n is the number of names.
Here is my attempt at a non-optimized (O(n^2)) implementation of this idea:
snipt.org/BKic4
With some additional data structures - path merging and next character selection can be reduced to O(1) instead of O(n) which would make this solution linear. It worked well for the tests I ran but I haven't tested it thoroughly.