Given a list of words, return

Jane Street Interview Question for Interns

0

of 0 votes

15
Answers
Given a list of words, return a list of lists, where each list contains a group of anagrams.
Sample Input: cat bat act tab
Sample Output: [ [ cat, act ] , [ bat, tab ] ]
- panoptic.biopower December 12, 2012 in United States | Report Duplicate | Flag | PURGE
Jane Street Intern

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

Hi!

I would use a HashMap , where the key is the word sorted  and as value I would have a ArrayList of words for which if we sort the letters we would obtain the key of their entry. 

After going over each word of the list and putting it into a category , I would create  an ArrayList<ArrayList<String>> . Going just once over the HashMap I will put into the ArrayList all the values from the HashMap, and return it.

Example :
Input : cat, bat, act, tab

HashMap<String, ArrayList<String>> map  = new HashMap<String, ArrayList<String>>();
--- after going once over each word of the input, map will contain: 
{ act = {cat, act}; abt = {bat, tab} }

ArrayList<ArrayList<String>> out = new ArrayList<ArrayList<String>>();
foreach key in map {
   out.add(map.get(key));
}

return out;

Complexity time: O(n* the max length of a word from the input list)

- Kamy December 13, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

but the sorting itself would take xlgx comparisons (x is the length of a word).

- hero December 13, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

you're right! sry .. so the correct complexity for this is n * x log x

- Kamy December 13, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

we don't even need to sort the words, we can create a hash code with the alphabets count in a word then use that as the key...it will reduce the complexity to O(n x) rather than O(n x logx). A simple example of such a hash code can be for cat = a1c1t1 bat = a1b1t1 act = a1c1t1 tac = a1c1t1 and so on

- The Artist December 13, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

well the hash code that i gave as an example is actually count sort and on second though i think you can use count sort (an O(n) algorithm if you know that the letters are only english alphabets

- The Artist December 13, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

<<<
Create a list of lists using below data struct:
struct ListList
{
char * string;
char * revstring;
struct ListList * next;
} * return_list;

Start iterating over given list of words, for each word do the following:
1) search the word in hash_table, if not found then
1.1) Create a node of ListList, add it the return_list and point the node.string to that word.
1.2) Mark node.revstring = NULL
1.3) Reverse the word and put into hash_table (considering key) and value as the address of node.

2) If word found in hash table then
2.1) from hash table entry get the address of node and point node.revstring to that word.

Once the iteration of given list is over, start iterating ‘return_list’ and remove & delete the nodes whose ‘node.revstring’ value is NULL.

Total complexity O(n).

>>>

- Ashutosh December 14, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

no energy to try to understand your logic but since you are doing something with the reverse of a word i seriously think that you need to first find out the meaning of Anagrams

- The Artist December 14, 2012 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

The core of this algorithm is calculating two numbers of each string. In the first round, It will calculate the sum of each characters minus 'a', meanwhile the min character in the string will be found. In the second round, the sum of the squre differrence between the min character get in the first round and each character in the string. With these two sum values, it can be easliy told that if two strings are in the same group of anagrams

struct Pair
{
string name;
int sum;
int SquSum;
};
node* createNode(Pair* p)
{
node* n = (node*) malloc(sizeof(node));
n->pair = p;
return n;
}

vector<std::list<Pair*> > results;
vector<string> inputs;
void insertResult(node* n)
{

for(vector<std::list<Pair*> >::iterator it = results.begin(); it != results.end(); it++)
{
if(((*it->begin())->sum == n->pair->sum)&&((*it->begin())->SquSum==n->pair->SquSum))
{
it->push_back(n->pair);
return;
}
}

{
list<Pair*> list;
list.push_back(n->pair);
results.push_back(list);
}
}
void calculateString(string input, Pair*& pair)
{
int length = input.size();
int sum = 0, squSum = 0, min = 0;
for (int i = 0; i < length; i++)
{
min = input[min] > input[i] ? i : min;
sum += input[i] - 'a';
}
for (int i = 0; i < length; i++)
{
squSum += (input[min] - input[i]) * (input[min] - input[i]);
}
pair->sum = sum;
pair->SquSum = squSum;
pair->name = input;
cout << "The string " << input << " pair result: " << sum << " " << squSum << endl;
}
int main()
{
inputs.push_back("abc");
inputs.push_back("bcd");
inputs.push_back("cbd");
inputs.push_back("bac");

for (vector<string>::iterator it = inputs.begin(); it != inputs.end(); it++)
{
Pair* pair = (Pair*) malloc(sizeof(Pair));
calculateString(*it, pair);
insertResult(createNode(pair));
}
for(vector<std::list<Pair*> >::iterator it = results.begin(); it != results.end(); it++)
{
for(list<Pair*>::iterator iter = it->begin();iter != it->end(); iter++)
{
cout<<(*iter)->name<<" "<<(*iter)->sum<<" "<<(*iter)->SquSum<<endl;
}
cout<<endl;
}
}

- lwh20053276@163.com December 14, 2012 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Store in a HashMap.

Now override the HashCode method for this String Object. HashCode should work on the principle of the sorted value of the Anagram hence all Anagrams will give you the same HashCode value.

Now override the Equals() method and let it work on the basis of the normal String comparison.

Hence all Anagrams are stored at the same Map location in buckets.

- Abhi January 29, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

MultiMaps are also a useful tool

- Abhi January 29, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

code in haskell

f xs=elems (fromListWith(++)[(sort v,[v])|v<-xs])

- lispc October 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Code in OCaml using Core.Std.Map (Tree map)

open Core.Std

let group_anagrams anagrams =
  let map = List.fold anagrams
      ~init:(Map.empty String.comparator)
      ~f:(fun map word ->
          let word_sorted =
            (String.to_list word)
            |> (List.sort ~cmp:Char.compare)
            |> String.of_char_list
          in
          let words = match Map.find map word_sorted with
            | None -> [word]
            | Some words -> word :: words
          in
          Map.add map ~key:word_sorted ~data:words
        )
  in
  Map.fold map ~init:[] ~f:(fun ~key ~data acc -> data :: acc)

let () =
  assert (
    (group_anagrams ["cat"; "act"; "bat"; "tab"; "b"])
    = [["b"]; ["act"; "cat"]; ["tab"; "bat"]]
  )

- byvoid October 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Perl solution, because why not? Added a cache so that sorting only happens a single time for duplicate words. I also did some other sanitizing e.g removed all whitespace and punctuation:

#!/usr/bin/perl

use strict; 
use warnings; 

use Data::Dumper; 

sub group {
   my @words = @_;

   my %anagram_map; 
   my %sort_cache; 
   foreach my $word ( @words ) { 
      # strip all whitespace 
      $word =~ s/[^0-9A-Za-z]+//g;
   
      # force to lowercase
      $word = lc $word; 

      # sort word unless already sorted 
      my $sorted_word = $sort_cache{$word} || join '', sort +split //, $word; 

      # add to cache unless it already has an entry
      $sort_cache{$word} = $sorted_word 
         unless $sort_cache{$word};

      # group by anagram (sorted string)
      push @{$anagram_map{$sorted_word}},
         $word; 
   }

   # values of the anagram map are the grouped solutions
   return [values %anagram_map];
}

print Dumper group(qw(cat bat act tab));
print Dumper group('the END of the world is nigh', 'down this hole frightened', 'b', 'rome wasn\'t built in a day', 'but laid in two years man', 'a man a plan a canal panama');

__DATA__
# results        
        [
          [
            'cat',
            'act'
          ],
          [
            'bat',
            'tab'
          ]
        ];
        [
          [
            'theendoftheworldisnigh',
            'downthisholefrightened'
          ],
          [
            'romewasntbuiltinaday',
            'butlaidintwoyearsman'
          ],
          [
            'b'
          ],
          [
            'amanaplanacanalpanama'
          ]
        ];

- mcmillhj April 14, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Solution in python

words = ['cat','act','bat','tab']

def anagramList(words):
    grouplist = []
    wordDict = {}
    for w in words:
        wordDict[''.join(sorted(w))] = []
        
    
    for w in words:
        wordDict[''.join(sorted(w))].append(w)
        
    for sl in wordDict:
        grouplist.append(wordDict[sl])

    return grouplist
        
print(anagramList(words))

- Anonymous August 16, 2018 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

class Solution {
Map<String, List<String>> map = new HashMap<>();
public List<List<String>> groupAnagrams(String[] strs) {
for (String s : strs) {
char[] temp = s.toCharArray();
Arrays.sort(temp);
String key = new String(temp);
if(!map.containsKey(key)) {
map.put(key, new ArrayList<>());
}
map.get(key).add(s);
}
return new ArrayList<List<String>>(map.values());
}
}

- AlexIonescu January 09, 2020 | Flag Reply

CareerCup

Jane Street Interview Question for Interns

Books

Videos

Resume Review

Mock Interviews