Suppose you have to maintain t

Amazon Interview Question for Interns

4

of 4 votes

24
Answers
Suppose you have to maintain the stock values of various companies during various periods and return minimum stock value of a particular company over a given period of time.what data structure is best for this.
- nishu November 07, 2013 in India | Report Duplicate | Flag | PURGE
Amazon Intern Data Structures

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 3 vote

HashTable<CompanyID, HashMap<TimePeriod, stockValue>>

- Anonymous November 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Good Approach ..

- Sharma November 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Why you cant use like this?
HashTable<CompanyID, HashTable<TimePeriod, stockValue>>
OR
HashMap<CompanyID, HashMap<TimePeriod, stockValue>>

- Lathish November 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 4 votes

Time period is arbitrary. Will you store an arbitrary number of time periods? The bound on such a data structure is infinity.

- ka1n November 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Best solution ever possible

- subbu fanclub chairmen November 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

This is a good solution. But i would rather prefer a ConcurrentHashMap instead of HashTable.

- raga November 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

This is correct one

- *regex* November 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Absolutely perfect

- Polish Pole Vaulter November 09, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 3 vote

HashMap<CompanyID, MIN-HEAP-BY-Stock-Value<Time>>

Searching O(1)
hashMap to get records of a copmanywith company-ID in O(1).
Got Min-Heap in O(1)--> search time with min stock will take O(1).

Insertion- O(log n)
HashMap- get record of company with company ID in O(1)
In Min-Heap insertion will take O(log n)

- Leo November 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Don't you think you should store value in your MIN-HEAP rather than Time? And moreover, time is not something that is given as a value - the problem statement is giving you a set of stock value for various companies that were observed over a period of time.

- nosyDeveloper November 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Map company ID to array of stock values. The arrays are naturally sorted by time, as each write would simply append.

To determine the min stock value for an arbitrary time period, perform a linear scan over the array. This is O(n), where n is the number of stock values for a given company. The use of a heap would increase this to O(n lg n), as such a heap could not be constructed until the input time period is known.

Because the number of stock prices for a reasonable time period (i.e. a day) may be quite large, we may wish to reduce the breadth of our linear scan by segmenting the per-company-arrays (i.e. one per hour, or day, or whatever is appropriate for the situation).

We may further reduce the breadth of the linear scan by maintaining min values for certain time periods. If the input distribution is heavily skewed towards a particular type of time period (e.g. delimited by day), then these cached min may facilitate a dramatic reduction in linear scan breadth -- potentially O(1) depending on the input.

- ka1n November 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Need more info, but this seems like a dynamic problem (no preprocessing on offline static array) can be solved with segment trees and range min. Queries.

Or buffer the interday stock variations if the time slices are smaller than a day and insert into segment tree at appropriate intervals.

Or mix and match even more.

- urik on bb November 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

It depends on more info.

Every day or week or whatever we can offline historic data and preprocess on that part. Then we can do range min. Quries acrossd the offline and dynamic part if need be and form results.

Answer changed to depends.

- urik on bb November 07, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

best solution

- Raku March 05, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

double direction queue or deque in C++ STL

- Anonymous November 07, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

can you suggest how deque can be used for this????

- nishu November 08, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

can anybody suggest how deque or simply queue can be used for this????

- nishu November 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Bst

- Anonymous November 08, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

A minheap can be used. The node will contain the appropriate data for each and the minimum value which will be at the root can be removed.

- Subham Soni November 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

From the question I gather:
> maintaining stock values ...
Sounds like timeseries. I would use 2 arrays for this wherein first one stores #of seconds/milliseconds elapsed since the beginning of capture and the second one stores the value. It is much faster and less overhead with arrays. The only issue is managing them. Besides you can do binary search on the time if required.

Follow up question:
Is modification to the data structure multi-threaded? If yes use Hashtable instead of HashMap.

Now about application of the data structure:

HashMap<String,Object[][]>

where value is:

Object[][] timeseries = new Object[]{
  new int[MAX_TIME_TO_STORE_IN_SECONDS],
  new float[MAX_TIME_TO_STORE_IN_SECONDS]
}
// assuming that the precision of data structure is seconds

- ashivaru November 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

A nice open ended question
Let's assume
1) data is coming in continually
2) The stock price varies frequently I hear high frequency traders count feet of fiber optic in some co located data centers.
3) These same stock trader's are going to want things to happen fast
4) We might assume that the frequency of trades is also low relative to the granularity of the time line
So we end up with a sparse time line
This suggests some kind of a binary tree, you could speed query time by turning it into a range tree or something similar. You could put some range information on each node to reduce search time. Additionally you could put the minimum value on these nodes.
However binary trees don’t do so well on machines with cashes. Additionally we will be continually adding data and search trees start requiring rebalancing when you do that. If you assume 10 to 100 fluctuations in the price per second and you want to keep years of data around you start getting beyond what you can reasonably keep in ram. It is probably a good time to go back to the people who wanted this program and get some more details on how it will be used.
One good option might be to use trees with really big nodes each of which contains an implicit tree structure. You might also consider moving all the actual data in to the leaves, though I have not thought this through. The memory bottle neck also brings even the notion of cashed ranges and minimum values into question when you get close to the leaves. Omitting this information will let you cram more data into a big leaf and this may offset the time needed to compute the values by looking at all the data once you have it on the processor. So each big leaf would then be just a sorted list with times or time deltas and the price and or price deltas.
Cashing and restructuring of the tree based on expected data usage might also be worth considering. You could keep some extra pointers around based on this as well so you don’t need to grab multiple disparately stored nodes off the disk to make queries that are of the most interest. While the millisecond to millisecond fluctuations might matter over the last hour such granularity in the stock's value 10 years ago may not be so important so we could perhaps reduce granularity of older data or just dump it after a certain point. This should not be too time consuming a task but in the fast paced world of high frequency trading it might be worth using a different core to do this and other house cleaning tasks.

- Dr A' November 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Personally, I think it can work by using STL containers:
vector<int, multimap<float, string> * >
int: period, months, weeks, or days
float: the value of stock, there are maybe several companies can have the same stock value, so use multimap to map stock value to company ID

- Tyler November 11, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Who cares about different companies. Solve it for one.

And no, simply saying "hash the data we need for quieries we need" is not an alg.

CareerCup

Amazon Interview Question for Interns

Books

Videos

Resume Review

Mock Interviews