m*r
1 楼
Given a sequence of data (with duplicates), move a fix-sized window along
the data sequence and find mode in the window at each iteraion, where the
oldest data is removed and a new data is inserted to the window.
I cannot find better solutions here.
My idea: Use a hashtable, key is the data, key's data is the frequency of
the data occuring in the window.
At the first iteration, iterate each data in the window and put it to the
hashtable, meanwhile cout the frequency of each data. After that, traverse
the hashtable and return the data with the highest frequency. For each
following iteration, search the oldest data in the hashtable and reduce its
frequency by 1 if it is in the hahstable if it becoms 0 use new data to
replace the old one. Otherwise, just insert the new data into the hahstable.
Traverse the table and return the mode.
It is O(n * m) where n is data seq size and m is window size. The drawback
is : The hashtable size is not fixed, it may have resize overhead. Each
iteration, the table has to be traversed, it is not effcient.
Is it possble to do it with O(n lg m) or O(n) ?
Any help is appreciated.
thanks
the data sequence and find mode in the window at each iteraion, where the
oldest data is removed and a new data is inserted to the window.
I cannot find better solutions here.
My idea: Use a hashtable, key is the data, key's data is the frequency of
the data occuring in the window.
At the first iteration, iterate each data in the window and put it to the
hashtable, meanwhile cout the frequency of each data. After that, traverse
the hashtable and return the data with the highest frequency. For each
following iteration, search the oldest data in the hashtable and reduce its
frequency by 1 if it is in the hahstable if it becoms 0 use new data to
replace the old one. Otherwise, just insert the new data into the hahstable.
Traverse the table and return the mode.
It is O(n * m) where n is data seq size and m is window size. The drawback
is : The hashtable size is not fixed, it may have resize overhead. Each
iteration, the table has to be traversed, it is not effcient.
Is it possble to do it with O(n lg m) or O(n) ?
Any help is appreciated.
thanks