avatar
F*r
1
版上的题,但是考古找不到讨论了。。。大家看看怎么做好?只能想到brute force的
。。。。
Given a document and a query of K words, how do u find the smallest window
that covers all the words at least once in that document? (given you know
the inverted lists of all K words, that is, for each word, you have a list
of all its occurrrences).
avatar
g*y
2
inverted list:
A: 3, 5, 99, 21
B: 4, 23
C: 9, 20, 35
D: 1, 6, 24
Merge sort all list to produce to below sorted array:
1D 3A 4B 5A 6D 9C 20C 21A 23B 24D 35C 99A
the problem is find the smallest sub array which contains all four letters (
A, B, C, D)
It can be solved by O(n) where n is the length of the array by idea of
dynamiac progeamming and also by looking up the original inverted list:
Starting from index 0, end index 3 (we know the min lenght of the subarry is
4),
scan the four items, bookkeeping the occruences:
for eaxmple: when start index is 2, window now is 4B 5A 6D, 9C, next step is
start index = start index + 1= 3, and find next available B which is 23,
and length is 23-5, bookkeep the 5A 6D, 9C, 20C, 21A, 23B, and then slide
windows to right (by start index = start index + 1)

【在 F**********r 的大作中提到】
: 版上的题,但是考古找不到讨论了。。。大家看看怎么做好?只能想到brute force的
: 。。。。
: Given a document and a query of K words, how do u find the smallest window
: that covers all the words at least once in that document? (given you know
: the inverted lists of all K words, that is, for each word, you have a list
: of all its occurrrences).

avatar
F*r
3
不是很明白startindex从0到2的过程要做些什么?这个slide window要怎么用呢?谢谢!

(

【在 g***y 的大作中提到】
: inverted list:
: A: 3, 5, 99, 21
: B: 4, 23
: C: 9, 20, 35
: D: 1, 6, 24
: Merge sort all list to produce to below sorted array:
: 1D 3A 4B 5A 6D 9C 20C 21A 23B 24D 35C 99A
: the problem is find the smallest sub array which contains all four letters (
: A, B, C, D)
: It can be solved by O(n) where n is the length of the array by idea of

avatar
g*y
4
每次start index +1,然后根据前一步的bookkeeping觉得end index move到哪儿
然后继续bookkeeping,记录上新的start index, end index直接的A, B, C, D的个数以
供下一步用。
key observation是 :这个sliding windows肯定是那个sorted arry的sub array

谢!

【在 F**********r 的大作中提到】
: 不是很明白startindex从0到2的过程要做些什么?这个slide window要怎么用呢?谢谢!
:
: (

avatar
F*r
5
能给个pesudo code或是个链接么? 每一步收集的sliding window对下一步有什么用呢?

【在 g***y 的大作中提到】
: 每次start index +1,然后根据前一步的bookkeeping觉得end index move到哪儿
: 然后继续bookkeeping,记录上新的start index, end index直接的A, B, C, D的个数以
: 供下一步用。
: key observation是 :这个sliding windows肯定是那个sorted arry的sub array
:
: 谢!

avatar
g*r
6
An ugly pesudo code
MinSlidingWindow(document){
number_of_words=0;
count_word={};
for(word in document){
if(word not in count_word){
count_word[word]=1;
number_of_words++;
}
}
best_interval=[1,n];
min_index=max_index=0;
count_word={};
current_words=1;
for(word in document){
if(current_words==number_of_words){
//all words are in count_word, find one candinate
if(better([min_index,max_index],best_interval){
best_interval = [min_index, max_index];
}
}
if(count_word[word]==0){
//the word never appear, add it to count_word
current_words++;
count_word[word]++;
}else{
//the word appeared before, try to delete it from front
while(count_word[document[min_index]]>1){
count_word[document[min_index]]--;
min_index++;
}
}
max_index++;
}
}
avatar
F*r
7
这两天光顾着关注地震形势了。。。写得很明白,多谢多谢啊!

【在 g**********r 的大作中提到】
: An ugly pesudo code
: MinSlidingWindow(document){
: number_of_words=0;
: count_word={};
: for(word in document){
: if(word not in count_word){
: count_word[word]=1;
: number_of_words++;
: }
: }

相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。