Redian新闻
>
装了CM9 alpha 2以后,touchpad就不认SD卡了
avatar
装了CM9 alpha 2以后,touchpad就不认SD卡了# PDA - 掌中宝
b*x
1
1. what have you done recently at work? and which was the most proud thing
you have done
2. implement a hashtabel in pseudocode
3. how to delete duplicate lines (both verbal and pseudocode)
4. after we enter a url in the browser and before we get the page, what
happened?
5. if we have 200 million GB of lines, but we have only 1 GB of memory, how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....大家谁可以说说怎么答)
明天或者周一就知道结果。
avatar
m*g
3
有没有人跟我一样的遭遇?
一直显示“preparing SD card”
打开setting的storage,就会死掉,然后重启。重启以后就没有下面的bar了。
然后怎么进入以后都看不到SD里的东西。但是在Mod里能看见。
avatar
d*e
4
4. 怎么回答?
5. 是用外部排序吗?

how

【在 b**********x 的大作中提到】
: 1. what have you done recently at work? and which was the most proud thing
: you have done
: 2. implement a hashtabel in pseudocode
: 3. how to delete duplicate lines (both verbal and pseudocode)
: 4. after we enter a url in the browser and before we get the page, what
: happened?
: 5. if we have 200 million GB of lines, but we have only 1 GB of memory, how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....大家谁可以说说怎么答)
: 明天或者周一就知道结果。

avatar
b*x
6
5. 忘了说了,不让排序

【在 d**e 的大作中提到】
: 4. 怎么回答?
: 5. 是用外部排序吗?
:
: how

avatar
c*7
7
5.最直接的一条跟剩余的比,时间n+(n-1)+...1=O(n^2).不知道能否用hash。
avatar
d*e
8
刚才google了一下,好像只看到三个方法
1. sort
2. hash
3. 就是一条条比较
不知还有什么好的方法我漏了。

【在 c*********7 的大作中提到】
: 5.最直接的一条跟剩余的比,时间n+(n-1)+...1=O(n^2).不知道能否用hash。
avatar
P*b
9
hash的空间也不够吧。

【在 d**e 的大作中提到】
: 刚才google了一下,好像只看到三个方法
: 1. sort
: 2. hash
: 3. 就是一条条比较
: 不知还有什么好的方法我漏了。

avatar
d*e
10
对这道题来说是不够。
我说的是类似的题,我只搜索到这三种方法。
所以空间不够的情况,是否就只能用上面同学提到的一行行比较,O(n^2)

【在 P*******b 的大作中提到】
: hash的空间也不够吧。
avatar
h*6
11
可以试试多次hash,先把hash结果存在硬盘上,然后定义与某记录hash1值相同的输入
为集合S1,定义与该记录hash2值相同的输入为集合S2,......
然后找出S1、S2……的交集,如果不为空,再一条一条与该记录进行比较。
avatar
b*n
12
我的理解是考mapreduce之类的东东.
可以将hash values按range分成1GB的chunks.多次load/save I/O operations.

【在 h**6 的大作中提到】
: 可以试试多次hash,先把hash结果存在硬盘上,然后定义与某记录hash1值相同的输入
: 为集合S1,定义与该记录hash2值相同的输入为集合S2,......
: 然后找出S1、S2……的交集,如果不为空,再一条一条与该记录进行比较。

avatar
A*H
13
agree, typical map/reduce work

【在 b*****n 的大作中提到】
: 我的理解是考mapreduce之类的东东.
: 可以将hash values按range分成1GB的chunks.多次load/save I/O operations.

avatar
K*g
14
第5题,han6的方法不错。我觉得也可以用bloom filter来做。

how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....
大家谁可以说说怎么答)

【在 b**********x 的大作中提到】
: 1. what have you done recently at work? and which was the most proud thing
: you have done
: 2. implement a hashtabel in pseudocode
: 3. how to delete duplicate lines (both verbal and pseudocode)
: 4. after we enter a url in the browser and before we get the page, what
: happened?
: 5. if we have 200 million GB of lines, but we have only 1 GB of memory, how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....大家谁可以说说怎么答)
: 明天或者周一就知道结果。

avatar
s*5
15

how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....
大家谁可以说说怎么答)
"in pseudocode" means you will write code on google docs during the
interview?
Thanks!

【在 b**********x 的大作中提到】
: 1. what have you done recently at work? and which was the most proud thing
: you have done
: 2. implement a hashtabel in pseudocode
: 3. how to delete duplicate lines (both verbal and pseudocode)
: 4. after we enter a url in the browser and before we get the page, what
: happened?
: 5. if we have 200 million GB of lines, but we have only 1 GB of memory, how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....大家谁可以说说怎么答)
: 明天或者周一就知道结果。

avatar
K*g
16
请问谁能给个实现hashtable的pseudocode啊?
另外第三题是怎么做呢,就是把每条line放到hash table里去吗?或者还有其他的做法?

how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....
大家谁可以说说怎么答)

【在 b**********x 的大作中提到】
: 1. what have you done recently at work? and which was the most proud thing
: you have done
: 2. implement a hashtabel in pseudocode
: 3. how to delete duplicate lines (both verbal and pseudocode)
: 4. after we enter a url in the browser and before we get the page, what
: happened?
: 5. if we have 200 million GB of lines, but we have only 1 GB of memory, how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....大家谁可以说说怎么答)
: 明天或者周一就知道结果。

avatar
d*e
17
CLRS里面hashtable一章的pseudo code应该可以吧?

法?
..

【在 K******g 的大作中提到】
: 请问谁能给个实现hashtable的pseudocode啊?
: 另外第三题是怎么做呢,就是把每条line放到hash table里去吗?或者还有其他的做法?
:
: how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....
: 大家谁可以说说怎么答)

avatar
s*n
19
200M还是200kTB?

how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....
大家谁可以说说怎么答)

【在 b**********x 的大作中提到】
: 1. what have you done recently at work? and which was the most proud thing
: you have done
: 2. implement a hashtabel in pseudocode
: 3. how to delete duplicate lines (both verbal and pseudocode)
: 4. after we enter a url in the browser and before we get the page, what
: happened?
: 5. if we have 200 million GB of lines, but we have only 1 GB of memory, how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....大家谁可以说说怎么答)
: 明天或者周一就知道结果。

avatar
s*5
20

法?
..
每条line放到hash table
====
不用每条吧,只是要loop through every line. 每次check 这个line在不在那个
Hashtable里面。
另外就是用linux里面的grep或者awk搞定。。。

【在 K******g 的大作中提到】
: 请问谁能给个实现hashtable的pseudocode啊?
: 另外第三题是怎么做呢,就是把每条line放到hash table里去吗?或者还有其他的做法?
:
: how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....
: 大家谁可以说说怎么答)

avatar
c*2
21
how about doing this for question#5:
design a function to hash each line to an integer between 0 to 1G
create a bit map of 1G
read a line and hash it to an idx, set bit idx to 1
read next line , hash it, if that bit is 1, discard
...
avatar
m*l
22
define duplicated lines.
identical at bitwise?
I think it's similar to how anti-virus find virus signatures?

thing
how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....大
家谁可以说说怎么答)

【在 b**********x 的大作中提到】
: 1. what have you done recently at work? and which was the most proud thing
: you have done
: 2. implement a hashtabel in pseudocode
: 3. how to delete duplicate lines (both verbal and pseudocode)
: 4. after we enter a url in the browser and before we get the page, what
: happened?
: 5. if we have 200 million GB of lines, but we have only 1 GB of memory, how to remove the duplicate lines? no sorting allowed.(这个题我没答上来....大家谁可以说说怎么答)
: 明天或者周一就知道结果。

avatar
m*l
23
since you are allowed to use space, why not build a tree as you go?
e.g. build a tree by first character, then its child is the second
character, etc. the leave is where you keep track of # of duplicate
counts and locations.
that way, you don't even need 1 MB of memory and run thru the source
only one. (of course, you need a lot of space, but way less than 200
million GB. )
相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。