Redian新闻
>
Mars Saturn aspect in natal chart, transit and synastry
avatar
Mars Saturn aspect in natal chart, transit and synastry# astrology - 星座物语
f*r
1
For example, suppose you have ~1 million text files, each containing ~1 mill
ion positive integers, one integer per line. How would you find the top 3 in
tegers that appear most frequently?
how many machines you will use, what algorithms to run on them, how they com
municate together, the running time and space complexity of your approach, e
tc.
有兴趣的同学们,大家看看如何solve吧:)
avatar
A*y
2
avatar
i*g
3
use HDFS as the storage, per the report from Yahoo, 10000 PC(linux), each Terabytes so, this is a minor case :) anyway use as much as HDisk to enhance the I/O performance and push the CPU, there could be a trade-off
using mapreduce to clac the frequency, map for the integer hashing(nothing to do), reduce from intermediate results of map shaping for the frequency, simply, just accumulate
Even, need pratice on different OS, CPU arch(generally X86 is OK)
avatar
A*y
4
Done!
avatar
f*r
5
非常感谢有如此深度的答复:)

Terabytes so, this is a minor case :) anyway use as much as HDisk to enhance
the I/O performance and push the CPU, there could be a trade-off
to do), reduce from intermediate results of map shaping for the frequency,
simply, just accumulate

【在 i*******g 的大作中提到】
: use HDFS as the storage, per the report from Yahoo, 10000 PC(linux), each Terabytes so, this is a minor case :) anyway use as much as HDisk to enhance the I/O performance and push the CPU, there could be a trade-off
: using mapreduce to clac the frequency, map for the integer hashing(nothing to do), reduce from intermediate results of map shaping for the frequency, simply, just accumulate
: Even, need pratice on different OS, CPU arch(generally X86 is OK)

相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。