请教可以在线练习 map reduce 的地方？ - 未名空间MITBBS历史存档

国际科技财经博客移民网络热点娱乐民生时事公众号

Redian新闻

>未名空间

>JobHunting - 待字闺中

请教可以在线练习 map reduce 的地方？

请教可以在线练习 map reduce 的地方？# JobHunting - 待字闺中

a*y2013-05-29 07:05

1 楼

谢谢大牛指点了！拜谢～～

c*r2013-05-29 07:05

2 楼

有这样的地方么？

【在 a***y 的大作中提到】

: 谢谢大牛指点了！拜谢～～

w*p2013-05-29 07:05

3 楼

http://jsmapreduce.com/

j*y2013-05-29 07:05

4 楼

为啥我听人说，其实人家最关心的是到底处理过多大的数据，否则数据不大，简单的程
序不难写吧。
难的是规模很大，怎么处理的问题。所以说有什么大的数据吗？

【在 w******p 的大作中提到】

: http://jsmapreduce.com/

j*y2013-05-29 07:05

5 楼

不过这网页确实不错，简单的可以run

【在 w******p 的大作中提到】

: http://jsmapreduce.com/

s*r2013-05-29 07:05

6 楼

可以自己装个hadoop
如果只是想测试一些简单的python/perl写的mapper/reducer脚本是否work
什么都不用装 linux下通过管道测试就行了
细节可以查阅大象书中hadoop streaming一节

y*u2013-05-29 07:05

7 楼

如果想连连mapreduce算法，下面python script能模拟
MapReduce.py
import json
class MapReduce:
def __init__(self):
self.intermediate = {}
self.result = []
def emit_intermediate(self, key, value):
self.intermediate.setdefault(key, [])
self.intermediate[key].append(value)
def emit(self, value):
self.result.append(value)
def execute(self, data, mapper, reducer):
for line in data:
record = json.loads(line)
mapper(record)
for key in self.intermediate:
reducer(key, self.intermediate[key])
#jenc = json.JSONEncoder(encoding='latin-1')
jenc = json.JSONEncoder()
for item in self.result:
print jenc.encode(item)
wordcount.py
import MapReduce
import sys
"""
Word Count Example in the Simple Python MapReduce Framework
"""
mr = MapReduce.MapReduce()
# =============================
# Do not modify above this line
def mapper(record):
# key: document identifier
# value: document contents
key = record[0]
value = record[1]
words = value.split()
for w in words:
mr.emit_intermediate(w, 1)
def reducer(key, list_of_values):
# key: word
# value: list of occurrence counts
total = 0
for v in list_of_values:
total += v
mr.emit((key, total))
# Do not modify below this line
# =============================
if __name__ == '__main__':
inputdata = open(sys.argv[1])
mr.execute(inputdata, mapper, reducer)