B*n
2 楼
难道是因为所有计算都是in memory的?看了databrick 的demo,每个cluster的内存都
是上千G的。
但内存大的话计算显然快呀,这idea不是很简单么?
新手,求科普,谢谢
是上千G的。
但内存大的话计算显然快呀,这idea不是很简单么?
新手,求科普,谢谢
z*g
8 楼
RDD can provide fault tolerance for in-memory intermediate result by only
storing very small amount of data on persistent storage. This is
particularly useful for iterative algorithms, since there is intermediate
result involved. Although in case there is not enough memory, Spark performs
exactly like Hadoop.
【在 s********k 的大作中提到】
: 一直没搞懂这个RDD,到底牛在什么地方
storing very small amount of data on persistent storage. This is
particularly useful for iterative algorithms, since there is intermediate
result involved. Although in case there is not enough memory, Spark performs
exactly like Hadoop.
【在 s********k 的大作中提到】
: 一直没搞懂这个RDD,到底牛在什么地方
z*g
9 楼
There is nothing new about in-memory. The key point is that RDD can achieve
fault tolerance for intermediate computation results without having to
writing the whole data back to disk.
fault tolerance for intermediate computation results without having to
writing the whole data back to disk.
相关阅读
Android论坛框架请推荐vim的javascript plugin知乎网页的防复制功能是怎么实现的?python pickle 目的是什么js里怎么一批批地执行异步操作?码农越来越不容易了TFS可以生成SubVersion repo吗?python pickle 目的是什么Java 在ML和DL还不错了华为要把45岁以上的非管理人员都赶走,你们怎么看?Tensorflow's Mandelbrot Set Tutorial自动码农机器永远也不会出来 (转载)WTF is docker EE请问Java和Java EE的区别是啥?DL一个基础问题:能否请推荐个社区系统可以双向同步Slack的?a way to generate video automaticallyTime series big data大家觉得怎么存储比较好?java真是让人纠结ThreadLocal可以这样用吗?