B*n
2 楼
难道是因为所有计算都是in memory的?看了databrick 的demo,每个cluster的内存都
是上千G的。
但内存大的话计算显然快呀,这idea不是很简单么?
新手,求科普,谢谢
是上千G的。
但内存大的话计算显然快呀,这idea不是很简单么?
新手,求科普,谢谢
z*g
8 楼
RDD can provide fault tolerance for in-memory intermediate result by only
storing very small amount of data on persistent storage. This is
particularly useful for iterative algorithms, since there is intermediate
result involved. Although in case there is not enough memory, Spark performs
exactly like Hadoop.
【在 s********k 的大作中提到】![](/moin_static193/solenoid/img/up.png)
: 一直没搞懂这个RDD,到底牛在什么地方
storing very small amount of data on persistent storage. This is
particularly useful for iterative algorithms, since there is intermediate
result involved. Although in case there is not enough memory, Spark performs
exactly like Hadoop.
【在 s********k 的大作中提到】
![](/moin_static193/solenoid/img/up.png)
: 一直没搞懂这个RDD,到底牛在什么地方
z*g
9 楼
There is nothing new about in-memory. The key point is that RDD can achieve
fault tolerance for intermediate computation results without having to
writing the whole data back to disk.
fault tolerance for intermediate computation results without having to
writing the whole data back to disk.
相关阅读
Which is the most powerful OS for programmers?[bssd]Golang还不错现在公司的DEVOPS把好多以前的传统工作机会都搞没有了 (转载)看了几个kaggle的答题,有点迷惑了关于LISP的长贴感觉版上大都搞ML, DLPhD Openings in Data Science & ML & AI-Fall18/Spr19下一波是区块链吧最近想学NLP据国内同学说:风投今年大批投区块链请问这个机器可以换个什么GPU卡 ?大家试过 h2o吗?有没有人用timescale数据库?老魏的套路需要能scale才行其实coroutine is overrated初级问题 (转载)Latest Redmonk Programming Language Rankings中國人最近也有創業成功的啊Kenneth Lin消除眼睛疲劳的视频哪个傻逼发明的蟒蛇语言,这种compiler 好蠢 (转载)