Google go 还挺不错的 - 未名空间MITBBS历史存档

Google go 还挺不错的# Linux - Linux 操作系统

y*02011-01-03 08:01

1 楼

谢谢大家提供信息

D*92011-01-03 08:01

2 楼

好几千交了就白交了？

f*u2011-01-03 08:01

3 楼

1楼
看了很多道友的帖子,也看了很多的比喻,我想我们应该从凡人这本书的角度说幕MM的结
局.不要种马小说看多了觉得主角遇到的女人有点关系的女人就得都收了,就算收小妾我
觉得也轮不到幕MM吧(银月,元瑶等都比她戏多,和主角关系也不错)好了,说说我的几点
看法.
1,从凡人书的角度看,主角是在修仙界,女人没实力就是男人的玩物、炉鼎.女人想活的
舒服活的久就的抱上大神通者.不提幕MM当初怎么跟的主角,现在主角提出俩条件她还不
知足(很符合主角的性格和修仙界的规则,你一没实力二主角对你没感情现在提出这俩条
件正好主角的性格),还非的要当真正的妾,你有没有想过自己当初是怎么跟主角的,想没
想过你平什么当真正的妾,况且主角对其没什么感情.感觉幕MM很不知足(好几次主角提
出俩条件她非得自己在搞个出来,你想没想过自己现在是什么身份在和什么级别的人说
话,也就主角不是狠辣嗜杀之人,换别的元樱级别人物,你这样那样早一掌灭了吧,)
2,在乱星海外海那个小女孩不就是感动了主角么,从而不紧救了父亲自己又得到了一份
机缘.从而看出主角不是无情之人,当初主角拒绝幕MM后说过不想刚当长老就让别人有种
以势压人的感觉

S*A2011-01-03 08:01

4 楼

最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
整体感觉，填补了 C 和 Python 中间的空白。
和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。
最近在玩一些 Graph 有 8M 结点, 200M 条边. Python 最最优化的程序装
进来也要几分钟。我已经恨不得给 Python 手写 C module 来装如那些数据了。
Python 有个地方很讨厌，slice of array[a:b] 会得到一个 copy. 这个
对一般程序没有什么问题。到很大规模的这个 copy 直接导致没法原地使用
mmap 的 buffer, 浪费很多内存。
用 C 写就是简单 dictionary 还要自己搞比较麻烦。
然后我的 go 的第一个程序是玩玩这个大数据的装载。结果发现用 mmap
还要用很多 unsafe 的指针转换，要写些 code 是为了避开 type system
的限制直接使用 mmap buffer 的内容。但是总体上还行，可以实现我想用 C
实现的。除了 mmap buffer 有的繁琐以外，其他的部分感觉还是很好的。
那个 map 还是很方便的。感觉比 C++ 干净简单多了。没有 Python 简洁，
大概 80% 介于 C 和 Python 中间，偏向 C。
写 Web Sever 什么的应该很好用。
和 C 比多了内存管理，

m*i2011-01-03 08:01

5 楼

去航空网站注册一个历程卡，然后把机票信息填进去，飞完后会自动加上的

【在 y*****0 的大作中提到】

: 谢谢大家提供信息

D*92011-01-03 08:01

6 楼

ask?

f*u2011-01-03 08:01

7 楼

可怜的慕MM，想当个妾都这么不容易。。。

b*s2011-01-03 08:01

8 楼

thanks for sharing

p*t2011-01-03 08:01

9 楼

等你退休了有钱拿。现在报税没得退吧。

【在 D****9 的大作中提到】

: 好几千交了就白交了？

a*o2011-01-03 08:01

10 楼

真是的，要不找版上兄弟们要个名分得了。

【在 f*u 的大作中提到】

: 可怜的慕MM，想当个妾都这么不容易。。。

F*i2011-01-03 08:01

11 楼

SSA:
have you try numpy array for the slicing? which supposes to return
a *view* instead of copy.
btw: do you have any trick about bypassing python's gil?
thanks,

【在 S*A 的大作中提到】

: 最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
: 整体感觉，填补了 C 和 Python 中间的空白。
: 和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
: 都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
: 个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
: 语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
: 的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
: Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
: 这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
: basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。

f*u2011-01-03 08:01

12 楼

嗯，大家来抢吧，谁给我伪币多幕MM就嫁谁，LOL。

【在 a*o 的大作中提到】

: 真是的，要不找版上兄弟们要个名分得了。

f*Q2011-01-03 08:01

13 楼

如果不做界面的话objC在linux上面有。

a*o2011-01-03 08:01

14 楼

ft，难道你就是慕mm在本版潜水？

【在 f*u 的大作中提到】

: 嗯，大家来抢吧，谁给我伪币多幕MM就嫁谁，LOL。

S*A2011-01-03 08:01

15 楼

你说的对，numpy slice 的确是使用 view. 我也试过了，对于我的应用
不行。我的问题是数据量很大，就算用 mmap 也要非常有技巧。
问题是 numpy 的 slice object 比 python array.array
费很多内存。我估计 numpy slice 在 200 byte 左右。这样 slice 数目
多上去以后总的内存还是没有省。
import numpy
fp = numpy.memmap("big-file-over-1G-byte", dtype='uint32')
x = [ fp[i] for i in xrange(0, len(fp), 50)]
# load all the thing in memory. Python should take 1.x G here
x = [ fp[i:i+1] for i in xrange(0, len(fp), 50)]
# Python take over 2G here.
不知道你要问什么 bypassing.
我会写 Python 的 C module. 代码看上去很多，但是其实很多是重复的，
如果你写过一次很多就可以 copy/paste. 真正要改的其实很少。
然后 load module 把数据传给 C 处理。
我现在最新的策略是使用 binary 的数据格式，把中间结果存到 file
里面。然后用 go 读进来处理。做 NxN 循环什么的 go 基本上和 C
很接近，比 python 快不知道多少倍。当然写 go 比 python 慢很多，
只是用在工作量大的那些活。
python 做探索性的工作还是很爽的。

【在 F*******i 的大作中提到】

: SSA:
: have you try numpy array for the slicing? which supposes to return
: a *view* instead of copy.
: btw: do you have any trick about bypassing python's gil?
: thanks,

f*u2011-01-03 08:01

16 楼

我帮妄语写写，把韩立改写成该男子好了。

【在 a*o 的大作中提到】

: ft，难道你就是慕mm在本版潜水？

S*A2011-01-03 08:01

17 楼

我试过，根本没法用。没有什么人维护。
问题在于 ObjC 的 Foundary ，如果没有 NS*
那套的话，ObjC 什么都干不了。
Linux 里面的 Foundary 模仿 Apple 都很烂，
我试过好几个版本，最简单的 Dictionary 链接都通不过。
Apple 的 Foundary 理论上 Open Source, 但是只有老调牙的版本，
Linux上编译还不通过，要改很多地方。
我后来就彻底放弃了。
Google Go 要比 Objective C 强，写起来快。唯不如的地方
是 Objective C 可以直接调用 C code. Go 要写 binding.
其他内存管理和简洁程度，可读性 Go 都要好些。Go 也完全借用
了 ObjC 那套 Interface 的东西。ObjC 内存管理很繁琐，
有 GC 以后也没有逆转这个局面。

【在 f*****Q 的大作中提到】

: 如果不做界面的话objC在linux上面有。

c*n2011-01-03 08:01

18 楼

我觉着凡人没有把韩立写成一个种马也是其成功的原因之一
就像我最近在开始看的仙葫说的一样，大道3000，怎样才能长生呢
感情戏在修仙文中太多了会喧宾夺主

bz2011-01-03 08:01

19 楼

基本上所有的GOOGLE的东东都要小心，不仅是BUG多，而且SERVICE TERM都是说变就变
的。玩玩还行，严肃的活就要谨慎了。

【在 S*A 的大作中提到】

: 最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
: 整体感觉，填补了 C 和 Python 中间的空白。
: 和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
: 都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
: 个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
: 语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
: 的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
: Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
: 这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
: basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。

S*n2011-01-03 08:01

20 楼

哈哈

【在 a*o 的大作中提到】

: ft，难道你就是慕mm在本版潜水？

r*n2011-01-03 08:01

21 楼

Appreciate for your sharing.

【在 S*A 的大作中提到】

: 最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
: 整体感觉，填补了 C 和 Python 中间的空白。
: 和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
: 都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
: 个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
: 语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
: 的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
: Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
: 这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
: basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。

e*y2011-01-03 08:01

22 楼

哈哈

S*A2011-01-03 08:01

23 楼

我不是 Google 的，但是我可以说这个完全可以放心。这个和使用 online service
不一样, Open Source 的就收不回来了。 Gcc 已经接纳 Go 为一部分了，后面 gcc
collection 就有 go 了，这样没有 Google, go 也可以独立存活了。
和所有 Open Source 一样，关键是要有 Community. 我看了一下 go-nuts mailing
list. 感觉 Community 还不错，bug fix 也很快。肯定比 gcc 快多了。关键是有几
个老 ATT Lab 牛人在里面做比较正确的 trade off. 到了这个分上我觉得已经可以
放心了。
C 和 Python 中间的空白始终是要有东西填的。是好东西就会有人用，Go
使用比较爽而且速度快，这个我完全可以感觉到还挺 additive 的。
我相信从历史的角度来看，Go 应该会是挺成功的。

【在 bz 的大作中提到】

: 基本上所有的GOOGLE的东东都要小心，不仅是BUG多，而且SERVICE TERM都是说变就变
: 的。玩玩还行，严肃的活就要谨慎了。

r*n2011-01-03 08:01

24 楼

请问lz有没有关注过D，你觉得他跟go比怎么样。他们的目标基本是一样的，也是为了c
和python直间取
平衡。
跟D比，go 的语法看上去并不简洁，我都没有认真研究过，所以只是感觉而已。

【在 S*A 的大作中提到】

: 我不是 Google 的，但是我可以说这个完全可以放心。这个和使用 online service
: 不一样, Open Source 的就收不回来了。 Gcc 已经接纳 Go 为一部分了，后面 gcc
: collection 就有 go 了，这样没有 Google, go 也可以独立存活了。
: 和所有 Open Source 一样，关键是要有 Community. 我看了一下 go-nuts mailing
: list. 感觉 Community 还不错，bug fix 也很快。肯定比 gcc 快多了。关键是有几
: 个老 ATT Lab 牛人在里面做比较正确的 trade off. 到了这个分上我觉得已经可以
: 放心了。
: C 和 Python 中间的空白始终是要有东西填的。是好东西就会有人用，Go
: 使用比较爽而且速度快，这个我完全可以感觉到还挺 additive 的。
: 我相信从历史的角度来看，Go 应该会是挺成功的。

f*Q2011-01-03 08:01

25 楼

我在linux下面写过一点儿objC。把GNUStep装好的话还行。最大的问题是路径设置，我
估计你说的连接不上的问题也是路径问题，这哥们就帮不上什么忙了。
你说的Foundation的那些东西都在GNUstep Base包里面。我刚看了一下，GNUstep Base
里面的NSObject.m的日期是10年6月份。也不能算老掉牙了。

【在 S*A 的大作中提到】

: 我试过，根本没法用。没有什么人维护。
: 问题在于 ObjC 的 Foundary ，如果没有 NS*
: 那套的话，ObjC 什么都干不了。
: Linux 里面的 Foundary 模仿 Apple 都很烂，
: 我试过好几个版本，最简单的 Dictionary 链接都通不过。
: Apple 的 Foundary 理论上 Open Source, 但是只有老调牙的版本，
: Linux上编译还不通过，要改很多地方。
: 我后来就彻底放弃了。
: Google Go 要比 Objective C 强，写起来快。唯不如的地方
: 是 Objective C 可以直接调用 C code. Go 要写 binding.

S*A2011-01-03 08:01

26 楼

不是路径问题，是很神秘的某些 memory 相关的 fucntion stub symbol missing.
有些版本的是可以连上的，但是一跑就 segfault. 可能是我当时的 gcc
的版本和 objC runtime 有什么不 match 的东西。路径那个我记得是搞过的，
应该不是那个问题。但是 community 很小是事实。
Base
我说的是 Apple release 的 CF-lite. 不是 GNUstep. GNUstep 我没有
搞过去。也有可能是什么弱智问题。很久很久以前我是在 Linux 上改过
ObjC 的 code，那时候是直接可以用的。后来我玩 iphone 那阵子搞过
linux 上的 ObjC, 就没有过。
反正在 Apple 里面用还行，出了 Apple 很不爽。你用的是 gcc 什么版本
的？

【在 f*****Q 的大作中提到】

: 我在linux下面写过一点儿objC。把GNUStep装好的话还行。最大的问题是路径设置，我
: 估计你说的连接不上的问题也是路径问题，这哥们就帮不上什么忙了。
: 你说的Foundation的那些东西都在GNUstep Base包里面。我刚看了一下，GNUstep Base
: 里面的NSObject.m的日期是10年6月份。也不能算老掉牙了。

S*A2011-01-03 08:01

27 楼

我没有关注过 D, 刚刚看了一下，5 分钟的感觉是 D 基本上是 sugar code
c++. 使用 c++ 那样的 template，基本上没戏。有一个大的区别是 D
memory 看上去不是 safe 的。你可以玩玩指针搞个 segfault.
go 如果你stay 在 type system 里面，基本上是 safe 的。在 go 里面
需要 explicit 使用 unsafe package 才能接触到指针。
接近指针的 buildin type 只有 slice. slice 都是有 limit check
所以不会越界。所以这一点对懒人来说挺好的。不太容易有犯错的机会。
同时也注定了 go 不适合拿来写些 kernel.
我觉得 D 的没有 Go 那样比较清楚的 vision. 境界完全不一样。

了c

【在 r*******n 的大作中提到】

: 请问lz有没有关注过D，你觉得他跟go比怎么样。他们的目标基本是一样的，也是为了c
: 和python直间取
: 平衡。
: 跟D比，go 的语法看上去并不简洁，我都没有认真研究过，所以只是感觉而已。

F*i2011-01-03 08:01

28 楼

Thanks! I didn't realize numpy's view/slice has such overhead :)
The Gil I mean, is
http://wiki.python.org/moin/GlobalInterpreterLock
which looks like a pain of ass to me.
Have you tried cython/ctypes?
You should be able to write very fast loop with cython/ctypes.

【在 S*A 的大作中提到】

: 你说的对，numpy slice 的确是使用 view. 我也试过了，对于我的应用
: 不行。我的问题是数据量很大，就算用 mmap 也要非常有技巧。
: 问题是 numpy 的 slice object 比 python array.array
: 费很多内存。我估计 numpy slice 在 200 byte 左右。这样 slice 数目
: 多上去以后总的内存还是没有省。
: import numpy
: fp = numpy.memmap("big-file-over-1G-byte", dtype='uint32')
: x = [ fp[i] for i in xrange(0, len(fp), 50)]
: # load all the thing in memory. Python should take 1.x G here
: x = [ fp[i:i+1] for i in xrange(0, len(fp), 50)]

S*A2011-01-03 08:01

29 楼

哦，我的程序是单线程的，所以不太关心这个。再说我
本来就不指望 Python 跑得快，还是写得快最重要。
我不喜欢 ctypes. cpython 好像就是普通的 python, 除非你另外
有所指。我要是用 C 的话还不如直接写个 C module 呢。中间
还少经过一层东西。编译链接进 python 也容易，不用自己搞
dynamic lib loading, 直接 import 进来用就行了。最爽的是
还可以 reload().

【在 F*******i 的大作中提到】

: Thanks! I didn't realize numpy's view/slice has such overhead :)
: The Gil I mean, is
: http://wiki.python.org/moin/GlobalInterpreterLock
: which looks like a pain of ass to me.
: Have you tried cython/ctypes?
: You should be able to write very fast loop with cython/ctypes.

S*A2011-01-03 08:01

30 楼

刚刚发现 go 居然可以用中文变量名：
package main
import (
"fmt"
)
func main() {
数字:= 1
fmt.Printf("变态阿: %d\n", 数字)
}

F*i2011-01-03 08:01

31 楼

thats true :)
I mean, cython, not cpython, it may be used for fast prototyping
while main ok speed.

【在 S*A 的大作中提到】

: 刚刚发现 go 居然可以用中文变量名：
: package main
: import (
: "fmt"
: )
: func main() {
: 数字:= 1
: fmt.Printf("变态阿: %d\n", 数字)
: }

N*n2011-01-03 08:01

32 楼

Programming Language version of Google Wave.

【在 S*A 的大作中提到】

: 最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
: 整体感觉，填补了 C 和 Python 中间的空白。
: 和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
: 都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
: 个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
: 语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
: 的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
: Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
: 这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
: basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。

N*w2011-01-03 08:01

33 楼

偶觉得 java 那样的语法就简单好用
java 编译成 native 肯定不难
可惜。。。都要搞 byte code JIT

【在 S*A 的大作中提到】

: 刚刚发现 go 居然可以用中文变量名：
: package main
: import (
: "fmt"
: )
: func main() {
: 数字:= 1
: fmt.Printf("变态阿: %d\n", 数字)
: }

S*A2011-01-03 08:01

34 楼

喔，我土，看成 cpython 了，多谢分享，book mark 了。
看上去还成，应该有些地方能够用上。
这就是隔着 python 写 C, 隔靴挠痒，哈哈。

【在 F*******i 的大作中提到】

: thats true :)
: I mean, cython, not cpython, it may be used for fast prototyping
: while main ok speed.

r*t2011-01-03 08:01

35 楼

cython != cpython

【在 S*A 的大作中提到】

: 喔，我土，看成 cpython 了，多谢分享，book mark 了。
: 看上去还成，应该有些地方能够用上。
: 这就是隔着 python 写 C, 隔靴挠痒，哈哈。

F*i2011-01-03 08:01

36 楼

hehe.

【在 S*A 的大作中提到】

: 喔，我土，看成 cpython 了，多谢分享，book mark 了。
: 看上去还成，应该有些地方能够用上。
: 这就是隔着 python 写 C, 隔靴挠痒，哈哈。

S*A2011-01-03 08:01

37 楼

不一样吧，这个 go 以前没有类似的语言，实现到这个地步的。
我一直在找， ObjC 大概是最接近的。
我非常相信 go 以后会是很经典的语言。

【在 N********n 的大作中提到】

:
: Programming Language version of Google Wave.

S*A2011-01-03 08:01

38 楼

VM, byte code 不是问题，
LLVM 不是也有自己的 byte code 么，也挺好。
关键是 Java VM 的 byte code 和 type 很难直接 map 到现在
流行的机器上面，它的 byte code 太高层了一点。

【在 N****w 的大作中提到】

: 偶觉得 java 那样的语法就简单好用
: java 编译成 native 肯定不难
: 可惜。。。都要搞 byte code JIT

r*t2011-01-03 08:01

39 楼

from itertools import islice
这个是最基本的
or use MemoryView if you are on >2.7, at least use buffer, I think that
directly works with mmap.
You problem was others problem before, and python is better than you
thought.

【在 S*A 的大作中提到】

: VM, byte code 不是问题，
: LLVM 不是也有自己的 byte code 么，也挺好。
: 关键是 Java VM 的 byte code 和 type 很难直接 map 到现在
: 流行的机器上面，它的 byte code 太高层了一点。

N*w2011-01-03 08:01

40 楼

所以要直接编译成 native 就好了

【在 S*A 的大作中提到】

: VM, byte code 不是问题，
: LLVM 不是也有自己的 byte code 么，也挺好。
: 关键是 Java VM 的 byte code 和 type 很难直接 map 到现在
: 流行的机器上面，它的 byte code 太高层了一点。

S*A2011-01-03 08:01

41 楼

FT，看来好久不出来混了，都不知到有这样的东西。
不过这个 islice 不能解决我的内存问题。我的图有
200M 条边，一条边需要三个 python object:
(a,b), 一个 tuple 两个 int. 然后还要一个
N(nodes) mapping 来指向这个 (a,b).
box every python objects 就超过 4G 了，我的
机器玩不了。

【在 r****t 的大作中提到】

: from itertools import islice
: 这个是最基本的
: or use MemoryView if you are on >2.7, at least use buffer, I think that
: directly works with mmap.
: You problem was others problem before, and python is better than you
: thought.

S*A2011-01-03 08:01

42 楼

Nope, array.array don't accept mmap buffer views.
If you know how to do that, please show me. I do want to learn.
I want to write some thing like that:
bigbuffer = mmap.mmap('some_1G_file')
node = {}
for n, offset,len in edges:
node[n] = array.array('i', bigbuffer[offset:offset+len])
但是这里 array 不能直接接受 bigbuffer 的 memory view.
array 做了一个 copy. 我试过其他都不行，你要是知道一定要告诉我。

【在 r****t 的大作中提到】

: from itertools import islice
: 这个是最基本的
: or use MemoryView if you are on >2.7, at least use buffer, I think that
: directly works with mmap.
: You problem was others problem before, and python is better than you
: thought.

r*t2011-01-03 08:01

43 楼

为啥所有的边需要同时在内存里面？有这个必要？
最不济 MemoryView 基本和 C 用的内存差不多了。
大问题不能指望全部在内存里面解，你这个问题多半需要用矩阵操作，
早点用 h5py/pytables

【在 S*A 的大作中提到】

:
: Nope, array.array don't accept mmap buffer views.
: If you know how to do that, please show me. I do want to learn.
: I want to write some thing like that:
: bigbuffer = mmap.mmap('some_1G_file')
: node = {}
: for n, offset,len in edges:
: node[n] = array.array('i', bigbuffer[offset:offset+len])
: 但是这里 array 不能直接接受 bigbuffer 的 memory view.
: array 做了一个 copy. 我试过其他都不行，你要是知道一定要告诉我。

S*A2011-01-03 08:01

44 楼

200M edges, 8M nodes. 你自己算算就知道了。
要装进 4G 的机器是不容易的。
numpy 都用上了，你没有考古。numpy memory 都占太多。
我不需要矩阵运算，就是图相关的一些运算。

【在 r****t 的大作中提到】

: 为啥所有的边需要同时在内存里面？有这个必要？
: 最不济 MemoryView 基本和 C 用的内存差不多了。
: 大问题不能指望全部在内存里面解，你这个问题多半需要用矩阵操作，
: 早点用 h5py/pytables

N*w2011-01-03 08:01

45 楼

用 64位系统吧，就不受 4G 限制了
swap 开大点。。。
速度肯定慢就是了

【在 S*A 的大作中提到】

:
: 200M edges, 8M nodes. 你自己算算就知道了。
: 要装进 4G 的机器是不容易的。
: numpy 都用上了，你没有考古。numpy memory 都占太多。
: 我不需要矩阵运算，就是图相关的一些运算。

S*A2011-01-03 08:01

46 楼

我本来就是 64 位系统，我有 4G memory.
你也太小看我了。还 swap 呢。我有办法
load 进 python. 就是装载慢点。不用 swap.
算了，看来你对这种规模的运算没有什么感觉，
不说了。

【在 N****w 的大作中提到】

: 用 64位系统吧，就不受 4G 限制了
: swap 开大点。。。
: 速度肯定慢就是了

r*t2011-01-03 08:01

47 楼

这个 edges 已经有 4G? 算法需要 keep every edge in memory?
如果是的话啥办法也没有,
只能建议 node value 都用 lazy evaluation. 用
到了再到 mmap 里面去读，再小的 view 也禁不住你有 200M edge 阿。
这个需要写个小 class, 如果你的图不是很稀的话应该没啥问题.
只能用 cpu 来换内存，这类问题早就应该分布开写成 mpi 的，一个
node loads a part of graph. 全部 load 到内存，就算 c 实现也
好不到那里去。不能上 h5py, 早点上数据库好了。

【在 S*A 的大作中提到】

: 我本来就是 64 位系统，我有 4G memory.
: 你也太小看我了。还 swap 呢。我有办法
: load 进 python. 就是装载慢点。不用 swap.
: 算了，看来你对这种规模的运算没有什么感觉，
: 不说了。

r*n2011-01-03 08:01

48 楼

多谢回复。
这是我最近本版看到最好的帖子。

【在 S*A 的大作中提到】

: 我没有关注过 D, 刚刚看了一下，5 分钟的感觉是 D 基本上是 sugar code
: c++. 使用 c++ 那样的 template，基本上没戏。有一个大的区别是 D
: memory 看上去不是 safe 的。你可以玩玩指针搞个 segfault.
: go 如果你stay 在 type system 里面，基本上是 safe 的。在 go 里面
: 需要 explicit 使用 unsafe package 才能接触到指针。
: 接近指针的 buildin type 只有 slice. slice 都是有 limit check
: 所以不会越界。所以这一点对懒人来说挺好的。不太容易有犯错的机会。
: 同时也注定了 go 不适合拿来写些 kernel.
: 我觉得 D 的没有 Go 那样比较清楚的 vision. 境界完全不一样。
:

S*A2011-01-03 08:01

49 楼

不是，我没有说清楚。描述 edge 的 binary file 有 1.5 G byte.
edge 大概有 200M 条， node 有 8M 个。edge 有两个 node, 200M
x 8 byte, 1.5G 差不多。
有方案可以让 keep every thing in memory, 和你建议的方案，swap,
mpi, database，你说换了你用哪一个？
办法是有的。好吧，我把我的秘密办法告诉你吧。最简单的一点观察就是
希望用越少的 python object 越好。每个 sizeof(PyObject)大概 50 byte
起步。那这么多 object 如何表述呢，最好的就是用 array.array.
你可以把很多个 32 位整数 pack 进去，占用一个 PyObject.
8M 个 node, 每个 node 一个 array object 就是 8 x 50 = 400M
还是比你每条边用三个 python object 表示 200Mx 50 x 3 = 30G 小很多。
用这个方法就可以装下了，但是有个缺点，python 占用 3.xG 左右。
这样就把 1.5G 的文件 caching 给冲出去了。结果每次启动都要
从新从硬盘读那个大文件。这个是我想优化的。
你的这些建议还不如我的每次从新读 1G 文件的方案呢。你要是自己
搞一个这样的项目玩玩就知道这样的方案要浪费你很多时间的。
1 node loads part of graph. 很难协调的。关键是算法不能事先
知道那些 graph 有用。
我还真写了个 GO 的版本，使用上了用 mmap 来 store data 的方案。
算法本身不用那么多内存，结果是，我的 1.5G 文件 cache 不会被冲
掉。我每次启动只需要 5 秒钟就把全部 graph 装进去了。不用碰硬盘。
消耗内存基主要就是 binary file 的大小，就是 file cache。
Python 版本最好版本是 30+ 秒钟，狂读硬盘，但是启动之后就好了。
你的数据库方案每次访问 node 还要 inquiry db, 切换进程. 就直接
奔等几个小时甚至几天以上去吧。

【在 r****t 的大作中提到】

: 这个 edges 已经有 4G? 算法需要 keep every edge in memory?
: 如果是的话啥办法也没有,
: 只能建议 node value 都用 lazy evaluation. 用
: 到了再到 mmap 里面去读，再小的 view 也禁不住你有 200M edge 阿。
: 这个需要写个小 class, 如果你的图不是很稀的话应该没啥问题.
: 只能用 cpu 来换内存，这类问题早就应该分布开写成 mpi 的，一个
: node loads a part of graph. 全部 load 到内存，就算 c 实现也
: 好不到那里去。不能上 h5py, 早点上数据库好了。

E*V2011-01-03 08:01

50 楼

C++ stl 不是有map啥的么

【在 S*A 的大作中提到】

: 最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
: 整体感觉，填补了 C 和 Python 中间的空白。
: 和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
: 都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
: 个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
: 语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
: 的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
: Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
: 这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
: basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。

j*a2011-01-03 08:01

51 楼

算法我不懂不过你上8g内存吧实在不是什么遥不可及的事情

【在 S*A 的大作中提到】

:
: 不是，我没有说清楚。描述 edge 的 binary file 有 1.5 G byte.
: edge 大概有 200M 条， node 有 8M 个。edge 有两个 node, 200M
: x 8 byte, 1.5G 差不多。
: 有方案可以让 keep every thing in memory, 和你建议的方案，swap,
: mpi, database，你说换了你用哪一个？
: 办法是有的。好吧，我把我的秘密办法告诉你吧。最简单的一点观察就是
: 希望用越少的 python object 越好。每个 sizeof(PyObject)大概 50 byte
: 起步。那这么多 object 如何表述呢，最好的就是用 array.array.
: 你可以把很多个 32 位整数 pack 进去，占用一个 PyObject.

S*A2011-01-03 08:01

52 楼

13 inch 白色 macbook， 4G 到头了。
还有个原因是精打细算，因地制宜解决问题而沾沾自喜。
To me that is a good part of the fun.

【在 j*a 的大作中提到】

: 算法我不懂不过你上8g内存吧实在不是什么遥不可及的事情

S*A2011-01-03 08:01

53 楼

我就是觉得 C++ stl 太 evil 了。抵制。
其实也是不会用，也不想学那个，太复杂了，no fun。

【在 E*V 的大作中提到】

: C++ stl 不是有map啥的么

E*V2011-01-03 08:01

54 楼

stl 用起来还行，
自己写template那个菜烦人

【在 S*A 的大作中提到】

: 我就是觉得 C++ stl 太 evil 了。抵制。
: 其实也是不会用，也不想学那个，太复杂了，no fun。

j*a2011-01-03 08:01

55 楼

这么大的运算在笔记本做是自虐行为公司、学校没机器给你用吗？

【在 S*A 的大作中提到】

: 13 inch 白色 macbook， 4G 到头了。
: 还有个原因是精打细算，因地制宜解决问题而沾沾自喜。
: To me that is a good part of the fun.

E*V2011-01-03 08:01

56 楼

bingo，
计算必须上cluster，
任何用台式机，笔记本算的都是浪费

【在 j*a 的大作中提到】

: 这么大的运算在笔记本做是自虐行为公司、学校没机器给你用吗？

S*A2011-01-03 08:01

57 楼

Relex,
还有个原因是精打细算，因地制宜解决问题而沾沾自喜。
To me that is a good part of the fun.
这些就是个人的 have fun 的项目，跟公司没有关系，用
公司机器不合适。

【在 j*a 的大作中提到】

: 这么大的运算在笔记本做是自虐行为公司、学校没机器给你用吗？

a*92011-01-03 08:01

58 楼

我是觉得C++已经走上了邪路
那么多可以用来做interview的考察点
就说明这个语言有问题了

【在 S*A 的大作中提到】

: 我就是觉得 C++ stl 太 evil 了。抵制。
: 其实也是不会用，也不想学那个，太复杂了，no fun。

j*a2011-01-03 08:01

59 楼

for fun? 那这另说了 :) 你继续探索下

【在 S*A 的大作中提到】

: Relex,
: 还有个原因是精打细算，因地制宜解决问题而沾沾自喜。
: To me that is a good part of the fun.
: 这些就是个人的 have fun 的项目，跟公司没有关系，用
: 公司机器不合适。

S*A2011-01-03 08:01

60 楼

但是 STL 生成的 code 都很大，规则太复杂，完全理解
投入和产出没有什么优势。我觉得 STL 总体上是个过于
复杂的尝试。你如果调一调 stl 里面的 memory leak.
我干过这事，oh my god.
所以我觉得 STL 是个误区，看上去很好，真正要吸收
理解和调试很困难。我每次看 C++ 的 back trace 都
特别痛恨。

【在 E*V 的大作中提到】

: stl 用起来还行，
: 自己写template那个菜烦人

S*d2011-01-03 08:01

61 楼

是没试过c#/mono?

S*A2011-01-03 08:01

62 楼

我对 java/c# 这种 JIT 都不是很感冒。写个独立程序还要带 JDK/.Net
才能跑。特别累赘。

【在 S***d 的大作中提到】

: 是没试过c#/mono?

r*t2011-01-03 08:01

63 楼

这种方案除非你永远不 scale up, 早晚出问题。一定要在你的 code 上改，试试这个
花多少时间，多少内存？
bigbuffer = mmap.mmap('some_1G_file')
node = dict( (n, slice(offset,offset+len)) for n,offset,len in edges ）
每个 slice object 20 bytes, 比指针浪费点，但是 << 50 byte.
而且，这个远不是最好的办法。
access node i 就用
bigbuffer[node[i]]
做了 mmap 又生成整个文件的 copy 就是脱了xx打x, 而且还 pack 的不如原来的 tight.

【在 S*A 的大作中提到】

: 我对 java/c# 这种 JIT 都不是很感冒。写个独立程序还要带 JDK/.Net
: 才能跑。特别累赘。

N*n2011-01-03 08:01

64 楼

It's not as fast as c/c++ (no garbage collection language can be) and has
no J2EE/.Net kinda platform behind it. So it's just another language.

【在 S*A 的大作中提到】

: 不一样吧，这个 go 以前没有类似的语言，实现到这个地步的。
: 我一直在找， ObjC 大概是最接近的。
: 我非常相信 go 以后会是很经典的语言。

wy2011-01-03 08:01

65 楼

could build platform around it ma. google is rich

【在 N********n 的大作中提到】

:
: It's not as fast as c/c++ (no garbage collection language can be) and has
: no J2EE/.Net kinda platform behind it. So it's just another language.

S*A2011-01-03 08:01

66 楼

这个一个 project 一个样，这个 project 给的数据就是这么多。
要不用 python 呢，怎么快完成怎么来。
这个不需要测量 mmap. 因为 mmap 没有用过。
Please get your fact straight.
你的 slice object 20 bytes 是如何得出来的呢？先不说是 64 位系统
上面，就算 32 位 Python 都肯定是错的。你要算 sizeof (PySliceObject)
而不是 sizeof (PyObject).
我的系统上 64 bit. sizeof(PySliceObject) 目测一下是 48 byte. 我估算
50 byte 很离谱么？你要硬说 48 << 50 那我也没有办法。
bigbuffer is byte array. node needs to be int array.
每次使用还转换一下。
tight.
所以我说这个改进不 work 嘛，如果 array.array 支持 mmap memory view
就 work 了。说了半天还是同样的问题，就是你说 Python 应该没有那么弱
的问题。我都不知到你要和我争什么。

【在 r****t 的大作中提到】

:
: 这种方案除非你永远不 scale up, 早晚出问题。一定要在你的 code 上改，试试这个
: 花多少时间，多少内存？
: bigbuffer = mmap.mmap('some_1G_file')
: node = dict( (n, slice(offset,offset+len)) for n,offset,len in edges ）
: 每个 slice object 20 bytes, 比指针浪费点，但是 << 50 byte.
: 而且，这个远不是最好的办法。
: access node i 就用
: bigbuffer[node[i]]
: 做了 mmap 又生成整个文件的 copy 就是脱了xx打x, 而且还 pack 的不如原来的 tight.

r*t2011-01-03 08:01

67 楼

论语言快慢的还是洗洗睡了吧。

【在 N********n 的大作中提到】

:
: It's not as fast as c/c++ (no garbage collection language can be) and has
: no J2EE/.Net kinda platform behind it. So it's just another language.

r*t2011-01-03 08:01

68 楼

这个就是让你对 lazy evaluation 在这个问题能有多快有个概念。
我就是用 sys 里面自带的 method 算的。你是咋算得，从源码数？
then just use struct to unpack... 你的叙述说你现在是 memory bound, 转换一下到底有多大 overhead，给个评价？
我不知你觉得我在争什么，你不是觉得你自己用 array.array 从 mmap copy 不值么，我不是在给你反复讲只要保留 slice 做 late eval 就行了么。

【在 S*A 的大作中提到】

:
: 这个一个 project 一个样，这个 project 给的数据就是这么多。
: 要不用 python 呢，怎么快完成怎么来。
: 这个不需要测量 mmap. 因为 mmap 没有用过。
: Please get your fact straight.
: 你的 slice object 20 bytes 是如何得出来的呢？先不说是 64 位系统
: 上面，就算 32 位 Python 都肯定是错的。你要算 sizeof (PySliceObject)
: 而不是 sizeof (PyObject).
: 我的系统上 64 bit. sizeof(PySliceObject) 目测一下是 48 byte. 我估算
: 50 byte 很离谱么？你要硬说 48 << 50 那我也没有办法。

S*A2011-01-03 08:01

69 楼

It is easier to develop than c/c++ and get pretty up close to
c/c++ in terms of running performance. It is a good design trade
off and it will have its place.
Java is too slow. Not in the same league.
No J2EE/.Net, isn't that is a good thing? There are too many blow
packages around already.

【在 N********n 的大作中提到】

:
: It's not as fast as c/c++ (no garbage collection language can be) and has
: no J2EE/.Net kinda platform behind it. So it's just another language.

N*w2011-01-03 08:01

70 楼

go 为了迎合 google 的应用改得太狠了
本来多线程这种用库函数实现就很好，结果非要搞几个关键字弄到语言里面
难免让人觉得四不像

【在 S*A 的大作中提到】

:
: It is easier to develop than c/c++ and get pretty up close to
: c/c++ in terms of running performance. It is a good design trade
: off and it will have its place.
: Java is too slow. Not in the same league.
: No J2EE/.Net, isn't that is a good thing? There are too many blow
: packages around already.

S*A2011-01-03 08:01

71 楼

对啊，看 /usr/include/python2.7/object.h & sliceobject.h.
我做个实验也基本 confirm 这个理论。
x = [ slice(i,i+2) for i in xrange(1024*1024)]
It take about 1G in my machine.
每个 slice(i,i+2) 大概 90+ byte 左右。
你不要忘了 slice 指向的 3 个 int object 也分别要算在内存里的。
每个都有 PyObject header。
So no better than reading from file directly to array.array
Smaller memory usage size without struct unpack. That is
currently my working pure python solution.
It does have the advantage of delay evaluate the array.
Only read the page if it is needed. So maybe it works that way.
1G + 1.5G = 2.x G，Plus dictionary mapping about 3G.

【在 r****t 的大作中提到】

:
: 这个就是让你对 lazy evaluation 在这个问题能有多快有个概念。
: 我就是用 sys 里面自带的 method 算的。你是咋算得，从源码数？
: then just use struct to unpack... 你的叙述说你现在是 memory bound, 转换一下到底有多大 overhead，给个评价？
: 我不知你觉得我在争什么，你不是觉得你自己用 array.array 从 mmap copy 不值么，我不是在给你反复讲只要保留 slice 做 late eval 就行了么。

r*t2011-01-03 08:01

72 楼

我没说只是 numpy，h5py 和 pytables 对内存都是很友好的。说他们占太多内存或者
是你看错了，要么就是你喝多了。

【在 S*A 的大作中提到】

:
: 对啊，看 /usr/include/python2.7/object.h & sliceobject.h.
: 我做个实验也基本 confirm 这个理论。
: x = [ slice(i,i+2) for i in xrange(1024*1024)]
: It take about 1G in my machine.
: 每个 slice(i,i+2) 大概 90+ byte 左右。
: 你不要忘了 slice 指向的 3 个 int object 也分别要算在内存里的。
: 每个都有 PyObject header。
: So no better than reading from file directly to array.array
: Smaller memory usage size without struct unpack. That is

S*A2011-01-03 08:01

73 楼

你喜欢用自己的库函数还是可以用啊。
go 和 chan 是为了简化写多线程的程序。这些用 C 写 multy thread
要数据完整性很 tricky 的。我倒不太觉得 go 保留字是太多 google
置入广告。还是为了简化流程服务的。当然双关也是很明显的。

【在 N****w 的大作中提到】

: go 为了迎合 google 的应用改得太狠了
: 本来多线程这种用库函数实现就很好，结果非要搞几个关键字弄到语言里面
: 难免让人觉得四不像

r*t2011-01-03 08:01

74 楼

ok，我 32 bit 下的，加两个整数 30 byte，第三个是 None, 不算的。
还有，你这个 graph 相当 sparse, 用 sparse matrix 直接上也是个办法，奖励是不少 sparse matrix 的算法。不过打死都不用矩阵操作的话就算了。

【在 S*A 的大作中提到】

: 你喜欢用自己的库函数还是可以用啊。
: go 和 chan 是为了简化写多线程的程序。这些用 C 写 multy thread
: 要数据完整性很 tricky 的。我倒不太觉得 go 保留字是太多 google
: 置入广告。还是为了简化流程服务的。当然双关也是很明显的。

S*A2011-01-03 08:01

75 楼

你又是如何得出这个结论的呢？
等等，是不是用和得出 PySliceObject 20 byte 结论的同样方法呢？
自己去看看 /usr/include/numpy/ndarrayobject.h

【在 r****t 的大作中提到】

: 我没说只是 numpy，h5py 和 pytables 对内存都是很友好的。说他们占太多内存或者
: 是你看错了，要么就是你喝多了。

S*A2011-01-03 08:01

76 楼

sparse matrix 不知道能不能对付 8M x 8M.
scipy 到是有个 sparse package, 我怀疑不太行的。
sparse matrix link list 也是要费 memory 的。
64 bit 上面 pointer 8 byte, 不如 4 byte int 划算。

【在 r****t 的大作中提到】

: ok，我 32 bit 下的，加两个整数 30 byte，第三个是 None, 不算的。
: 还有，你这个 graph 相当 sparse, 用 sparse matrix 直接上也是个办法，奖励是不少 sparse matrix 的算法。不过打死都不用矩阵操作的话就算了。

r*t2011-01-03 08:01

77 楼

h5py 之类使用多少内存和 ndarrayobject 结构的尺寸关系不大

【在 S*A 的大作中提到】

: 你又是如何得出这个结论的呢？
: 等等，是不是用和得出 PySliceObject 20 byte 结论的同样方法呢？
: 自己去看看 /usr/include/numpy/ndarrayobject.h

S*A2011-01-03 08:01

78 楼

这个我倒是没有试过，装个来玩玩。
多谢提示。

【在 r****t 的大作中提到】

: h5py 之类使用多少内存和 ndarrayobject 结构的尺寸关系不大

r*t2011-01-03 08:01

79 楼

我说了，20 加 2 个 int 好了, 是多少 byte？

【在 S*A 的大作中提到】

: 你又是如何得出这个结论的呢？
: 等等，是不是用和得出 PySliceObject 20 byte 结论的同样方法呢？
: 自己去看看 /usr/include/numpy/ndarrayobject.h

S*A2011-01-03 08:01

80 楼

为什么是 2 个 Int?

【在 r****t 的大作中提到】

: 我说了，20 加 2 个 int 好了, 是多少 byte？

r*t2011-01-03 08:01

81 楼

第三个是 None 阿

【在 S*A 的大作中提到】

: 为什么是 2 个 Int?

S*A2011-01-03 08:01

82 楼

喔，我理解错了。那你的 20 byte 还是不对。
Python 每个 object header 有 {
8 byte reference counter,
4 or 8 byte pointer to object type
8 byte payload size.
}
这里已经 20-24 byte.
还不算 3 个指针指向 start, stop, step.
所以每个 slice object 32 - 48 byte.
每个 Int 应该是 24 - 28 byte 左右。

【在 r****t 的大作中提到】

: 第三个是 None 阿

r*t2011-01-03 08:01

83 楼

8M x 8M 就不是问题了，现在只讨论 200M edge 了，
or 给你的另外一条路是用 python 使用 boost graph

【在 S*A 的大作中提到】

: sparse matrix 不知道能不能对付 8M x 8M.
: scipy 到是有个 sparse package, 我怀疑不太行的。
: sparse matrix link list 也是要费 memory 的。
: 64 bit 上面 pointer 8 byte, 不如 4 byte int 划算。

S*A2011-01-03 08:01

84 楼

不用那么麻烦。我搞了个 go 玩了一下，实现了我的 mmap
的思路。效果很好。而且也不太可能比这个更加compact 了。
google boost graph中。

【在 r****t 的大作中提到】

: 8M x 8M 就不是问题了，现在只讨论 200M edge 了，
: or 给你的另外一条路是用 python 使用 boost graph

r*t2011-01-03 08:01

85 楼

>> sys.getsizeof(1)
12
>> sys.getsizeof(slice(1,2))
20
20 + 24 = 48 对了吧？

【在 S*A 的大作中提到】

: 喔，我理解错了。那你的 20 byte 还是不对。
: Python 每个 object header 有 {
: 8 byte reference counter,
: 4 or 8 byte pointer to object type
: 8 byte payload size.
: }
: 这里已经 20-24 byte.
: 还不算 3 个指针指向 start, stop, step.
: 所以每个 slice object 32 - 48 byte.
: 每个 Int 应该是 24 - 28 byte 左右。

S*A2011-01-03 08:01

86 楼

不要急。 20 + 24 = 44.

【在 r****t 的大作中提到】

:
: >> sys.getsizeof(1)
: 12
: >> sys.getsizeof(slice(1,2))
: 20
: 20 + 24 = 48 对了吧？

r*t2011-01-03 08:01

87 楼

44 对了么？

【在 S*A 的大作中提到】

: 不要急。 20 + 24 = 44.

S*A2011-01-03 08:01

88 楼

对。
我其实算的也不对。
n [4]: sys.getsizeof(slice(1,2))
Out[4]: 40
64 bit system 是 40 bytes. 貌似 Pyssize_t 是 4 byte 的。
看 source code 就是这点不好。

【在 r****t 的大作中提到】

: 44 对了么？

r*t2011-01-03 08:01

89 楼

还有，今天俺最早给你建议的：
bigbuffer = memoryview(mmap.mmap('some_1G_file'))
node = dict( (n, bigbuffer[offset:offset+len]) for n,offset,len in edges )
这个对 2.7 以上应该是可以的，2.6 一下用 buffer() 也能类似搞, 这些东西都是直
接操作 memory 的，返回的都不是 PyObject. bytearray 啥的也有希望。

【在 S*A 的大作中提到】

: 对。
: 我其实算的也不对。
: n [4]: sys.getsizeof(slice(1,2))
: Out[4]: 40
: 64 bit system 是 40 bytes. 貌似 Pyssize_t 是 4 byte 的。
: 看 source code 就是这点不好。

S*A2011-01-03 08:01

90 楼

这个倒是，到用的时候才 array.array('i', node[n]) 钻换成 int array.
这样应该能用。那些 array.array 应该很快被 free 掉。Good idea。
多谢。

【在 r****t 的大作中提到】

: 还有，今天俺最早给你建议的：
: bigbuffer = memoryview(mmap.mmap('some_1G_file'))
: node = dict( (n, bigbuffer[offset:offset+len]) for n,offset,len in edges )
: 这个对 2.7 以上应该是可以的，2.6 一下用 buffer() 也能类似搞, 这些东西都是直
: 接操作 memory 的，返回的都不是 PyObject. bytearray 啥的也有希望。

d*q2011-01-03 08:01

91 楼

the only way to bypass gil is using multi process...you may try
parallel python or multiprocessing module (this one is a part of standard
library).
or you can try other implementations like jython or ironpython. However
those third implementation may suffer a lot on performance.

【在 F*******i 的大作中提到】

: SSA:
: have you try numpy array for the slicing? which supposes to return
: a *view* instead of copy.
: btw: do you have any trick about bypassing python's gil?
: thanks,

r*t2011-01-03 08:01

92 楼

actually not necessarily true. C modules can release gil during heavily
CPU tasks if the author can be sure that he is not doing bad things to the
whole runtime. Numpy is an example. I did not try, but it sounds possible
to use a parallel version of atlas with it because it releases gil when
feasible.
standard

【在 d***q 的大作中提到】

:
: the only way to bypass gil is using multi process...you may try
: parallel python or multiprocessing module (this one is a part of standard
: library).
: or you can try other implementations like jython or ironpython. However
: those third implementation may suffer a lot on performance.

f*Q2011-01-03 08:01

93 楼

4.4
土人俺感觉objC的runtime最好从源代码自己编译。不然实现上可以有很多区别。

【在 S*A 的大作中提到】

:
: 这个倒是，到用的时候才 array.array('i', node[n]) 钻换成 int array.
: 这样应该能用。那些 array.array 应该很快被 free 掉。Good idea。
: 多谢。

F*i2011-01-03 08:01

94 楼

that is true :)
The thing is sometime, want to do some computing with hybrid code (python+c+
+/c)
instead of in the standalone c extension(module) :(

【在 r****t 的大作中提到】

:
: actually not necessarily true. C modules can release gil during heavily
: CPU tasks if the author can be sure that he is not doing bad things to the
: whole runtime. Numpy is an example. I did not try, but it sounds possible
: to use a parallel version of atlas with it because it releases gil when
: feasible.
: standard

l*s2011-01-03 08:01

95 楼

2nd.

【在 r*******n 的大作中提到】

: 多谢回复。
: 这是我最近本版看到最好的帖子。

a92011-01-03 08:01

96 楼

这个go也就是瞄准了移动设备吧？android跑java实在是太慢。

【在 S*A 的大作中提到】

: 不一样吧，这个 go 以前没有类似的语言，实现到这个地步的。
: 我一直在找， ObjC 大概是最接近的。
: 我非常相信 go 以后会是很经典的语言。

S*A2011-01-03 08:01

97 楼

我觉得不是。这个 go 应该和那个 inferno OS 发展出来的。
http://www.vitanuova.com/inferno/
ATT Lab 那帮人很早就提出了用虚拟机可以跑在任何平台这样
的概念。这个就是 inferno 的前身 Limbo/Dis，但是一直商业化不好。
Sun 同样投入这个概念，商业化很成功，那就是 Java。
但是 Java 有个致命弱点就是 Java VM 使用 stack based VM.
Dis 和后来的LLVM 都是使用 register based VM. 这个 stack based
VM 很难高效率 map 到现在的 CPU 的指令。玩过汇编的都知道，
到处都是 register based CPU.
所以有些品味的大牛技术上把把关还是挺重要的。
然后 ATT/Lucent 散伙之后那帮牛人被 google 收买了。原班人马用了
inferno 的 code, 改了前端，就是 go. BTW 这帮人就是发明 C 的元老。
所以我还是有很多 respect 的。

【在 a9 的大作中提到】

: 这个go也就是瞄准了移动设备吧？android跑java实在是太慢。

F*i2011-01-03 08:01

98 楼

zan!

【在 S*A 的大作中提到】

: 我觉得不是。这个 go 应该和那个 inferno OS 发展出来的。
: http://www.vitanuova.com/inferno/
: ATT Lab 那帮人很早就提出了用虚拟机可以跑在任何平台这样
: 的概念。这个就是 inferno 的前身 Limbo/Dis，但是一直商业化不好。
: Sun 同样投入这个概念，商业化很成功，那就是 Java。
: 但是 Java 有个致命弱点就是 Java VM 使用 stack based VM.
: Dis 和后来的LLVM 都是使用 register based VM. 这个 stack based
: VM 很难高效率 map 到现在的 CPU 的指令。玩过汇编的都知道，
: 到处都是 register based CPU.
: 所以有些品味的大牛技术上把把关还是挺重要的。

d*q2011-01-03 08:01

99 楼

it still depends...the code running in c module definitely can bypass the
gil as long as the code in c module doesn't need to work with python code...
if it does, the gil can still impact...

【在 r****t 的大作中提到】

:
: actually not necessarily true. C modules can release gil during heavily
: CPU tasks if the author can be sure that he is not doing bad things to the
: whole runtime. Numpy is an example. I did not try, but it sounds possible
: to use a parallel version of atlas with it because it releases gil when
: feasible.
: standard

v*s2011-01-03 08:01

100 楼

俺还挺喜欢那个goroutine的。他的select ／ chan 我觉得好用，和Erlang很像。

【在 N****w 的大作中提到】

: go 为了迎合 google 的应用改得太狠了
: 本来多线程这种用库函数实现就很好，结果非要搞几个关键字弄到语言里面
: 难免让人觉得四不像

S*A2011-01-03 08:01

101 楼

I assume bypass means release the gil.
It just need to acquire gil again before return to python.
If you get to the point gil matters, may be better is do
the multi thread part in C. There is only one python
interpreter running at any give time.
I never worry about gil so much because it is micro
optimizations. Python is too slow to begin with.
I would switch to C code or write the whole thing in C
way before I hit the gil limit.

..

【在 d***q 的大作中提到】

: it still depends...the code running in c module definitely can bypass the
: gil as long as the code in c module doesn't need to work with python code...
: if it does, the gil can still impact...

v*s2011-01-03 08:01

102 楼

另外俺还喜欢go中的下面这些东西：
强类型系统。但是编译器又会自动推断你的类型。例如
xx := abc.efg(123)
可以返回多个数值。 a,b := abc.f(123)
defer. 用它来实现 RAII 比 C++ 利用栈上对象的析构函数的方案让人塌实多了

【在 v*s 的大作中提到】

: 俺还挺喜欢那个goroutine的。他的select ／ chan 我觉得好用，和Erlang很像。

S*A2011-01-03 08:01

103 楼

同意
C99 可以返回 struct，不是 struct pointer，效果是一样的。
go 这个看着干净舒服。
这个是应人民群众要求加上去的。还有那个 recover 也不错。
go 用了 ObjC 那样的 interface 而不是 C++ 的多重继承关系。
我觉得 interface 比继承简单有效。

【在 v*s 的大作中提到】

: 另外俺还喜欢go中的下面这些东西：
: 强类型系统。但是编译器又会自动推断你的类型。例如
: xx := abc.efg(123)
: 可以返回多个数值。 a,b := abc.f(123)
: defer. 用它来实现 RAII 比 C++ 利用栈上对象的析构函数的方案让人塌实多了

z*y2011-01-03 08:01

104 楼

谢谢分享。
我最近也在学Go，Go的语法其实比较简单，Language specification基本上一天就看完
了。语法很接近C，但又不用自己管理内存，对于系统程序员简直是太好了。我觉得一
个编程语言，尤其是系统语言，应该用非常简洁的语法，提供程序员最需要的功能，以
最大程度上提高编程的效率。Go在C的基础上添加了很多其他语言中的精华，感觉我有
了我所需要的东西，但又不像很多其他语言那样，需要我改变思维方式，比如erlang,
clojure等等。
我觉得Go对于我这样的系统程序员来说，最好的特点有这么几个：
1 static type。dynamic type实在不是用来编大规模系统程序的好方法。
2 interface。没有了Object oriented那样的强加的体系，接近于duck typing，但又
是static的。
3 function as a type
4 GC。不用说了。
5 multiple return values。也不用说了。
6 native的collect types, list, map, etc. 还有array slicing
7 goroutine和channel。写server最重要了，再也不用管threading还是event driven
了。另外也可以避免很多threading本身的问题。详见"The Problem with Threads"一
文。是解决c10k问题的好办法。
8 越来越丰富的标准库。
9 native compilation
10 extremely fast build。对于大型项目很重要。

z*y2011-01-03 08:01

105 楼

只是用6*系列的Go自己的fast compiler的时候才要用binding，原因就是因为multiple
return values，所以calling convention不一样。如果用gccgo，你就可以自己直接连。
但是gccgo还有一些feature不支持。尤其是goroutine还是one-to-one map到system
thread
上。不过performance是gccgo编译出来的好一些。当然compilation speed就差一点了
。看你什
么需要。看你的情况，可以直接用gccgo。

【在 S*A 的大作中提到】

: 我试过，根本没法用。没有什么人维护。
: 问题在于 ObjC 的 Foundary ，如果没有 NS*
: 那套的话，ObjC 什么都干不了。
: Linux 里面的 Foundary 模仿 Apple 都很烂，
: 我试过好几个版本，最简单的 Dictionary 链接都通不过。
: Apple 的 Foundary 理论上 Open Source, 但是只有老调牙的版本，
: Linux上编译还不通过，要改很多地方。
: 我后来就彻底放弃了。
: Google Go 要比 Objective C 强，写起来快。唯不如的地方
: 是 Objective C 可以直接调用 C code. Go 要写 binding.

z*y2011-01-03 08:01

106 楼

Go程序本身就是Unicode的，这点很好。

【在 S*A 的大作中提到】

: 刚刚发现 go 居然可以用中文变量名：
: package main
: import (
: "fmt"
: )
: func main() {
: 数字:= 1
: fmt.Printf("变态阿: %d\n", 数字)
: }

w*l2011-01-03 08:01

107 楼

你这样还不如直接写C了。稍微写一个能用的map也不是什么难事儿，就算效率不高如果
能抵消你不能
mmap产生的花销也值了。

【在 S*A 的大作中提到】

: 你说的对，numpy slice 的确是使用 view. 我也试过了，对于我的应用
: 不行。我的问题是数据量很大，就算用 mmap 也要非常有技巧。
: 问题是 numpy 的 slice object 比 python array.array
: 费很多内存。我估计 numpy slice 在 200 byte 左右。这样 slice 数目
: 多上去以后总的内存还是没有省。
: import numpy
: fp = numpy.memmap("big-file-over-1G-byte", dtype='uint32')
: x = [ fp[i] for i in xrange(0, len(fp), 50)]
: # load all the thing in memory. Python should take 1.x G here
: x = [ fp[i:i+1] for i in xrange(0, len(fp), 50)]

S*A2011-01-03 08:01

108 楼

我后来的确去写了个 C module 干这事，中间又 side track 写了个
go 的版本熟悉一下 go. C map 现在都不用自己写了，除了 glib 有，
libc 现在都带 hash 和 b-tree 了。man hsearch & tsearch.
BTW, 纯 Python 也是可以近似实现这个功能的。前面有人提议过了。
关键就是在 map dictionary 的时候不要转换 mmap buffer 的类型，
推迟到访问的时候再转换。

【在 w*********l 的大作中提到】

: 你这样还不如直接写C了。稍微写一个能用的map也不是什么难事儿，就算效率不高如果
: 能抵消你不能
: mmap产生的花销也值了。

N*n2011-01-03 08:01

109 楼

Stack-based VMs all have JIT to handle optimization. Go's advantage over
there is not that big. The main performance gain is probably from that it
doesn't have as much exception handling as Java, which is always a pain
in the butt for any serious optimization attempt.
Go gives up a lot for performance gain I think. There's no exception, no
inheritance and no generics. You cannot write large-scale apps in it.

【在 S*A 的大作中提到】

: 我觉得不是。这个 go 应该和那个 inferno OS 发展出来的。
: http://www.vitanuova.com/inferno/
: ATT Lab 那帮人很早就提出了用虚拟机可以跑在任何平台这样
: 的概念。这个就是 inferno 的前身 Limbo/Dis，但是一直商业化不好。
: Sun 同样投入这个概念，商业化很成功，那就是 Java。
: 但是 Java 有个致命弱点就是 Java VM 使用 stack based VM.
: Dis 和后来的LLVM 都是使用 register based VM. 这个 stack based
: VM 很难高效率 map 到现在的 CPU 的指令。玩过汇编的都知道，
: 到处都是 register based CPU.
: 所以有些品味的大牛技术上把把关还是挺重要的。

z*y2011-01-03 08:01

110 楼

over
it
pain
no
Don't think so. Go is developed for writing large scale application:
Exception: with multiple return values and defer, Go can handle errors
pretty well. In fact, exception is often discouraged in C++. In Java, it
is abused. If you really need it, the panic/recover is just for that.
Inheritance: this is actually a forced model on programmers by OO. Go's
interface and struct composition is more flexible. As with duck typing,
as long as something can quack like a duck, swim like a duck and walks
like a duck, it is a duck. Why do I have to define a base duck then
inherit from it.
Generics: depending on your needs. Generics is not that important in Go,
but people are debating whether to support it. But really, it is not
that important. You just have to change your mindset.

【在 N********n 的大作中提到】

:
: Stack-based VMs all have JIT to handle optimization. Go's advantage over
: there is not that big. The main performance gain is probably from that it
: doesn't have as much exception handling as Java, which is always a pain
: in the butt for any serious optimization attempt.
: Go gives up a lot for performance gain I think. There's no exception, no
: inheritance and no generics. You cannot write large-scale apps in it.

N*n2011-01-03 08:01

111 楼

Interfaces are good for interaction between objects. Inheritance is good
for code reuse. W/o it and generics that helps avoiding code duplicate I
don't see how it helps in writing a robust and organized platform.

【在 z***y 的大作中提到】

:
: over
: it
: pain
: no
: Don't think so. Go is developed for writing large scale application:
: Exception: with multiple return values and defer, Go can handle errors
: pretty well. In fact, exception is often discouraged in C++. In Java, it
: is abused. If you really need it, the panic/recover is just for that.
: Inheritance: this is actually a forced model on programmers by OO. Go's

z*y2011-01-03 08:01

112 楼

good
duplicate I
Take a look at how Go does struct composition and interface
composition. Actually, even in language like Java and C++, composition
is encouraged over inheritance many times.
As for generics, in Go you can use empty interface to construct data
type and unbox explicitly. It is not as easy to use as generics, but it
is OK most of the time. For people heavily use generics, maybe it is a
pain. But I never find generics is critical for me.

【在 N********n 的大作中提到】

:
: Interfaces are good for interaction between objects. Inheritance is good
: for code reuse. W/o it and generics that helps avoiding code duplicate I
: don't see how it helps in writing a robust and organized platform.

N*w2011-01-03 08:01

113 楼

GC 就没法当系统语言了
还是 app develop 吧

,

【在 z***y 的大作中提到】

: 谢谢分享。
: 我最近也在学Go，Go的语法其实比较简单，Language specification基本上一天就看完
: 了。语法很接近C，但又不用自己管理内存，对于系统程序员简直是太好了。我觉得一
: 个编程语言，尤其是系统语言，应该用非常简洁的语法，提供程序员最需要的功能，以
: 最大程度上提高编程的效率。Go在C的基础上添加了很多其他语言中的精华，感觉我有
: 了我所需要的东西，但又不像很多其他语言那样，需要我改变思维方式，比如erlang,
: clojure等等。
: 我觉得Go对于我这样的系统程序员来说，最好的特点有这么几个：
: 1 static type。dynamic type实在不是用来编大规模系统程序的好方法。
: 2 interface。没有了Object oriented那样的强加的体系，接近于duck typing，但又

z*y2011-01-03 08:01

114 楼

Go is developed mostly for networking, like web server. If you are
talking about writing OS, the trend is actually towards using languages
with GC. So Go still has a chance. But I love Go as a language writing
large distributed system.

【在 N****w 的大作中提到】

: GC 就没法当系统语言了
: 还是 app develop 吧
:
: ,

S*A2011-01-03 08:01

115 楼

Stack-based VM is hard to do normal function optimization like CSE.
You can in theory undo the stack and convert them back to SSA
form. A lot of modern compiler transformation is depend on SSA.
Convert stack to SSA takes memory and CPU time. In modern C
compilers, most of the time is spend on IR transformation.
The C front end is very fast.
That is why Java better use JIT, it only work on the hot path,
and most of the scalor is already on the stack. It can not afford
to full optimization of the function. JIT is not magic, it is
embed a small compiler into your run time. It just can't run
memory intensive nor CPU intensive optimization pass.
That is just your speculations.
Don't be silly. Inheritance and generics are not the only way to
write large scale code. Let along weather it is the best way to
do it.
Go has the unique advantage that it is invented in a time after all
those languages. People learn from mistakes those existing languages,
what works, what doesn't. Go pick the ones that works very well.
Generics sounds good, but it cause a lot of complicity in the actually
implementations. Go did not say no to generics, it is just open
issue not decide yet. Make the simple things works first. Then decide
if it needs more to improve it. That is the good way to approach it.
Don't forget Go team has lots of the great minds that invent UNIX
and C, which stand the test of time. Go is closer to C than Java.
It compile more like C than Java. It is simpler and cleaner than C++.
That is unique to me, that is why I believe it will gain its place
in the history.
I have been looking forward to some thing fill the gap between C
and Python for years. Go did a great job on that. Java is not it,
period.

【在 N********n 的大作中提到】

:
: Interfaces are good for interaction between objects. Inheritance is good
: for code reuse. W/o it and generics that helps avoiding code duplicate I
: don't see how it helps in writing a robust and organized platform.

S*A2011-01-03 08:01

116 楼

I have to politely decline GC in writing OS.
GC has a lot of problem make it very hard to write OS.
First of all, GC need to have the memory mmaped in the
kernel in order to do object scans. You can not GC object
swap out to the disk. You are likely to call GC when
you are low on free pages. But you need to find more pages
to mmap those in order to clean some out. You can't get
a correct usage count if there are some page is missing.
That is a chicken and egg problem and asking for swap
storm behavior.
That along is a deal breaker for GC in OS.

【在 z***y 的大作中提到】

:
: Go is developed mostly for networking, like web server. If you are
: talking about writing OS, the trend is actually towards using languages
: with GC. So Go still has a chance. But I love Go as a language writing
: large distributed system.

N*n2011-01-03 08:01

117 楼

Nah, you underestimated the impact of exception handling on optimization.
With exceptions around you have #1 many abnormal edges on a control flow
graph to ruin optimization opportunities; #2 limited ways to reschedule
code execution order if you have to maintain the scene when an exception
is thrown. Go gets off easy by not offering serious exception support.
Also Java is for writing apps not system software. It should not be used
for computation intensive tasks. I'm sure Go can write large-scale code,
unnecessarily large probably b/c of short on code reuse and consistency
support that come natural from OO languages.

【在 S*A 的大作中提到】

: I have to politely decline GC in writing OS.
: GC has a lot of problem make it very hard to write OS.
: First of all, GC need to have the memory mmaped in the
: kernel in order to do object scans. You can not GC object
: swap out to the disk. You are likely to call GC when
: you are low on free pages. But you need to find more pages
: to mmap those in order to clean some out. You can't get
: a correct usage count if there are some page is missing.
: That is a chicken and egg problem and asking for swap
: storm behavior.

S*A2011-01-03 08:01

118 楼

You have no clue on how modern compilers generate machine
code for exception. Your mind set is still the python way
of doing exception check. Segfault can be generate at every
load/store instruction. You really think that every possible
exception point has en exception edge to handler in the control
graph. I am speachless.
I don't know much about Java, but that is not how things was done
in LLVM and C. I can not image Java will be stupid enough
to do your insane way, while there is much faster and easier way.
Code reuse:
Take a look at how webgo and other go network package reuse the
http package. You can't do it doesn't mean other people have to
do it your silly way.
This is not constructive. If you like Java, Java is perfect for
you, keep using your java. Nobody is forcing you to write go code.
At the same time, stop the FUD that go can't reuse code. That is
simply not true. The fact is it can, there is code out there prove
it. it is different than the way you think it should be done.
But it is there. You can't deny it.

【在 N********n 的大作中提到】

:
: Nah, you underestimated the impact of exception handling on optimization.
: With exceptions around you have #1 many abnormal edges on a control flow
: graph to ruin optimization opportunities; #2 limited ways to reschedule
: code execution order if you have to maintain the scene when an exception
: is thrown. Go gets off easy by not offering serious exception support.
: Also Java is for writing apps not system software. It should not be used
: for computation intensive tasks. I'm sure Go can write large-scale code,
: unnecessarily large probably b/c of short on code reuse and consistency
: support that come natural from OO languages.

N*n2011-01-03 08:01

119 楼

You lay off segfaults b/c it's system's duty to handle hardware issues
not your app code.
But if it's software exceptions defined and thrown by your code then you
better recognize the need of an edge or you could optimize it wrong to
break language specification. Segfault. Gimme a break.

【在 S*A 的大作中提到】

: You have no clue on how modern compilers generate machine
: code for exception. Your mind set is still the python way
: of doing exception check. Segfault can be generate at every
: load/store instruction. You really think that every possible
: exception point has en exception edge to handler in the control
: graph. I am speachless.
: I don't know much about Java, but that is not how things was done
: in LLVM and C. I can not image Java will be stupid enough
: to do your insane way, while there is much faster and easier way.
: Code reuse:

I*e2011-01-03 08:01

120 楼

I just started to use cython recently.
As far as I know, any serious python project has to use cython more or
less.
Though there are 2 problems for me:
I could not find any good editor for it and I could not find any doc for
cross-python-version compiling.
But I like everything else.

【在 S*A 的大作中提到】

: 最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
: 整体感觉，填补了 C 和 Python 中间的空白。
: 和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
: 都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
: 个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
: 语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
: 的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
: Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
: 这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
: basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。

S*A2011-01-03 08:01

121 楼

You did not realized that hardware or software raised exception
go through the same exact mechanism for exception recovery.
The edge you add is the same as the number of software raise
you use. If you raise exception and catch in upper functions.
The function in between is unaffected. They don't need to know
about exception at all. If you raise and catch in the same
function. There is not much different from goto on error
used by linux kernel. You are trying to tell me compiler
can't optimize C code with goto and error handling?
As I told you before. Because the stack based VM, it is
harder to do data flow related analyse without SSA form.
Control flow along there is not much you can do at all.
Even dead code elimination needs data flow analyse to find
out the dead edge. Tell me, tell me which control flow
optimization you have in mind doesn't need help from SSA.
That is why Java pick the JIT approach. They can optimized
the hot path easily, it is all on the stack. Function scope
optimization isn't really the strong point of stack based
VM. It needs to undo the stack and convert to SSA form.
On the other hand, static compiling is hard to do cross
function optimizations. That is why static and inline are
useful in C.
Another way to look at the exception feature. I can see
too many exception cause problem, per your argument isn't
removing the Java's style of exception a good thing?
Go still have panic and recover. Isn't that cover the useful
part of the exception in Java?

【在 N********n 的大作中提到】

: You lay off segfaults b/c it's system's duty to handle hardware issues
: not your app code.
: But if it's software exceptions defined and thrown by your code then you
: better recognize the need of an edge or you could optimize it wrong to
: break language specification. Segfault. Gimme a break.

S*A2011-01-03 08:01

122 楼

shame on me. As a long time python lover and hacker,
I only heard about it here a few days ago.
Here is vim syntax highlighting for cython:
http://www.vim.org/scripts/script.php?script_id=2209
If you don't use vim, well, learn it. :-)
Can you give more detail on the python version
problem you have?

【在 I**********e 的大作中提到】

: I just started to use cython recently.
: As far as I know, any serious python project has to use cython more or
: less.
: Though there are 2 problems for me:
: I could not find any good editor for it and I could not find any doc for
: cross-python-version compiling.
: But I like everything else.

N*n2011-01-03 08:01

123 楼

You ain't listening. Lots of transformation needs to move code onto the
edges. You can do it on regular edges where source and target are clear.
That is the goto case. If you happen to need to move code onto abnormal
edges you are in trouble as the throw target is floating, so compilers
hate to see lots of abnormal edges in a program. That's just one of the
headaches caused by exceptions.

【在 S*A 的大作中提到】

: You did not realized that hardware or software raised exception
: go through the same exact mechanism for exception recovery.
: The edge you add is the same as the number of software raise
: you use. If you raise exception and catch in upper functions.
: The function in between is unaffected. They don't need to know
: about exception at all. If you raise and catch in the same
: function. There is not much different from goto on error
: used by linux kernel. You are trying to tell me compiler
: can't optimize C code with goto and error handling?
: As I told you before. Because the stack based VM, it is

S*A2011-01-03 08:01

124 楼

I am listening but I can not understand what kind
of transformation you are talking about.
So educate me.

What kind of the transformation are you talking about here?
Can you give me a concrete example the IR (instruction in a
basic block level) before the transformation and after the
transformation?
"Headaches" is not a compiler term. Help me figure it out, what
kind of the transformation you can do without exceptions.
Because you have exceptions edges, you can't do it any more.
So I can have a better understanding of the headache you are
dealing with.
And do you know how does "floating target" get catch in machine
instruction level?

【在 N********n 的大作中提到】

:
: You ain't listening. Lots of transformation needs to move code onto the
: edges. You can do it on regular edges where source and target are clear.
: That is the goto case. If you happen to need to move code onto abnormal
: edges you are in trouble as the throw target is floating, so compilers
: hate to see lots of abnormal edges in a program. That's just one of the
: headaches caused by exceptions.

N*n2011-01-03 08:01

125 楼

Get yourself a copy of Robert Morgan's "Building an Optimizing Compiler"
and read the abnormal edge part. I don't have time to type it down here.

【在 S*A 的大作中提到】

: I am listening but I can not understand what kind
: of transformation you are talking about.
: So educate me.
:
: What kind of the transformation are you talking about here?
: Can you give me a concrete example the IR (instruction in a
: basic block level) before the transformation and after the
: transformation?
: "Headaches" is not a compiler term. Help me figure it out, what
: kind of the transformation you can do without exceptions.

S*A2011-01-03 08:01

126 楼

Is that a back hand way of saying Robert Morgan knows but
you don't know the detail?
If you do, can you give me some 1 minutes executive summary?
You don't have time and yet you reply on mitbbs at 1:50am?

【在 N********n 的大作中提到】

: Get yourself a copy of Robert Morgan's "Building an Optimizing Compiler"
: and read the abnormal edge part. I don't have time to type it down here.

S*d2011-01-03 08:01

127 楼

Typical BBS argument. Always ends up in
A: you are stupid
B: no, you are stupid
Remember, arguing on internet is like attending special olympics...

F*i2011-01-03 08:01

128 楼

i am happy with vim's synatax highlight/indent for cython

【在 I**********e 的大作中提到】

: I just started to use cython recently.
: As far as I know, any serious python project has to use cython more or
: less.
: Though there are 2 problems for me:
: I could not find any good editor for it and I could not find any doc for
: cross-python-version compiling.
: But I like everything else.

F*i2011-01-03 08:01

129 楼

the default one from vim7.3 is not bad :)

【在 S*A 的大作中提到】

: shame on me. As a long time python lover and hacker,
: I only heard about it here a few days ago.
: Here is vim syntax highlighting for cython:
: http://www.vim.org/scripts/script.php?script_id=2209
: If you don't use vim, well, learn it. :-)
: Can you give more detail on the python version
: problem you have?

S*A2011-01-03 08:01

130 楼

That is not what happen here.
I try to keep the discussion technical. I just ask a very
detail technical question regarding which optimization NeverLearn
is referring to. NeverLearn can not answer.
So far the NeverLearn default on:
- You can reuse go code to write large scale code.
- Go's panic() recover() is not that bad compare to Java's exception.
The only remaining thing is how much Java's exception impact
Java's performance.
Shred, joke aside. I found it a little disturbing you discriminate
against special olympics people. You did not spell it out but
the way you say it imply it.

【在 S***d 的大作中提到】

: Typical BBS argument. Always ends up in
: A: you are stupid
: B: no, you are stupid
: Remember, arguing on internet is like attending special olympics...

S*A2011-01-03 08:01

131 楼

I own you an apology.
I am sorry that I teased you. That is a my bad.
The question I ask you is a trick question. It can not be
answered by copying some chapter in the book.
I ask *what* optimization pass get affected by abnormal edges.
You try to reply *how* abnormal edges affect optimization.
You see the difference?
To answer my question, you really need to absorb some other
chapter as well. Then draw the conclusion. I am trying to see
if you can explain things using your own understanding instead
of reciting some the books. That is a bad habit of mine gain
from the interview process.
Regarding my question. The how part is relate to the phi node
in SSA form need to know which edge in come from. It is not
two big a deal if you have only one level of try catch. If you
have more than one level, the edge number multiplex. It also
doesn't matter that much if catch in a different function.
Because that function does not see the inner function local scope.
The what part, for example promoting memory variable to scaler
will get affected by the phi node.
If you are not doing SSA form, variable stay in the memory.
it does not matter that much either.
That is my reasoning. I know the exception play some role in
optimization. However Java does JIT in a way is significant
different from static compiling. The book you are talking
about is about static compiling. Not JIT. Java might not even
have the function scope SSA form.
The only sure way to find out is testing. Have some Java program
do it with try catch. In 3 different case:
try never catch
try catch 50% chance
try catch 100%
See the real performance numbers. I doubt it has more than 10%
difference.
Any way, I am sorry that I teased you to show off.
I hope you learn some thing from it as well.

【在 N********n 的大作中提到】

: Get yourself a copy of Robert Morgan's "Building an Optimizing Compiler"
: and read the abnormal edge part. I don't have time to type it down here.

r*n2011-01-03 08:01

132 楼

I'm python lover, too, but not hacker.
Looking forward to your post about cython after you play it around.

【在 S*A 的大作中提到】

: shame on me. As a long time python lover and hacker,
: I only heard about it here a few days ago.
: Here is vim syntax highlighting for cython:
: http://www.vim.org/scripts/script.php?script_id=2209
: If you don't use vim, well, learn it. :-)
: Can you give more detail on the python version
: problem you have?

S*A2011-01-03 08:01

133 楼

Not to beat the dead horse. I am curious about the Java Exception handler
performance as well.
Here is the first link I google "java exception performance"
http://stackoverflow.com/questions/299068/how-slow-are-java-exc
The dude is asking question why he test with exception and without
exception, the code runs about the same speed if not faster.
That is more or less match my expection. Using JIT, Java can't
do a lot of the triditional static compiling optimization. On the
other hand, because of using JIT, Java can do a lot of optimization
not aviable to static compiling. It is different design choice.

n*w2011-01-03 08:01

134 楼

那么结论是register vm 好过 stack？

S*A2011-01-03 08:01

135 楼

这两个是可以相互转换的。
生成 static compile 的机器代码话 register 比较容易做优化处理。
解释执行的话 stack 比较容易写。Python Lua 都是 stack based.
曾经有人想把 Python 转成 stackless 的，那个项目好像不了了之了。
直接解释 register 的话这个 register 回收什么的有很多麻烦的
问题。

【在 n*w 的大作中提到】

: 那么结论是register vm 好过 stack？

n*w2011-01-03 08:01

136 楼

java dotnet什么目前都是stack的吧。
除了llvm，还有什么在实用阶段？好像llvm现在用的还很少?

【在 S*A 的大作中提到】

: 这两个是可以相互转换的。
: 生成 static compile 的机器代码话 register 比较容易做优化处理。
: 解释执行的话 stack 比较容易写。Python Lua 都是 stack based.
: 曾经有人想把 Python 转成 stackless 的，那个项目好像不了了之了。
: 直接解释 register 的话这个 register 回收什么的有很多麻烦的
: 问题。

S*A2011-01-03 08:01

137 楼

你可以管 compiler 后端的内部表示方式为 VM, 正规叫 IR.
Java 因该是 stack 的。这个很难改了。
Apple 的东西都移到 llvm 上了，llvm 现在火，以后会更火。
gcc 太难 hack 了。gcc 以前很多东西是 RMS 起家的。RMS 满腔
热情，但是不是 CS 科班出身的，很多东西搞的没有使用上好的理论
指导。RMS 很擅长看看软件如何工作自己琢磨出来如何写一个了。
但是 compiler 是理论背景比较深的，gcc 很长时间都处于非常落后
的状态。很多 optimization 发生在不正确的阶段。gcc 的内部描述
就是 lisp 那样的，因为 RMS 特别喜欢 lisp，世界上所有的东西应该
都是长 lisp 那样的。gcc 支持的机器类型很多多，所以这种结构上
的改动非常困难，尾大不掉。gcc 4 是改用比较好一点的架构，不
用那些 lisp 的 IR.

【在 n*w 的大作中提到】

: java dotnet什么目前都是stack的吧。
: 除了llvm，还有什么在实用阶段？好像llvm现在用的还很少?

w*g2011-01-03 08:01

138 楼

说RMS不是cs科班出身的我觉得不合适。严格说来不是，但毕竟人家是在MIT AI lab混
的。

【在 S*A 的大作中提到】

: 你可以管 compiler 后端的内部表示方式为 VM, 正规叫 IR.
: Java 因该是 stack 的。这个很难改了。
: Apple 的东西都移到 llvm 上了，llvm 现在火，以后会更火。
: gcc 太难 hack 了。gcc 以前很多东西是 RMS 起家的。RMS 满腔
: 热情，但是不是 CS 科班出身的，很多东西搞的没有使用上好的理论
: 指导。RMS 很擅长看看软件如何工作自己琢磨出来如何写一个了。
: 但是 compiler 是理论背景比较深的，gcc 很长时间都处于非常落后
: 的状态。很多 optimization 发生在不正确的阶段。gcc 的内部描述
: 就是 lisp 那样的，因为 RMS 特别喜欢 lisp，世界上所有的东西应该
: 都是长 lisp 那样的。gcc 支持的机器类型很多多，所以这种结构上

n*t2011-01-03 08:01

139 楼

科班在写软件上，没什么太大用。。。

【在 S*A 的大作中提到】

: 你可以管 compiler 后端的内部表示方式为 VM, 正规叫 IR.
: Java 因该是 stack 的。这个很难改了。
: Apple 的东西都移到 llvm 上了，llvm 现在火，以后会更火。
: gcc 太难 hack 了。gcc 以前很多东西是 RMS 起家的。RMS 满腔
: 热情，但是不是 CS 科班出身的，很多东西搞的没有使用上好的理论
: 指导。RMS 很擅长看看软件如何工作自己琢磨出来如何写一个了。
: 但是 compiler 是理论背景比较深的，gcc 很长时间都处于非常落后
: 的状态。很多 optimization 发生在不正确的阶段。gcc 的内部描述
: 就是 lisp 那样的，因为 RMS 特别喜欢 lisp，世界上所有的东西应该
: 都是长 lisp 那样的。gcc 支持的机器类型很多多，所以这种结构上

wy2011-01-03 08:01

140 楼

你这个是鬼扯

【在 n******t 的大作中提到】

: 科班在写软件上，没什么太大用。。。

N*w2011-01-03 08:01

141 楼

的确是鬼扯
除非编译器那套东西不叫软件，叫软件之母

【在 wy 的大作中提到】

: 你这个是鬼扯

M*u2011-01-03 08:01

142 楼

知道是鬼扯还回

【在 wy 的大作中提到】

: 你这个是鬼扯

S*A2011-01-03 08:01

143 楼

好吧，RMS 是 CS 科班，但不是 compiler 背景的。
RMS 比较适合做精神领袖，技术挂帅上品味不是很行。
看看 GNU Coding Style. 非常不爽。
那个 Hurd 搞了半天难产最后被 Linux 抢走了。
Gcc 内部结构长期很糟糕，政治斗争太多。被 llvm 严重赶超，
照这个速度，以后地位不保。
我觉得做 Open source maintainer 技术上最难得，最难学到
的是品味。这个品味的上限基本上是天生的。
我发现比较好的maintainer 都有比较包容的心态。比较偏激的
都做不好。太面的也做不好。

【在 w***g 的大作中提到】

: 说RMS不是cs科班出身的我觉得不合适。严格说来不是，但毕竟人家是在MIT AI lab混
: 的。

T*x2011-01-03 08:01

144 楼

高手啊。

【在 S*A 的大作中提到】

: I own you an apology.
: I am sorry that I teased you. That is a my bad.
: The question I ask you is a trick question. It can not be
: answered by copying some chapter in the book.
: I ask *what* optimization pass get affected by abnormal edges.
: You try to reply *how* abnormal edges affect optimization.
: You see the difference?
: To answer my question, you really need to absorb some other
: chapter as well. Then draw the conclusion. I am trying to see
: if you can explain things using your own understanding instead

r*z2011-01-03 08:01

145 楼

我也很喜欢go，用它写了一些代码
目前对我来说，主要的问题是缺少必要的科学计算库。用C/C++的时候就
比较依赖于gsl，所以改用go的时候不得不用go的C库接口，太麻烦了

【在 S*A 的大作中提到】

: 最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
: 整体感觉，填补了 C 和 Python 中间的空白。
: 和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
: 都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
: 个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
: 语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
: 的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
: Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
: 这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
: basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。

S*A2011-01-03 08:01

146 楼

这些就要有些时间来发展了。 go 的设计和模式都挺好，有一定优势。
以后用的人多了之然库就多了。

【在 r*****z 的大作中提到】

: 我也很喜欢go，用它写了一些代码
: 目前对我来说，主要的问题是缺少必要的科学计算库。用C/C++的时候就
: 比较依赖于gsl，所以改用go的时候不得不用go的C库接口，太麻烦了

L*n2011-01-03 08:01

147 楼

I think java can do scientifc computations, there were some benchmarks that
I can't find now but I remember java did perform decently, comparable
sometimes
even better than C++ and Fortran(albelit rarely). I'm playing with a
language
called scala that is running on the JVM. I'm quite safisfied with its
performance(my work mainly invloves manipulating small matrices), but I
didn't do any benchmark so can't compare it with C++.

【在 N********n 的大作中提到】

: Get yourself a copy of Robert Morgan's "Building an Optimizing Compiler"
: and read the abnormal edge part. I don't have time to type it down here.

S*A2011-01-03 08:01

148 楼

Java 做 benchmark 是还不错。这个完全取决于什么样的类型。
如果 hot path 很集中，例如矩阵运算，大部分时间都花在很小
的一部分代码上， Java JIT 可以把这部分代码根据运行的实际
情况动态优化。这些 JIT 优化包括一些 C++ 没法做的优化。例
如观察这个这几层这个类型都是 A , 就可以针对 A 生成代码，
那些 function table lookup 都直接换成 A 的。进入这个 hot
path 之前先检查一下，如果是 A 的用优化代码处理。不是 A 的
话再用通用的慢的版本处理。
但是如果做的 hot path 不是很集中，或者启动时间很短，例如
写个 C++ compiler. 编译一个程序一秒钟就搞定了，这个深度 JIT
反而会拖慢整个运行时间。而且编译下一个文件又要从新开始 JIT
优化，做重复劳动，浪费电。
这是为什么我不喜欢这种以 JIT 为主要手段的 Language 的一个原因.
还有一个原因是如果 JIT 系统出了错，完全没法调试。C 编译
出错很少，但用多了还是能碰到的。看生成的代码还是能分析出
compiler 什么地方编译错了。JIT 要是错了就没法看。你只能说这么
写不行，改一种写法又行了，为什么这样不知道。

that

【在 L***n 的大作中提到】

: I think java can do scientifc computations, there were some benchmarks that
: I can't find now but I remember java did perform decently, comparable
: sometimes
: even better than C++ and Fortran(albelit rarely). I'm playing with a
: language
: called scala that is running on the JVM. I'm quite safisfied with its
: performance(my work mainly invloves manipulating small matrices), but I
: didn't do any benchmark so can't compare it with C++.

S*d2011-01-03 08:01

149 楼

譬如linus...

【在 S*A 的大作中提到】

: 好吧，RMS 是 CS 科班，但不是 compiler 背景的。
: RMS 比较适合做精神领袖，技术挂帅上品味不是很行。
: 看看 GNU Coding Style. 非常不爽。
: 那个 Hurd 搞了半天难产最后被 Linux 抢走了。
: Gcc 内部结构长期很糟糕，政治斗争太多。被 llvm 严重赶超，
: 照这个速度，以后地位不保。
: 我觉得做 Open source maintainer 技术上最难得，最难学到
: 的是品味。这个品味的上限基本上是天生的。
: 我发现比较好的maintainer 都有比较包容的心态。比较偏激的
: 都做不好。太面的也做不好。

S*A2011-01-03 08:01

150 楼

Linus 是我的偶像啊。品味不要太好。而且很会管理。
Linux kernel 就没有 BSD 那边政治斗争那么厉害，发展很快。
这个和 Linus 有很大关系。

【在 S***d 的大作中提到】

:
: 譬如linus...

N*w2011-01-03 08:01

151 楼

it 看那么多 patch 都细看么
脑子不是一般的好使

【在 S*A 的大作中提到】

: Linus 是我的偶像啊。品味不要太好。而且很会管理。
: Linux kernel 就没有 BSD 那边政治斗争那么厉害，发展很快。
: 这个和 Linus 有很大关系。

o*n2011-01-03 08:01

152 楼

我是外行，问个弱问题，现在大家常用的OS又没几种，开发这种基于VM的JIT语言有啥
用呢？把code在不同的OS下编译成executable发行有啥麻烦的？即使有UI的程序，用
cross-platform的widget像QT之类的就很好用阿。说到编的快又没有python之类的好用
。我每次用Java的程序点个菜单都要顿半天的时候就心里骂一次，除非实在没有
alternative，我基本不碰java的程序，openoffice里也把java关掉。

【在 S*A 的大作中提到】

: Java 做 benchmark 是还不错。这个完全取决于什么样的类型。
: 如果 hot path 很集中，例如矩阵运算，大部分时间都花在很小
: 的一部分代码上， Java JIT 可以把这部分代码根据运行的实际
: 情况动态优化。这些 JIT 优化包括一些 C++ 没法做的优化。例
: 如观察这个这几层这个类型都是 A , 就可以针对 A 生成代码，
: 那些 function table lookup 都直接换成 A 的。进入这个 hot
: path 之前先检查一下，如果是 A 的用优化代码处理。不是 A 的
: 话再用通用的慢的版本处理。
: 但是如果做的 hot path 不是很集中，或者启动时间很短，例如
: 写个 C++ compiler. 编译一个程序一秒钟就搞定了，这个深度 JIT

r*z2011-01-03 08:01

153 楼

对商业公司来说，意义是很大的
比如mathworks要同时维护unix、linux、osx、windows下的matlab，使用java
就能节省很多成本的

【在 o**n 的大作中提到】

: 我是外行，问个弱问题，现在大家常用的OS又没几种，开发这种基于VM的JIT语言有啥
: 用呢？把code在不同的OS下编译成executable发行有啥麻烦的？即使有UI的程序，用
: cross-platform的widget像QT之类的就很好用阿。说到编的快又没有python之类的好用
: 。我每次用Java的程序点个菜单都要顿半天的时候就心里骂一次，除非实在没有
: alternative，我基本不碰java的程序，openoffice里也把java关掉。

S*A2011-01-03 08:01

154 楼

其实你说的很对。这个有很多是忽悠。
Java 当时能提出这个主要是因为 Java 是个全新的东西，而 C 背负了
几十年的发展历史，从 8 位机到 32 位机。语言本身其实是很好 port
的， C 比 java 更好 port, 因为很多平台根本就没有 java.
不好 port 的是 lib. 例如 Unix 有 fork，windows 没有。
Java 自己重写了一套来吸收掉这个差别。
但是这个不是长期的，以后机器发展到 128 位，那 java 同样要面对
32,64,128 位整数的问题。就和当年的 C 有 8, 16, 32 整数一样。
而且 Java 自己的那套 lib 就会成为瓶颈。例如我想要 windows 那种
半透明的 window 怎么办？Java 有很强烈的 Java 和非 Java 的界限，
出了 Java, 要调用一下 C, Python 都很麻烦。
所以我觉得 Java 总体上是个忽悠。

【在 o**n 的大作中提到】

: 我是外行，问个弱问题，现在大家常用的OS又没几种，开发这种基于VM的JIT语言有啥
: 用呢？把code在不同的OS下编译成executable发行有啥麻烦的？即使有UI的程序，用
: cross-platform的widget像QT之类的就很好用阿。说到编的快又没有python之类的好用
: 。我每次用Java的程序点个菜单都要顿半天的时候就心里骂一次，除非实在没有
: alternative，我基本不碰java的程序，openoffice里也把java关掉。

S*A2011-01-03 08:01

155 楼

又是一个被忽悠的。你知道 matlab 不是用 Java 写的吧。
你这个例子非常有问题，如果 matlab 比现在慢很多，内存消耗多很多，
这个就不是节省成本的问题了，这个就是要浪费钱重写的问题了。
你知道那个 Corel wordprefect 还有 office suit 吧，就是被
忽悠了，拿 Java 开发下一代产品，结果和预计的不一样，直接
就挂掉了。不然现在没准还可以和 adobe 抢口饭吃。
你说的这个例子恰好是 Java 不擅长干的。 matlab 本身是个解释器，
代码分布很广，要用很多小 object, 经常 malloc free. 对速度要求
很高。Java JIT 是个 magic. 要是性能达不到要求你能怎么调？
C 写的你还能 profile 一下，什么地方用一下汇编 MMX 指令加速
一下。magic system 不灵光的时候就彻底没则。
如果你把调试不同平台，调试 memory foot print, 调试执行速度
这些算进去的话就不能省了。
现在 Java 最多的是那些商业应用，人家对 memory 和 performance
要求不高。代码全部藏在超级强大的服务器里面。不要出 C 那样的
segfault 很重要。

【在 r*****z 的大作中提到】

: 对商业公司来说，意义是很大的
: 比如mathworks要同时维护unix、linux、osx、windows下的matlab，使用java
: 就能节省很多成本的

r*z2011-01-03 08:01

156 楼

我当然知道matlab核心不是java的了
matlab是一个很典型的用java做GUI节省夸平台成本的商业案例，
对于matlab这样不需要多么美观的GUI软件而言，这么做是非常合适的。
实际上，matlab以外，还有不少多平台的科学计算软件这么设计，
你不会不知道吧？

【在 S*A 的大作中提到】

: 又是一个被忽悠的。你知道 matlab 不是用 Java 写的吧。
: 你这个例子非常有问题，如果 matlab 比现在慢很多，内存消耗多很多，
: 这个就不是节省成本的问题了，这个就是要浪费钱重写的问题了。
: 你知道那个 Corel wordprefect 还有 office suit 吧，就是被
: 忽悠了，拿 Java 开发下一代产品，结果和预计的不一样，直接
: 就挂掉了。不然现在没准还可以和 adobe 抢口饭吃。
: 你说的这个例子恰好是 Java 不擅长干的。 matlab 本身是个解释器，
: 代码分布很广，要用很多小 object, 经常 malloc free. 对速度要求
: 很高。Java JIT 是个 magic. 要是性能达不到要求你能怎么调？
: C 写的你还能 profile 一下，什么地方用一下汇编 MMX 指令加速

S*A2011-01-03 08:01

157 楼

我其实不熟悉科学计算，不是很熟悉哪里的 GUI 软件。
我连 mathlab 都不用，没有 license, 也用不着。
我可以想象那样用也就凑合用了。
现在有 wxwindow qt。跨平台的 GUI 多了去了。
我几乎从来不和 GUI 打交道，唯一写的一点 object
viewer 工具也是用 C 写的。

【在 r*****z 的大作中提到】

: 我当然知道matlab核心不是java的了
: matlab是一个很典型的用java做GUI节省夸平台成本的商业案例，
: 对于matlab这样不需要多么美观的GUI软件而言，这么做是非常合适的。
: 实际上，matlab以外，还有不少多平台的科学计算软件这么设计，
: 你不会不知道吧？

S*A2011-01-03 08:01

158 楼

这也说明 matlab 的人也没有那么容易被忽悠去用 java 写
核心的东西。其实 matlab 那点东西用 wxpython， pygtk 什么
的就够用了，还更加简单。

【在 r*****z 的大作中提到】

: 我当然知道matlab核心不是java的了
: matlab是一个很典型的用java做GUI节省夸平台成本的商业案例，
: 对于matlab这样不需要多么美观的GUI软件而言，这么做是非常合适的。
: 实际上，matlab以外，还有不少多平台的科学计算软件这么设计，
: 你不会不知道吧？

r*z2011-01-03 08:01

159 楼

matlab改用java作GUI那阵，java基本上是唯一靠谱的解决方案

【在 S*A 的大作中提到】

: 这也说明 matlab 的人也没有那么容易被忽悠去用 java 写
: 核心的东西。其实 matlab 那点东西用 wxpython， pygtk 什么
: 的就够用了，还更加简单。

z*y2011-01-03 08:01

160 楼

It is not easy. But check out the Singularity OS from Microsoft. The
kernel runtime is GCed.

【在 S*A 的大作中提到】

: I have to politely decline GC in writing OS.
: GC has a lot of problem make it very hard to write OS.
: First of all, GC need to have the memory mmaped in the
: kernel in order to do object scans. You can not GC object
: swap out to the disk. You are likely to call GC when
: you are low on free pages. But you need to find more pages
: to mmap those in order to clean some out. You can't get
: a correct usage count if there are some page is missing.
: That is a chicken and egg problem and asking for swap
: storm behavior.

S*A2011-01-03 08:01

161 楼

There is java os as well. It can be done but it doesn't mean it
make sense.

【在 z***y 的大作中提到】

: It is not easy. But check out the Singularity OS from Microsoft. The
: kernel runtime is GCed.

r*n2011-01-03 08:01

162 楼

俺也有同感。用java写的GUI软件，用户体验是非常的不爽。
JVM的特点应该是启动慢，跑起来还是比较快的，所以在后端服务器
运行没有问题。但我觉得java语言语法设计很漂亮，简洁一致，
不如把JVM去掉，直接编译成可执行的码，这样是不是可行？
说道java的跨平台，android选择用java做GUI系统应该不是考虑
跨平台，是不是降低了用户体验，还是Dalvik virtual machine
非常强劲。

【在 o**n 的大作中提到】

: 我是外行，问个弱问题，现在大家常用的OS又没几种，开发这种基于VM的JIT语言有啥
: 用呢？把code在不同的OS下编译成executable发行有啥麻烦的？即使有UI的程序，用
: cross-platform的widget像QT之类的就很好用阿。说到编的快又没有python之类的好用
: 。我每次用Java的程序点个菜单都要顿半天的时候就心里骂一次，除非实在没有
: alternative，我基本不碰java的程序，openoffice里也把java关掉。

S*A2011-01-03 08:01

163 楼

Gcc's java compile into executable. But there are some
java stander class not supported yet.

【在 r*******n 的大作中提到】

: 俺也有同感。用java写的GUI软件，用户体验是非常的不爽。
: JVM的特点应该是启动慢，跑起来还是比较快的，所以在后端服务器
: 运行没有问题。但我觉得java语言语法设计很漂亮，简洁一致，
: 不如把JVM去掉，直接编译成可执行的码，这样是不是可行？
: 说道java的跨平台，android选择用java做GUI系统应该不是考虑
: 跨平台，是不是降低了用户体验，还是Dalvik virtual machine
: 非常强劲。

j*a2011-01-03 08:01

164 楼

java未必差我在用一个软件就挺快的
说java慢的是没写好吧(用户体验考虑进去,怎么把慢的显得很快)

【在 r*******n 的大作中提到】

: 俺也有同感。用java写的GUI软件，用户体验是非常的不爽。
: JVM的特点应该是启动慢，跑起来还是比较快的，所以在后端服务器
: 运行没有问题。但我觉得java语言语法设计很漂亮，简洁一致，
: 不如把JVM去掉，直接编译成可执行的码，这样是不是可行？
: 说道java的跨平台，android选择用java做GUI系统应该不是考虑
: 跨平台，是不是降低了用户体验，还是Dalvik virtual machine
: 非常强劲。

r*n2011-01-03 08:01

165 楼

eclipse 在俺的机上比较慢（T61），启动慢就算了，打开文件（8000行）也要等。
注：eclipse不是俺写的，呵呵。

【在 j*a 的大作中提到】

: java未必差我在用一个软件就挺快的
: 说java慢的是没写好吧(用户体验考虑进去,怎么把慢的显得很快)

S*A2011-01-03 08:01

166 楼

redhat 写的那些 system-config-xxx 系列的东西用起来也不觉得
慢，都是 pygtk 写的。这些就看你用来干什么了。纯粹普通 GUI
部分要求不高。核心的部分要求就高些。

o*n2011-01-03 08:01

167 楼

就是因为这个，我已经好久没用过matlab的GUI了，一直command line，幸亏matlab画
图不用java，要不就我就要天天面对java了 :)

【在 S*A 的大作中提到】

: 我其实不熟悉科学计算，不是很熟悉哪里的 GUI 软件。
: 我连 mathlab 都不用，没有 license, 也用不着。
: 我可以想象那样用也就凑合用了。
: 现在有 wxwindow qt。跨平台的 GUI 多了去了。
: 我几乎从来不和 GUI 打交道，唯一写的一点 object
: viewer 工具也是用 C 写的。

S*I2011-01-03 08:01

168 楼

Eclipse基本上是Mac上启动最慢的应用程序了，Photoshop都比它快。

【在 r*******n 的大作中提到】

: eclipse 在俺的机上比较慢（T61），启动慢就算了，打开文件（8000行）也要等。
: 注：eclipse不是俺写的，呵呵。

d*q2011-01-03 08:01

169 楼

are you talking about stackless python?
it is a good one..many companies have used it successfully to implement web
server/online game server etc... it is good for non-blocking usage.
however the downside is the gil is still the bar.

【在 S*A 的大作中提到】

: 这两个是可以相互转换的。
: 生成 static compile 的机器代码话 register 比较容易做优化处理。
: 解释执行的话 stack 比较容易写。Python Lua 都是 stack based.
: 曾经有人想把 Python 转成 stackless 的，那个项目好像不了了之了。
: 直接解释 register 的话这个 register 回收什么的有很多麻烦的
: 问题。

d*q2011-01-03 08:01

170 楼

RMS那会有没有计算机系这种东西？？
早期搞编程的似乎大多是物理和数学转过去的.

【在 w***g 的大作中提到】

: 说RMS不是cs科班出身的我觉得不合适。严格说来不是，但毕竟人家是在MIT AI lab混
: 的。

n*w2011-01-03 08:01

171 楼

补充一下，Google用在android的那个java vm是register based。

【在 S*A 的大作中提到】

: 你可以管 compiler 后端的内部表示方式为 VM, 正规叫 IR.
: Java 因该是 stack 的。这个很难改了。
: Apple 的东西都移到 llvm 上了，llvm 现在火，以后会更火。
: gcc 太难 hack 了。gcc 以前很多东西是 RMS 起家的。RMS 满腔
: 热情，但是不是 CS 科班出身的，很多东西搞的没有使用上好的理论
: 指导。RMS 很擅长看看软件如何工作自己琢磨出来如何写一个了。
: 但是 compiler 是理论背景比较深的，gcc 很长时间都处于非常落后
: 的状态。很多 optimization 发生在不正确的阶段。gcc 的内部描述
: 就是 lisp 那样的，因为 RMS 特别喜欢 lisp，世界上所有的东西应该
: 都是长 lisp 那样的。gcc 支持的机器类型很多多，所以这种结构上

c*v2011-01-03 08:01

172 楼

我一直觉得，如果可以把python代码转换成go语言，再编译成目标代码，会是提高
python速度的一个好路子。主要是考虑到go与python的某种相似性，以及go编译速度很
快。

【在 S*A 的大作中提到】

: 最近学习了一下 google go. 觉得还挺好的。我大胆预测以后一定会火。
: 整体感觉，填补了 C 和 Python 中间的空白。
: 和 C 一样，直接生成机器代码这个非常好。这一点就把什么 Java, C#
: 都比下去了。估计以后成熟写应该能和 C++ 的速度差不多。我一直想找
: 个类似 C 的但是可以直接用 dictionary & array 的抽象数据类型的
: 语言。以前我学过的最接近的是 Objective C, Objective C 其实挺不错
: 的，就是写起来比较长一点。最大的问题是出了 OSX 没有地方可以用。
: Python, Lua， Java 子类都太慢，什么都是 Box Type. 没法快起来。
: 这个 C++ template 那一套太复杂，而且生成很多不必要的代码（各种
: basic type 生成一套）而且 OO 多重继承那一套就是走火如魔了。

S*A2011-01-03 08:01

173 楼

不行的，因为 python 是 dynamic type, go 是 strong type.
这里没法直接转换的。因为这个 dynamic type, 什么近路都不能抄的。

【在 c****v 的大作中提到】

: 我一直觉得，如果可以把python代码转换成go语言，再编译成目标代码，会是提高
: python速度的一个好路子。主要是考虑到go与python的某种相似性，以及go编译速度很
: 快。

r*t2011-01-03 08:01

174 楼

dynamic typed 和 strong typed 不矛盾

【在 S*A 的大作中提到】

: 不行的，因为 python 是 dynamic type, go 是 strong type.
: 这里没法直接转换的。因为这个 dynamic type, 什么近路都不能抄的。

S*A2011-01-03 08:01

175 楼

展开说说？
我的理解是这样的，你看看有什么不对：
例如 expression a+b.
在 C 里面就是 int a, int b 的话就是编译成：
r1 = load &a
r2 = load &b
r3 = add r1, r2
不算 loading 的话就是一条汇编指令。
Python 里面因为不知道 a, b 是什么 type.
a 有可能是 int, 也有可能是个 custom type, 带 __add__(self, b)
的 method.
先不考虑 a, b 需要 dictionary lookup. 假设 a, b 这两个 PyObject *
已经拿到了，因为你不知道 PyObject 指向什么 type. 所以你必须检查
这样解释：
if (a->type == &intType & b->type == &intType) {
// Fast path for normal int.
c = (int) a->value + (int)b->value
c = Py_BuildValue("i", c)
} else {
func = PyObject_GetAttrString(a, "__add__") ? :
PyObject_GetAttrString(b, "__add__")
args = Py_BuildValue("oo", a, b);
c = PyObject_CallObject(func, args);
}
这里搞不好就几千个 instruction 出去了，因为 getattr 这些 function
都可以是很深的。
不然Python "a + b" 你还能有什么聪明的翻译办法？

【在 r****t 的大作中提到】

: dynamic typed 和 strong typed 不矛盾

r*t2011-01-03 08:01

176 楼

你这个比较是比的 static typed vs. duck typed，并且从实现上说明 cpython 的
duck
typing 需要更多指令。这个例子不能说明 dynamic typed 不能和 strong typed 同时
真，
这是两个话题。
对前一个话题，参见 rpython,
对后一个话题，我意思是 1+'2' raises TypeError. python 里面变量名和 c 里面变
量是个
很不同的概念，更接近 (Type*) void*, 不好比较的。

【在 S*A 的大作中提到】

: 展开说说？
: 我的理解是这样的，你看看有什么不对：
: 例如 expression a+b.
: 在 C 里面就是 int a, int b 的话就是编译成：
: r1 = load &a
: r2 = load &b
: r3 = add r1, r2
: 不算 loading 的话就是一条汇编指令。
: Python 里面因为不知道 a, b 是什么 type.
: a 有可能是 int, 也有可能是个 custom type, 带 __add__(self, b)

S*A2011-01-03 08:01

177 楼

对。你要说明这个问题的话最简单的例子就是 Python C module。
每个 Python C module 都有 static type 可以和 dynamic Type 同时存在。
C type 就是 static type. PyObject 就是 dynamic type.
我原文说的是因为 Python 的 dynamic type, 这个是跑得慢的原因。
翻译成 go code 并不能获得 C code 不能获得的东西。可能就是好看一点
点。
对啊，不是有人问干嘛不把 Python Code 翻译成 Go code 然后就很快了。
我就是说明一下这样翻译并不能获得本质上的改进。除非在翻译的过程
限定了 dynamic type 成为 static type。

【在 r****t 的大作中提到】

: 你这个比较是比的 static typed vs. duck typed，并且从实现上说明 cpython 的
: duck
: typing 需要更多指令。这个例子不能说明 dynamic typed 不能和 strong typed 同时
: 真，
: 这是两个话题。
: 对前一个话题，参见 rpython,
: 对后一个话题，我意思是 1+'2' raises TypeError. python 里面变量名和 c 里面变
: 量是个
: 很不同的概念，更接近 (Type*) void*, 不好比较的。

z*y2011-01-03 08:01

178 楼

You confused static typed with strong typed. Python is strongly typed
but dynamic.

【在 S*A 的大作中提到】

: 展开说说？
: 我的理解是这样的，你看看有什么不对：
: 例如 expression a+b.
: 在 C 里面就是 int a, int b 的话就是编译成：
: r1 = load &a
: r2 = load &b
: r3 = add r1, r2
: 不算 loading 的话就是一条汇编指令。
: Python 里面因为不知道 a, b 是什么 type.
: a 有可能是 int, 也有可能是个 custom type, 带 __add__(self, b)

S*A2011-01-03 08:01

179 楼

My reasoning of dynamic type can't compile into efficient code
still applies.
Yes, you are right, I confused the strong vs dynamic type.
Thanks for pointing it out.

【在 z***y 的大作中提到】

: You confused static typed with strong typed. Python is strongly typed
: but dynamic.

S*A2011-01-03 08:01

180 楼

FT, I did not get what you mean until zllwy point out I mix
the strong type vs static type. My bad.

【在 r****t 的大作中提到】

: 你这个比较是比的 static typed vs. duck typed，并且从实现上说明 cpython 的
: duck
: typing 需要更多指令。这个例子不能说明 dynamic typed 不能和 strong typed 同时
: 真，
: 这是两个话题。
: 对前一个话题，参见 rpython,
: 对后一个话题，我意思是 1+'2' raises TypeError. python 里面变量名和 c 里面变
: 量是个
: 很不同的概念，更接近 (Type*) void*, 不好比较的。

n*t2011-01-03 08:01

181 楼

没有觉得llvm比gcc好啊。。。至少目前来说。。。

【在 S*A 的大作中提到】

: 好吧，RMS 是 CS 科班，但不是 compiler 背景的。
: RMS 比较适合做精神领袖，技术挂帅上品味不是很行。
: 看看 GNU Coding Style. 非常不爽。
: 那个 Hurd 搞了半天难产最后被 Linux 抢走了。
: Gcc 内部结构长期很糟糕，政治斗争太多。被 llvm 严重赶超，
: 照这个速度，以后地位不保。
: 我觉得做 Open source maintainer 技术上最难得，最难学到
: 的是品味。这个品味的上限基本上是天生的。
: 我发现比较好的maintainer 都有比较包容的心态。比较偏激的
: 都做不好。太面的也做不好。

S*A2011-01-03 08:01

182 楼

内部结构好，如果你要写个 jit 的东西用 llvm 很容易，
用 gcc 几乎就是没办法。自己写个新的优化的 pass，llvm
也容易写。

【在 n******t 的大作中提到】

: 没有觉得llvm比gcc好啊。。。至少目前来说。。。