Redian新闻
>
提问:matlab效率问题(不同机器上)
avatar
提问:matlab效率问题(不同机器上)# Computation - 科学计算
f*n
1
从linux版转来的,因为听说这里牛人更多 :-)
注意:我的问题只是好奇,不是问这个具体问题该用那个软件(当然也欢迎这方面的
remarks )
Hi, guys:
Just write a very simple code to compare perf. of Matlab:
//svd.m
for i=1:1000
for j=1:1000
m(i,j)=1/(i+j);
end;
end;
t0=cputime;
svd(m); //compute the SVD decomposition.
cputime-t0
Test result:
1) P4 1.8G A : 6.7s
2) Barton running at 1.4GHz: 9.3s
3) Sun (anybody tells me how to see the cpu type of a Sun machine?): 21.0s
(though it is a public machine, at least 60% cpu time is idle, so I assume my
measur
avatar
n*t
2
sun is slow really.
anyway, you shoul vectorize you code.
The loop in Matlab is really slow.
just do such a comparison.
tic;
i=[1:1000];
j=[1:1000];
[X, Y] = meshgrid(i, j);
n = 1./(X+Y);
toc
tic;
for i=1:1000
for j=1:1000
m(i,j)=1/(i+j);
end;
end;
toc

【在 f***n 的大作中提到】
: 从linux版转来的,因为听说这里牛人更多 :-)
: 注意:我的问题只是好奇,不是问这个具体问题该用那个软件(当然也欢迎这方面的
: remarks )
: Hi, guys:
: Just write a very simple code to compare perf. of Matlab:
: //svd.m
: for i=1:1000
: for j=1:1000
: m(i,j)=1/(i+j);
: end;

avatar
n*t
3
Sparc's floating point performace is not so good.
So it is simply quite common to see i386 beats it.
P4 has changed a lot of instruction implemention
so the time needed for certain instruction has changed.
This bring problems to optimization.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
That is tradition. hehe.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
yes, RISC CPU are always better at this.

【在 f***n 的大作中提到】
: 从linux版转来的,因为听说这里牛人更多 :-)
: 注意:我的问题只是好奇,不是问这个具体问题该用那个软件(当然也欢迎这方面的
: remarks )
: Hi, guys:
: Just write a very simple code to compare perf. of Matlab:
: //svd.m
: for i=1:1000
: for j=1:1000
: m(i,j)=1/(i+j);
: end;

avatar
a*a
4
P4 has SSE2 which can multiple floating point calculations at same time (2 for
double, 4 for float).
If the code is written approperiately, it will be huge speed-up

my
said

【在 f***n 的大作中提到】
: 从linux版转来的,因为听说这里牛人更多 :-)
: 注意:我的问题只是好奇,不是问这个具体问题该用那个软件(当然也欢迎这方面的
: remarks )
: Hi, guys:
: Just write a very simple code to compare perf. of Matlab:
: //svd.m
: for i=1:1000
: for j=1:1000
: m(i,j)=1/(i+j);
: end;

avatar
c*e
5
Just run the program on my machines
The results are based on averaging 5 runs.
P4 2.4GHz HT: 4.125s
Athlon XP 1600+: 14.017s
Huge difference.
One of the reasons is that in the P4 machine, dual channel DDR400 is used.

【在 a*******a 的大作中提到】
: P4 has SSE2 which can multiple floating point calculations at same time (2 for
: double, 4 for float).
: If the code is written approperiately, it will be huge speed-up
:
: my
: said

avatar
S*y
6

my
~~~~~~~~~
where? I don't think Athlon can beat P4C in large linear algebraic problems.
Actually, I never see a single CPU could beat P4 in such problems.
Of course, I'm referring the latest CPUs.
Alpha, Itanium, Power4, SGI, SUN ....
Maybe, Opetron can, but never have a chance to give it a try. :-)
no good things ba... for high-performance computing.

【在 f***n 的大作中提到】
: 从linux版转来的,因为听说这里牛人更多 :-)
: 注意:我的问题只是好奇,不是问这个具体问题该用那个软件(当然也欢迎这方面的
: remarks )
: Hi, guys:
: Just write a very simple code to compare perf. of Matlab:
: //svd.m
: for i=1:1000
: for j=1:1000
: m(i,j)=1/(i+j);
: end;

avatar
d*d
7
I got avg 5.1s for that code on P4 2.4G 256M RDRAM.
Same P4 is faster than Sun in office too,
Sun has 1G or 2G mem, but Sun is shared by several
users.

【在 f***n 的大作中提到】
: 从linux版转来的,因为听说这里牛人更多 :-)
: 注意:我的问题只是好奇,不是问这个具体问题该用那个软件(当然也欢迎这方面的
: remarks )
: Hi, guys:
: Just write a very simple code to compare perf. of Matlab:
: //svd.m
: for i=1:1000
: for j=1:1000
: m(i,j)=1/(i+j);
: end;

avatar
d*d
8
This is what I get by
"bench"
0.6100 1.0150 0.4380 0.5930 0.9070 1.8440

【在 d*******d 的大作中提到】
: I got avg 5.1s for that code on P4 2.4G 256M RDRAM.
: Same P4 is faster than Sun in office too,
: Sun has 1G or 2G mem, but Sun is shared by several
: users.

avatar
c*e
9
I ran five tests, the average is
0.5160 1.0630 0.4690 0.5780 0.9060 0.7030

【在 d*******d 的大作中提到】
: This is what I get by
: "bench"
: 0.6100 1.0150 0.4380 0.5930 0.9070 1.8440

avatar
d*d
10
yes, from second time on, its faster.
here's my 10 times result.
0.5320 1.1250 0.4380 0.6100 0.9220 1.0000
0.5310 1.0940 0.4530 0.6100 0.9690 0.8750
0.5320 1.3590 0.4840 0.6090 0.8910 0.5160
0.5470 1.1090 0.4690 0.7500 0.8910 0.5000
0.6410 1.0940 0.4690 0.6410 0.9060 0.5470
0.5160 1.0940 0.4530 0.6720 0.9220 0.5620
0.5620 1.1250 0.4530 0.6090 1.8430 0.7040
0.53
avatar
f*n
11

Cache size matters, I think. at least you should compare with Barton (with
512kB L2 cache).

【在 c*******e 的大作中提到】
: Just run the program on my machines
: The results are based on averaging 5 runs.
: P4 2.4GHz HT: 4.125s
: Athlon XP 1600+: 14.017s
: Huge difference.
: One of the reasons is that in the P4 machine, dual channel DDR400 is used.

相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。