问个Index的问题 - 未名空间MITBBS历史存档

国际科技财经博客移民网络热点娱乐民生时事公众号

Redian新闻

>未名空间

>Database - 数据库

问个Index的问题

问个Index的问题# Database - 数据库

p*32009-03-13 07:03

1 楼

书法--天苍苍,夜茫茫, 风吹草低现牛羊

s*g2009-03-13 07:03

2 楼

比如说设计一个银行的数据库，
有一个table叫transaction(transactionID, accountID, date, ...)
-------------
这个表显然经常需要更新，主要是一条一条的insert操作
这样看来，为了performance，尽量减少index，那么不应该在accountID上建index
不然，每次insert都需要更新这个index(?)
然而，用户（和银行）也需要查询某账户近期的所有transactions
这样似乎又需要给accountID建一个index
想请教一下实际设计中是怎么做的？谢谢！

K*y2009-03-13 07:03

3 楼

不懂，觉得挺好看的

t*n2009-03-13 07:03

4 楼

Which database are you using? If it is SQL Server, it looks like accountID+
date is a good candidate for the clustered index. In SQL Server, a clustered
index is actually the table itself, so have no fear of inserting.

【在 s******g 的大作中提到】

: 比如说设计一个银行的数据库，
: 有一个table叫transaction(transactionID, accountID, date, ...)
: -------------
: 这个表显然经常需要更新，主要是一条一条的insert操作
: 这样看来，为了performance，尽量减少index，那么不应该在accountID上建index
: 不然，每次insert都需要更新这个index(?)
: 然而，用户（和银行）也需要查询某账户近期的所有transactions
: 这样似乎又需要给accountID建一个index
: 想请教一下实际设计中是怎么做的？谢谢！

i*a2009-03-13 07:03

5 楼

从个人审美看，这幅逸趣横生，契合风吹草低这句里的怡然自得。。。

【在 p*****3 的大作中提到】

: 书法--天苍苍,夜茫茫, 风吹草低现牛羊

z*32009-03-13 07:03

6 楼

这个个人感觉还是一个比较复杂的问题，如果一般的index的话是指b+ tree，并不是每
次insert都会产生balance tree的操作。只有某个node全充满的时候才会，进行
balance tree操作，或许还有left rotation和right rotation的操作。
如果使用mysql的话，可以设定经常插入的数据库的engine是inndo，不要每次insert都
commit，先insert若干条，再commit，然后用master-slave replication把数据同步到
，经常查询的一个数据库，而这个数据库的engine是myisam。
oracle的话，个人感觉可以对date，建立一个bitmap index，这样的话，其实不会有
tree的运算，每次只是没有操作或是更新一下某个gap的up或low bound。然后建一个
job，每隔一段时间dbms_stats.gather_stats,这样的话，可以尽快更新execution
plan,达到更好的效果。

p*32009-03-13 07:03

7 楼

谢谢版大

【在 i***a 的大作中提到】

: 从个人审美看，这幅逸趣横生，契合风吹草低这句里的怡然自得。。。

w*e2009-03-13 07:03

8 楼

不太同意
要是clustered index的话, 经常insert会产生fragmentation的(internal and
external), 对吧?
所以才有rebuild or reorganize index.....
(不是"no fear of insertting")

clustered

【在 t****n 的大作中提到】

: Which database are you using? If it is SQL Server, it looks like accountID+
: date is a good candidate for the clustered index. In SQL Server, a clustered
: index is actually the table itself, so have no fear of inserting.

p*32009-03-13 07:03

9 楼

谢谢KittyGray

【在 K*******y 的大作中提到】

: 不懂，觉得挺好看的

z*32009-03-13 07:03

10 楼

同意，wildhorse的意见，cluster index说穿了还是tree，只要有插入，还是会有
balance tree的操
作，而且如果后update或是insert的话一样是低效的。

【在 w*******e 的大作中提到】

: 不太同意
: 要是clustered index的话, 经常insert会产生fragmentation的(internal and
: external), 对吧?
: 所以才有rebuild or reorganize index.....
: (不是"no fear of insertting")
:
: clustered

a*p2009-03-13 07:03

11 楼

真好！狂赞！

s*e2009-03-13 07:03

12 楼

bitmap index on date field, if date is really timestamp, that is abusing
bitmap index.

q*x2009-03-13 07:03

13 楼

RE，同不懂，同赞，呵呵~

【在 K*******y 的大作中提到】

: 不懂，觉得挺好看的

B*g2009-03-13 07:03

14 楼

minimum index != no index
b-tree index is ok

【在 s******g 的大作中提到】

j*n2009-03-13 07:03

15 楼

that's why there is another thing called fill factor. which can reduce
fragmentation / page split problem.
for me, i'd use combination of transactionID, accountID, and date for the
clustered index. why? most likely transactionID is incremental. it
guarantees new row is appended into the last data page. what about "hot spot
"? it is not a problem for SQL Server 7.0 and later versions if less 1,000
inserts per second.

【在 w*******e 的大作中提到】

w*e2009-03-13 07:03

16 楼

恩, fill factor能"减少"fragmentation, 但是
这是银行的transaction, 你认为多少fill factor合适?

spot

【在 j*****n 的大作中提到】

: that's why there is another thing called fill factor. which can reduce
: fragmentation / page split problem.
: for me, i'd use combination of transactionID, accountID, and date for the
: clustered index. why? most likely transactionID is incremental. it
: guarantees new row is appended into the last data page. what about "hot spot
: "? it is not a problem for SQL Server 7.0 and later versions if less 1,000
: inserts per second.

B*g2009-03-13 07:03

17 楼

不明白。how "guarantees new row is appended into the last data page" will
imporve the search based on accountID?

spot

【在 j*****n 的大作中提到】

j*n2009-03-13 07:03

18 楼

usually start from 80%
i use to worked for a financial firm with maximum transaction more than 1
million per day. 80% fill factor handles well.

【在 w*******e 的大作中提到】

: 恩, fill factor能"减少"fragmentation, 但是
: 这是银行的transaction, 你认为多少fill factor合适?
:
: spot

j*n2009-03-13 07:03

19 楼

less fragmentation will improve performance, right?
accountid is in the clustered index, index seek will perform.

【在 B*****g 的大作中提到】

: 不明白。how "guarantees new row is appended into the last data page" will
: imporve the search based on accountID?
:
: spot

B*g2009-03-13 07:03

20 楼

I thought when create clustered index on TransID, AcctID, Date, the data
will stored based on TransID (I believe it is unique) order, so nothing
ordered based on AcctID.

【在 j*****n 的大作中提到】

: less fragmentation will improve performance, right?
: accountid is in the clustered index, index seek will perform.

j*n2009-03-13 07:03

21 楼

hmm... you right. however, sql serve can still use B tree instead of hash
tree.
if you doubt with that, we can go for non-clustered index with accountid +
date with fill factor = 80%; then reindex nightly. should work.

【在 B*****g 的大作中提到】

: I thought when create clustered index on TransID, AcctID, Date, the data
: will stored based on TransID (I believe it is unique) order, so nothing
: ordered based on AcctID.

gy2009-03-13 07:03

22 楼

我纯属抬杠哈...
即使80% fillfactor, 也仍然会有fragmentation在银行这种high volume transaction
里,
只是时间的早晚而已
所以reindex是不可少的.

【在 j*****n 的大作中提到】

: usually start from 80%
: i use to worked for a financial firm with maximum transaction more than 1
: million per day. 80% fill factor handles well.

gy2009-03-13 07:03

23 楼

恩, 所以建composite index时是很有考虑的

【在 B*****g 的大作中提到】

: I thought when create clustered index on TransID, AcctID, Date, the data
: will stored based on TransID (I believe it is unique) order, so nothing
: ordered based on AcctID.