算法求教 - 未名空间MITBBS历史存档

国际科技财经博客移民网络热点娱乐民生时事公众号

Redian新闻

>未名空间

>Programming - 葵花宝典

算法求教

算法求教# Programming - 葵花宝典

l*x2010-01-22 08:01

1 楼

算法求教
数据库中的一个table, 有a b c d e f ...等字段.现给定一个记录,要求找出该table
按如下算法得出的适应值分数大于或等于3的所有记录
适应值分数:
1。 a 相同 +1分
2. b 相同 +1分
3. c 相同 +1分
4. d 相同 +2分
5. e 相同 +0分但不相同 -2分
想了半天,貌似只能每个字段进行比较算每个记录的适应值.不知大家有没有什么好的算
法?
由于table 中数据很多(上万),每个字段都计算分值感觉效率不高,不知道怎样做可以最
快最有效
谢谢!

i*l2010-01-22 08:01

2 楼

写SQL?

table

【在 l******x 的大作中提到】

: 算法求教
: 数据库中的一个table, 有a b c d e f ...等字段.现给定一个记录,要求找出该table
: 按如下算法得出的适应值分数大于或等于3的所有记录
: 适应值分数:
: 1。 a 相同 +1分
: 2. b 相同 +1分
: 3. c 相同 +1分
: 4. d 相同 +2分
: 5. e 相同 +0分但不相同 -2分
: 想了半天,貌似只能每个字段进行比较算每个记录的适应值.不知大家有没有什么好的算

b*e2010-01-22 08:01

3 楼

You have a classic inverted index problem. Usually this can be handled
effective by building hash index and use bit maps to store the results.
Google the term "lucene" or "solr", and read some related topics. It is
very straight forward to build a lucene-based inverted index repository
of your data. Then a ranked search will return you, not only all
entries whose score is more than 3, but also order the scores from the
highest to the lowest.

table

【在 l******x 的大作中提到】