Redian新闻
>
ARRY - 基本面良好的小biotech (转载)
avatar
ARRY - 基本面良好的小biotech (转载)# Stock
s*n
1
Suppose you are developing a search engineer. Given a sequence of
numbers as a query string. You need to design algorithm/data structure such
that it can return whether this number sequence belongs to some famous
number sequences. For example, if the query string is “0 1 1 2 3 5”, it
might belong to a Fibonacci sequence. Note that the given sequence may not
always start from the beginning of a known number sequence.
My idea:
For those "famous" number sequences, we may cut them into smaller sequences
(e.g. three numbers each). Then save these sequences into a database (NoSQL
key-value store). When a query comes in, cut it into smaller sequences with
the same length (such 3) and then look up in the database/key-value store.
Any suggestions? Thx!
avatar
j*7
2
【 以下文字转载自 pennystock 俱乐部 】
发信人: jhsph07 (银杏), 信区: pennystock
标 题: ARRY - 基本面良好的小biotech
发信站: BBS 未名空间站 (Fri Aug 14 15:39:51 2009, 美东)
在RA, 糖尿病和Oncology area都有成功希望的药, 公司有研有发, 不缺钱。
ARRY-162 九月份出二期实验结果。
整盘许久, 盘子小, share outstanding少, 好拉。
一个药不好,可以AD.
缺点: Pipeline上的药大多为early/mid-stage的, 目前没有大的partnership, 等
于是还没有大药厂的背书。
avatar
g*y
3
for each "famous" number sequence, construct a suffix tree for it?
avatar
d*o
4
对每一个famous sequence, 建立一个suffix tree.
然后用该string 去遍历suffix tree

such
sequences
NoSQL
with

【在 s*****n 的大作中提到】
: Suppose you are developing a search engineer. Given a sequence of
: numbers as a query string. You need to design algorithm/data structure such
: that it can return whether this number sequence belongs to some famous
: number sequences. For example, if the query string is “0 1 1 2 3 5”, it
: might belong to a Fibonacci sequence. Note that the given sequence may not
: always start from the beginning of a known number sequence.
: My idea:
: For those "famous" number sequences, we may cut them into smaller sequences
: (e.g. three numbers each). Then save these sequences into a database (NoSQL
: key-value store). When a query comes in, cut it into smaller sequences with

avatar
s*n
5
用suffix tree的话,对于这种sequence的suffix,它们不重合,比如
0 1 1 2 3 5 8 13 21 34
suffix
34
21 34
13 21 34
8 13 21 34
5 8 13 21 34
3 5 8 13 21 34
2 3 5 8 13 21 34
1 2 3 5 8 13 21 34
1 1 2 3 5 8 13 21 34
0 1 1 2 3 5 8 13 21 34
avatar
g*y
6
那用hashtable +linked list. use linked list to store the sequence. use
hashtable to do lookup. key = number in sequence; value= pointer to the
number in the linked list.

【在 s*****n 的大作中提到】
: 用suffix tree的话,对于这种sequence的suffix,它们不重合,比如
: 0 1 1 2 3 5 8 13 21 34
: suffix
: 34
: 21 34
: 13 21 34
: 8 13 21 34
: 5 8 13 21 34
: 3 5 8 13 21 34
: 2 3 5 8 13 21 34

avatar
o*o
7
"famous" number sequence 大都是无穷多个的,怎么办?

【在 s*****n 的大作中提到】
: 用suffix tree的话,对于这种sequence的suffix,它们不重合,比如
: 0 1 1 2 3 5 8 13 21 34
: suffix
: 34
: 21 34
: 13 21 34
: 8 13 21 34
: 5 8 13 21 34
: 3 5 8 13 21 34
: 2 3 5 8 13 21 34

avatar
l*i
8
treat the sequence as a string and use KMP to match. You have to cut the
famous sequence at some point. Usually it is not a problem as most sequences
are increasing so you know you can stop when the last integer in your
sequence is smaller than the last one in the prefix of a famous sequence.
oeis is a website to do this search quite efficiently, not sure how they
implemented it.
avatar
b*d
9
除了上边说的suffix tree之外,
一个比较粗的想法:
很多“famous” sequence都是指数级增长,所以越界是比较快的(assume,相信几百
万个这样数之后就overflow了,那么存下来也就是几个M而已)。对于很大的数,可以
直接标记是否他们是fabonacii数or not。对比较短的数列,标记几个level的inverted
index,如上边说的3连续,4连续,5连续,6连续。6连续的话就已经是shingle了,算
得上是finger print了,查询可以非常快的。
相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。