boa信用卡能从无回馈的卡转换为有cb的卡吗？ - 未名空间MITBBS历史存档

国际科技财经博客移民网络热点娱乐民生时事公众号

Redian新闻

>未名空间

>Money - 海外理财

boa信用卡能从无回馈的卡转换为有cb的卡吗？

boa信用卡能从无回馈的卡转换为有cb的卡吗？# Money - 海外理财

g*e2012-02-24 08:02

1 楼

没想出太好的答案，答用heap和hash所以ip addr。用heap操作cost很高，hash所以
ipv4 addr很expensive。不知道是open question还是有标准答案的。
大意是某公司gateway记录着所有connection的外部ip addresses，比如保留最多24hr
，不断更新。现在求给定时间过去k sec内出现最多的n个ip address(es)，k可以为60s
，600s，3600s，n=1，10，100，etc。设计高效算法和数据结构。

w*62012-02-24 08:02

2 楼

有张卡没有任何cb等回馈，不时会用来扫货，所以想转换成有cb的卡，不知道可以不可
以。
因为近期申请了好几张卡，再申请估计要被拒
不想再有hard pull了

c*22012-02-24 08:02

3 楼

LRU + Counting Buckets?

s*92012-02-24 08:02

4 楼

可以

【在 w******6 的大作中提到】

: 有张卡没有任何cb等回馈，不时会用来扫货，所以想转换成有cb的卡，不知道可以不可
: 以。
: 因为近期申请了好几张卡，再申请估计要被拒
: 不想再有hard pull了

g*e2012-02-24 08:02

5 楼

LRU主要是来做cache的。用在这里的话就只能求出近似解了。而且维护LRU的数据结构
本来是heap或者sorted tree吧？
主要是维护这个heap得保存k时间里面的所有出现过的ip addr，差不多就是随时间推移
不断地在reconstruct这个heap。否则我觉得用heap已经很不错了。

【在 c***2 的大作中提到】

: LRU + Counting Buckets?

x*i2012-02-24 08:02

6 楼

可以,我转过,不需要重新申请~

j*x2012-02-24 08:02

7 楼

range minimum query

w*62012-02-24 08:02

8 楼

是直接打电话还是网上申请？

【在 x**********i 的大作中提到】

: 可以,我转过,不需要重新申请~

l*32012-02-24 08:02

9 楼

可不可以用一个4G个元素的counter数组，每个数组存一个long，每次access就累加这
个数组。如果query的时候，用min heap做一个linear search，找出n个最多的地址。
如果要优化search的话，用一个什么bitmap记录一段IP range，如果有访问的话，就是
1，否则是0，像个索引，这样可以跳过很多0。
不知道可不可行，大家讨论一下。

f*w2012-02-24 08:02

10 楼

转卡会有hard pull么？

【在 x**********i 的大作中提到】

: 可以,我转过,不需要重新申请~

g*e2012-02-24 08:02

11 楼

你说的4g是把ipv4所有addr记录一个counter，查counter值是O(1)操作不需要优化了。
或者这个search优化我理解错了:P 我觉得这里麻烦的是维护那个heap。
楼上的range min requery怎么理解？想不出转化模型。。。

【在 l**********3 的大作中提到】

: 可不可以用一个4G个元素的counter数组，每个数组存一个long，每次access就累加这
: 个数组。如果query的时候，用min heap做一个linear search，找出n个最多的地址。
: 如果要优化search的话，用一个什么bitmap记录一段IP range，如果有访问的话，就是
: 1，否则是0，像个索引，这样可以跳过很多0。
: 不知道可不可行，大家讨论一下。

a*92012-02-24 08:02

12 楼

可以参考data streaming algorithm中的detect elephant flows
主要idea是靠sampling

24hr
60s

【在 g*****e 的大作中提到】

: 没想出太好的答案，答用heap和hash所以ip addr。用heap操作cost很高，hash所以
: ipv4 addr很expensive。不知道是open question还是有标准答案的。
: 大意是某公司gateway记录着所有connection的外部ip addresses，比如保留最多24hr
: ，不断更新。现在求给定时间过去k sec内出现最多的n个ip address(es)，k可以为60s
: ，600s，3600s，n=1，10，100，etc。设计高效算法和数据结构。

j*x2012-02-24 08:02

13 楼

时间区间任意，求取的最多ip数任意的情况下，问题转化为如何快速获取任意
给定时间区间里，某ip出现的次数；这样看，如果不引入随机误差，是不可能
有良好的性能吧。。。

高，hash所有
的。
比如保留最多24hr
address(es)，k可以为60s

【在 g*****e 的大作中提到】

c*n2012-02-24 08:02

14 楼

这道题有没有什么比较可行的solution啊？

y*g2012-02-24 08:02

15 楼

ipv4 addr有什么好hash的？本身就是一个整数啊

24hr
60s

【在 g*****e 的大作中提到】

z*t2012-02-24 08:02

16 楼

能不能用Trie. 可以每个结点存IP地址中的一个数字，总共也就4或6层。每个叶子
存着一个IP地址的count
顺着读一遍输入，建这个树。然后再在这个树上找出现最多的ip地址。可能优势是找
某个IP的count快，缺点是还得列出所有的count，才能知道哪些是出现最多的IP。

l*a2012-02-24 08:02

17 楼

我的想法根你差不多。
用这个Trie记录每个IP的访问次数及访问时间的list,
###为了提高一点效率，我还希望有这个list的tail pointer so that i can add new
items
directly.
now the Trie will contain all the access information in the past
24Hours.
we need to add a timer function for the Trie to refresh the trie
frequently to remove expired data and update access count for each IP.
each time there is a request come,
I will
1)Make a copy of the whole Trie
2)scan the whole trie,remove the expired access from the head of list
for each IP,adjust the access count
3) Use a max heap (size as request) to store the access count for each
IP

【在 z****t 的大作中提到】

: 能不能用Trie. 可以每个结点存IP地址中的一个数字，总共也就4或6层。每个叶子
: 存着一个IP地址的count
: 顺着读一遍输入，建这个树。然后再在这个树上找出现最多的ip地址。可能优势是找
: 某个IP的count快，缺点是还得列出所有的count，才能知道哪些是出现最多的IP。

c*n2012-02-24 08:02

18 楼

我觉得你是题目没听明白。
如果是要continuously calculate this, 就是online algorithm, 后面来那个IP,
谁
知道? 最简单就是用个2^32 bit 的hash, 记录count, 但是没有那么大内存的话，就是
要用
更小的cache 来近似，就是LRU 之类的概念，但这里谁知道后面的IP incoming
pattern?
LRU mem cache 如果没有hit 无所谓，就是个lower performance, 你这个题如果不
hit,
就错了。比如，如果用keep largest count IP,
you have only 4 mem cells to record the map.
you have incoming sequence:
a a b b c c d d e e e e e e
before "e" comes, all cells are taken, and have count 2, then e has no
count yet, so it will never enter into the map.
如果有2^32 map size, 那就
void OnInComingIP(int IP) {
all_IP_count{IP}++ ;
max_count_heap.insert(key=>all_IP_count{IP}, payload=>IP);
}
where max_count_heap has a size of n
the above is not quite right.
the problem is that the all_IP_count should be a count of the number
of occurances within the last K seconds. so that means , whenever
clock advances one second, we have to scan the entire all_IP_count
map, and purge the counts older than the window. so it can't be a ++,
but should be
all_IP_count{IP}.append(currentTime());
but still, upating the map every second , and updating the heap, is
still costly

24hr
60s

【在 g*****e 的大作中提到】

g*s2012-02-24 08:02

19 楼

为何hash ipv4很贵？比任意长的字符串便宜太多了。
hash + link list吧。

24hr
60s

【在 g*****e 的大作中提到】