bounding box regression多尺寸问题 - 未名空间MITBBS历史存档

国际科技财经博客移民网络热点娱乐民生时事公众号

Redian新闻

>未名空间

>Programming - 葵花宝典

bounding box regression多尺寸问题

bounding box regression多尺寸问题# Programming - 葵花宝典

w*82018-04-19 07:04

1 楼

Given three sorted arrays, A, B, C, each of length n (assume odd), with all
3n elements distinct, find the median of the 3n elements.
O(n) time is easy to do. Is there an O(logn) time algorithm?

k*12018-04-19 07:04

2 楼

我在公司的人力资源小组待了很久一段时间了，公司组织过很多次大大小小的应聘会
。但是经历了这么多是应聘会，我真的总结到一个道理，其实真正能做到守时的应聘者
真的很少。我们都知道在应聘工作的时候，守时是一个多么重要的因素，但是又偏偏有
那么多的应聘人员，是因为守时这个问题而被刷掉，真的很遗憾。其实做到守时真的有
那么难吗？
不管是在生活中是约会还是应聘工作，或者是其他的约定，我们都非常讲究守时这
个原则。因为守时是对对方的一个尊重，换位思考的话，如果是你和别人有一个约会，
你也不希望你等他很久吧。这样的一条守则，在应聘中显得格外重要，也许守时不能说
明什么其他的东西，但是至少它能说明你对待这件事情的态度，如果你真的重视它，那
么你就会担心迟到会给别人造成不好的影响吧！为了这个，你也会尽可能的早一点来到
应聘现场吧！如果在应聘中你没有做到守时的话，哪怕你再有才能，哪怕你有一张能够
说破大天的嘴，恐怕别人也是很难容忍把你拉进这公司的吧？因为再有才能的人，如果
你没有一个好的工作态度，把你招进来也只是多了一个，难题而已，有才能只是一个方
面，这些才能要能为我所用才是关键。所以奉劝所有的人，做好守时这个原则，因为你
的小小的一个表现，真的会给别人一个很好的印象。

b*d2018-04-19 07:04

3 楼

欢迎童鞋们围观。

w*g2018-04-19 07:04

4 楼

我发现bounding box regression能拟合的对象尺寸大小范围很狭隘，
似乎必须通过加multiple prior的方式来提高拟合能力。
版上的同学有遇到过这种情况吗？
我最近在自己实现SSD, faster RCNN和Mask RCNN。代码在这里，
只能看不能跑，依赖关系还没整理出来。
https://github.com/aaalgo/box/blob/master/train-anchors.py
这个只出bounding box
https://github.com/aaalgo/box/blob/master/train.py
这个是类似mask RCNN了。
不过我目前没有做multiple prior。我觉得很奇怪，
像今年kaggle这种细胞图，就一个圆形的东西，大了就拟合不出来。
这儿有一些结果： http://www.wdong.org/~wdong/val.db.out/
细胞一大，bounding box跟不上，再大，干脆就不出东西了。
今年这个kaggle比赛很奇葩，把我至今所有的技术链全都打败了。
最近一个月都在从头开始调所有的流程。

r*c2018-04-19 07:04

5 楼

basic idea, for three sorted arrays of A_1, A_2, A_3 (arbitratry length), (
assume you already know the smallest n_1 numbers and largest n_2 number),
find the longest one and choose the medium, say K in A_1, then find K's
position in all other two. Now you know K's position in all 3n, if it is <
3n/2 then the mediam then update to exclude all numbers smaller than K in
three arrays, and update n_1 = n_1 + all numbers smaller than K. recursion
then
so every time, you reduce the size at least by 5/

【在 w**********8 的大作中提到】

: Given three sorted arrays, A, B, C, each of length n (assume odd), with all
: 3n elements distinct, find the median of the 3n elements.
: O(n) time is easy to do. Is there an O(logn) time algorithm?

t*t2018-04-19 07:04

6 楼

尼玛机器人还特爱好为人师，最后总要来点人生格言，心灵鸡汤之类的玩意。

l*m2018-04-19 07:04

7 楼

对付不同大小的物体，有三种常用的方法：image pyramid, filter pyramid, and
anchor pyramid. 当然filter pyramid不大流行。你这个anchor pyramid应该还行吧？

【在 w***g 的大作中提到】

: 我发现bounding box regression能拟合的对象尺寸大小范围很狭隘，
: 似乎必须通过加multiple prior的方式来提高拟合能力。
: 版上的同学有遇到过这种情况吗？
: 我最近在自己实现SSD, faster RCNN和Mask RCNN。代码在这里，
: 只能看不能跑，依赖关系还没整理出来。
: https://github.com/aaalgo/box/blob/master/train-anchors.py
: 这个只出bounding box
: https://github.com/aaalgo/box/blob/master/train.py
: 这个是类似mask RCNN了。
: 不过我目前没有做multiple prior。我觉得很奇怪，

c*d2018-04-19 07:04

8 楼

T(n) = T(5/6 n) + 2 log(n)
根据Master method, T(n) = [log(n)]^2
不知我的理解对不对？

【在 r****c 的大作中提到】

: basic idea, for three sorted arrays of A_1, A_2, A_3 (arbitratry length), (
: assume you already know the smallest n_1 numbers and largest n_2 number),
: find the longest one and choose the medium, say K in A_1, then find K's
: position in all other two. Now you know K's position in all 3n, if it is <
: 3n/2 then the mediam then update to exclude all numbers smaller than K in
: three arrays, and update n_1 = n_1 + all numbers smaller than K. recursion
: then
: so every time, you reduce the size at least by 5/

w*g2018-04-19 07:04

9 楼

总结得好！我其实比较喜欢image pyramid的。相对anchor pyramid代价小很多。
我其实怀疑一个backbone network接出去多个head，互相影响了导致每个
head的拟合能力下降。但是这个真是说不清楚。
我还有一个怀疑，就是CNN其实只能做pattern matching，而没有多少regression
能力。所以必须把box regression问题分解为多个形状大小相对固定的
prior matching的问题。

【在 l*******m 的大作中提到】

: 对付不同大小的物体，有三种常用的方法：image pyramid, filter pyramid, and
: anchor pyramid. 当然filter pyramid不大流行。你这个anchor pyramid应该还行吧？

x*y2018-04-19 07:04

10 楼

we can reduce it to 2 arrays problem...First compare the median of 3 arrays, if A[n/2]C[0:n/2-1] together, we can find the median of these two in logn, compare
it witht the median of B, eliminate corresponding part of B and A, C
combination. To do it in A, C combination, we have to find the position of
median in both A and C, O(logn) time. This B and A,C combination is just a 2
array problem except to find median and do e

【在 w**********8 的大作中提到】

l*m2018-04-19 07:04

11 楼

anchor pyramid数据流比较简单。不同尺寸的image inference的确快，training不好
说，这三个数据流的融合点有很多选择，还有一个tricky的地方是，不同尺寸出来的
tensors其实是没有calibrated的，平均权重理论上不是最优的。

【在 w***g 的大作中提到】

: 总结得好！我其实比较喜欢image pyramid的。相对anchor pyramid代价小很多。
: 我其实怀疑一个backbone network接出去多个head，互相影响了导致每个
: head的拟合能力下降。但是这个真是说不清楚。
: 我还有一个怀疑，就是CNN其实只能做pattern matching，而没有多少regression
: 能力。所以必须把box regression问题分解为多个形状大小相对固定的
: prior matching的问题。

x*y2018-04-19 07:04

12 楼

Actually, we can do some extension. For m arrays, we can convert to m-1
arrays problem with the time complexity being timed O(logn). Then, convert
into m-2 arrays, m-3 arrays, and in the end 2 arrays, which is O(logn), 1 array
which is O(1), we can conclude that the time efficiency is O((logn)^(m-1)) for m arrays with same length m.
Also, this can be extended into arrays with different lengths as two arrays with different lengths.....

w*g2018-04-19 07:04

13 楼

我感觉ssd那种好多支出来的feature层接同样的prediction head也有这个问题啊，真
是炼金术一般。

：anchor pyramid数据流比较简单。不同尺寸的image inference的确快，training不好
：说，这三个数据流的融合点有很多选择，还有一个tricky的地方是，不同尺寸出来的

w*82018-04-19 07:04

14 楼

so every time, you reduce the size at least by 5/6, log n will do that.
****************
this is not log(n) because log(n) has 2 as the base which means each time
you get rid of half of the rest.

【在 r****c 的大作中提到】

w*r2018-04-19 07:04

15 楼

从多层卷积到pooling到Unet其实都是在实现filter pyramid,应用最广

: 对付不同大小的物体，有三种常用的方法：image pyramid, filter pyramid,
and

: anchor pyramid. 当然filter pyramid不大流行。你这个anchor pyramid应该还
行吧？

【在 l*******m 的大作中提到】

: anchor pyramid数据流比较简单。不同尺寸的image inference的确快，training不好
: 说，这三个数据流的融合点有很多选择，还有一个tricky的地方是，不同尺寸出来的
: tensors其实是没有calibrated的，平均权重理论上不是最优的。

w*82018-04-19 07:04

16 楼

I was trying to reduce the problem to 2 arrays but I did not succeed. In
your solution, even if you find median of A and C. How do you compare it
with B to find the median of A, B,C.
A:1 4 6 10 18
B:22 28 33 45 58
C:7 9 100 180 200
you find the median of A and C is 9 or 10. Tell me how do you use them to
find the median of 3 arrays.

arrays, if A[n/2]A[n/2:n-1] and
2

【在 x***y 的大作中提到】

: we can reduce it to 2 arrays problem...First compare the median of 3 arrays, if A[n/2]: C[0:n/2-1] together, we can find the median of these two in logn, compare
: it witht the median of B, eliminate corresponding part of B and A, C
: combination. To do it in A, C combination, we have to find the position of
: median in both A and C, O(logn) time. This B and A,C combination is just a 2
: array problem except to find median and do e

w*r2018-04-19 07:04

17 楼

我之前有过一个想法，别hard code anchor.把priors catenate到最后的feature map
上，这样bbox就可以用rnn迭代了，效果可能能好不少

w*82018-04-19 07:04

18 楼

I am thinking this way.
let n = 2m +1;
* Let three median are Am, Bm, Cm
* compare them, let Am < Bm < Cm
* get rid of 0 ... m in A, and m ... n-1 in C
* for the rest of arrays, repeat the 3 steps above.
each time, we get rid of 2/6 of the rest numbers, i.e. T(n) = C+T(2n/3). This
is not O(log(n)) as well.
That's the best I can gave. Any other ideas?

【在 r****c 的大作中提到】

w*g2018-04-19 07:04

19 楼

按我的理解，其实完全没必要用 prior + delta的方式来做预测。
比如说有16个prior，最后每个pixel输出16组参数。
prior的作用只需要决定这16组数那几组的weight是0那几组的是1就行。
prior本身的参数是固定的，CNN预测delta或者直接预测(prior+delta)难度一样。
现在有个设计问题没想清楚。就是groundtruth如何分配到所有的prior上。
我试过用k-means clustering来cluster ground truth来定prior。完全没用。
不如16,32,64这么手工定好。后来想想也是，关键是要各种corner case都
罩得住。

map

【在 w*****r 的大作中提到】

: 我之前有过一个想法，别hard code anchor.把priors catenate到最后的feature map
: 上，这样bbox就可以用rnn迭代了，效果可能能好不少

c*d2018-04-19 07:04

20 楼

base 2 or base e doesn't matter
just a difference in the constant coefficient

【在 w**********8 的大作中提到】

: so every time, you reduce the size at least by 5/6, log n will do that.
: ****************
: this is not log(n) because log(n) has 2 as the base which means each time
: you get rid of half of the rest.

x*u2018-04-19 07:04

21 楼

开搞自动驾驶了？

【在 w***g 的大作中提到】

c*d2018-04-19 07:04

22 楼

下一步A, B, C的size分别是m, 2m, m（不同长度），和初始的情况不一样了
This
如果这个recursion是对的，当然是log(n)
google "Master method recursion"

【在 w**********8 的大作中提到】

: I am thinking this way.
: let n = 2m +1;
: * Let three median are Am, Bm, Cm
: * compare them, let Am < Bm < Cm
: * get rid of 0 ... m in A, and m ... n-1 in C
: * for the rest of arrays, repeat the 3 steps above.
: each time, we get rid of 2/6 of the rest numbers, i.e. T(n) = C+T(2n/3). This
: is not O(log(n)) as well.
: That's the best I can gave. Any other ideas?

r*c2018-04-19 07:04

23 楼

it is not absolute right,
since the number of elements are not same for three arrays, your method does
not work well in degraded case

This

【在 w**********8 的大作中提到】

w*82018-04-19 07:04

24 楼

Thanks. You are right. "下一步A, B, C的size分别是m, 2m, m（不同长度），和初
始的情况不一样了"
But master theorem shows T(n/b) is not log(n) if b != 2.
http://en.wikipedia.org/wiki/Master_theorem

【在 c*******d 的大作中提到】

:
: 下一步A, B, C的size分别是m, 2m, m（不同长度），和初始的情况不一样了
: This
: 如果这个recursion是对的，当然是log(n)
: google "Master method recursion"

w*82018-04-19 07:04

25 楼

what is degrade case? could you give one example?

does

【在 r****c 的大作中提到】

: it is not absolute right,
: since the number of elements are not same for three arrays, your method does
: not work well in degraded case
:
: This

x*y2018-04-19 07:04

26 楼

A[n/2]=6, B[n/2]=33, C[n/2]=100, So A':6 10 18 C': 7 9
To find the median M' of A' C', O(log(max(|A'|, |C'|)))=O(logn).
if M' < B[n/2], B is left with B[22 28 33], and find data in A'less than M'
, find data in C' less than M', O(logn), and eleminate them.....
All problems of this kind is essentially 2 array problem, and the time
efficiency is O(logn) * time for finding the median and delete operation in each array. For sorted array, 2nd term is O(1), but for an partially sorted array A'+C',th

【在 w**********8 的大作中提到】

: I was trying to reduce the problem to 2 arrays but I did not succeed. In
: your solution, even if you find median of A and C. How do you compare it
: with B to find the median of A, B,C.
: A:1 4 6 10 18
: B:22 28 33 45 58
: C:7 9 100 180 200
: you find the median of A and C is 9 or 10. Tell me how do you use them to
: find the median of 3 arrays.
:
: arrays, if A[n/2]