Redian新闻
>
bounding box regression多尺寸问题
avatar
bounding box regression多尺寸问题# Programming - 葵花宝典
w*8
1
Given three sorted arrays, A, B, C, each of length n (assume odd), with all
3n elements distinct, find the median of the 3n elements.
O(n) time is easy to do. Is there an O(logn) time algorithm?
avatar
k*1
2
我在公司的人力资源小组待了很久一段时间了,公司组织过很多次大大小小的应聘会
。但是经历了这么多是应聘会,我真的总结到一个道理,其实真正能做到守时的应聘者
真的很少。我们都知道在应聘工作的时候,守时是一个多么重要的因素,但是又偏偏有
那么多的应聘人员,是因为守时这个问题而被刷掉,真的很遗憾。其实做到守时真的有
那么难吗?
不管是在生活中是约会还是应聘工作,或者是其他的约定,我们都非常讲究守时这
个原则。因为守时是对对方的一个尊重,换位思考的话,如果是你和别人有一个约会,
你也不希望你等他很久吧。这样的一条守则,在应聘中显得格外重要,也许守时不能说
明什么其他的东西,但是至少它能说明你对待这件事情的态度,如果你真的重视它,那
么你就会担心迟到会给别人造成不好的影响吧!为了这个,你也会尽可能的早一点来到
应聘现场吧!如果在应聘中你没有做到守时的话,哪怕你再有才能,哪怕你有一张能够
说破大天的嘴,恐怕别人也是很难容忍把你拉进这公司的吧?因为再有才能的人,如果
你没有一个好的工作态度,把你招进来也只是多了一个,难题而已,有才能只是一个方
面,这些才能要能为我所用才是关键。所以奉劝所有的人,做好守时这个原则,因为你
的小小的一个表现,真的会给别人一个很好的印象。
avatar
b*d
3
欢迎童鞋们围观。
avatar
w*g
4
我发现bounding box regression能拟合的对象尺寸大小范围很狭隘,
似乎必须通过加multiple prior的方式来提高拟合能力。
版上的同学有遇到过这种情况吗?
我最近在自己实现SSD, faster RCNN和Mask RCNN。代码在这里,
只能看不能跑,依赖关系还没整理出来。
https://github.com/aaalgo/box/blob/master/train-anchors.py
这个只出bounding box
https://github.com/aaalgo/box/blob/master/train.py
这个是类似mask RCNN了。
不过我目前没有做multiple prior。我觉得很奇怪,
像今年kaggle这种细胞图,就一个圆形的东西, 大了就拟合不出来。
这儿有一些结果: http://www.wdong.org/~wdong/val.db.out/
细胞一大,bounding box跟不上,再大,干脆就不出东西了。
今年这个kaggle比赛很奇葩,把我至今所有的技术链全都打败了。
最近一个月都在从头开始调所有的流程。
avatar
r*c
5
basic idea, for three sorted arrays of A_1, A_2, A_3 (arbitratry length), (
assume you already know the smallest n_1 numbers and largest n_2 number),
find the longest one and choose the medium, say K in A_1, then find K's
position in all other two. Now you know K's position in all 3n, if it is <
3n/2 then the mediam then update to exclude all numbers smaller than K in
three arrays, and update n_1 = n_1 + all numbers smaller than K. recursion
then
so every time, you reduce the size at least by 5/

【在 w**********8 的大作中提到】
: Given three sorted arrays, A, B, C, each of length n (assume odd), with all
: 3n elements distinct, find the median of the 3n elements.
: O(n) time is easy to do. Is there an O(logn) time algorithm?

avatar
t*t
6
尼玛机器人还特爱好为人师,最后总要来点人生格言,心灵鸡汤之类的玩意。
avatar
l*m
7
对付不同大小的物体,有三种常用的方法:image pyramid, filter pyramid, and
anchor pyramid. 当然filter pyramid不大流行。你这个anchor pyramid应该还行吧?

【在 w***g 的大作中提到】
: 我发现bounding box regression能拟合的对象尺寸大小范围很狭隘,
: 似乎必须通过加multiple prior的方式来提高拟合能力。
: 版上的同学有遇到过这种情况吗?
: 我最近在自己实现SSD, faster RCNN和Mask RCNN。代码在这里,
: 只能看不能跑,依赖关系还没整理出来。
: https://github.com/aaalgo/box/blob/master/train-anchors.py
: 这个只出bounding box
: https://github.com/aaalgo/box/blob/master/train.py
: 这个是类似mask RCNN了。
: 不过我目前没有做multiple prior。我觉得很奇怪,

avatar
c*d
8
T(n) = T(5/6 n) + 2 log(n)
根据Master method, T(n) = [log(n)]^2
不知我的理解对不对?

【在 r****c 的大作中提到】
: basic idea, for three sorted arrays of A_1, A_2, A_3 (arbitratry length), (
: assume you already know the smallest n_1 numbers and largest n_2 number),
: find the longest one and choose the medium, say K in A_1, then find K's
: position in all other two. Now you know K's position in all 3n, if it is <
: 3n/2 then the mediam then update to exclude all numbers smaller than K in
: three arrays, and update n_1 = n_1 + all numbers smaller than K. recursion
: then
: so every time, you reduce the size at least by 5/

avatar
w*g
9
总结得好!我其实比较喜欢image pyramid的。相对anchor pyramid代价小很多。
我其实怀疑一个backbone network接出去多个head,互相影响了导致每个
head的拟合能力下降。但是这个真是说不清楚。
我还有一个怀疑,就是CNN其实只能做pattern matching,而没有多少regression
能力。所以必须把box regression问题分解为多个形状大小相对固定的
prior matching的问题。

【在 l*******m 的大作中提到】
: 对付不同大小的物体,有三种常用的方法:image pyramid, filter pyramid, and
: anchor pyramid. 当然filter pyramid不大流行。你这个anchor pyramid应该还行吧?

avatar
x*y
10
we can reduce it to 2 arrays problem...First compare the median of 3 arrays, if A[n/2]C[0:n/2-1] together, we can find the median of these two in logn, compare
it witht the median of B, eliminate corresponding part of B and A, C
combination. To do it in A, C combination, we have to find the position of
median in both A and C, O(logn) time. This B and A,C combination is just a 2
array problem except to find median and do e

【在 w**********8 的大作中提到】
: Given three sorted arrays, A, B, C, each of length n (assume odd), with all
: 3n elements distinct, find the median of the 3n elements.
: O(n) time is easy to do. Is there an O(logn) time algorithm?

avatar
l*m
11
anchor pyramid数据流比较简单。不同尺寸的image inference的确快,training不好
说,这三个数据流的融合点有很多选择,还有一个tricky的地方是,不同尺寸出来的
tensors其实是没有calibrated的,平均权重理论上不是最优的。

【在 w***g 的大作中提到】
: 总结得好!我其实比较喜欢image pyramid的。相对anchor pyramid代价小很多。
: 我其实怀疑一个backbone network接出去多个head,互相影响了导致每个
: head的拟合能力下降。但是这个真是说不清楚。
: 我还有一个怀疑,就是CNN其实只能做pattern matching,而没有多少regression
: 能力。所以必须把box regression问题分解为多个形状大小相对固定的
: prior matching的问题。

avatar
x*y
12
Actually, we can do some extension. For m arrays, we can convert to m-1
arrays problem with the time complexity being timed O(logn). Then, convert
into m-2 arrays, m-3 arrays, and in the end 2 arrays, which is O(logn), 1 array
which is O(1), we can conclude that the time efficiency is O((logn)^(m-1)) for m arrays with same length m.
Also, this can be extended into arrays with different lengths as two arrays with different lengths.....
avatar
w*g
13
我感觉ssd那种好多支出来的feature层接同样的prediction head也有这个问题啊,真
是炼金术一般。

:anchor pyramid数据流比较简单。不同尺寸的image inference的确快,training不好
:说,这三个数据流的融合点有很多选择,还有一个tricky的地方是,不同尺寸出来的
avatar
w*8
14
so every time, you reduce the size at least by 5/6, log n will do that.
****************
this is not log(n) because log(n) has 2 as the base which means each time
you get rid of half of the rest.

【在 r****c 的大作中提到】
: basic idea, for three sorted arrays of A_1, A_2, A_3 (arbitratry length), (
: assume you already know the smallest n_1 numbers and largest n_2 number),
: find the longest one and choose the medium, say K in A_1, then find K's
: position in all other two. Now you know K's position in all 3n, if it is <
: 3n/2 then the mediam then update to exclude all numbers smaller than K in
: three arrays, and update n_1 = n_1 + all numbers smaller than K. recursion
: then
: so every time, you reduce the size at least by 5/

avatar
w*r
15
从多层卷积到pooling到Unet其实都是在实现filter pyramid,应用最广


: 对付不同大小的物体,有三种常用的方法:image pyramid, filter pyramid,
and

: anchor pyramid. 当然filter pyramid不大流行。你这个anchor pyramid应该还
行吧?



【在 l*******m 的大作中提到】
: anchor pyramid数据流比较简单。不同尺寸的image inference的确快,training不好
: 说,这三个数据流的融合点有很多选择,还有一个tricky的地方是,不同尺寸出来的
: tensors其实是没有calibrated的,平均权重理论上不是最优的。

avatar
w*8
16
I was trying to reduce the problem to 2 arrays but I did not succeed. In
your solution, even if you find median of A and C. How do you compare it
with B to find the median of A, B,C.
A:1 4 6 10 18
B:22 28 33 45 58
C:7 9 100 180 200
you find the median of A and C is 9 or 10. Tell me how do you use them to
find the median of 3 arrays.

arrays, if A[n/2]A[n/2:n-1] and
2

【在 x***y 的大作中提到】
: we can reduce it to 2 arrays problem...First compare the median of 3 arrays, if A[n/2]: C[0:n/2-1] together, we can find the median of these two in logn, compare
: it witht the median of B, eliminate corresponding part of B and A, C
: combination. To do it in A, C combination, we have to find the position of
: median in both A and C, O(logn) time. This B and A,C combination is just a 2
: array problem except to find median and do e

avatar
w*r
17
我之前有过一个想法,别hard code anchor.把priors catenate到最后的feature map
上,这样bbox就可以用rnn迭代了,效果可能能好不少
avatar
w*8
18
I am thinking this way.
let n = 2m +1;
* Let three median are Am, Bm, Cm
* compare them, let Am < Bm < Cm
* get rid of 0 ... m in A, and m ... n-1 in C
* for the rest of arrays, repeat the 3 steps above.
each time, we get rid of 2/6 of the rest numbers, i.e. T(n) = C+T(2n/3). This
is not O(log(n)) as well.
That's the best I can gave. Any other ideas?

【在 r****c 的大作中提到】
: basic idea, for three sorted arrays of A_1, A_2, A_3 (arbitratry length), (
: assume you already know the smallest n_1 numbers and largest n_2 number),
: find the longest one and choose the medium, say K in A_1, then find K's
: position in all other two. Now you know K's position in all 3n, if it is <
: 3n/2 then the mediam then update to exclude all numbers smaller than K in
: three arrays, and update n_1 = n_1 + all numbers smaller than K. recursion
: then
: so every time, you reduce the size at least by 5/

avatar
w*g
19
按我的理解,其实完全没必要用 prior + delta的方式来做预测。
比如说有16个prior,最后每个pixel输出16组参数。
prior的作用只需要决定这16组数那几组的weight是0那几组的是1就行。
prior本身的参数是固定的,CNN预测delta或者直接预测(prior+delta)难度一样。
现在有个设计问题没想清楚。就是groundtruth如何分配到所有的prior上。
我试过用k-means clustering来cluster ground truth来定prior。完全没用。
不如16,32,64这么手工定好。后来想想也是,关键是要各种corner case都
罩得住。

map

【在 w*****r 的大作中提到】
: 我之前有过一个想法,别hard code anchor.把priors catenate到最后的feature map
: 上,这样bbox就可以用rnn迭代了,效果可能能好不少

avatar
c*d
20
base 2 or base e doesn't matter
just a difference in the constant coefficient

【在 w**********8 的大作中提到】
: so every time, you reduce the size at least by 5/6, log n will do that.
: ****************
: this is not log(n) because log(n) has 2 as the base which means each time
: you get rid of half of the rest.

avatar
x*u
21
开搞自动驾驶了?

【在 w***g 的大作中提到】
: 我发现bounding box regression能拟合的对象尺寸大小范围很狭隘,
: 似乎必须通过加multiple prior的方式来提高拟合能力。
: 版上的同学有遇到过这种情况吗?
: 我最近在自己实现SSD, faster RCNN和Mask RCNN。代码在这里,
: 只能看不能跑,依赖关系还没整理出来。
: https://github.com/aaalgo/box/blob/master/train-anchors.py
: 这个只出bounding box
: https://github.com/aaalgo/box/blob/master/train.py
: 这个是类似mask RCNN了。
: 不过我目前没有做multiple prior。我觉得很奇怪,

avatar
c*d
22

下一步A, B, C的size分别是m, 2m, m(不同长度),和初始的情况不一样了
This
如果这个recursion是对的,当然是log(n)
google "Master method recursion"

【在 w**********8 的大作中提到】
: I am thinking this way.
: let n = 2m +1;
: * Let three median are Am, Bm, Cm
: * compare them, let Am < Bm < Cm
: * get rid of 0 ... m in A, and m ... n-1 in C
: * for the rest of arrays, repeat the 3 steps above.
: each time, we get rid of 2/6 of the rest numbers, i.e. T(n) = C+T(2n/3). This
: is not O(log(n)) as well.
: That's the best I can gave. Any other ideas?

avatar
r*c
23
it is not absolute right,
since the number of elements are not same for three arrays, your method does
not work well in degraded case

This

【在 w**********8 的大作中提到】
: I am thinking this way.
: let n = 2m +1;
: * Let three median are Am, Bm, Cm
: * compare them, let Am < Bm < Cm
: * get rid of 0 ... m in A, and m ... n-1 in C
: * for the rest of arrays, repeat the 3 steps above.
: each time, we get rid of 2/6 of the rest numbers, i.e. T(n) = C+T(2n/3). This
: is not O(log(n)) as well.
: That's the best I can gave. Any other ideas?

avatar
w*8
24
Thanks. You are right. "下一步A, B, C的size分别是m, 2m, m(不同长度),和初
始的情况不一样了"
But master theorem shows T(n/b) is not log(n) if b != 2.
http://en.wikipedia.org/wiki/Master_theorem

【在 c*******d 的大作中提到】
:
: 下一步A, B, C的size分别是m, 2m, m(不同长度),和初始的情况不一样了
: This
: 如果这个recursion是对的,当然是log(n)
: google "Master method recursion"

avatar
w*8
25
what is degrade case? could you give one example?

does

【在 r****c 的大作中提到】
: it is not absolute right,
: since the number of elements are not same for three arrays, your method does
: not work well in degraded case
:
: This

avatar
x*y
26
A[n/2]=6, B[n/2]=33, C[n/2]=100, So A':6 10 18 C': 7 9
To find the median M' of A' C', O(log(max(|A'|, |C'|)))=O(logn).
if M' < B[n/2], B is left with B[22 28 33], and find data in A'less than M'
, find data in C' less than M', O(logn), and eleminate them.....
All problems of this kind is essentially 2 array problem, and the time
efficiency is O(logn) * time for finding the median and delete operation in each array. For sorted array, 2nd term is O(1), but for an partially sorted array A'+C',th

【在 w**********8 的大作中提到】
: I was trying to reduce the problem to 2 arrays but I did not succeed. In
: your solution, even if you find median of A and C. How do you compare it
: with B to find the median of A, B,C.
: A:1 4 6 10 18
: B:22 28 33 45 58
: C:7 9 100 180 200
: you find the median of A and C is 9 or 10. Tell me how do you use them to
: find the median of 3 arrays.
:
: arrays, if A[n/2]
相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。