New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

paper DINO #483

Open

junxnone opened this issue Oct 14, 2024 · 0 comments

Owner

junxnone commented Oct 14, 2024 •

edited

Loading

DINO

DINO - DETR with Improved deNoising anchOr boxe
DETR 存在的问题
- 收敛速度慢(by decoder cross-attention & instability of bipartite matching)
- 查询含义不明确
参考优化
- DAB-DETR - 将位置查询明确表示为 Dynamic Anchor boxes
- DN-DETR - 引入噪声技术，在训练期间稳定二分匹配
- Deformable DETR - 加速收敛
End-to-End
对比去噪训练(contrastive denoising training) - 有助于模型避免同一目标的重复输出
混合查询方法初始化锚点(mixed query selection) - 更好地初始化查询

Arch

对比去噪训练（Contrastive DeNoising Training）
混合查询选择（Mixed Query Selection）
二次预测（Look Forward Twice）

Contrastive DeNoising Training

生成两种类型的对比去噪（CDN）查询：正样本查询和负样本查询
内正方形中的点表示正样本查询
内正方形和外正方形之间的负样本查询
通过噪声尺度( $\lambda_{1} < \lambda_{2}$ )控制
能够更好的抑制重复的框
提升检测小目标能力

Mixed Query Selection

DETR & DN-DETR 使用静态嵌入作为解码器查询图5(a)
- 学习位置查询，内容查询设置为 0 的向量
Deformable DETR
- 学习位置查询与内容查询
- two-stage: 选择 Top K Encoder Feature 增强查询图 5(b)
DINO 使用与所选 top - K 特征相关的位置信息初始化锚框，内容查询保持可学习，避免所选特征对解码器的误导。

Look Forward Twice

根据 Deformable DETR 的一次预测方法，提出二次预测
即层 i 的参数受层 i 和层 (i + 1) 的损失影响，通过使用下一层的输出监督当前层的最终框，提高预测框的精度
即优化初始化框 $b_{i - 1}$ ，也优化框偏移量 $\Delta b_{i}$

Reference

The text was updated successfully, but these errors were encountered:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment