【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

原创

doubleZ0108 2023-01-03 18:43:26 博主文章分类：MVS ©著作权

文章标签 深度学习计算机视觉人工智能 3D 级联 文章分类 虚拟化云计算

©著作权归作者所有：来自51CTO博客作者doubleZ0108的原创作品，请联系作者获取转载授权，否则将追究法律责任

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
CVPR 2020

Abstract

之前的问题：构建3D代价体，随着分辨率的增加cubic增长

memory and time efficient cost volume formulation complementary

首先构建特征金字塔，每一步通过上一步的结果缩小深度假设范围

同样也是coarse-to-fine: gradually higher cost volume resolution and adaptive adjustment of depth intervals

DTU准确度提升了35%，GPU和运行时间降了50%

可以集成到现有方法中

Introduction

3D CNN可以捕捉更多的几何结构（光度一致性、遮挡、透视变化的畸变）

【cascade表示】

特征金字塔提取多尺度特征
早期cost volume构建在大尺度的语义特征上（稀疏采样）
后期cost volume通过之前估计的深度图适应性的调整深度假设范围，构建精细的代价体

这种适应性深度假设和图像分辨率调整使得计算资源被用在more meaningful region，从而降低GPU和时间消耗

【两类问题】

multi-view stereo 主线上的DTU那些
stereo matching：end-point-error(EPE), GwcNet…

Related Work

基于3D代价体的方法受限于下采样cost volume和最后通过插值生成高分辨率视差

cascade可以与之前的方法融合在一起，提高分辨率和性能

Methodology

代价体表示

【构建3D代价体的三步】

假设离散的深度假设平面
将每个视点提取的2D特征投影到假设平面上，构建feature volume
最终fuse together构建3D cost volume

pixel-wise构建cost是不稳定的，在遮挡、重复纹理、低纹理、反射等区域都不好 → 3D CNNs at multiple scales可以用来聚集上下文信息，使得正则化时更鲁棒

【MVS问题中的构建】

相机前平行平面当作深度假设平面，深度假设范围通过稀疏重建得到(colmap)

通过单应变换将2D feature map投影到ref视点的假设平面上，构建feature volume

最终通过方差将每个视点的特征体聚合成一个cost volume

【SM问题中的构建】

视差水平作为假设平面，范围要针对指定场景决定

由于左右视点都被矫正过，因此只是一个x轴的平移(相当于MVS中的投影变换，只是变得很简单)

之后通过类似方法进行聚合，不过方法有

直接聚集，不进行特征降维
sum of absolute differences
计算左右相关性，product only a single-channel correlation map for each disparity level
group-wise correlation

级联代价体

固定的代价体尺寸是 $【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching_3D$ 分辨率深度假设特征通道，都对acc有提升，但影响效率 16G的P100最大能跑 $【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching_人工智能_02$

【深度假设范围】

【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching_级联_03

公式计算起来很简单 $【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching_级联_04$

【深度假设间隔】

两假设平面间的距离 $【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching_级联_05$ 也是最开始大一点粗糙一点， $【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching_计算机视觉_06$ ，逐步假设变小变精确

【深度假设平面数】

上两步已经得到了 $【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching_级联_07$ ，则数量 $【深度学习MVS系列论文】CasMVSNet:Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching_计算机视觉_08$