NYU-Depth数据集

转载

女王de专属领地 2023-06-25 09:12:59 博主文章分类：CV和DIP

文章标签 数据集预处理深度图 文章分类 HarmonyOS 后端开发

论文：https://cs.nyu.edu/~silberman/papers/indoor_seg_support.pdf

数据集地址：https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html

数据集包含以下几个部分：

有标签的：视频数据的一个子集，1449张处理好的有标签和补全深度的。伴随着密集多标签。此数据也已经被预处理，以填补缺少的深度标签。
原始数据集：利用Kinect测得的原始的RGB、Depth、加速度数据。
工具箱：用于操作数据和标签的有用的工具。
用于评估的训练和测试部分。

有标签的数据集

NYU-Depth数据集_数据集

RGB图像（左）预处理后的深度（中）图像的一组标签（右）

有标签的数据集是原始数据集的子集。它是由成对的RGB和深度帧同步组成的，并且每个图像都有多个标签。除了加上标签的深度地图之外（右），还包含了一组预处理的深度地图（中），该预处理的深度地图已经利用Colorization方法把确实数据填充了。与原始数据集不同，有标签数据集是一个matlab中的mat文件，有如下几个变量：

accelData – Nx4 matrix of accelerometer values indicated when each frame was taken. The columns contain the roll, yaw, pitch and tilt angle of the device.
depths – HxWxN matrix of depth maps where H and W are the height and width, respectively and N is the number of images. The values of the depth elements are in meters.
images – HxWx3xN matrix of RGB images where H and W are the height and width, respectively, and N is the number of images.
labels – HxWxN matrix of label masks where H and W are the height and width, respectively and N is the number of images. The labels range from 1..C where C is the total number of classes. If a pixel’s label value is 0, then that pixel is ‘unlabeled’.
names – Cx1 cell array of the english names of each class.
namesToIds – map from english label names to IDs (with C key-value pairs)
rawDepths – HxWxN matrix of depth maps where H and W are the height and width, respectively, and N is the number of images. These depth maps are the raw output from the kinect.
scenes – Cx1 cell array of the name of the scene from which each image was taken

原始数据集

NYU-Depth数据集_数据集_02

RGB图像（左）深度图像（右）

原始数据集包含利用Kinect得到的原始的图像以及加速度存储（accelerometer dumps）。RGB相机和深度相机采样率是20至30 FPS（帧/秒）(随着时间会有变化)。因为帧不是同步的，所以在每个文件夹里都包含了每个rgb图像、深度图以及加速计的时间戳。t。数据集分为不同的文件夹,这对应于每个场景的拍摄，都被命名为‘living_room_0012′ 或者 ‘office_0014′。文件结构如下所示：

/
 ../bedroom_0001/
 ../bedroom_0001/a-1294886363.011060-3164794231.dump
 ../bedroom_0001/a-1294886363.016801-3164794231.dump
 ...
 ../bedroom_0001/d-1294886362.665769-3143255701.pgm
 ../bedroom_0001/d-1294886362.793814-3151264321.pgm
 ...
 ../bedroom_0001/r-1294886362.238178-3118787619.ppm
 ../bedroom_0001/r-1294886362.814111-3152792506.ppm

a-开头的是 accelerometer dumps
d-开头的是深度图像
r-开头的是原始rgb图像
Note：由于原始数据集没有经过预处理，所以原始深度图像需要投影到RGB坐标系使得其对齐。

工具箱

demo_synched_projected_frames.m – Demos synchronization of the raw rgb and depth images as well as alignment of the rgb and raw depth.
eval_seg.m – Evaluates the predicted segmentation against the ground truth label map.
get_projected_depth.m – Projects the raw depth image onto the rgb image plane. This file contains the calibration parameters that are specific to the kinect used to gather the data.
get_accel_data.m – Extracts the accelerometer data from the binary files in the raw dataset.
get_projection_mask.m – Gets a mask for the images that crops areas whose depth had to be inferred, rather than directly projected from the Kinect signal.
get_synched_frames.m – Returns a list of synchronized rgb and depth frames from a given scene in the raw dataset.
get_train_test_split.m – Splits the data in a way that ensures that images from the same scene are not found in both train and test sets.