摘要——Summary
目标是将激光雷达数据投影到2D图像。 我希望能够创建“前视图”并观察激光雷达数据的鸟瞰图,不幸的是,我只能在“前视图”上工作,而不能在另一方面工作。
“前视图”投影——"Front View" projection
笛卡尔坐标转变为极坐标的过程,以与x轴的夹角为横坐标,与XOY平面夹角为纵坐标
为了将激光雷达传感器的“前视图”平坦化为2D图像,我们必须将3D空间中的点投影到可以展开的圆柱形表面到平坦表面。 以下代码 来自 Li et al. 2016 paper
# h_res = horizontal resolution of the lidar sensor
# v_res = vertical resolution of the lidar sensor
x_img = arctan2(y_lidar, x_lidar)/ h_res
y_img = np.arctan2(z_lidar, np.sqrt(x_lidar**2 + y_lidar**2))/ v_res
问题:实际位于汽车右侧的物体在图像中处于左侧,左侧的物体位于右侧,颠倒的位置。
以下代码修复了这一点。
# h_res = horizontal resolution of the lidar sensor
# v_res = vertical resolution of the lidar sensor
x_img = np.arctan2(-y_lidar, x_lidar)/ h_res # seam in the back
y_img = np.arctan2(z_lidar, np.sqrt(x_lidar**2 + y_lidar**2))/ v_res
沿每个轴配置刻度——Configuring the scale along each axis
h_res和v_res varaibels非常依赖于所使用的LIDAR传感器。 在KTTI数据集中,使用的传感器是Velodyne HDL 64E。 根据Velodyne HDL 64E的规格表,它具有以下重要特征:
垂直视野为26.9度,分辨率为0.4度。 垂直视野被分解为传感器上方+2度,传感器下方-24.9度。
360度的水平视野,分辨率为0.08 - 0.35(取决于旋转速度)
可以选择旋转速率在5-20Hz之间。
现在,我们可以按如下方式更新代码:
# Resolution and Field of View of LIDAR sensor
h_res = 0.35 # horizontal resolution, assuming rate of 20Hz is used
v_res = 0.4 # vertical res
v_fov = (-24.9, 2.0) # Field of view (-ve, +ve) along vertical axis
v_fov_total = -v_fov[0] + v_fov[1]
# Convert to Radians
v_res_rad = v_res * (np.pi/180)
h_res_rad = h_res * (np.pi/180)
# Project into image coordinates
x_img = np.arctan2(-y_lidar, x_lidar)/ h_res_rad
y_img = np.arctan2(z_lidar, d_lidar)/ v_res_rad
然而,这导致大约一半的点沿着负x坐标定位,并且大多数在负y坐标中定位。 为了投影到2D图像,我们需要将最小值设置为(0,0)。 所以我们需要改变一些事情:
# SHIFT COORDINATES TO MAKE 0,0 THE MINIMUM
x_min = -360.0/h_res/2 # Theoretical min x value based on specs of sensor
x_img = x_img - x_min # Shift
x_max = 360.0/h_res # Theoretical max x value after shifting
y_min = v_fov[0]/v_res # theoretical min y value based on specs of sensor
y_img = y_img - y_min # Shift
y_max = v_fov_total/v_res # Theoretical max x value after shifting
y_max = y_max + 5 # UGLY: Fudge factor because the calculations based on
# spec sheet do not seem to match the range of angles
# collected by sensor in the data.
2D图像进行光栅化——Rasterising as a 2D image
现在我们将3D点投影到2D坐标点,最小值为(0,0),我们可以将这些点数据绘制成2D图像。
pixel_values = -d_lidar # Use depth data to encode the value for each pixel
cmap = "jet" # Color map to use
dpi = 100 # Image resolution
fig, ax = plt.subplots(figsize=(x_max/dpi, y_max/dpi), dpi=dpi)
ax.scatter(x_img,y_img, s=1, c=pixel_values, linewidths=0, alpha=1, cmap=cmap)
ax.set_axis_bgcolor((0, 0, 0)) # Set regions with no points to black
ax.axis('scaled') # {equal, scaled}
ax.xaxis.set_visible(False) # Do not draw axis tick marks
ax.yaxis.set_visible(False) # Do not draw axis tick marks
plt.xlim([0, x_max]) # prevent drawing empty space outside of horizontal FOV
plt.ylim([0, y_max]) # prevent drawing empty space outside of vertical FOV
fig.savefig("/tmp/depth.png", dpi=dpi, bbox_inches='tight', pad_inches=0.0)
总结
我把所有上面的代码放在一个函数中
def lidar_to_2d_front_view(points,
v_res,
h_res,
v_fov,
val="depth",
cmap="jet",
saveto=None,
y_fudge=0.0
):
""" Takes points in 3D space from LIDAR data and projects them to a 2D
"front view" image, and saves that image.
Args:
points: (np array)
The numpy array containing the lidar points.
The shape should be Nx4
- Where N is the number of points, and
- each point is specified by 4 values (x, y, z, reflectance)
v_res: (float)
vertical resolution of the lidar sensor used.
h_res: (float)
horizontal resolution of the lidar sensor used.
v_fov: (tuple of two floats)
(minimum_negative_angle, max_positive_angle)
val: (str)
What value to use to encode the points that get plotted.
One of {"depth", "height", "reflectance"}
cmap: (str)
Color map to use to color code the `val` values.
NOTE: Must be a value accepted by matplotlib's scatter function
Examples: "jet", "gray"
saveto: (str or None)
If a string is provided, it saves the image as this filename.
If None, then it just shows the image.
y_fudge: (float)
A hacky fudge factor to use if the theoretical calculations of
vertical range do not match the actual data.
For a Velodyne HDL 64E, set this value to 5.
"""
# DUMMY PROOFING
assert len(v_fov) ==2, "v_fov must be list/tuple of length 2"
assert v_fov[0] <= 0, "first element in v_fov must be 0 or negative"
assert val in {"depth", "height", "reflectance"}, \
'val must be one of {"depth", "height", "reflectance"}'
x_lidar = points[:, 0]
y_lidar = points[:, 1]
z_lidar = points[:, 2]
r_lidar = points[:, 3] # Reflectance
# Distance relative to origin when looked from top
d_lidar = np.sqrt(x_lidar ** 2 + y_lidar ** 2)
# Absolute distance relative to origin
# d_lidar = np.sqrt(x_lidar ** 2 + y_lidar ** 2, z_lidar ** 2)
v_fov_total = -v_fov[0] + v_fov[1]
# Convert to Radians
v_res_rad = v_res * (np.pi/180)
h_res_rad = h_res * (np.pi/180)
# PROJECT INTO IMAGE COORDINATES
x_img = np.arctan2(-y_lidar, x_lidar)/ h_res_rad
y_img = np.arctan2(z_lidar, d_lidar)/ v_res_rad
# SHIFT COORDINATES TO MAKE 0,0 THE MINIMUM
x_min = -360.0 / h_res / 2 # Theoretical min x value based on sensor specs
x_img -= x_min # Shift
x_max = 360.0 / h_res # Theoretical max x value after shifting
y_min = v_fov[0] / v_res # theoretical min y value based on sensor specs
y_img -= y_min # Shift
y_max = v_fov_total / v_res # Theoretical max x value after shifting
y_max += y_fudge # Fudge factor if the calculations based on
# spec sheet do not match the range of
# angles collected by in the data.
# WHAT DATA TO USE TO ENCODE THE VALUE FOR EACH PIXEL
if val == "reflectance":
pixel_values = r_lidar
elif val == "height":
pixel_values = z_lidar
else:
pixel_values = -d_lidar
# PLOT THE IMAGE
cmap = "jet" # Color map to use
dpi = 100 # Image resolution
fig, ax = plt.subplots(figsize=(x_max/dpi, y_max/dpi), dpi=dpi)
ax.scatter(x_img,y_img, s=1, c=pixel_values, linewidths=0, alpha=1, cmap=cmap)
ax.set_axis_bgcolor((0, 0, 0)) # Set regions with no points to black
ax.axis('scaled') # {equal, scaled}
ax.xaxis.set_visible(False) # Do not draw axis tick marks
ax.yaxis.set_visible(False) # Do not draw axis tick marks
plt.xlim([0, x_max]) # prevent drawing empty space outside of horizontal FOV
plt.ylim([0, y_max]) # prevent drawing empty space outside of vertical FOV
if saveto is not None:
fig.savefig(saveto, dpi=dpi, bbox_inches='tight', pad_inches=0.0)
else:
fig.show()
以下是一些使用它的样本:
import matplotlib.pyplot as plt
import numpy as np
HRES = 0.35 # horizontal resolution (assuming 20Hz setting)
VRES = 0.4 # vertical res
VFOV = (-24.9, 2.0) # Field of view (-ve, +ve) along vertical axis
Y_FUDGE = 5 # y fudge factor for velodyne HDL 64E
lidar_to_2d_front_view(lidar, v_res=VRES, h_res=HRES, v_fov=VFOV, val="depth",
saveto="/tmp/lidar_depth.png", y_fudge=Y_FUDGE)
lidar_to_2d_front_view(lidar, v_res=VRES, h_res=HRES, v_fov=VFOV, val="height",
saveto="/tmp/lidar_height.png", y_fudge=Y_FUDGE)
lidar_to_2d_front_view(lidar, v_res=VRES, h_res=HRES, v_fov=VFOV,
val="reflectance", saveto="/tmp/lidar_reflectance.png",
y_fudge=Y_FUDGE)
产生以下三个图像:
Depth
Height
Reflectance
目前创建每个图像非常慢,我怀疑它是因为matplotlib,它不能很好地处理大量的散点。 我想创建一个纯粹numpy或使用PIL的实现。 以下为一个纯粹的numpy解决方案,它应该使它更快,并且更有用的预处理步骤。
# ==============================================================================
# SCALE_TO_255
# ==============================================================================
def scale_to_255(a, min, max, dtype=np.uint8):
""" Scales an array of values from specified min, max range to 0-255
Optionally specify the data type of the output (default is uint8)
"""
return (((a - min) / float(max - min)) * 255).astype(dtype)
# ==============================================================================
# POINT_CLOUD_TO_PANORAMA
# ==============================================================================
def point_cloud_to_panorama(points,
v_res=0.42,
h_res = 0.35,
v_fov = (-24.9, 2.0),
d_range = (0,100),
y_fudge=3
):
""" Takes point cloud data as input and creates a 360 degree panoramic
image, returned as a numpy array.
Args:
points: (np array)
The numpy array containing the point cloud. .
The shape should be at least Nx3 (allowing for more columns)
- Where N is the number of points, and
- each point is specified by at least 3 values (x, y, z)
v_res: (float)
vertical angular resolution in degrees. This will influence the
height of the output image.
h_res: (float)
horizontal angular resolution in degrees. This will influence
the width of the output image.
v_fov: (tuple of two floats)
Field of view in degrees (-min_negative_angle, max_positive_angle)
d_range: (tuple of two floats) (default = (0,100))
Used for clipping distance values to be within a min and max range.
y_fudge: (float)
A hacky fudge factor to use if the theoretical calculations of
vertical image height do not match the actual data.
Returns:
A numpy array representing a 360 degree panoramic image of the point
cloud.
"""
# Projecting to 2D
x_points = points[:, 0]
y_points = points[:, 1]
z_points = points[:, 2]
r_points = points[:, 3]
d_points = np.sqrt(x_points ** 2 + y_points ** 2) # map distance relative to origin
#d_points = np.sqrt(x_points**2 + y_points**2 + z_points**2) # abs distance
# We use map distance, because otherwise it would not project onto a cylinder,
# instead, it would map onto a segment of slice of a sphere.
# RESOLUTION AND FIELD OF VIEW SETTINGS
v_fov_total = -v_fov[0] + v_fov[1]
# CONVERT TO RADIANS
v_res_rad = v_res * (np.pi / 180)
h_res_rad = h_res * (np.pi / 180)
# MAPPING TO CYLINDER
x_img = np.arctan2(y_points, x_points) / h_res_rad
y_img = -(np.arctan2(z_points, d_points) / v_res_rad)
# THEORETICAL MAX HEIGHT FOR IMAGE
d_plane = (v_fov_total/v_res) / (v_fov_total* (np.pi / 180))
h_below = d_plane * np.tan(-v_fov[0]* (np.pi / 180))
h_above = d_plane * np.tan(v_fov[1] * (np.pi / 180))
y_max = int(np.ceil(h_below+h_above + y_fudge))
# SHIFT COORDINATES TO MAKE 0,0 THE MINIMUM
x_min = -360.0 / h_res / 2
x_img = np.trunc(-x_img - x_min).astype(np.int32)
x_max = int(np.ceil(360.0 / h_res))
y_min = -((v_fov[1] / v_res) + y_fudge)
y_img = np.trunc(y_img - y_min).astype(np.int32)
# CLIP DISTANCES
d_points = np.clip(d_points, a_min=d_range[0], a_max=d_range[1])
# CONVERT TO IMAGE ARRAY
img = np.zeros([y_max + 1, x_max + 1], dtype=np.uint8)
img[y_img, x_img] = scale_to_255(d_points, min=d_range[0], max=d_range[1])
return img
以下是使用为Velodyne HDL 64E配置的值的示例。
im = point_cloud_to_panorama(points,
v_res=0.42,
h_res=0.35,
v_fov=(-24.9, 2.0),
y_fudge=3,
d_range=(0,100))