三维点云课程—PointNet-Pytorch运行


三维点云课程---PointNet-Pytorch运行

  • 三维点云课程---PointNet-Pytorch运行
  • 1.分类---Classification
  • 1.1训练
  • 1.2 分类训练可能出现的问题
  • 1.3测试分类后的模型文件
  • 1.4测试分类时源码可能遇到的问题
  • 2.分割---Segmentation
  • 2.1训练
  • 2.2分割训练可能出现的问题
  • 2.3测试分类后的模型文件
  • 2.4测试分类时源码可能遇到的问题



PointNet 源码下载地址:https://github.com/fxia22/pointnet.pytorch

源代码存在一些问题,喜欢折腾的小伙伴可以继续往下看,不喜欢的话,我也会在文章的最后给出我调试好的PointNet_Pytorch的包,直接运行即可。

Windows10的环境

python==3.7.4

torch==1.6.0

cuda=10.1

cudnn=8.0

PointNet包安装

cd pointnet.pytorch-master
pip install -e .

再通过pip list查看终端有没有PointNet包

数据集为shapenet类型



1.分类—Classification

1.1训练

在 pointnet.pytorch-master/utils文件夹下,通过快捷键打开终端:按住Ctrl+L,输入cmd回车,就会快速打开cmd窗口,且定位到该文件夹下。在终端输入

python train_classification.py --dataset=E:\PointNet\pointnet.pytorch-master\pointnet.pytorch-master\shapenetcore_partanno_segmentation_benchmark_v0\ --nepoch=4 --dataset_type=shapenet

不出问题的话,终端开始进行训练,在10个train进行test。出现以下的情况表示训练结束了

pytorch lightning 设置epoch pytorch pointnet_pytorch

此时在该文件夹下会产生分类模型文件夹cls



1.2 分类训练可能出现的问题

报错1: Detected call oflr_scheduler.step()beforeoptimizer.step()`

参考:

解决方案: 将train_classification.py下的代码 scheduler.step()(在for epoch in range(opt.nepoch)下方)放在每次epoch训练完成之后。更改代码如下

for epoch in range(opt.nepoch):
        #scheduler.step()
        for i, data in enumerate(dataloader, 0):
            points, target = data
            target = target[:, 0]
            points = points.transpose(2, 1)
            points, target = points.cuda(), target.cuda()
            optimizer.zero_grad()
            classifier = classifier.train()
            pred, trans, trans_feat = classifier(points)
            loss = F.nll_loss(pred, target)
            if opt.feature_transform:
                loss += feature_transform_regularizer(trans_feat) * 0.001
            loss.backward()
            optimizer.step()
            pred_choice = pred.data.max(1)[1]
            correct = pred_choice.eq(target.data).cpu().sum()
            print('[%d: %d/%d] train loss: %f accuracy: %f' % (epoch, i, num_batch, loss.item(), correct.item() / float(opt.batchSize)))

            if i % 10 == 0:
                j, data = next(enumerate(testdataloader, 0))
                points, target = data
                target = target[:, 0]
                points = points.transpose(2, 1)
                points, target = points.cuda(), target.cuda()
                classifier = classifier.eval()
                pred, _, _ = classifier(points)
                loss = F.nll_loss(pred, target)
                pred_choice = pred.data.max(1)[1]
                correct = pred_choice.eq(target.data).cpu().sum()
                print('[%d: %d/%d] %s loss: %f accuracy: %f' % (epoch, i, num_batch, blue('test'), loss.item(), correct.item()/float(opt.batchSize)))
        
        scheduler.step()
        torch.save(classifier.state_dict(), '%s/cls_model_%d.pth' % (opt.outf, epoch))

报错2:The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable.

参考:

解决方案:将你要运行的代码放到main函数中运行即可,train_classification.py代码更改如下

if __name__=='__main__':

    optimizer = optim.Adam(classifier.parameters(), lr=0.001, betas=(0.9, 0.999))
    scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5)
    classifier.cuda()

    num_batch = len(dataset) / opt.batchSize

    for epoch in range(opt.nepoch):
        #scheduler.step()
        for i, data in enumerate(dataloader, 0):
            points, target = data
            target = target[:, 0]
            points = points.transpose(2, 1)
            points, target = points.cuda(), target.cuda()
            optimizer.zero_grad()
            classifier = classifier.train()
            pred, trans, trans_feat = classifier(points)
            loss = F.nll_loss(pred, target)
            if opt.feature_transform:
                loss += feature_transform_regularizer(trans_feat) * 0.001
            loss.backward()
            optimizer.step()
            pred_choice = pred.data.max(1)[1]
            correct = pred_choice.eq(target.data).cpu().sum()
            print('[%d: %d/%d] train loss: %f accuracy: %f' % (epoch, i, num_batch, loss.item(), correct.item() / float(opt.batchSize)))

            if i % 10 == 0:
                j, data = next(enumerate(testdataloader, 0))
                points, target = data
                target = target[:, 0]
                points = points.transpose(2, 1)
                points, target = points.cuda(), target.cuda()
                classifier = classifier.eval()
                pred, _, _ = classifier(points)
                loss = F.nll_loss(pred, target)
                pred_choice = pred.data.max(1)[1]
                correct = pred_choice.eq(target.data).cpu().sum()
                print('[%d: %d/%d] %s loss: %f accuracy: %f' % (epoch, i, num_batch, blue('test'), loss.item(), correct.item()/float(opt.batchSize)))
        
        scheduler.step()
        torch.save(classifier.state_dict(), '%s/cls_model_%d.pth' % (opt.outf, epoch))

    total_correct = 0
    total_testset = 0
    for i,data in tqdm(enumerate(testdataloader, 0)):
        points, target = data
        target = target[:, 0]
        points = points.transpose(2, 1)
        points, target = points.cuda(), target.cuda()
        classifier = classifier.eval()
        pred, _, _ = classifier(points)
        pred_choice = pred.data.max(1)[1]
        correct = pred_choice.eq(target.data).cpu().sum()
        total_correct += correct.item()
        total_testset += points.size()[0]

    print("final accuracy {}".format(total_correct / float(total_testset)))

报错3:PermissionError: [WinError 5] 拒绝访问

解决方案:将训练的workers改为0(默认为4),师妹说这是关于线程的,windows10下改为0没有影响,起作用是在linux下的(我深度学习纯小白),代码更改如下

parser.add_argument(
    '--workers', type=int, help='number of data loading workers', default=0) #4

报错4:如果在运行过程中,提示内存不够的问题

**解决方案:**将batchsize调小一点即可(默认是32,改为8即可),代码更改如下

parser.add_argument(
    '--batchSize', type=int, default=8 ,help='input batch size')  #32

报错5:输入指令后,程序没报错也没有任何响应,正常情况下确实有点慢,但是半天没反应,说明代码出现了问题,这是个隐藏的Bug,我当时也折腾了好几天才解决这个问题。

解决方案: 需要在pytorch引入入drop_last=True参数。代码更改如下。关于drop_last的作用,参考:https://www.jb51.net/article/178398.htm

dataloader = torch.utils.data.DataLoader(
    dataset,
    batch_size=opt.batchSize,
    shuffle=True,
    num_workers=int(opt.workers)
    ,drop_last=True)   ##很重要

以上就是我在分类时遇到的问题,可能你和我遇到的不一样,到时候多百度百度。



1.3测试分类后的模型文件

在pointnet.pytorch-master/utils文件夹下输入

python show_cls.py --model cls/cls_model_0.pth

正常情况下,终端开始打印

pytorch lightning 设置epoch pytorch pointnet_深度学习_02


1.4测试分类时源码可能遇到的问题

1.出现FileNotFoundError: [Errno 2] No such file or directory: 'shapenetcore_partanno_segmentation_benchmark_v0/synsetoffset2category.txt'

解决方案

实际上是有这个文件的。查看show_cls.py,发现此处root位置为:
root='shapenetcore_partanno_segmentation_benchmark_v0', 需要将路径改为绝对路径:

root='E:\PointNet\pointnet.pytorch-master\pointnet.pytorch-master\shapenetcore_partanno_segmentation_benchmark_v0',

2.但此时的accuracy都是0。需要修改show_cls.py中的代码:

#correct = pred_choice.eq(target.data).cpu().sum()
 correct = target.eq(pred_choice.data).cpu().sum().data.numpy()

此处参考:https://github.com/fxia22/pointnet.pytorch/issues/29



2.分割—Segmentation

2.1训练

在pointnet.pytorch-master/utils文件夹下输入

python train_segmentation.py --dataset=E:\PointNet\pointnet.pytorch-master\pointnet.pytorch-master\shapenetcore_partanno_segmentation_benchmark_v0 --nepoch=5 --class_choice=Chair

正常情况下,终端开始进行训练了,出现下图,表示训练结束了,且在utils文件夹下参数seg模型文件夹

pytorch lightning 设置epoch pytorch pointnet_深度学习_03


2.2分割训练可能出现的问题

代码的修改"参考1.2的分类训练可能出现的问题",这里就不赘述了。



2.3测试分类后的模型文件

在pointnet.pytorch-master/utils文件夹下输入

python show_seg.py --model seg/seg_model_Chair_3.pth --class_choice Airplane

正常情况下,会显示如下

pytorch lightning 设置epoch pytorch pointnet_深度学习_04

在该指令下,Airplane可以替换成训练好的其他类别。在该图片下也有一些人机交互的快捷键。

q:退出程序,程序是死循环的,强制是关不掉的

t+q:变换颜色

n:放大

m:缩小

r:恢复原状

s:保存图片

具体的快捷键的设置查看show3d_balls.py



2.4测试分类时源码可能遇到的问题

1.出现FileNotFoundError: [Errno 2] 没有那个文件或目录: '~/dengjie/Paper/PointNet/pointnet.pytorch/shapenetcore_partanno_segmentation_benchmark_v0/synsetoffset2category.txt'

解决方案

#root=opt.dataset,   
root='/home/dengjie/dengjie/Paper/PointNet/pointnet.pytorch/shapenetcore_partanno_segmentation_benchmark_v0',

2.如果执行以上的指令出现

pytorch lightning 设置epoch pytorch pointnet_python_05


pytorch lightning 设置epoch pytorch pointnet_深度学习_06

因为show_seg.py调用show3d_balls.py,而show3d_balls.py使用render_balls_so的库

dll = np.ctypeslib.load_library('render_balls_so', '.'),需要重新生成dll,替换原文件的dll。

参考:

具体操作如下

用VS新建一个dll动态链接库工程,头文件为dlltest.h,源文件为dllmain.cpp,编译成功后在\x64\Debug文件夹下有一个render_balls_so.dll,替换即可。

dlltest.h

#include <cstdio>
#include <vector>
#include <algorithm>
#include <math.h>
using namespace std;

struct PointInfo {
	int x, y, z;
	float r, g, b;
};

extern "C" __declspec(dllexport) void render_ball(int h, int w, unsigned char * show, int n, int * xyzs, float * c0, float * c1, float * c2, int r) {
		/*r = max(r, 1);
		vector<int> depth(h*w, -2100000000);
		vector<PointInfo> pattern;
		for (int dx = -r; dx <= r; dx++)
			for (int dy = -r; dy <= r; dy++)
				if (dx*dx + dy*dy<r*r) {
					double dz = sqrt(double(r*r - dx*dx - dy*dy));
					PointInfo pinfo;
					pinfo.x = dx;
					pinfo.y = dy;
					pinfo.z = dz;
					pinfo.r = dz / r;
					pinfo.g = dz / r;
					pinfo.b = dz / r;
					pattern.push_back(pinfo);
				}
		double zmin = 0, zmax = 0;
		for (int i = 0; i<n; i++) {
			if (i == 0) {
				zmin = xyzs[i * 3 + 2] - r;
				zmax = xyzs[i * 3 + 2] + r;
			}
			else {
				zmin = min(zmin, double(xyzs[i * 3 + 2] - r));
				zmax = max(zmax, double(xyzs[i * 3 + 2] + r));
			}
		}
		for (int i = 0; i<n; i++) {
			int x = xyzs[i * 3 + 0], y = xyzs[i * 3 + 1], z = xyzs[i * 3 + 2];
			for (int j = 0; j<int(pattern.size()); j++) {
				int x2 = x + pattern[j].x;
				int y2 = y + pattern[j].y;
				int z2 = z + pattern[j].z;
				if (!(x2<0 || x2 >= h || y2<0 || y2 >= w) && depth[x2*w + y2]<z2) {
					depth[x2*w + y2] = z2;
					double intensity = min(1.0, (z2 - zmin) / (zmax - zmin)*0.7 + 0.3);
					show[(x2*w + y2) * 3 + 0] = pattern[j].b*c2[i] * intensity;
					show[(x2*w + y2) * 3 + 1] = pattern[j].g*c0[i] * intensity;
					show[(x2*w + y2) * 3 + 2] = pattern[j].r*c1[i] * intensity;
				}
			}
		}*/
	}

//extern "C"

dllmain.cpp

#include <cstdio>
#include <vector>
#include <algorithm>
#include <math.h>
using namespace std;

struct PointInfo {
	int x, y, z;
	float r, g, b;
};

extern "C" {

	__declspec(dllexport) void render_ball(int h, int w, unsigned char * show, int n, int * xyzs, float * c0, float * c1, float * c2, int r) {
		r = max(r, 1);
		vector<int> depth(h*w, -2100000000);
		vector<PointInfo> pattern;
		for (int dx = -r; dx <= r; dx++)
			for (int dy = -r; dy <= r; dy++)
				if (dx*dx + dy*dy<r*r) {
					double dz = sqrt(double(r*r - dx*dx - dy*dy));
					PointInfo pinfo;
					pinfo.x = dx;
					pinfo.y = dy;
					pinfo.z = dz;
					pinfo.r = dz / r;
					pinfo.g = dz / r;
					pinfo.b = dz / r;
					pattern.push_back(pinfo);
				}
		double zmin = 0, zmax = 0;
		for (int i = 0; i<n; i++) {
			if (i == 0) {
				zmin = xyzs[i * 3 + 2] - r;
				zmax = xyzs[i * 3 + 2] + r;
			}
			else {
				zmin = min(zmin, double(xyzs[i * 3 + 2] - r));
				zmax = max(zmax, double(xyzs[i * 3 + 2] + r));
			}
		}
		for (int i = 0; i<n; i++) {
			int x = xyzs[i * 3 + 0], y = xyzs[i * 3 + 1], z = xyzs[i * 3 + 2];
			for (int j = 0; j<int(pattern.size()); j++) {
				int x2 = x + pattern[j].x;
				int y2 = y + pattern[j].y;
				int z2 = z + pattern[j].z;
				if (!(x2<0 || x2 >= h || y2<0 || y2 >= w) && depth[x2*w + y2]<z2) {
					depth[x2*w + y2] = z2;
					double intensity = min(1.0, (z2 - zmin) / (zmax - zmin)*0.7 + 0.3);
					show[(x2*w + y2) * 3 + 0] = pattern[j].b*c2[i] * intensity;
					show[(x2*w + y2) * 3 + 1] = pattern[j].g*c0[i] * intensity;
					show[(x2*w + y2) * 3 + 2] = pattern[j].r*c1[i] * intensity;
				}
			}
		}
	}

}//extern "C"