1.报错 ValueError: signal only works in main thread

File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlex\cv\models\base.py", line 240, in net_initialize
    pretrain_weights = get_pretrain_weights(
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlex\cv\models\utils\pretrain_weights.py", line 208, in get_pretrain_weights
    import paddlehub as hub
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlehub\__init__.py", line 30, in <module>
    from . import dataset
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlehub\dataset\__init__.py", line 24, in <module>
    from .squad import SQUAD
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlehub\dataset\squad.py", line 20, in <module>
    from paddlehub.reader import tokenization
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlehub\reader\__init__.py", line 22, in <module>
    from .cv_reader import ImageClassificationReader
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlehub\reader\cv_reader.py", line 26, in <module>
    from ..contrib.ppdet.data.reader import Reader
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlehub\contrib\ppdet\data\reader.py", line 28, in <module>
    from .transform import build_mapper, map, batch, batch_map
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlehub\contrib\ppdet\data\transform\__init__.py", line 24, in <module>
    from .parallel_map import ParallelMappedDataset
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlehub\contrib\ppdet\data\transform\parallel_map.py", line 229, in <module>
    signal.signal(signal.SIGTERM, _reader_exit)
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\signal.py", line 47, in signal
    handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
ValueError: signal only works in main thread

这个错误是因为我在代码中是在子线程中启动训练的, signal. 只能在主线程中使用. 所以就报错了.
我这里将它先注释掉了, 然后加载数据集的时候works_nums = 1 这样就避免了多线程加载数据. 此bug已提交.

解决方法, 把 signal.signal(signal.SIGTERM, _reader_exit) 这行注释掉了, 并且在打dataloader参数中把,

train_dataset = pdx.datasets.CocoDetection(
            data_dir= option.trainDataDir,
            num_workers=1, # 这里设置为1
            ann_file= os.path.join(option.trainDataDir,'train_annotations.json'), # 'xiaoduxiong_ins_det/train.json',
            transforms=train_transforms,
            shuffle=True)

2. 提示缺少cublas64_100.dll
到网站上下载一下即可
https://www.dll-files.com/download/0e506d21dd9e1be9d60d2ad215af943f/cublas64_100.dll.html?c=eGFndlZTdzk3VGVpNk13Z09DanBLZz09
网上下一个放到 c:\window\system32下就可以了

3. 下面的错误, 验证的时候找到了不存在的类别.超出索引


Traceback (most recent call last):
  File "f:\project\AI\ai.cycleblock.cn\common\TrainThread_Paddlex.py", line 135, in main
    self.starttrain(opt)
  File "f:\project\AI\ai.cycleblock.cn\common\TrainThread_Paddlex.py", line 257, in starttrain
    model.train(
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlex\cv\models\mask_rcnn.py", line 220, in train
    self.train_loop(
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlex\cv\models\base.py", line 557, in train_loop
    self.eval_metrics, self.eval_details = self.evaluate(
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlex\cv\models\mask_rcnn.py", line 316, in evaluate
    ap_stats, eval_details = eval_results(
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlex\cv\models\utils\detection_eval.py", line 56, in eval_results       
    box_ap_stats, xywh_results = coco_bbox_eval(
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlex\cv\models\utils\detection_eval.py", line 119, in coco_bbox_eval    
    xywh_results = bbox2out(
  File "F:\ProgramData\Anaconda3\envs\yolo5\lib\site-packages\paddlex\cv\models\utils\detection_eval.py", line 312, in bbox2out
    catid = (clsid2catid[int(clsid)])
KeyError: 12

后来经过排除, 发现是我的验证集标注文件.val_annotations.json 里面的categories个数 比train_annotations.json里面的categories 个数少导致的.

解决方法修改这两个文件的代码逻辑., 把两个文件中的categories 生成的一模一样.就解决了