本次模式识别课程要求实现路标检测,训练集只给了5个样本,测试集有50个样本,听说HOG特征+特征匹配就能达到很好的效果,因此采用了这种方法。

在python-opencv里,有定义了一个类cv2.HOGDescriptor,使用这个类就可以直接提取图片的HOG特征。图片没有要求,3通道和单通道的我试一下结果一样。
网上关于这个类的介绍很少,翻了好多内容才找到了一部分。首先来看一下如何直接使用构造函数来定义一个hog对象,下面就是定义的方法,里面的参数稍微看一下(常用的就前面几个,后面的默认就行,在opencv教材里全部用的默认参数)

hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
                        histogramNormType,L2HysThreshold,gammaCorrection,nlevels)

常用的是winSize, blockSize, blockStride, cellSize, nbins这四个,分别是窗口大小(单位:像素)、block大小(单位:像素)、block步长(单位:像素)、cell大小(单位:像素)、bin的取值
这些的概念建议找一个HOG教程自己看一下就行,我们用的时候就自己规定这几个参数就差不多了(用默认的也可以,但是效果可能不好,毕竟这个特征描述子很看参数设置的,可以更换几组参数多试试)
贴一下我自己用的时候的过程:

import numpy as np
import cv2

img = cv2.imread(test)

#在这里设置参数
winSize = (128,128)
blockSize = (64,64)
blockStride = (8,8)
cellSize = (16,16)
nbins = 9

#定义对象hog,同时输入定义的参数,剩下的默认即可
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins)

定义完HOG描述子对象hog之后,就能拿来计算图像的HOG特征了,它封装的太好以至于用着非常简单无脑,直接用这个类中的成员函数compute就能求得一个图片的HOG特征描述子,它的返回值一个拼接好的n*1维的特征向量(应该就是把许多个特征向量横向拼接起来了,具体的n要看你设置的参数和窗口、block的步长),数据结构是Numpy-nparray类型,利用numpy也非常方便处理,使用过程如下
compute常用的参数有3个,第一个是必须的参数,就是图片(用opencv读取的numpy-nparray,经测试3通道BGR或者单通道灰度图都可以,而且结果也一样)数据结构。第二个是winStride,是窗口滑动步长(影响最终n的大小)。第三个是padding,填充,就是在外面填充点来处理边界。
然后就开始使用compute来计算HOG特征描述子

winStride = (8,8)
padding = (8,8)
test_hog = hog.compute(img, winStride, padding).reshape((-1,))

这里我就得到了HOG描述子,一个n*1的矩阵(numpy-nparray),这样HOG描述子就提取出来了,剩下就随心所欲了,这就是用python-opencv来实现提取HOG描述子

对于上述的路标问题我就是提取每个图片的HOG描述子,然后相互求内积,内积大的就说明两者最相近。

下面是我自己用python实现的HOG特征提取,主要是思路和梯度计算部分需要想明白,别的都很简单。

提取流程

python opencv 提取骨干 opencv提取hog特征_模式识别

梯度计算

python opencv 提取骨干 opencv提取hog特征_模式识别_02

归一化

python opencv 提取骨干 opencv提取hog特征_HOG特征_03

提取单通道、三通道HOG特征,完成路标识别代码如下:

import numpy as np
import cv2
import os
import math
from sklearn.preprocessing import normalize

eps = 0.000001

#灰度图提取HOG
def getHOG_1dims(pic_name):
    img = cv2.imread(pic_name,cv2.IMREAD_GRAYSCALE)
    img = img/255
    img = cv2.resize(img,(207,194))
    g_img = np.zeros((img.shape[0],img.shape[1],2))
    for i in range(1,img.shape[0]-1):
        for j in range(1,img.shape[1]-1):
            gx = img[i+1,j] - img[i-1,j]
            gy = img[i,j+1] - img[i,j-1]
            g = (gx**2 + gy**2)**0.5
            if gx == 0 and gy == 0:
                dg = 0
            elif gx == 0 and gy != 0:
                dg = math.pi/2
            else:
                dg = math.atan(gy/gx) 
                if dg < 0:
                    dg = dg + math.pi
            if dg == math.pi:
                dg = 0
            g_img[i,j,0] = g
            g_img[i,j,1] = dg 
    cell_n = np.zeros((9))
    #cell h
    h = img.shape[0]//16
    #cell w
    w = img.shape[1]//16
    #cell size per h
    h_size = 16
    #cell size per w
    w_size = 16
    cell = np.zeros((h,w,9))    
    for m in range(h):
        for n in range(w):
            for i in range(h_size*m,h_size*(m+1)):
                for j in range(w_size*n,w_size*(n+1)):
                    cell_n[int(g_img[i,j,1]//(math.pi/9))] += g_img[i,j,0]
            cell[m,n] = cell_n
    block = np.zeros((h//2,w//2,9))
    for p in range(h//2):
        for q in range(w//2):
            for i in range(2*p,2*p+2):
                for j in range(2*q,2*q+2):
                    block[p,q] += cell[i,j]
    block_norm = np.zeros((h//2,w//2,9))
    for i in range(h//2):
        for j in range(w//2):
            length = (np.linalg.norm(block[i,j])**2 + 0.000001)**0.5
            block_norm[i,j] = block[i,j]/length
    block_norm = block_norm.reshape(block_norm.shape[0]*block_norm.shape[1],9)
    return block_norm  


#RGB提取HOG
def getHOG_3dims(pic_name):
    img = cv2.imread(pic_name)
    img = cv2.resize(img,(192,192))
    #img = cv2.imread('xxx.jpg')
    #cv2.imshow('result',img)
    #key = cv2.waitKey()
    #print(img.shape[0])
    #if key & 0xff == ord('q'):
    #    cv2.destroyAllWindows()
    #gamma归一化 gamma取1
    img = img/255
    #用1阶微分算子计算图像梯度
    g_img = np.zeros((img.shape[0],img.shape[1],3,2))
    for i in range(1,img.shape[0]-1):
        for j in range(1,img.shape[1]-1):
            gx_b = img[i+1,j,0] - img[i-1,j,0]
            gy_b = img[i,j+1,0] - img[i,j-1,0]
            gx_g = img[i+1,j,1] - img[i-1,j,1]
            gy_g = img[i,j+1,1] - img[i,j-1,1]
            gx_r = img[i+1,j,2] - img[i-1,j,2]
            gy_r = img[i,j+1,2] - img[i,j-1,2]
            gb = (gx_b**2 + gy_b**2)**0.5
            gg = (gx_g**2 + gy_g**2)**0.5
            gr = (gx_r**2 + gy_r**2)**0.5
            if gx_b == 0 and gy_b == 0:
                dgb = 0
            elif gx_b == 0 and gy_b != 0:
                dgb = math.pi/2
            else:
                dgb = math.atan(gy_b/gx_b) 
                if dgb < 0:
                    dgb = dgb + math.pi
            if gx_g == 0 and gy_g == 0:
                dgg = 0
            elif gx_g == 0 and gy_g != 0:
                dgg = math.pi/2
            else:
                dgg = math.atan(gy_g/gx_g) 
                if dgg < 0:
                    dgg = dgg + math.pi
            if gx_r == 0 and gy_r == 0:
                dgr = 0
            elif gx_r == 0 and gy_r != 0:
                dgr = math.pi/2
            else:
                dgr = math.atan(gy_r/gx_r)
                if dgr < 0:
                    dgr = dgr + math.pi
            g_img[i,j,0,0] = gb
            g_img[i,j,1,0] = gg
            g_img[i,j,2,0] = gr
            g_img[i,j,0,1] = dgb
            g_img[i,j,1,1] = dgg
            g_img[i,j,2,1] = dgr        
    #计算cell的梯度直方图向量,其中每个cell包含8*8个像素,每个block包含16*16个像素,即2*2个cell
    cell_n = np.zeros((3,9))
    #cell h
    h = 24
    #cell w
    w = 24
    #cell size per h
    h_size = img.shape[0]//24
    #cell size per w
    w_size = img.shape[1]//24

    cell = np.zeros((h,w,27))
    for m in range(h):
        for n in range(w):
            for i in range(h_size*m,h_size*(m+1)):
                for j in range(w_size*n,w_size*(n+1)):
                    for k in range(3):
                        cell_n[k,int(g_img[i,j,k,1]//(math.pi/9))] += g_img[i,j,k,0]
            cell[m,n] = cell_n.reshape(27)
    block = np.zeros((h//2,w//2,27))
    for p in range(h//2):
        for q in range(w//2):
            for i in range(2*p,2*p+2):
                for j in range(2*q,2*q+2):
                    block[p,q] += cell[i,j]
    block_norm = np.zeros((h//2,w//2,27))
    for i in range(h//2):
        for j in range(w//2):
            length = (np.linalg.norm(block[i,j])**2 + 0.000001)**0.5
            block_norm[i,j] = block[i,j]/length
    block_norm = block_norm.reshape(block_norm.shape[0]*block_norm.shape[1],27)
    return block_norm


def judge(test):
    global train
    test_hog = getHOG_1dims(test)

    temp = 0
    result = 0
    for i in range(5):
        matrix = np.dot(test_hog,train[i].T)
        num_sum = 0
        for j in range(36):
#            for k in range(16):
            num_sum += matrix[j,j]
        if num_sum > temp:
            temp = num_sum
            result = i+1
    return result


if __name__ == '__main__':
	train_1 = getHOG_1dims('xxx/1.jpg')
	train_2 = getHOG_1dims('/xxx/2.jpg')
	train_3 = getHOG_1dims('/xxx/3.jpg')
	train_4 = getHOG_1dims('/xxx/4.jpg')
	train_5 = getHOG_1dims('/xxx/5.jpg')
	train = [train_1,train_2,train_3,train_4,train_5]
	path = '/xxx/test/'
	path_list = os.listdir(path)
	path_list.sort(key=lambda x: int(x[:-4]))
	count = 0
	for filename in path_list:
	    result_1 = judge(path + filename)
	    print(result_1)
	    if (int(filename[:-4])-1)//10 + 1 == result_1:
		count += 1
	print("accquracy is :" + str(count/50))