Basic Operations on Images

Goal

Learn to:

  • Access pixel values and modify them
  • Access image properties
  • Set a Region of Interest (ROI)
  • Split and merge images

Almost all the operations in this section are mainly related to Numpy rather than OpenCV. A good knowledge of Numpy is required to write better optimized code with OpenCV.

( Examples will be shown in a Python terminal, since most of them are just single lines of code )

学习:

  • 访问像素数值并修改他们
  • 访问像素属性
  • 设置感兴趣区域(ROI)
  • 分割和合并图像

这一章中几乎所有操作都是和Numpy有关联而不是OpenCV。想要用OpenCV写出优化的好代码,拥有大量Numpy的知识是必须的。

(例程将在Python终端中展示,因为大多数都是简单的几行代码)

Accessing and Modifying pixel values

Let’s load a color image first:

让我们首先加载一张彩色图片:

>>>import numpy as np
>>>import cv2 as cv
>>>img = cv.imread("msddi5.jpg")

你可以通过它的行和列坐标访问像素值。对于BGR图像,它返还一个由蓝色,绿色,红色值组成的数组。对于灰度图像,仅仅返回相应的灰度值。

You can access a pixel value by its row and column coordinates. For BGR image, it returns an array of Blue, Green, Red values. For grayscale image, just corresponding intensity is returned.

>>>px = img[100,100]
>>>print(px)
[157 166 200]
#accessing only blue pixel
>>>blue = img[100,100,0]
>>>print(blue)
157
#you can modify the pixel value the same way
>>>img[100,100] = [255,255,255]
>>>print(img[100,100])
[255 255 255]

Warning

Numpy is an optimized library for fast array calculations. So simply accessing each and every pixel value and modifying it will be very slow and it is discouraged.

  • Note
    The above method is normally used for selecting a region of an array, say the first 5 rows and last 3 columns. For individual pixel access, the Numpy array methods, array.item() and array.itemset() are considered better. They always return a scalar, however, so if you want to access all the B,G,R values, you will need to call array.item() separately for each value.

Better pixel accessing and editing method :

Numpy是一个优化好的用于 快速数组计算的库。因此简单地访问每个像素值并修改将非常慢,所以并不推荐,

  • note
    上述方法通常用于选择数组的区域,例如开始5行和最后三列。对于单个像素访问,Numpy数组方法里,array.item()和array.items被认为不错。它们总是返回一个标量,然而,如果你想要访问所有的B,G,R,值,你需要分别为每一个值调用array.item()。
#accessing RED value
>>>img.item(10,10,2)
59
#modifying RED value
>>>img.itemset((10,10,2),100)
>>>img.item(10,10,2)
100

Accessing Image Properties

Image properties include number of rows, columns, and channels; type of image data; number of pixels; etc.

The shape of an image is accessed by img.shape. It returns a tuple of the number of rows, columns, and channels (if the image is color):

图像属性包含行数,列数和通道数;图像数据的类型;像素的数量等等。

>>>print(img.shape)
(342,548,3)
  • Note
    If an image is grayscale, the tuple returned contains only the number of rows and columns, so it is a good method to check whether the loaded image is grayscale or color.

Total number of pixels is accessed by img.size:

  • note
    如果是灰度图,元组仅返回行和列的数量,所以是一个好办法来检查加载的图片是灰度值还是彩色。

像素的总数通过img.size来访问:

>>>print(img.size)
562248

图片的数据格式通过img.dtype来得到:

print(img.dtype)
uint8
  • Note
    img.dtype is very important while debugging because a large number of errors in OpenCV-Python code are caused by invalid datatype.
  • note
    找bug的时候,img.dtype是一个非常重要的方法。这是因为在OpenCV-Python中大量的错误都是因为无效的数据类型造成的。

Image ROI

Sometimes, you will have to play with certain regions of images. For eye detection in images, first face detection is done over the entire image. When a face is obtained, we select the face region alone and search for eyes inside it instead of searching the whole image. It improves accuracy (because eyes are always on faces 😄 ) and performance (because we search in a small area).

ROI is again obtained using Numpy indexing. Here I am selecting the ball and copying it to another region in the image:

有时,你不得不处理图像的特定区域。对于图像的眼睛检测,首先对整个图片进行面部检测。当人脸被获得的时候,我们仅选择人脸区域,并在其中寻找眼睛,而不是在整个图片寻找。这增加了正确性(因为眼睛总是在脸上)和性能(因为我们在一个小的区域进行搜寻)。

>>>ball = img[280:340,330:390]
>>>img[273:333,100:160] = ball

Check the results below:

检查下列结果:

python opencv显示乱码 python opencv中文文档_OpenCV

Splitting and Merging Image Channels

Sometimes you will need to work separately on the B,G,R channels of an image. In this case, you need to split the BGR image into single channels. In other cases, you may need to join these individual channels to create a BGR image. You can do this simply by:

有时你需要分别处理一张图片的B,G,R,通道。在这个例子中,你需要将BGR图像划分为单独的通道。在其他例子中,你可能需要将单独的通道合并来创建一个BGR图像。你可以通过下列方法简单达成:

>>>b,g,r = cv.split(img)
>>>img = cv.merge((b,g,r))

或者

>>>b = img[:,:,0]

假设你想把所有红色像素变为0-你不需要第一步进行通道划分。Numpy索引更快:

>>>img[:,:,2]=0

Warning

cv.split() is a costly operation (in terms of time). So use it only if necessary. Otherwise go for Numpy indexing.

cv.split()是一个消耗很大的操作(就时间而言)。所以只有在必要的时候才使用它。否则用Numpy索引。

Making Borders for Images (Padding)为图像设置边框(填充)

If you want to create a border around an image, something like a photo frame, you can use cv.copyMakeBorder(). But it has more applications for convolution operation, zero padding etc. This function takes following arguments:

  • src - input image
  • top, bottom, left, right - border width in number of pixels in corresponding directions
  • borderType- Flag defining what kind of border to be added. It can be following types:
  • cv.BORDER_CONSTANT
  • cv.BORDER_REFLECT - Border will be mirror reflection of the border elements, like this : fedcba|abcdefgh|hgfedcb
  • cv.BORDER_REFLECT_101 or cv.BORDER_DEFAULT - Same as above, but with a slight change, like this : gfedcb|abcdefgh|gfedcba
  • cv.BORDER_REPLICATE - Last element is replicated throughout, like this: aaaaaa|abcdefgh|hhhhhhh
  • cv.BORDER_WRAP - Can’t explain, it will look like this : cdefgh|abcdefgh|abcdefg
  • value - Color of border if border type is cv.BORDER_CONSTANT

Below is a sample code demonstrating all these border types for better understanding:

如果你想在图片周围创建一个边框,比如相框,你可以使用cv.copyMakeBorder()。但它在卷积运算,零填充等方面有更多的应用。此函数有以下参数:

  • src-输入图片
  • top,bottom,left,right-边界宽度,以相应方向上的像素数量为单位
  • borderType
    -标志决定了添加哪种类型的边界。有下列几种:
  • cv.BORDER_CONSTANT - 添加一个恒定的彩色边界。该值应该作为下一个参数给出。
  • cv.BORDER_REFLECT - 边框将是边框元素的镜像反射,例如:fedcba|abcdefgh|hgfedcb
  • cv.BORDER_REFLECT_101 或者 cv.BORDER_DEFAULT - 和上述相同,但是有一点轻微变化,例如:
    gfedcb|abcdefgh|gfedcba
  • cv.BORDER_REPLICATE - 最后一个元素被复制,就像:cdefgh|abcdefgh|abcdefg
  • cv.BORDER_WRAP - 难以解释,它看起来就是:cdefgh|abcdefgh|abcdefg
  • value - 边框的颜色,如果边框的类型是cv.BORDER_CONSTANT

下面是一个简单的代码展示所有的边框类型以便更好的理解:

import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
BLUE = [255,0,0]
img1 = cv.imread("opencv-logo.png")
replicate = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REPLICATE)
reflect = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REFLECT)
reflect101 = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REFLECT_101)
wrap = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_WRAP)
constant = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_CONSTANT,value=BLUE)
plt.subplot(231),plt.imshow(img1,"gray"),plt.title("ORIGINAL")
plt.subplot(232),plt.imshow(replicate,"gray"),plt.title("REPLICATE")
plt.subplot(232),plt.imshow(reflect,"gray"),plt.title("REFLECT")
plt.subplot(233),plt.imshow(reflect,"gray"),plt.title("REFLECT")
plt.subplot(234),plt.imshow(reflect101,"gray"),plt.title("Reflect_101")
plt.subplot(235),plt.imshow(wrap,"gray"),plt.title("CONSTANT")
plt.show()

请看下列结果。(图片用matplotlib展示。所以红色和蓝色通道将互换):

python opencv显示乱码 python opencv中文文档_python opencv显示乱码_02

Additional Resources

Exercises