自动化测试验证码处理不了？看这里 -＞＞＞ pytesseract库登录验证码识

转载

mb607022e25a607 2021-06-09 21:56:15

文章标签 软件测试 文章分类 软件测试

1、调用第三方图片识别接口，这里没有找到比较好的免费的接口，有的同学可以分享哈~

2、使用cookie绕过登录直接访问页面，webdriver 中有操作cookie的方法，获取、添加、删除

# 思路：登录成功获取cookie保存到文件中 -> 添加 cookie 打开页面直接访问，

注意：.ini 配置文件中不能写入 % 字符，保存cookie时候需要替换下

自动化测试验证码处理不了？看这里 -＞＞＞ pytesseract库登录验证码识_软件测试

def get_cookie(self, option):

"""

获取登录后的cookie值

:param option: 配置文件中标题的值

:param section: 对应key值

:return:

"""

# 由于.ini文件中不能写入 % 字符，会报错，替换成 $ 后写入

cookieValue = str(self.driver.get_cookies()).replace("%", "$")

Log.info("获取浏览器cookie值: {}...".format(cookieValue[:50]))

# cookie值保存到配置文件

rwParams.write_ini_file(section="LOGIN_COOKIE", option=option, value=cookieValue)

def set_cookie(self, option):

"""

设置携带cookie访问网页

:param option: 索要获取的cookie对应key值

:return:

"""

cookieStr = rwParams.read_ini_file(section="LOGIN_COOKIE", option=option)

# 保存的cookie值中需要把 $ 字符替换为 % 字符, 且转化为原格式

cookieValue = eval（cookieStr.replace("$", "%"))

# 添加cookie, 原格式是个列表循环添加

[self.driver.add_cookie(cookie) for cookie in cookieValue]

Log.info("给浏览器添加cookie值: {}...".format(cookieStr[:50]))

def del_cookie(self):

"""删除cookie"""

self.driver.delete_all_cookies()

Log.info("删除当前添加的所有cookie值")

3、最简单的方法，联系开发再项目代码中设置万能验证码，直接输万能码

4、降噪方法可以多执行几次，多次处理图片，还可以对图片进行切割处理，单个图片进行识别，目前只能识别不扭曲的验证码和干扰线不粗的验证码图片

# 思路：验证码图片转称黑白色 -> 遍历每个像素点判断周围8个点的颜色 -> 有4个以上的不同的颜色就判断该点为底色 -> 使用ocr工具进行识别处理后的验证码

# -*- coding: utf-8 -*-

# @Author : Mr.Deng

# @Time : 2021/3/21 14:34

"""

图片识别，登录页面验证码文字识别

"""

from util.basePages import BasePages

from config.filePathConfig import FilePathConfig

from config.varConfig import SysConfig as SC

from PIL import Image

from pytesseract import pytesseract

import re

class ImageRecognize:

def __init__(self, driver):

self.base = BasePages(driver)

self.imageSavePath = FilePathConfig.CODE_IMG_SAVE_PATH + "\\" + "shotCode.png"

def save_code_image(self, elementPath, zoomNum=1.25):

"""

截图保存验证码图片

:param elementPath:

:param zoomNum: 电脑屏幕缩放比例，125% ， zoom = 1.25

:return:

"""

self.base.driver.origin_driver.get_screenshot_as_file(self.imageSavePath)

imageData = self.base.driver.get_location(elementPath)

# 图片左右高低尺寸坐标，要乘以屏幕缩放比例

left = imageData["x"] * zoomNum

top = imageData["y"] * zoomNum

right = left + imageData["width"] * zoomNum

bottom = top + imageData["height"] * zoomNum

self.imageObj = Image.open(self.imageSavePath)

codeImage = self.imageObj.crop((left, top, right, bottom))

codeImage.save(self.imageSavePath)

return codeImage

def binarization_image(self, image):

"""

验证码图片二值化转化成黑白色

:param image: 图片保存对象

:return:

"""

imageCode = image.convert("L")

pixelData = imageCode.load()

row, col = image.size

threshold = 150 # 150 灰色

for i in range(row):

for y in range(col):

if pixelData[i, y] > threshold:

pixelData[i, y] = 0

else:

pixelData[i, y] = 255

return imageCode

def delete_noisy_point(self, image):

"""

降噪，删除多余的干扰线像素点

:param image: 图片对象

:return:

"""

pixelData = image.load()

row, col = image.size

# 判断图片中黑白像素点的多少，判断那种颜色是背景色，那个是验证码颜色

poxList = []

for x in range(row - 1):

for y in range(col - 1):

poxList.append(pixelData[x, y])

# 按像素点多少降序排列，多的是背景，少的是验证码

newList = sorted(set(poxList), key=lambda x: poxList.count(x), reverse=True)

# 循环判断每个像素点上下左右，左上，右上，左下，右下八个像素点的颜色值

for a in range(row - 1):

for b in range(col - 1):

count = 0

if pixelData[a, b - 1] == newList[0]: count += 1 # 上

if pixelData[a, b + 1] == newList[0]: count += 1 # 下

if pixelData[a - 1, b] == newList[0]: count += 1 # 左

if pixelData[a + 1, b] == newList[0]: count += 1 # 右

if pixelData[a - 1, b - 1] == newList[0]: count += 1 # 左上

if pixelData[a - 1, b + 1] == newList[0]: count += 1 # 左下

if pixelData[a + 1, b - 1] == newList[0]: count += 1 # 右上

if pixelData[a + 1, b + 1] == newList[0]: count += 1 # 右下

# 统计周围四个以上的点都是背景色，则该点就是背景色，否则验证码色

if count > 4: pixelData[a, b] = newList[0]

image.save(self.imageSavePath.replace("shotCode", "ProcessedImage")) # 保存处理后的验证码

return image

def image_str(self, image):

"""识别处理后的验证码图片"""

img = self.binarization_image(image)

afterSpotImg = self.delete_noisy_point(img)

pytesseract.tesseract_cmd = SC.PYTESSERACT_OCR

# 图片转文字

result = pytesseract.image_to_string(afterSpotImg)

return result

if __name__ == '__main__':

from util.pySelenium import PySelenium

p = PySelenium(openType="pc")

# p.open_url(url="https://XXXX/admin/login?redirect=%2Fadmin%2Fdashboard")

# p.sleep(2)

# IM = ImageRecognize(p).save_code_image('xpath->//div[@class="imgs"]/img')

code = ImageRecognize(p).image_str(image=Image.open(r"C:\Users\kk\Desktop\下载.jpg"))

print(code)

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：git 的使用

下一篇：Git 从入门到跑路

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

自动化测试验证码处理不了？看这里 -＞＞＞ pytesseract库登录验证码识

自动化测试验证码处理不了？看这里 -＞＞＞ pytesseract库登录验证码识

51CTO博客