Retinex图像增强图像增强在哪里进行

转载

云端筑梦大师 2024-04-05 15:26:22

文章标签 Retinex图像增强卷积深度学习 tensorflow 机器学习 文章分类 计算机视觉人工智能

什么是图像增强？

图像增强，一个解决数据有限的问题。图像增强是一种通过在数据集中创建修改多个版本的图像来人工扩大训练数据集大小的技术。图像增强包括一系列技术，这些技术可以增强训练图像的大小和质量，从而可以用它们建立更好的深度学习模型。

Retinex图像增强图像增强在哪里进行_tensorflow

部署 Tensorflow 环境

先决条件

Linux, macOS, Windows Linux，macOS，Windows
Python ≥ 3.7 3.7

安装 TensorFlow

只用 CPU

pip install “tensorflow>=1.15.2,<2.0”
or 
conda install tensorflow’>=1.15.2,<2.0.0'

使用 GPU

pip install “tensorflow-gpu>=1.15.2,<2.0”
or 
conda install tensorflow-gpu’>=1.15.2,<2.0.0'

可用性检查

>> import tensorflow as tf
>> tf.__version__
'2.3.0'

现在，我们将使用来自 Kaggle 的石头剪刀布数据集来执行多类图像分类。

1. 数据集探索

数据集有训练、测试和验证三个目录。在这里，训练和测试有三类图像，并且有一个图像列表用来进行测试验证。

base_dir = os.path.join("/kaggle/input/rock-paper-scissors-dataset/Rock-Paper-Scissors/")


# Train set
train_dir = os.path.join(base_dir + "train")
print("Train set --> ", os.listdir(train_dir))


# Test set
test_dir = os.path.join(base_dir + "test")
print("Test set --> ", os.listdir(test_dir))


# Validation set
validation_dir = os.path.join(base_dir + "validation")
print("Validation set --> ", os.listdir(validation_dir)[:3])

输出结果是：

Train set -->  ['paper', 'scissors', 'rock']
Test set -->  ['paper', 'scissors', 'rock']
Validation set -->  ['paper8.png', 'paper1.png', 'scissors-hires1.png']

2. 数据集样本

让我们显示数据集中每个类的随机图像。

fig, ax = plt.subplots(1, 3, figsize=(15, 10))


sample_paper = random.choice(os.listdir(train_dir + "paper"))
image = load_img(train_dir + "paper/" + sample_paper)
ax[0].imshow(image)
ax[0].set_title("Paper")
ax[0].axis("Off")


sample_rock = random.choice(os.listdir(train_dir + "rock"))
image = load_img(train_dir + "rock/" + sample_rock)
ax[1].imshow(image)
ax[1].set_title("Rock")
ax[1].axis("Off")


sample_scissor = random.choice(os.listdir(train_dir + "scissors"))
image = load_img(train_dir + "scissors/" + sample_scissor)
ax[2].imshow(image)
ax[2].set_title("Scissor")
ax[2].axis("Off")


plt.show()

所以这些图像是：

Retinex图像增强图像增强在哪里进行_tensorflow_02

3. 定义 CNN 模型

这个模型包括五种不同的层：

卷积层：这一层将提取图像中的重要特征
池化层：这一层通过提取重要特征，减少了卷积后输入图像的大小
Flatten 层：将输入“压平”成一个单维数组
隐藏层：将网络从一层连接到另一层
输出层： 最终输出图像的类别

这里，我们有三类图像，所以，输出层应该有三个神经元。

model = tf.keras.models.Sequential([
    
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    
    tf.keras.layers.Dense(3, activation='softmax')
])


model.summary()

模型如下：

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 72, 72, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128)         0         
_________________________________________________________________
flatten (Flatten)            (None, 6272)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               3211776   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________

4. 模型编译 & 回调函数

对于这个模型，我们使用 adam 优化器和交叉熵损失作为损失函数。这里的回调函数在模型的新的 epoch 结束时停止训练，准确率达到 95% 以上。

model.compile(loss = 'categorical_crossentropy',
              optimizer = 'adam',
              metrics = ['accuracy'])


class myCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if(logs.get('accuracy')>0.95):
            print("\nReached >95% accuracy so cancelling training!")
            self.model.stop_training = True
        
callbacks = myCallback()

5. 生成器

图像增强训练生成器

train_datagen = ImageDataGenerator(
      rescale=1./255,
      rotation_range=40,
      width_shift_range=0.2, # Shifting image width by 20%
      height_shift_range=0.2,# Shifting image height by 20%
      shear_range=0.2,       # Shearing across X-axis by 20%
      zoom_range=0.2,        # Image zooming by 20%
      horizontal_flip=True,
      fill_mode='nearest')


train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size = (150, 150),
    class_mode = 'categorical',
    batch_size = 20
)

Found 2520 images belonging to 3 classes.

验证生成器

validation_datagen = ImageDataGenerator(rescale=1./255)


validation_generator = validation_datagen.flow_from_directory(
    test_dir,
    target_size = (150, 150),
    class_mode = 'categorical',
    batch_size = 20
)

Found 372 images belonging to 3 classes.

6. 优化模型

因为我们使用生成器来代替 model.fit，所以我们需要使用 model.fit 生成器函数。

history = model.fit_generator(
      train_generator,
      steps_per_epoch = np.ceil(2520/20),  # 2520 images = batch_size * steps
      epochs = 10,
      validation_data=validation_generator,
      validation_steps = np.ceil(372/20),  # 372 images = batch_size * steps
      callbacks=[callbacks],
      verbose = 2)

Epoch 1/10
126/126 - 46s - loss: 1.0141 - accuracy: 0.4591 - val_loss: 0.4937 - val_accuracy: 0.9301
Epoch 2/10
126/126 - 27s - loss: 0.5067 - accuracy: 0.7968 - val_loss: 0.0886 - val_accuracy: 0.9785
Epoch 3/10
126/126 - 27s - loss: 0.2712 - accuracy: 0.9056 - val_loss: 0.1290 - val_accuracy: 0.9624
Epoch 4/10
126/126 - 27s - loss: 0.1608 - accuracy: 0.9393 - val_loss: 0.1045 - val_accuracy: 0.9597
Epoch 5/10


Reached >95% accuracy so cancelling training!
126/126 - 26s - loss: 0.1408 - accuracy: 0.9512 - val_loss: 0.0784 - val_accuracy: 0.9677

7. 模型训练的可视化

让我们在整个 epoch 中计算模型的准确性和损失

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']


epochs = range(len(acc))


plt.figure(figsize=(7,7))


plt.plot(epochs, acc, 'r', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend()


plt.figure(figsize=(7,7))


plt.plot(epochs, loss, 'r', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.title('Training and validation loss')
plt.legend()


plt.show()

结果图是：

Retinex图像增强图像增强在哪里进行_深度学习_03

Retinex图像增强图像增强在哪里进行_机器学习_04

我们可以看到，每个 epoch 的精度提高，损失下降

8. 预测

准备测试数据

test_img = os.listdir(os.path.join(validation_dir))


test_df = pd.DataFrame({'Image': test_img})
test_df.head()

Retinex图像增强图像增强在哪里进行_深度学习_05

测试生成器

test_gen = ImageDataGenerator(rescale=1./255)


test_generator = test_gen.flow_from_dataframe(
    test_df, 
    validation_dir, 
    x_col = 'Image',
    y_col = None,
    class_mode = None,
    target_size = (150, 150),
    batch_size = 20,
    shuffle = False
)

Found 33 validated image filenames.

模型预测

predict = model.predict_generator(test_generator, steps = int(np.ceil(33/20)))

标签映射

为了识别图像的标签，使用了 class_indexes 函数

# Identifying the classes


label_map = dict((v,k) for k,v in train_generator.class_indices.items())
print(label_map)

{0: 'paper', 1: 'rock', 2: 'scissors'}

绘制预测图

test_df['Label'] = np.argmax(predict, axis = -1) # axis = -1 --> To compute the max element index within list of lists
test_df['Label'] = test_df['Label'].replace(label_map)
test_df.Label.value_counts().plot.bar(color = ['red','blue','green'])
plt.xticks(rotation = 0)
plt.show()

Retinex图像增强图像增强在哪里进行_机器学习_06

在其他图像上的模型性能

v = random.randint(0, 25)


sample_test = test_df.iloc[v:(v+18)].reset_index(drop = True)
sample_test.head()


plt.figure(figsize=(12, 24))
for index, row in sample_test.iterrows():
    filename = row['Image']
    category = row['Label']
    img = load_img(validation_dir + filename, target_size = (150, 150))
    plt.subplot(6, 3, index + 1)
    plt.imshow(img)
    plt.xlabel(filename + ' ( ' + "{}".format(category) + ' )' )
plt.tight_layout()
plt.show()

Retinex图像增强图像增强在哪里进行_机器学习_07

看不见的图像的模型精度

lis = []
for ind in test_df.index: 
    if(test_df['Label'][ind] in test_df['Image'][ind]):
        lis.append(1)
    else:
        lis.append(0)
        
print("Accuracy of the model on test data is {:.2f}%".format((sum(lis)/len(lis))*100))

Accuracy of the model on test data is 93.94%

· END ·

HAPPY LIFE

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：pycharm 怎么查看sqlite pycharm查看库类的方法

下一篇：NestedScrollView 滑动至顶部滑动窗口怎么实现

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

Retinex图像增强 图像增强在哪里进行

Retinex图像增强 图像增强在哪里进行

51CTO博客

Retinex图像增强图像增强在哪里进行

Retinex图像增强图像增强在哪里进行