Java做深度学习?
今天笔者要介绍的就是如何利用java去实现手写数字识别。本文大部分内容都借鉴自其它博主,在此表示感谢,不过出于笔者喜欢记录自己做过的事情,并且写一点自己的个人体会。所以,希望大佬们不要见怪,再次感谢。
本文旨在帮助初次接触deeplearn4j的java爱好者,欢迎评论区留言交流。
温馨提示:为了避免不必要的麻烦,请先更新您的JDK和Eclipse至64位。
1.DL4J
与其他深度学习框架相比,Deeplearning4j具有以下优点。
- 与Spark、Hadoop、Kafka等主流JVM框架实现大规模集成
- 专为基于分布式CPU和/或GPU运行而优化
- 服务于Java和Scala用户群
- 企业级部署可享商业化支持
初学者可以从官网上下载DL4J的示例代码。
示例项目由maven构建,包含多个子项目。具体的项目目录如下图左边所示,这里推荐一个非常方便的github代码阅读工具。
这里,我们需要注意的是,dl4j-examples、rl4j-examples等示例项目都对应着同一个父项目。其实父项目什么内容也没有,只是包含一个pom.xml文件,里面有所有子项目需要用到的一些内容。例如,当您打开dl4j-examples时,会在其pox.xml文件中看到一个parent节点,其实就是指向父项目的pom.xml文件。这个节点的含义大家如果不清楚,可以查一下maven中parent的含义。
当我们需要建立一个自己的项目时,可以利用官方示例代码中已经提供的示例。注意到,有一个叫做 standalone-sample-project的子项目,我们可以直接复制pom.xml中的内容到自己的maven项目中,创建一个自己的maven项目。
详细的内容如下:
<?xml version="1.0" encoding="UTF-8"?>
<!--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ Copyright (c) 2015-2019 Skymind, Inc.
~
~ This program and the accompanying materials are made available under the
~ terms of the Apache License, Version 2.0 which is available at
~ https://www.apache.org/licenses/LICENSE-2.0.
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
~ WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
~ License for the specific language governing permissions and limitations
~ under the License.
~
~ SPDX-License-Identifier: Apache-2.0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-->
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<!-- Group-ID, artifact ID and version of the project. You can modify these as you want -->
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-examples</artifactId>
<version>1.0.0-beta4</version>
<!-- Properties Section. Change DL4J and ND4J versions here, if required -->
<properties>
<dl4j.version>1.0.0-beta4</dl4j.version>
<nd4j.version>1.0.0-beta4</nd4j.version>
<logback.version>1.2.3</logback.version>
<java.version>1.8</java.version>
<maven-shade-plugin.version>2.4.3</maven-shade-plugin.version>
</properties>
<dependencies>
<!-- deeplearning4j-core: contains main functionality and neural networks -->
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-core</artifactId>
<version>${dl4j.version}</version>
</dependency>
<!--
ND4J backend: every project needs one of these. The backend defines the hardware on which network training
will occur. "nd4j-native-platform" is for CPUs only (for running on all operating systems).
-->
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-native</artifactId>
<version>${nd4j.version}</version>
</dependency>
<!-- CUDA: to use GPU for training (CUDA) instead of CPU, uncomment this, and remove nd4j-native-platform -->
<!-- Requires CUDA to be installed to use. Change the version (8.0, 9.0, 9.1) to change the CUDA version -->
<!--
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-cuda-9.2-platform</artifactId>
<version>${nd4j.version}</version>
</dependency>
-->
<!-- Optional, but recommended: if you use CUDA, also use CuDNN. To use this, CuDNN must also be installed -->
<!-- See: https://deeplearning4j.org/cudnn -->
<!--
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-cuda-9.2</artifactId>
<version>${dl4j.version}</version>
</dependency>
-->
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>${logback.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<!-- Maven compiler plugin: compile for Java 8 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.5.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
<!--
Maven shade plugin configuration: this is required so that if you build a single JAR file (an "uber-jar")
it will contain all the required native libraries, and the backends will work correctly.
Used for example when running the following commants
mvn package
cd target
java -cp deeplearning4j-examples-1.0.0-beta-bin.jar org.deeplearning4j.LenetMnistExample
-->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>${maven-shade-plugin.version}</version>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>bin</shadedClassifierName>
<createDependencyReducedPom>true</createDependencyReducedPom>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>org/datanucleus/**</exclude>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>reference.conf</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
如果没有额外的需求,您只需要修改一下:Group-ID 和 artifact ID
好了,有关官方的examples的讲解就这些了,如果在看的过程中有疑问,欢迎大家评论留言,相互交流。
2.MNIST数据集
MNIST数据集包含一个有6万个样例的训练集和一个有1万个样例的测试集。训练集用于让算法学习如何准确地预测出图像的整数标签,而测试集则用于检查已训练网络的预测有多准确。
其实这个数据集没有什么可以说的,因为为了方便学习,deeplearn4j的官方已经处理好了数据,根本不需要我们自己去处理,其中有一个 MnistDataSetIterator 已经帮你下载好了数据集,同时处理好了数据集,降低了入门者的学习难度。
3.利用DL4J实现手写数字识别
官方的实例代码处理MINIST数据集有点简单,笔者找到了一位大佬的博客,大佬没有使用BP神经网络做识别,而是使用卷积神经网络训练,效果更好。卷积神经网络和BP神经网络这里不做详细介绍,相信大家既然看到这里就应该有一定的理论基础。这里笔者借鉴了一位大佬的代码,请移步使用Dl4j训练的一个手写数字识别软件欣赏。真的很棒,对不对?下面只给出训练过程代码。
package cn.rocket;
import java.io.File;
import java.io.IOException;
import org.deeplearning4j.datasets.iterator.impl.MnistDataSetIterator;
import org.deeplearning4j.eval.Evaluation;
import org.deeplearning4j.nn.api.OptimizationAlgorithm;
import org.deeplearning4j.nn.conf.BackpropType;
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.inputs.InputType;
import org.deeplearning4j.nn.conf.layers.ConvolutionLayer;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.conf.layers.SubsamplingLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.nn.weights.WeightInit;
import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
import org.deeplearning4j.util.ModelSerializer;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.learning.config.Nesterovs;
import org.nd4j.linalg.lossfunctions.LossFunctions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class TrainCNN {
private static Logger log = LoggerFactory.getLogger(TrainCNN.class);
public static void main(String[] args) throws IOException {
// 图片的尺寸
final int numRows = 28;
final int numColumns = 28;
final int channels = 1;
int outputNum = 10; // 输出层的神经元个数
int batchSize = 128; // 每个批次的大小
int rngSeed = 123; // 随机数
int numEpochs = 1; // 训练的次数
// 划分训练集和测试集
DataSetIterator mnistTrain = new MnistDataSetIterator(batchSize, true, rngSeed);
DataSetIterator mnistTest = new MnistDataSetIterator(batchSize, false, rngSeed);
log.info("创建模型....");
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder().seed(rngSeed).l2(0.0005)
.weightInit(WeightInit.XAVIER).optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
.updater(new Nesterovs(0.006, 0.9)).list()
.layer(0,
new ConvolutionLayer.Builder(5, 5).nIn(channels).stride(1, 1).nOut(20)
.activation(Activation.IDENTITY).build())
.layer(1,
new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX).kernelSize(2, 2).stride(2, 2)
.build())
.layer(2,
new ConvolutionLayer.Builder(5, 5).stride(1, 1).nOut(50).activation(Activation.IDENTITY)
.build())
.layer(3,
new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX).kernelSize(2, 2).stride(2, 2)
.build())
.layer(4, new DenseLayer.Builder().activation(Activation.RELU).nOut(500).build())
.layer(5,
new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD).nOut(outputNum)
.activation(Activation.SOFTMAX).build())
.setInputType(InputType.convolutionalFlat(numRows, numColumns, channels))
.backpropType(BackpropType.Standard).build();
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
// 设置输出
model.setListeners(new ScoreIterationListener(1));
log.info("训练模型....");
model.fit(mnistTrain, numEpochs);
log.info("测试模型....");
Evaluation eval = model.evaluate(mnistTest);
log.info(eval.stats());
// 保存训练好的模型
File locationToSave = new File("模型.zip");
ModelSerializer.writeModel(model, locationToSave, false);
}
}
笔者项目的github地址:https://github.com/jack13163/MNISTBYDL4J
4.打包为exe可执行文件
那么好的东西,所以笔者不禁想将大佬的东西打包为exe可执行文件。
4.1 打包为jar可执行文件
将java项目打包成exe可执行文件的第一步是生成一个可执行的jar文件,这个jar文件中应该包含有本项目用到的所有的jar包及其依赖。既然是maven项目,那么自然最好使用maven工具进行打包了。
4.1.1 maven-compiler-plugin
编译插件,用来重新编译项目中的java文件,生成class文件。
<!-- Maven compiler plugin: compile for Java 8 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.5.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</plugin>
4.1.2 maven-dependency-plugin
依赖处理插件,能够将项目所需要的jar包全部复制到target目录下的指定目录下(这里我们指定的是lib目录,如果lib目录不存在,那么会自动创建lib目录)。
<!-- 将依赖jar包全部拷贝至lib目录下【关键】 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<execution>
<id>copy</id>
<phase>compile</phase>
<goals>
<goal>copy-dependencies</goal>
</goals>
<configuration>
<outputDirectory>
${project.build.directory}/lib
</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
4.1.3 maven-jar-plugin
jar打包插件,通过使用eclipse的maven插件,执行maven->build...->在target中输入:"install" 的方式调用,或者直接右键项目->maven->install。
<!-- 打包lib下的jar包 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
<mainClass>cn.rocket.MainFrame</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
4.1.4 maven-assembly-plugin
assembly打包插件,通过使用eclipse的maven插件,执行maven->build...->在target中输入:"assembly:assembly" 的方式调用。
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<!--这部分可有可无,加上的话则直接生成可运行jar包-->
<archive>
<manifest>
<mainClass>cn.rocket.MainFrame</mainClass>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
4.1.5 maven-shade-plugin
shade打包插件,通过使用eclipse的maven插件,执行maven->build...->在target中输入:"package" 的方式调用。
<!--
Maven shade plugin configuration: this is required so that if you build a single JAR file (an "uber-jar")
it will contain all the required native libraries, and the backends will work correctly.
Used for example when running the following commants
mvn package
cd target
java -cp deeplearning4j-examples-1.0.0-beta-bin.jar org.deeplearning4j.LenetMnistExample
-->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>${maven-shade-plugin.version}</version>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>bin</shadedClassifierName>
<createDependencyReducedPom>false</createDependencyReducedPom>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>org/datanucleus/**</exclude>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>cn.rocket.MainFrame</mainClass>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>reference.conf</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
最终,可以将生成的jar文件和模型文件放在同一目录下,通过运行:java -jar XXX.jar的方式运行,若出现报错,也别着急,记得留言交流,哈哈。
4.2 将jar文件打包生成exe文件
其实这个并不能将jre打包进exe中,但是可以实现在没有配置java环境的系统下运行,最终只需要下面的三个文件,就可以实现各个平台上的运行,其中minist.exe是根据jar文件生成的exe可执行文件,jre是对应的java环境,模型是这个项目之前预训练好的网络模型。
打包主要用到了叫做exe4j的软件,关于打包的相关教程,请移步至:使用exe4j打包exe。用exe4j打包的会有下面的内容,确定就行。
下面是识别效果,是不是效果不错,哈哈。训练的次数太少了,只训练了一次,大家可以找到代码调一下,重新训练一下。
最后,再对原来做这个程序的大佬表示感谢。