使用Maven有一段时间了,但是都是自己瞎折腾,终于有时间做一个基本的总结(但只限于基本的使用,因为没有系统的学习过)
基本的概念
maven是一个项目构建和管理的工具,提供了帮助管理 构建、文档、报告、依赖、scms、发布、分发的方法。可以方便的编译代码、进行依赖管理、管理二进制库等等。
maven的好处在于可以将项目过程规范化、自动化、高效化以及强大的可扩展性
利用maven自身及其插件还可以获得代码检查报告、单元测试覆盖率、实现持续集成等等。
maven的核心概念介绍
pom是project object Model的缩写,maven2 中直接是pom.xml,pom文件中包含了项目的信息和maven build项目所需的配置信息,通常有项目信息(如版本、成员)、项目的依赖、插件和goal、build选项等等
pom是可以继承的,通常对于一个大型的项目或是多个module的情况,子模块的pom需要指定父模块的pom(比如sspark-examples_2.11依赖于父模块spark-parent_2.11)
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Licensed to the Apache Software Foundation (ASF) under one or more
~ contributor license agreements. See the NOTICE file distributed with
~ this work for additional information regarding copyright ownership.
~ The ASF licenses this file to You under the Apache License, Version 2.0
~ (the "License"); you may not use this file except in compliance with
~ the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.11</artifactId>
<version>2.3.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>
<artifactId>spark-examples_2.11</artifactId>
<packaging>jar</packaging>
<name>Spark Project Examples</name>
<url>http://spark.apache.org/</url>
<properties>
<sbt.project.name>examples</sbt.project.name>
<build.testJarPhase>none</build.testJarPhase>
<build.copyDependenciesPhase>package</build.copyDependenciesPhase>
<flume.deps.scope>provided</flume.deps.scope>
<hadoop.deps.scope>provided</hadoop.deps.scope>
<hive.deps.scope>provided</hive.deps.scope>
<parquet.deps.scope>provided</parquet.deps.scope>
</properties>
<dependencies>
<!-- Prevent our dummy JAR from being included in Spark distributions or uploaded to YARN -->
<dependency>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
<version>1.0.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-flume_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-8_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.scalacheck</groupId>
<artifactId>scalacheck_${scala.binary.version}</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.github.scopt</groupId>
<artifactId>scopt_${scala.binary.version}</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>com.twitter</groupId>
<artifactId>parquet-hadoop-bundle</artifactId>
<scope>provided</scope>
</dependency>
</dependencies>
<build>
<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
<testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-deploy-plugin</artifactId>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-install-plugin</artifactId>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<outputDirectory>${jars.target.dir}</outputDirectory>
</configuration>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>kinesis-asl</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kinesis-asl_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
</dependencies>
</profile>
</profiles>
</project>
<project>:文件的顶级元素
<modelVersion>: 所使用的object model版本,为了确保稳定的使用,这个元素是强制性的。除非maven开发者升级模板,否则不需要修改
<groupId>: 是项目创建团体或组织的唯一标志符,通常是域名倒写,如groupId org.apache.maven.plugins就是为所有maven插件预留的
<artifactId>: 是项目artifact唯一的基地址名
<packaging>: artifact打包的方式,如jar、war、ear等等。默认为jar。这个不仅表示项目最终产生何种后缀的文件,也表示build过程使用什么样的<a target="_blank" href="http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html#Built-in_Lifecycle_Bindings">lifecycle</a>。
<version>: artifact的版本,通常能看见为类似0.0.1-SNAPSHOT,其中SNAPSHOT表示项目开发中,为开发版本
<name>:表示项目的展现名,在maven生成的文档中使用
<url>:表示项目的地址,在maven生成的文档中使用
<description>: 表示项目的描述,在maven生成的文档中使用
<dependencies>: 表示依赖,在子节点dependencies中添加具体依赖的groupId artifactId和version
<build> 表示build配置
<properties>:表示相应的配置属性(尤其是多次使用的dependencies中的version),如上面的pom 所示
<profiles>:maven能够自动适应外部的环境变化,比如Linux,window,
<modules>:module是指具体子模块的目录名,而不是子工程的artifactId
parent 表示父pom
<scope>:范围依赖(compile,test,runtime,provided,system)参考