使用Maven有一段时间了,但是都是自己瞎折腾,终于有时间做一个基本的总结(但只限于基本的使用,因为没有系统的学习过)

基本的概念

maven是一个项目构建和管理的工具,提供了帮助管理 构建文档报告依赖scms发布分发的方法。可以方便的编译代码、进行依赖管理、管理二进制库等等。

maven的好处在于可以将项目过程规范化、自动化、高效化以及强大的可扩展性

利用maven自身及其插件还可以获得代码检查报告、单元测试覆盖率、实现持续集成等等。

maven的核心概念介绍

pom是project object Model的缩写,maven2 中直接是pom.xml,pom文件中包含了项目的信息和maven build项目所需的配置信息,通常有项目信息(如版本、成员)、项目的依赖、插件和goal、build选项等等

pom是可以继承的,通常对于一个大型的项目或是多个module的情况,子模块的pom需要指定父模块的pom(比如sspark-examples_2.11依赖于父模块spark-parent_2.11)

<?xml version="1.0" encoding="UTF-8"?>
<!--
  ~ Licensed to the Apache Software Foundation (ASF) under one or more
  ~ contributor license agreements.  See the NOTICE file distributed with
  ~ this work for additional information regarding copyright ownership.
  ~ The ASF licenses this file to You under the Apache License, Version 2.0
  ~ (the "License"); you may not use this file except in compliance with
  ~ the License.  You may obtain a copy of the License at
  ~
  ~    http://www.apache.org/licenses/LICENSE-2.0
  ~
  ~ Unless required by applicable law or agreed to in writing, software
  ~ distributed under the License is distributed on an "AS IS" BASIS,
  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  ~ See the License for the specific language governing permissions and
  ~ limitations under the License.
  -->

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-parent_2.11</artifactId>
    <version>2.3.0-SNAPSHOT</version>
    <relativePath>../pom.xml</relativePath>
  </parent>

  <artifactId>spark-examples_2.11</artifactId>
  <packaging>jar</packaging>
  <name>Spark Project Examples</name>
  <url>http://spark.apache.org/</url>

  <properties>
    <sbt.project.name>examples</sbt.project.name>
    <build.testJarPhase>none</build.testJarPhase>
    <build.copyDependenciesPhase>package</build.copyDependenciesPhase>
    <flume.deps.scope>provided</flume.deps.scope>
    <hadoop.deps.scope>provided</hadoop.deps.scope>
    <hive.deps.scope>provided</hive.deps.scope>
    <parquet.deps.scope>provided</parquet.deps.scope>
  </properties>

  <dependencies>
    <!-- Prevent our dummy JAR from being included in Spark distributions or uploaded to YARN -->
    <dependency>
      <groupId>org.spark-project.spark</groupId>
      <artifactId>unused</artifactId>
      <version>1.0.0</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-mllib_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-hive_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-graphx_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-flume_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kafka-0-8_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql-kafka-0-10_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-math3</artifactId>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.scalacheck</groupId>
      <artifactId>scalacheck_${scala.binary.version}</artifactId>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>com.github.scopt</groupId>
      <artifactId>scopt_${scala.binary.version}</artifactId>
      <version>3.3.0</version>
    </dependency>
    <dependency>
      <groupId>com.twitter</groupId>
      <artifactId>parquet-hadoop-bundle</artifactId>
      <scope>provided</scope>
    </dependency>
  </dependencies>

  <build>
    <outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
    <testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-deploy-plugin</artifactId>
        <configuration>
          <skip>true</skip>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-install-plugin</artifactId>
        <configuration>
          <skip>true</skip>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-jar-plugin</artifactId>
        <configuration>
          <outputDirectory>${jars.target.dir}</outputDirectory>
        </configuration>
      </plugin>
    </plugins>
  </build>
  <profiles>
    <profile>
      <id>kinesis-asl</id>
      <dependencies>
        <dependency>
          <groupId>org.apache.spark</groupId>
          <artifactId>spark-streaming-kinesis-asl_${scala.binary.version}</artifactId>
          <version>${project.version}</version>
          <scope>provided</scope>
        </dependency>
      </dependencies>
    </profile>
  </profiles>
</project>
<project>:文件的顶级元素  
<modelVersion>: 所使用的object model版本,为了确保稳定的使用,这个元素是强制性的。除非maven开发者升级模板,否则不需要修改  
<groupId>: 是项目创建团体或组织的唯一标志符,通常是域名倒写,如groupId  org.apache.maven.plugins就是为所有maven插件预留的  
<artifactId>: 是项目artifact唯一的基地址名  
<packaging>: artifact打包的方式,如jar、war、ear等等。默认为jar。这个不仅表示项目最终产生何种后缀的文件,也表示build过程使用什么样的<a target="_blank" href="http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html#Built-in_Lifecycle_Bindings">lifecycle</a>。  
<version>: artifact的版本,通常能看见为类似0.0.1-SNAPSHOT,其中SNAPSHOT表示项目开发中,为开发版本  
<name>:表示项目的展现名,在maven生成的文档中使用  
<url>:表示项目的地址,在maven生成的文档中使用  
<description>: 表示项目的描述,在maven生成的文档中使用  
<dependencies>: 表示依赖,在子节点dependencies中添加具体依赖的groupId artifactId和version  
<build> 表示build配置  
<properties>:表示相应的配置属性(尤其是多次使用的dependencies中的version),如上面的pom 所示
<profiles>:maven能够自动适应外部的环境变化,比如Linux,window,
<modules>:module是指具体子模块的目录名,而不是子工程的artifactId
parent 表示父pom  
<scope>:范围依赖(compile,test,runtime,provided,system)参考