前言

Hadoop已经配置好了LZO, 如果Spark不配置LZO压缩的话,在提交作业执行的时候会报错Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found.

找到Hadoop配置的LZO压缩包路径

“/root/soft/hadoop-2.7.2/share/hadoop/common/hadoop-lzo-0.4.21-SNAPSHOT.jar”

配置Spark配置文件

编辑"/root/soft/spark-2.1.1-bin-hadoop2.7/conf/spark-defaults.conf"配置文件

spark.driver.extraClassPath /root/soft/hadoop-2.7.2/share/hadoop/common/hadoop-lzo-0.4.21-SNAPSHOT.jar
spark.executor.extraClassPath /root/soft/hadoop-2.7.2/share/hadoop/common/hadoop-lzo-0.4.21-SNAPSHOT.jar

因为我的lzojar包直接放到 hadoop的/share/hadoop/common目录下面的这里直接$HADOOP_HOME配置一下就行

配置"/root/soft/spark-2.1.1-bin-hadoop2.7/conf/spark-env.sh" 配置文件

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:$HADOOP_HOME/lib/native

启动Spark程序测试一下

[root@zjj101 spark-2.1.1-bin-hadoop2.7]# bin/spark-submit  --master yarn  --deploy-mode client --class com.WordCount /root/soft/demo02-1.0-SNAPSHOT.jar hdfs://zjj101:9000/wordCountData

运行完了也没发现报错, 这里结果就不粘贴了