前面已经讲过hadoop伪分布式和分布式环境搭建,参考大数据时代之Hadoop集群搭建
本来想直接搞java代码操作hdfs的,想了想还是先简单的复习下命令行操作hdfs吧。其实hdfs操作起来是操作linux系统的文件个人认为很相似,只不过命令前面要加个hdfs dfs -或者hadoop fs -,如:
#新建目录
hdfs dfs -mkdir <path>
或
hadoop fs -mkdir /test
这里就不得不问hadoop fs和hdfs dfs的区别是什么了。
参考:Hadoop:hadoop fs、hadoop dfs与hdfs dfs命令的区别
言归正传,下面讲解java如何操作hdsf文件系统:
pom.xml,hadoop依赖版本尽量保证和服务器上hadoop版本一致
<properties>
<hadoop-version>2.6.5</hadoop-version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop-version}</version>
</dependency>
</dependencies>
HdfsUtil.java
public class HdfsUtil {
private Configuration configuration;
private FileSystem fileSystem;
/**
* init
* @param url
* @param user
* @return
* @throws IOException
* @throws InterruptedException
*/
public static HdfsUtil getUtil(String url,String user) throws IOException, InterruptedException {
HdfsUtil util = new HdfsUtil();
util.configuration = new Configuration();
util.fileSystem =FileSystem.get(URI.create(url),util.configuration,user);
return util;
}
/**
* 创建目录
* @param filePath
* @return
*/
public boolean createPath(String filePath) {
boolean b = false;
Path path = new Path(filePath);
try {
b = this.fileSystem.mkdirs(path);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
this.fileSystem.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return b;
}
/**
<br>功能描述: 判断该路径是否存在,所指路径是文件还是文件夹
<br>处理逻辑:
<br>作者: lwl liuwanli_eamil@163.com 2018/12/25 15:08
<br>修改记录: {修改人 修改原因 修改时间}
* @param
* @throws
* @return int 0:不存在 1:文件 2:文件夹
* @see #
*/
public int checkFile(String filePath) {
Path path = new Path(filePath);
int result = 0;
try {
if(this.fileSystem.exists(path)){
if(this.fileSystem.isDirectory(path)){
result = 2;
}else{
result = 1;
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
this.fileSystem.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return result;
}
/**
* 上传文件
* @param sourcePath
* @param savePath
*/
public void uploadFile(String sourcePath, String savePath){
Path source = new Path(sourcePath);
Path disc = new Path(savePath);
try {
this.fileSystem.copyFromLocalFile(source,disc);
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
this.fileSystem.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
public void uploadFile(InputStream input, String savePath) throws IOException {
this.fileSystem.createNewFile(new Path(savePath));
Path inFile = new Path(savePath);
FSDataOutputStream output = this.fileSystem.create(inFile);
IOUtils.copyBytes(input,output,1024*1024*64,false);
output.close();
}
/**
* 下载文件
* @param sourcePath
* @param out
* @throws IOException
*/
public void dowonloadFile(String sourcePath, OutputStream out) throws IOException {
this.fileSystem.createNewFile(new Path(sourcePath));
Path inFile = new Path(sourcePath);
FSDataInputStream input = this.fileSystem.open(inFile);
IOUtils.copyBytes(input,out,1024*1024*64,false);
input.close();
}
public static void main(String[] args) throws IOException, InterruptedException {
String url = "hdfs://my-cdh-master:9000";//注意端口9000跟core-site.xml的fs.defaultFS配置匹配
HdfsUtil util = HdfsUtil.getUtil(url,"root");
util.uploadFile("H:\\VMachines\\my_cdh_slave1\\vmware.log","/test/vmware.log");
}
hdfs的更多操作根据具体需要参考api
至此java操作hdfs完成~~~~