hadoop副本冗余存储策略如何存储三个副本

转载

码海舵手之心 2024-11-04 10:32:59

文章标签 hadoop 上传 HDFS 文章分类 Hadoop 大数据

基于hadoop HDFS的存储系统（web 网盘）

1. HDFS的优势

1.1 源码注释说很透彻：
1.2 对外作为一个整体和容错性的原理
1.3 传统存储平台开发的弊端

2. 代码实现

2.1 上传功能
2.2 下载功能

结语

1. HDFS的优势

1.1 源码注释说很透彻：

Hadoop DFS is a multi-machine system that appears as a single
disk. It’s useful because of its fault tolerance and potentially
very large capacity.
解释：HDFS是一个多机系统，对外作为一个整体的磁盘存在。由于它的容错性和大容量，具有很大的可用性

1.2 对外作为一个整体和容错性的原理

1）整体性（虚拟化）
NameNode保存元数据，作为整个系统打大管家。客户端直接面对的是NameNode，它负管理所有的数据节点（datanode），这样一来对于客户端来说，整合存储端是黑盒的，一体化的，无需关心具体的服务器存储点。
2）容错性
数据数据多机备份、数据校验、心跳检测、数据块报告、读写容错。这些相当于对我们的数据加了一层更加强力的外衣。

1.3 传统存储平台开发的弊端

1).
业务代码需要感知服务端存储节点的分布/状态
操作时获取哪些可以用于存储，哪些空间不足进而来决定如何存放文件
2).
当然也可以借助ZK或者数据库动态获取存储端，但是依然无法摆脱业务端对服务器端的耦合，需要具有
3) .
数据的安全性上，很难做出保证。即使有保障，也需要业务端切入过多的逻辑。对性能的消耗也是不一般的

2. 代码实现

注释：后端基于springboot实现
首先我们要和集群建立联系，通过master的节点获取存储空间的状态和要存储的data节点，从而确定下一步的数据传输目的地和分布点。

和集群的交可以借助如下的一个重要的开源包hadoop-common，这是一个对hadoop其余模块的公共支持库。包含conf、FS、IO和IPC等部分
hadoop-common源码地址如下:
https://github.com/apache/hadoop/tree/trunk/hadoop-common-project

2.1 上传功能

1）接收用户的上传请求

package com.hdlw.controls;
 @Controller
 public class UploadMap {
    ...
    @RequestMapping(value = "/upload")
    @ResponseBody
    public ReturnMessage upload(@RequestParam("file") MultipartFile file, HttpServletRequest httpServletRequest) {
        ...
        int result = UploadDownload.upload(file.getInputStream(), path);
        ...
    }
}

2）处理上传存储请求

public class UploadDownload {
     ...
     public static int upload(InputStream inputStream, String path) {
         int result = -1;
        FileSystem fileSystem = checkAndInit();
        if (fileSystem == null) {
            return result;
        }
        OutputStream outputStream = null;
        try {
            outputStream = fileSystem.create(new Path(path), true);
            result = IOUtils.copy(inputStream, outputStream);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(inputStream,outputStream);
        }
        return result;
    }
 ...
 }

2.2 下载功能

1）接收用户的下载请求

@GetMapping("/download")
    public String downloadFile(HttpServletRequest request, HttpServletResponse response, @RequestParam String filenameName) {
       .....
        try {
            response.setContentType("application/force-download");
            response.addHeader("Content-Disposition", "attachment;fileName=" + filenameName);
            int result = UploadDownload.download(response.getOutputStream(), path);
            System.out.println(result);
            if (result != -1) {
                returnMessage.setMsg(constants.operate_ok);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }

1）下载请求处理

public static int download(OutputStream outputStream, String path){
        int result = -1;
        FileSystem fileSystem = checkAndInit();
        if (fileSystem == null) {
            return result;
        }
        InputStream inputStream = null;
        Path filePath = new Path(path);
        try {
            if (!fileSystem.exists(filePath)) {
                return result;
            }
            inputStream = fileSystem.open(new Path(path));
            result = IOUtils.copy(inputStream, outputStream);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            close(inputStream,outputStream);
        }
        return result;
    }

结语

借助hadoop-common,可以帮助我们和NameNode进行交互,申请存储空间或者查询文件.

NameNode,是存储的文件元数据.

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：云平台内存超分方法

下一篇：Android ipv6测试地址

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯