OSS实现多文件多线程的断点下载(java)


开放存储服务(Open Storage Service,OSS),是阿里云对外提供的海量、安全和高可靠的云存储服务,目前越来越多的开发者将应用数据存放至OSS,对使用OSS实现文件的断点续传功能使用的也比较多,在这儿分享下自己使用OSS的实例。

所谓断点下载,就是要从文件已经下载的地方开始继续下载,对于断点续传这样有状态功能的实现,关键点在于如何在客户端完成状态维护。此篇主要介绍多文件的多线程的断点下载。

为了限定同时下载文件的个数和每个文件同时下载的线程数,使用了线程池嵌套线程池来完成,外层线程池downloadMainPool用来限定并发下载的文件数,内层线程池pool执行单个文件的多线程下载。

downloadMainPool 代码片段:

public static ExecutorService downloadMainPool = null;
    static{
        downloadMainPool = Executors.newFixedThreadPool(Constant.CONCURRENT_FILE_NUMBER,new ThreadFactory() {
            public Thread newThread(Runnable r) {
                Thread s = Executors.defaultThreadFactory().newThread(r);
                s.setDaemon(true);
                return s;
            }
        });
    }

在static中实例化为固定大小的线程池,由于默认的ThreadFactory创建的线程为非守护状态,为了避免java程序不能退出的问题,保证在文件下载完后当前java程序结束在jvm中的运行,需要重写ThreadFactory,使其创建的线程为守护线程。

Constant类中自定义了程序中需要使用到的变量,可在类中直接定义或读取配置文件,Constant.CONCURRENT_FILE_NUMBER定义了并发文件数。

downloadMainPool中的每个线程负责一个文件,每个线程下载文件时创建子线程池,由子线程池分块下载文件;要做到断点续传需要记录每个子线程池中的每个线程下载位置,这里使用定时序列化子线程池线程对象的方式,定时将包含了下载位置的线程序列化到文件,在再次下载同一文件时反序列化,直接丢到子线程池下载即可。下面是downloadMainPool中的线程OSSDownloadFile代码清单

OSSDownloadFile 代码:

package cloudStorage.oss;

import java.io.File;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.TimeUnit;
import org.apache.log4j.Logger;
import cloudStorage.basis.Constant;
import cloudStorage.basis.Global;
import cloudStorage.basis.OSSClientFactory;
import cloudStorage.oss.download.DownloadPartObj;
import cloudStorage.oss.download.DownloadPartThread;
import cloudStorage.util.ObjectSerializableUtil;

import com.aliyun.oss.OSSClient;
import com.aliyun.oss.OSSException;
import com.aliyun.oss.model.ObjectMetadata;

/**
 * @Description: oss多线程分段下载文件
 * @author: zrk
 * @time: 2015年4月1日 上午10:37:35
 */
public class OSSDownloadFile  implements Callable<Integer>{
    public static final Logger LOGGER = Logger.getLogger(OSSDownloadFile.class);
    //外层线程池
    public static ExecutorService downloadMainPool = null;
    //内层线程池
    private ExecutorService pool ;

    static{
        downloadMainPool = Executors.newFixedThreadPool(Constant.CONCURRENT_FILE_NUMBER,new ThreadFactory() {
            public Thread newThread(Runnable r) {
                Thread s = Executors.defaultThreadFactory().newThread(r);
                s.setDaemon(true);
                return s;
            }
        });
    }
    private String localFilePath;//本地文件路径
    private String bucketName; //bucketName
    private String key;//云端存储路径

    public OSSDownloadFile() {
        super();
    }

    public OSSDownloadFile(String localFilePath,String bucketName,String key) {
        //初始化子线程池
        pool = Executors.newFixedThreadPool(Constant.SINGLE_FILE_CONCURRENT_THREADS);
        this.localFilePath = localFilePath;
        this.bucketName = bucketName;
        this.key = key;

    }

    //执行当前线程
    public Integer downloadFile() {
        Integer r = Global.ERROR;
        //向downloadMainPool中submit当前线程
        Future<Integer> result = downloadMainPool.submit(this);
        try {
            r=result.get();
        } catch (InterruptedException | ExecutionException e) {
            e.printStackTrace();
        } finally{
            return r;
        }

    }
    /**
     * 
     * @param localFilePath     需要存放的文件路径
     * @param bucketName        bucketName
     * @param key               存储key  -在oss的存储路径
     * @return
     */
    @Override
    public Integer call(){
        //OSSClient 使用单例
        OSSClient client = OSSClientFactory.getInstance();    
        ObjectMetadata objectMetadata = null;
        //判断文件在云端是否存在
        try {
            objectMetadata = client.getObjectMetadata(bucketName, key);
        } catch (OSSException e) {
            LOGGER.info("==请检查bucketName或key");
            return Global.ERROR;
        }
        long fileLength = objectMetadata.getContentLength();

        //自定义的每个下载分块大小
        Integer partSize = Constant.DOWNLOAD_PART_SIZE;

        //需要下载的文件分块数
        int partCount=calPartCount(fileLength, partSize);

        //子线程池的线程对象封装类(用于序列化的)
        DownloadPartObj downloadPartObj = null;
        boolean isSerializationFile = false;
        //序列化的文件路径(与下载文件同路径使用.dw.temp后缀)
        String serializationFilePath = localFilePath+".dw.temp";
        //若存在反序列化对象
        if(new File(serializationFilePath).exists()){
            downloadPartObj = (DownloadPartObj)ObjectSerializableUtil.load(serializationFilePath);
            isSerializationFile = true;
        }
        //序列化文件不存在,分配分块给子线程池线程对象
        if(downloadPartObj==null||!isSerializationFile){
            downloadPartObj = new DownloadPartObj();
            for (int i = 0; i < partCount; i++) {
                  final  long startPos = partSize * i;
                  final  long endPos = partSize * i +( partSize < (fileLength - startPos) ? partSize : (fileLength - startPos)) - 1;
                  //DownloadPartThread是执行每个分块下载任务的线程
                  downloadPartObj.getDownloadPartThreads().add(new DownloadPartThread(startPos, endPos, localFilePath, bucketName, key,Constant.ACCESS_ID,Constant.ACCESS_KEY));
                }
        }

        try {
            int i = 0;
            //download方法提交分块下载线程至子线程池下砸,while循环用于下载失败重复下载,Constant.RETRY定义重复下载次数
            while (download(downloadPartObj,serializationFilePath).isResult()==false) {
                if(++i == Constant.RETRY)break;
                LOGGER.info(Thread.currentThread().getName()+"重试第"+i+"次");
            }
        } catch (Exception e) {
            LOGGER.info("=="+e.getMessage());
            return Global.THREAD_ERROR;
        }
        if(!downloadPartObj.isResult()){
            return Global.NETWORK_ERROR;
        }
        return Global.SUCCESS;
    }


    /**
     *  多线程下载单个文件
     * @param partThreadObj
     * @param serializationFilePath
     * @return
     */
    private DownloadPartObj download(DownloadPartObj partThreadObj,String serializationFilePath){

        try {
            partThreadObj.setResult(true);
            //向子线程池中submit单个文件所有分块下载线程
            for (int i=0 ;i<partThreadObj.getDownloadPartThreads().size();i++) {
                if (partThreadObj.getDownloadPartThreads().get(i).geteTag() == null)
                    pool.submit(partThreadObj.getDownloadPartThreads().get(i));
            }
            //shutdown子线程池,池内所下载任务执行结束后停止当前线程池
            pool.shutdown();
            //循环检查线程池,同时在此序列化partThreadObj
            while (!pool.isTerminated()) {
                ObjectSerializableUtil.save(partThreadObj,serializationFilePath);
                pool.awaitTermination(Constant.SERIALIZATION_TIME, TimeUnit.SECONDS);
            }
            //判断下载结果
            for (DownloadPartThread downloadPartThread: partThreadObj.getDownloadPartThreads()) {
                if (downloadPartThread.geteTag() == null){
                    partThreadObj.setResult(false);
                }
            }
            //下载成功 删除序列化文件
            if (partThreadObj.isResult()==true) 
                ObjectSerializableUtil.delSerlzFile(serializationFilePath);

        } catch (Exception e) {
            LOGGER.info("=="+e.getMessage());
        }
        return partThreadObj;
    }

    /**
     * 获取分块数
     * @param fileLength
     * @param partSize
     * @return
     */
    private static int calPartCount(long fileLength,long partSize) {
        int partCount = (int) (fileLength / partSize);
        if (fileLength % partSize != 0){
            partCount++;
        }
        return partCount;
    }

    public String getLocalFilePath() {
        return localFilePath;
    }

    public void setLocalFilePath(String localFilePath) {
        this.localFilePath = localFilePath;
    }

    public String getBucketName() {
        return bucketName;
    }

    public void setBucketName(String bucketName) {
        this.bucketName = bucketName;
    }

    public String getKey() {
        return key;
    }

    public void setKey(String key) {
        this.key = key;
    }
}

此处引用了多个类
cloudStorage.basis.Constant;//定义程序中使用的变量
cloudStorage.basis.Global;//定义了全局的静态值,错误状态值
cloudStorage.basis.OSSClientFactory;//OSSClient工厂
cloudStorage.oss.download.DownloadPartObj;//分块下载线程类封装
cloudStorage.oss.download.DownloadPartThread;//分块下载线程
cloudStorage.util.ObjectSerializableUtil;//序列工具类

调下载方法是阻塞式,需要给调用者返回下结果,所以使用Callable和Future返回int型状态值,下面是子线程池pool中的分块下载线程DownloadPartThread的代码清单

DownloadPartThread 代码:

package cloudStorage.oss.download;

import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.io.Serializable;
import java.util.Date;
import java.util.concurrent.Callable;
import org.apache.log4j.Logger;
import cloudStorage.basis.OSSClientFactory;
import com.aliyun.oss.common.utils.IOUtils;
import com.aliyun.oss.model.GetObjectRequest;
import com.aliyun.oss.model.OSSObject;

/**
 * @Description: 用于上传每个part的线程类 可序列化 用于上传的断点续传
 * @author: zrk
 * @time: 2015年4月1日 上午10:35:34
 */
public class DownloadPartThread implements Callable<DownloadPartThread>,Serializable {

    private static final long serialVersionUID = 1L;
    public static final Logger LOGGER = Logger.getLogger(DownloadPartThread.class);
    // 当前线程的下载开始位置
    private long startPos;

    // 当前线程的下载结束位置
    private long endPos;

    // 保存文件路径
    private String localFilePath;

    private String bucketName;
    private String fileKey;
    private String eTag;
    private String accessId;
    private String accessKey;

    public DownloadPartThread(long startPos, long endPos, String localFilePath,
            String bucketName, String fileKey, String accessId,
            String accessKey) {
        this.startPos = startPos;
        this.endPos = endPos;
        this.localFilePath = localFilePath;
        this.bucketName = bucketName;
        this.fileKey = fileKey;
        this.accessId = accessId;
        this.accessKey = accessKey;
    }

    @Override
    public DownloadPartThread call() {
        RandomAccessFile file = null;
        OSSObject ossObject = null;
        try {
            File pFile = new File(localFilePath);
            if(!pFile.getParentFile().exists())
                pFile.getParentFile().mkdirs();
            file = new RandomAccessFile(localFilePath, "rw");
            //调用ossapi
            GetObjectRequest getObjectRequest = new GetObjectRequest(bucketName, fileKey);
            getObjectRequest.setRange(startPos, endPos);
            ossObject = OSSClientFactory.getInstance().getObject(getObjectRequest);
            file.seek(startPos);
            int bufSize = 1024;
            byte[] buffer = new byte[bufSize];
            int bytesRead;
            while ((bytesRead = ossObject.getObjectContent().read(buffer)) > -1) {
                file.write(buffer, 0, bytesRead);
                //更新开始位置,保证在出错后重下载是从上次结束的地方开始下,而不是下载整个块
                startPos += bytesRead; 
            }
            this.eTag = ossObject.getObjectMetadata().getETag();
        } catch (Exception e) {
            LOGGER.info("=="+e.getMessage());
        } finally{
            if(ossObject!=null)IOUtils.safeClose(ossObject.getObjectContent());
            try {if(file!=null)file.close();} catch (IOException e) {e.printStackTrace();}
            return this;
        }

    }
}

每个DownloadPartThread下载线程执行结束都会将自己作为返回值返回,当前文件是否下载完整有每个线程的返回值决定。

ObjectSerializableUtil 代码 点击查看

DownloadPartObj 代码:

package cloudStorage.oss.download;

import java.io.Serializable;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

/**
 * @Description: 单个文件的下载线程集合
 * @author: zrk
 * @time: 2015年5月5日 上午10:15:11
 */
public class DownloadPartObj implements Serializable{
    private static final long serialVersionUID = 1L;
    /**
     * 下载线程集合
     */
    List<DownloadPartThread> downloadPartThreads = Collections.synchronizedList(new ArrayList<DownloadPartThread>());
    /**
     * 下载结果
     */
    boolean result = true;

    public List<DownloadPartThread> getDownloadPartThreads() {
        return downloadPartThreads;
    }
    public void setDownloadPartThreads(List<DownloadPartThread> downloadPartThreads) {
        this.downloadPartThreads = downloadPartThreads;
    }
    public boolean isResult() {
        return result;
    }
    public void setResult(boolean result) {
        this.result = result;
    }

}

所有下载当前文件的线程都会操作downloadPartThreads,所以downloadPartThreads使用集合Collections.synchronizedList将其转换为一个线程安全的类。DownloadPartObj封装了downloadPartThreads和一个用于标识下载成功与失败的boolean值,定时保存序列化文件就直接序列化DownloadPartObj,DownloadPartObj的线程安全是至关重要的。

OSSClientFactory 代码:

package cloudStorage.basis;
import com.aliyun.oss.OSSClient;
public class OSSClientFactory {
    private static OSSClient ossClient = null;
    private OSSClientFactory() {
    }
    public static OSSClient getInstance() {
        if (ossClient == null) {
            // 可以使用ClientConfiguration对象设置代理服务器、最大重试次数等参数。
            // ClientConfiguration config = new ClientConfiguration();
            ossClient = new OSSClient(Constant.OSS_ENDPOINT,Constant.ACCESS_ID, Constant.ACCESS_KEY);
        }
        return ossClient;
    }
}

Constant 代码:

package cloudStorage.basis;

import cloudStorage.service.OSSConfigService;

/**
 * @Description: 
 * @author: zrk
 * @time: 2015年4月1日 下午5:22:28
 */
public class Constant {
    public static String OSS_ENDPOINT = "http://oss.aliyuncs.com/";
    public static String ACCESS_ID;
    public static String ACCESS_KEY;
    public static Integer DOWNLOAD_PART_SIZE ; // 每个下载Part的大小
    public static Integer UPLOAD_PART_SIZE ; // 每个上传Part的大小
    public static int CONCURRENT_FILE_NUMBER ; // 并发文件数。
    public static int SINGLE_FILE_CONCURRENT_THREADS ; // 单文件并发线程数。
    public static int RETRY ;//失败重试次数
    public static int SERIALIZATION_TIME;//断点保存时间间隔(秒)
    //。。。
}

Constant中数值是加载外部配置文件,也可在这儿直接配置,个别参数值断点上传时使用,只看下载的话请忽略。

Global代码:

package cloudStorage.basis;
/**
 * @Description: TODO
 * @author: zrk
 * @time: 2015年4月1日 下午5:22:46
 */
public class Global {
    public static final int SUCCESS = 1;
    public static final int ERROR = 10;
    public static final int FILE_NOT_FOUND_ERROR = 11;
    public static final int THREAD_ERROR = 12;
    public static final int NETWORK_ERROR = 13;
    public static final int OSS_SUBMIT_ERROR = 14;
//  META
    public static final String X_OSS_META_MY_MD5 = "x-oss-meta-my-md5";
}

调用下载方式:

实例化OSSDownloadFile后调用downloadFile方法:
return new OSSDownloadFile(localFilePath,bucketName, key).downloadFile();

OSS SDK文件的分块下载支持的很好,OSS官方的SDK里面也提供了一个多线程下载功能的实现,所以在实现文件的分块下载并没有什么难度。这块儿多线程下载仅供大家参考,后面自己在使用过程中继续优化。