需要爬取一个页面,因为访问页面时,参数里需要传入cookie数据,所以先要获取进入该网页的cookie,具体获取cookie步骤如下:

public static String getCookies(String url) throws IOException {
        // 全局请求设置
        RequestConfig globalConfig = RequestConfig.custom().setCookieSpec(CookieSpecs.STANDARD).build();
        // 创建cookie store的本地实例
        CookieStore cookieStore = new BasicCookieStore();
        // 创建HttpClient上下文
        HttpClientContext context = HttpClientContext.create();
        context.setCookieStore(cookieStore);

        // 创建一个HttpClient
        CloseableHttpClient httpClient = HttpClients.custom().setDefaultRequestConfig(globalConfig)
                .setDefaultCookieStore(cookieStore).build();

        CloseableHttpResponse res = null;

        // 创建一个get请求用来获取必要的Cookie,如_xsrf信息
        HttpGet get = new HttpGet(url);

        res = httpClient.execute(get, context);
        // 获取常用Cookie,包括_xsrf信息
       StringBuffer cookie=new StringBuffer();
        for (Cookie c : cookieStore.getCookies()) {
        //拼接所有cookie变成一个字符串;
            cookie.append(c.getName()+"="+c.getValue()+";");
            System.out.println(c.getName() + ": " + c.getValue());
        }
        
        String cookieres=cookie.toString();
        cookieres=cookieres.substring(0,cookieres.length()-1);
        res.close();
        return cookieres;
    }

获取cookie后,再通过post或者get请求,把cookie参数传入获取相对应的返回的json数据,或者html页面