java 获取url传值 java通过url获取数据

转载

mob64ca1411e411 2023-08-19 22:08:32

文章标签 java 获取url传值 java url 爬虫 http 文章分类 Java 后端开发

从URL中获取数据

Java中发起的POST请求，接收URL中的数据。

拿到的接口不能直接访问，会被拦截。

需要在header中加入发布接口时，提供的串码（key-value的形式），即可获取到数据。

HTTPClient的使用方式：

使用HttpClient发送请求、接收响应很简单，一般需要如下几步即可。 1. 创建HttpClient对象。 2. 创建请求方法的实例，并指定请求URL。如果需要发送GET请求，创建HttpGet对象；如果需要发送POST请求，创建HttpPost对象。 3. 如果需要发送请求参数，可调用HttpGet、HttpPost共同的setParams(HetpParams params)方法来添加请求参数；对于HttpPost对象而言，也可调用setEntity(HttpEntity entity)方法来设置请求参数。 4. 调用HttpClient对象的execute(HttpUriRequest request)发送请求，该方法返回一个HttpResponse。 5. 调用HttpResponse的getAllHeaders()、getHeaders(String name)等方法可获取服务器的响应头；调用HttpResponse的getEntity()方法可获取HttpEntity对象，该对象包装了服务器的响应内容。程序可通过该对象获取服务器的响应内容。 6. 释放连接。无论执行方法是否成功，都必须释放连接参考链接：

`使用到了Apache提供的commons-httpclient jar包，在pom中的依赖： <dependency> <groupId>commons-httpclient</groupId> <artifactId>commons-httpclient</artifactId> <version>3.1</version> </dependency>` `使用到了Apache提供的commons-httpclient jar包，在pom中的依赖： <dependency> <groupId>commons-httpclient</groupId> <artifactId>commons-httpclient</artifactId> <version>3.1</version> </dependency>`
示例代码： public String transRequest(String url, String type, String message) { // 响应内容 String result = ""; // 定义http客户端对象--httpClient HttpClient httpClient = new HttpClient(); // 定义并实例化客户端链接对象-postMethod PostMethod postMethod = new PostMethod(url); try{ // 设置http的头 postMethod.setRequestHeader("ContentType", "application/x-www-form-urlencoded;charset=UTF-8"); // 填入各个表单域的值 NameValuePair[] data = { new NameValuePair("type", type), new NameValuePair("message", message) }; // 将表单的值放入postMethod中 postMethod.setRequestBody(data); // 定义访问地址的链接状态 int statusCode = 0; try { // 客户端请求url数据 statusCode = httpClient.executeMethod(postMethod); } catch (Exception e) { e.printStackTrace(); } // 请求成功状态-200 if (statusCode == HttpStatus.SC_OK) { try { result = postMethod.getResponseBodyAsString(); } catch (IOException e) { e.printStackTrace(); } } else { log.error("请求返回状态：" + statusCode); } } catch (Exception e) { log.error(e.getMessage(), e); } finally { // 释放链接 postMethod.releaseConnection(); httpClient.getHttpConnectionManager().closeIdleConnections(0); } return result; } 使用此种方法可以得到POST中的数据信息。
但是在实际开发中，需要在header中加串码，保证数据的安全性，如果不加串码，访问会被拦截。
实现代码如下： public String getURLInfo(){ String result = ""; SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMdd"); String nowTime = sdf.format(new Date()); String url = "http://10.161..:**/ checkDate=" + nowTime; //定义http客户端对象，定义并实例化客户端连接对象。 HttpClient httpClient = new HttpClient(); PostMethod postMethod = new PostMethod(url); try { postMethod.setRequestHeader(headerName , headerValue); int statusCode = 0; try { //客户端请求url中的数据。返回请求结果的状态码。 statusCode = httpClient.executeMethod(postMethod); }catch (Exception e){ e.printStackTrace(); } //如果状态码=200。表示请求成功。 if(statusCode == HttpStatus.SC_OK){ try { result = postMethod.getResponseBodyAsString(); }catch (Exception e){ e.printStackTrace(); } }else{ System.out.println("请求有误，错误代码："+ statusCode); } }catch (Exception e){ System.out.println(e.getMessage()); }finally { //释放链接。 postMethod.releaseConnection(); httpClient.getHttpConnectionManager().closeIdleConnections(0); } return result; } 得到url中数据的字符串形式。一般是一个JSON格式的字符串。之后需要对字符串做什么处理，截取或者强转都可以。

org.json和net.sf.json的区别

net.sf.json.JSONObject 和org.json.JSONObject 的差别。一、创建json对象 String str = "{\"code\":\"0000\", \"msg\":{\"availableBalance\":31503079.02}} org.json.JSONObject： JSONObject json = new JSONObject(str); net.sf.json.JSONObject： JSONObject json = JSONObject.fromObject(str); net.sf.json.jsonobject 没有 new JSONObject(String)的构造方法二、解析json 第一种直接用json对象.getXXX()；方法获取 net.sf.json.JSONObject：没有严格要求获取字段的类型跟getXXX()的类型一样 org.json.JSONObject：获取的字段类型必须跟getXXX()的类型一样 e.g. JSONObject msgObj = json.getJSONObject("msg"); String availableBalance = msgObj.getString("availableBalance"); 如果在org.json.JSONObject 就会报错，可以msgObj.getDouble("availableBalance");也不会丢精度；而net.sf.json.JSONObject正确，但是精度会丢失，如果String str = "{\"code\":\"0000\", \"msg\":{\"availableBalance\":\"31503079.02\"}}"; 就不会丢失精度。第二中json对象直接转变实体对象 public class BalanceDto { private String availableBalance; public String getAvailableBalance() { return availableBalance; } public void setAvailableBalance(String availableBalance) { this.availableBalance = availableBalance; } public String toString(){ return "availableBalance "+availableBalance; } } org.json.JSONObject： BalanceDto alanceDto = (BalanceDto) JSONObject.stringToValue(msgObj); 这个句话编译通过，但是运行会报错，原因是BalanceDto 类中availableBalance 的类型跟json中的“availableBalance ”类型不同意 net.sf.json.JSONObject： String msg = json.getString("msg"); BalanceDto alanceDto = (BalanceDto) JSONObject.toBean( msg, new BalanceDto().getClass()); 三、从json中获取数组 JSONArray subArray = json.getJSONArray("msg"); net.sf.json.JSONObject: int leng = subArray.size(); org.json.JSONObject： int leng = subArray.length();

net.sf.json.JSONObject 和org.json.JSONObject  的差别。
 
一、创建json对象
String str = "{\"code\":\"0000\", \"msg\":{\"availableBalance\":31503079.02}}
 
org.json.JSONObject：
JSONObject json = new JSONObject(str);
 
net.sf.json.JSONObject：
JSONObject json = JSONObject.fromObject(str);  
 
net.sf.json.jsonobject 没有 new JSONObject(String)的构造方法
 
二、解析json
 
第一种直接用json对象.getXXX()；方法获取
 
net.sf.json.JSONObject： 没有严格要求获取字段的类型跟getXXX()的类型一样
org.json.JSONObject：获取的字段类型必须跟getXXX()的类型一样
 
e.g.
JSONObject msgObj = json.getJSONObject("msg");
 
String availableBalance = msgObj.getString("availableBalance");
 
如果在org.json.JSONObject 就会报错，可以msgObj.getDouble("availableBalance");也不会丢精度；而net.sf.json.JSONObject正确，但是精度会丢失，如果String str = "{\"code\":\"0000\", \"msg\":{\"availableBalance\":\"31503079.02\"}}";
就不会丢失精度。
 
第二中json对象直接转变实体对象
 
public class BalanceDto {
private String availableBalance;
public String getAvailableBalance() {
return availableBalance;
}
public void setAvailableBalance(String availableBalance) {
this.availableBalance = availableBalance;
}
public String toString(){
 
return "availableBalance   "+availableBalance;
}
 
 
}
org.json.JSONObject：
 
BalanceDto alanceDto  = (BalanceDto) JSONObject.stringToValue(msgObj);
 
这个句话编译通过，但是运行会报错，原因是BalanceDto 类中availableBalance 的类型跟json中的“availableBalance ”类型不同意
 
net.sf.json.JSONObject：
 
String msg = json.getString("msg");
BalanceDto  alanceDto = (BalanceDto) JSONObject.toBean(
msg, new BalanceDto().getClass());
 
三、从json中获取数组
 
JSONArray subArray = json.getJSONArray("msg");
 
net.sf.json.JSONObject:
int leng = subArray.size();
 
org.json.JSONObject：
int leng = subArray.length();

HTTP之常见状态码

1xx：指示信息--表示请求已接收，继续处理

2xx：成功--表示请求已被成功接收、理解、接受

3xx：重定向--要完成请求必须进行更进一步的操作

4xx：客户端错误--请求有语法错误或请求无法实现

5xx：服务器端错误--服务器未能实现合法的请求

200 OK //客户端请求成功 400 Bad Request //客户端请求有语法错误，不能被服务器所理解 401 Unauthorized //请求未经授权，这个状态代码必须和WWW-Authenticate报头域一起使用 403 Forbidden //服务器收到请求，但是拒绝提供服务 404 Not Found //请求资源不存在，eg：输入了错误的URL 500 Internal Server Error //服务器发生不可预期的错误 503 Server Unavailable //服务器当前不能处理客户端的请求，一段时间后可能恢复正常

200 OK                        //客户端请求成功
400 Bad Request               //客户端请求有语法错误，不能被服务器所理解
401 Unauthorized              //请求未经授权，这个状态代码必须和WWW-Authenticate报头域一起使用 
403 Forbidden                 //服务器收到请求，但是拒绝提供服务
404 Not Found                 //请求资源不存在，eg：输入了错误的URL
500 Internal Server Error     //服务器发生不可预期的错误
503 Server Unavailable        //服务器当前不能处理客户端的请求，一段时间后可能恢复正常

下载某URL中的图片

此种方法没有定位某些图片的功能，但是可以下载到图片。

package com.fly.test; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import org.junit.Test; import java.io.; import java.net.URL; import java.net.URLConnection; import java.nio.charset.Charset; /* * @Description : * @Create by FLY on 2017-11-02 14:56 / public class DemoDownloadPicture { String ALL_URL_STR = ""; String ALL_SRC_STR = ""; int nonameId = 1; int record = 0; int noPicname = 0; @Test public void start(){ //要爬取的网站地址 String urlStr = "http://cpu.baidu.com/wap/1022/1329713/detail/4970306611472452/news?blockId=2998&foward=block"; String html = getHTML(urlStr); getURL(html,0,"E://crawler//pic"); //图片存放地址，若无需创建 } public String getHTML(String urlStr){ StringBuilder html = new StringBuilder(); BufferedReader buffer = null; try { URL url = new URL(urlStr); URLConnection conn = url.openConnection(); conn.connect(); buffer = new BufferedReader(new InputStreamReader(conn.getInputStream(), Charset.forName("UTF-8"))); String line = null; while((line = buffer.readLine()) != null){ html.append(line); } }catch (Exception e){ e.printStackTrace(); }finally { if(buffer != null){ try { buffer.close(); }catch (Exception e){ throw new RuntimeException("关闭流错误"); } } } return html.toString(); } public void getURL(String html, int tmp,String fileName){ if(tmp > 5 \|\| html == null \|\| html.length() == 0){ System.out.println("--------end-------"); return; } if(record > 1000){ System.out.println("--------图片大于1000张-----"); return; } System.out.println("------start----------"); String urlMain = "http://cpu.baidu.com/wap/1022/1329713/detail/4970306611472452/news?blockId=2998&foward=block"; String urlPicMain = "http:"; //解析网页内容 Document doc = Jsoup.parse(html); //获取图片的链接，并下载图片。 Elements imglinks = doc.select("img[src]"); int picnum = 0; String dirFileName = ""; for(Element imglink : imglinks){ String src = imglink.attr("src"); if(src == null \|\|"".equals(src) \|\| src.length() < 3){ continue; } if(!ALL_SRC_STR.contains(src)){ ALL_SRC_STR += src + " ## "; if(!src.contains(urlPicMain)){ src = urlPicMain + src; } if(picnum == 0){ //创建新目录 dirFileName = makedir(fileName); picnum ++ ; } record ++; downloadPicture(src , dirFileName); } } Elements links = doc.select("a"); for(Element link : links){ String href = link.attr("href"); String text = link.text(); if(href == null \|\| "".equals(href) \|\| href.length() > 3){ continue; } if(text == null \|\| "".equals(text)){ text = "noName" + nonameId ++; } if(!href.contains(urlMain)){ href = urlMain + href; } //distinct if(!ALL_URL_STR.contains(href)){ ALL_URL_STR += href + " ## "; System.out.println("**********"); System.out.println("获取到新的url地址"+text+"--->"+href); getURL(getHTML(href) , tmp ++ ,text); } } return; } public void downloadPicture(String src,String fileName){ InputStream is = null; OutputStream os = null; try { String imageName = src.substring(src.lastIndexOf("/")+1,src.length()); int index = src.lastIndexOf("."); String imgType = ".png"; System.out.println(index); if(index != 1){ imgType = src.substring(index+1,src.length()); if(imgType.length() > 5){ imgType = ".png"; } } if(imageName == null \|\| imageName.length() == 0){ imageName = ""+ noPicname++ ; } imageName += imgType; //连接URL URL url = new URL(src); URLConnection uri = url.openConnection(); is = uri.getInputStream(); os = new FileOutputStream(new File(fileName,imageName)); byte[] buf = new byte[1024]; int length = 0; while((length = is.read(buf, 0,buf.length)) != -1){ os.write(buf,0,length); } os.close(); is.close(); System.out.println(src + "下载成功====="); }catch (Exception e){ System.out.println(src + "下载失败====="); }finally { try { if(os != null){ os.close(); } if(is != null){ is.close(); } }catch (IOException e){ System.out.println("关闭流时发生异常"); } } } public String makedir(String filesName){ //定义文件夹路径 String fileParh = "E://crawler//pic//"+filesName; File file = new File(fileParh); if(!file.exists()&&!file.isDirectory()){ file.mkdirs();//创建文件夹 if(file.exists()&&file.isDirectory()){ System.out.println("文件夹创建成功"); return fileParh; }else{ System.out.println("文件夹创建不成功"); return "E://crawler//pic"; } } else{ System.out.println(filesName + "文件已经存在"); return fileParh; } } }

package com.fly.test;
 
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.junit.Test;
 
import java.io.*;
import java.net.URL;
import java.net.URLConnection;
import java.nio.charset.Charset;
 
/**
 * @Description :
 * @Create by FLY on 2017-11-02  14:56
 */
public class DemoDownloadPicture {
 
    String ALL_URL_STR = "";
    String ALL_SRC_STR = "";
    int nonameId = 1;
    int record = 0;
    int noPicname = 0;
 
    @Test
    public void start(){
        //要爬取的网站地址
        String urlStr = "http://cpu.baidu.com/wap/1022/1329713/detail/4970306611472452/news?blockId=2998&foward=block";
        String html = getHTML(urlStr);
        getURL(html,0,"E://crawler//pic");      //图片存放地址，若无需创建
    }
 
    public String getHTML(String urlStr){
        StringBuilder html = new StringBuilder();
        BufferedReader buffer = null;
        try {
            URL url = new URL(urlStr);
            URLConnection conn = url.openConnection();
            conn.connect();
 
            buffer = new BufferedReader(new InputStreamReader(conn.getInputStream(), Charset.forName("UTF-8")));
            String line = null;
            while((line = buffer.readLine()) != null){
                html.append(line);
            }
 
        }catch (Exception e){
            e.printStackTrace();
        }finally {
            if(buffer != null){
                try {
                    buffer.close();
                }catch (Exception e){
                    throw new RuntimeException("关闭流错误");
                }
            }
        }
        return html.toString();
    }
 
    public void getURL(String html, int tmp,String fileName){
        if(tmp > 5 || html == null || html.length() == 0){
            System.out.println("--------end-------");
            return;
        }
 
        if(record > 1000){
            System.out.println("--------图片大于1000张-----");
            return;
        }
 
        System.out.println("------start----------");
 
         String urlMain = "http://cpu.baidu.com/wap/1022/1329713/detail/4970306611472452/news?blockId=2998&foward=block";
         String urlPicMain = "http:";
 
        //解析网页内容
        Document doc = Jsoup.parse(html);
        //获取图片的链接，并下载图片。
        Elements imglinks = doc.select("img[src]");
        int picnum = 0;
        String dirFileName = "";
        for(Element imglink : imglinks){
            String src = imglink.attr("src");
            if(src == null ||"".equals(src) || src.length() < 3){
                continue;
            }
            if(!ALL_SRC_STR.contains(src)){
                ALL_SRC_STR += src + " ## ";
                if(!src.contains(urlPicMain)){
                    src = urlPicMain + src;
                }
                if(picnum == 0){
                    //创建新目录
                    dirFileName  = makedir(fileName);
                    picnum ++ ;
                }
                record ++;
                downloadPicture(src , dirFileName);
            }
        }
        Elements links = doc.select("a");
        for(Element link : links){
            String href = link.attr("href");
            String text = link.text();
            if(href == null || "".equals(href) || href.length() > 3){
                continue;
            }
            if(text == null || "".equals(text)){
                text = "noName" + nonameId ++;
            }
            if(!href.contains(urlMain)){
                href = urlMain + href;
            }
            //distinct
            if(!ALL_URL_STR.contains(href)){
                ALL_URL_STR += href + " ## ";
                System.out.println("***********");
                System.out.println("获取到新的url地址"+text+"--->"+href);
                getURL(getHTML(href) , tmp ++ ,text);
            }
        }
        return;
    }
 
    public void downloadPicture(String src,String fileName){
        InputStream is = null;
        OutputStream os = null;
        try {
            String imageName = src.substring(src.lastIndexOf("/")+1,src.length());
            int index = src.lastIndexOf(".");
            String imgType = ".png";
            System.out.println(index);
            if(index != 1){
                imgType = src.substring(index+1,src.length());
                if(imgType.length() > 5){
                    imgType = ".png";
                }
            }
            if(imageName == null || imageName.length() == 0){
                imageName = ""+ noPicname++ ;
            }
            imageName += imgType;
            
            //连接URL
            URL url = new URL(src);
            URLConnection uri = url.openConnection();
            is = uri.getInputStream();
            os = new FileOutputStream(new File(fileName,imageName));
            
            byte[] buf = new byte[1024];
            int length = 0;
            while((length = is.read(buf, 0,buf.length)) != -1){
                os.write(buf,0,length);
            }
            os.close();
            is.close();
            System.out.println(src + "下载成功=====");
        }catch (Exception e){
            System.out.println(src + "下载失败=====");
        }finally {
            try {
                if(os != null){
                    os.close();
                }
                if(is != null){
                    is.close();
                }
            }catch (IOException e){
                System.out.println("关闭流时发生异常");
            }
        }
    }
 
    public String makedir(String filesName){
        //定义文件夹路径
        String fileParh = "E://crawler//pic//"+filesName;
        File file = new File(fileParh);
        if(!file.exists()&&!file.isDirectory()){
            file.mkdirs();//创建文件夹
            if(file.exists()&&file.isDirectory()){
                System.out.println("文件夹创建成功");
                return fileParh;
            }else{
                System.out.println("文件夹创建不成功");
                return "E://crawler//pic";
            }
        }
        else{
            System.out.println(filesName + "文件已经存在");
            return fileParh;
        }
    }
}