需求:由于系统切换,要求将存在数据库中的网页内容中的img标签的src属性进行修补,举例:
content="<p><img title=\"122444234\" src=\"/files/post/122444234.jpg\"/><p>其他字符";
要求替换后为:
content="<p><img title=\"122444234\" src=\"http://xxx.xxx.com/files/post/122444234_500.jpg\" /><p>其他字符";
使用正则即可解决,代码如下
- /**
- * 将img标签中的src进行二次包装
- * @param content 内容
- * @param replaceHttp 需要在src中加入的域名
- * @param size 需要在src中将文件名加上_size
- * @return
- */
- public static String repairContent(String content,String replaceHttp,int
- "<img\\s*([^>]*)\\s*src=\\\"(.*?)\\\"\\s*([^>]*)>";
- Pattern pattern = Pattern.compile(patternStr,Pattern.CASE_INSENSITIVE);
- Matcher matcher = pattern.matcher(content);
- String result = content;
- while(matcher.find()) {
- 2);
- "pattern string:"+src);
- "";
- if(src.lastIndexOf(".")>0){
- 0,src.lastIndexOf("."))+"_"+size+src.substring(src.lastIndexOf("."));
- }
- if(!src.startsWith("http://")&&!src.startsWith("https://")){
- replaceSrc = replaceHttp + replaceSrc;
- }
- result = result.replaceAll(src,replaceSrc);
- }
- " content == "
- " result == "
- return
- }
测试代码:
- public static void
- "<p><img title=\"10010001\" src=\"/files/post/10010001.gif\" width=\"200\" height=\"300\" />"
- "</p><p><img title=\"10010002\" src=\"/files/post/10010002.gif\" width=\"500\" height=\"300\" /><p> </p>"+
- "</p><p><img title=\"10010003\" src=\"/files/post/10010003.gif\" width=\"600\" height=\"300\" /><p> </p>";
- "http://www.baidu.com";
- int size = 500;
- String result = ApiUtil.repairContent(content, replaceHttp, size);
- System.out.println(result);
- }
关键在于正则表达式:<img\\s*([^>]*)\\s*src=\\\"(.*?)\\\"\\s*([^>]*)>
特别是 ([^>]*) 不能用.*代替,否则只会从<img匹配到字符串最后一个">"符号为止,如果每个src的内容不一样,就只会替换最后一个src
实际操作代码如下:
是去掉了上面的size。
public class Test2 {
public static void main(String[] args) {
String content = "<img alt=\"\" src=\"http://img5.imgtn.bdimg.com/it/u=1310232405,2471553728&fm=27&gp=0.jpg\" />"
+ "法鲁文件a<img alt=\"\" src=\"upload/20180608145137.png\" />sfdsfds<img alt=\"\" src=\"upload/20180608145137.png\" />"
+ "afd<img alt=\"\" src=\"http://img5.imgtn.bdimg.com/it/u=1310232405,2471553728&fm=27&gp=0.jpg\" />"
+ "s的速度NS<a href=\"http://www.baidu.com\" target=\"_blank\">A看到你</a>"
+ "<img alt=\"\" src=\"http://img5.imgtn.bdimg.com/it/u=1310232405,2471553728&fm=27&gp=0.jpg\" />法鲁"
+ "文件a<img alt=\"\" src=\"upload/20180608145137.png\" />sfdsfds<img alt=\"\" src=\"upload/20180608145137.png\" />a"
+ "fd<img alt=\"\" src=\"http://img5.imgtn.bdimg.com/it/u=1310232405,2471553728&fm=27&gp=0.jpg\" />s的速度"
+ "NS<a href=\"http://www.baidu.com\" target=\"_blank\">A看到你</a>";
String replaceHttp = "http://xxx.xx.xxx.xxx:8089/xxbd/";
String result = repairContent(content, replaceHttp);
System.out.println(result);
}
public static String repairContent(String content, String replaceHttp) {
String patternStr = "<img\\s*([^>]*)\\s*src=\\\"(.*?)\\\"\\s*([^>]*)>";
Pattern pattern = Pattern.compile(patternStr, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(content);
String result = content;
while (matcher.find()) {
String src = matcher.group(2);
String img = matcher.group(0);
String img2 = matcher.group(0);
// System.out.println("pattern string:"+src);
String replaceSrc = "";
if (!src.startsWith("http:")) {
replaceSrc = replaceHttp + src;
img = img.replaceAll(src, replaceSrc);
result = result.replaceAll(img2, img);
}
}
// System.out.println(" content == " +content);
// System.out.println(" result == " + result);
return result;
}
}