面试官Q1:请问为什么String用"+"拼接字符串效率低下,最好能从JVM角度谈谈吗?
对于这个问题,我们先来看看如下代码:
1public class StringTest {
2 public static void main(String[] args) {
3 String a = "abc";
4 String b = "def";
5 String c = a + b;
6 String d = "abc" + "def";
7 System.out.Println(c);
8 System.out.Println(d);
9 }
10}
打印结果:
1abcdef
2abcdef
从上面代码示例中,我们看到两种方式拼接的字符串打印的结果是一样的。但这只是表面上的,实际内部运行不一样。
两者究竟有什么不一样?
为了看到两者的不同,对代码做如下调整:
1public class StringTest {
2 public static void main(String[] args) {
3 String a = "abc";
4 String b = "def";
5 String c = a + b;
6 System.out.Println(c);
7 }
8}
我们看看编译完成后它是什么样子:
1C:\Users\GRACE\Documents>javac StringTest.java
2C:\Users\GRACE\Documents>javap -verbose StringTest
3Classfile /C:/Users/GRACE/Documents/StringTest.class
4 Last modified 2018-7-21; size 607 bytes
5 MD5 checksum a2729f11e22d7e1153a209e5ac968b98
6 Compiled from "StringTest.java"
7public class StringTest
8 minor version: 0
9 major version: 52
10 flags: ACC_PUBLIC, ACC_SUPER
11Constant pool:
12 #1 = Methodref #11.#20 // java/lang/Object."<init>":()V
13 #2 = String #21 // abc
14 #3 = String #22 // def
15 #4 = Class #23 // java/lang/StringBuilder
16 #5 = Methodref #4.#20 // java/lang/StringBuilder."<init>":()V
17 #6 = Methodref #4.#24 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
18 #7 = Methodref #4.#25 // java/lang/StringBuilder.toString:()Ljava/lang/String;
19 #8 = Fieldref #26.#27 // java/lang/System.out:Ljava/io/PrintStream;
20 #9 = Methodref #28.#29 // java/io/PrintStream.println:(Ljava/lang/String;)V
21 #10 = Class #30 // StringTest
22 #11 = Class #31 // java/lang/Object
23 #12 = Utf8 <init>
24 #13 = Utf8 ()V
25 #14 = Utf8 Code
26 #15 = Utf8 LineNumberTable
27 #16 = Utf8 main
28 #17 = Utf8 ([Ljava/lang/String;)V
29 #18 = Utf8 SourceFile
30 #19 = Utf8 StringTest.java
31 #20 = NameAndType #12:#13 // "<init>":()V
32 #21 = Utf8 abc
33 #22 = Utf8 def
34 #23 = Utf8 java/lang/StringBuilder
35 #24 = NameAndType #32:#33 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
36 #25 = NameAndType #34:#35 // toString:()Ljava/lang/String;
37 #26 = Class #36 // java/lang/System
38 #27 = NameAndType #37:#38 // out:Ljava/io/PrintStream;
39 #28 = Class #39 // java/io/PrintStream
40 #29 = NameAndType #40:#41 // println:(Ljava/lang/String;)V
41 #30 = Utf8 StringTest
42 #31 = Utf8 java/lang/Object
43 #32 = Utf8 append
44 #33 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
45 #34 = Utf8 toString
46 #35 = Utf8 ()Ljava/lang/String;
47 #36 = Utf8 java/lang/System
48 #37 = Utf8 out
49 #38 = Utf8 Ljava/io/PrintStream;
50 #39 = Utf8 java/io/PrintStream
51 #40 = Utf8 println
52 #41 = Utf8 (Ljava/lang/String;)V
53{
54 public StringTest();
55 descriptor: ()V
56 flags: ACC_PUBLIC
57 Code:
58 stack=1, locals=1, args_size=1
59 0: aload_0
60 1: invokespecial #1 // Method java/lang/Object."<init>":()V
61 4: return
62 LineNumberTable:
63 line 1: 0
64
65 public static void main(java.lang.String[]);
66 descriptor: ([Ljava/lang/String;)V
67 flags: ACC_PUBLIC, ACC_STATIC
68 Code:
69 stack=2, locals=4, args_size=1
70 0: ldc #2 // String abc
71 2: astore_1
72 3: ldc #3 // String def
73 5: astore_2
74 6: new #4 // class java/lang/StringBuilder
75 9: dup
76 10: invokespecial #5 // Method java/lang/StringBuilder."<init>":()V
77 13: aload_1
78 14: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
79 17: aload_2
80 18: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
81 21: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
82 24: astore_3
83 25: getstatic #8 // Field java/lang/System.out:Ljava/io/PrintStream;
84 28: aload_3
85 29: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
86 32: return
87 LineNumberTable:
88 line 3: 0
89 line 4: 3
90 line 5: 6
91 line 6: 25
92 line 7: 32
93}
94SourceFile: "StringTest.java"
首先看到使用了一个指针指向一个常量池中的对象内容为“abc”,而另一个指针指向“def”,此时通过new申请了一个StringBuilder,然后调用这个StringBuilder的初始化方法;然后分别做了两次append操作,然后最后做一个toString()操作;可见String的+在编译后会被编译为StringBuilder来运行,我们知道这里做了一个new StringBuilder的操作,并且做了一个toString的操作,如果你对JVM有所了解,凡是new出来的对象绝对不会放在常量池中,toString会发生一次内容拷贝,但是也不会在常量池中,所以在这里常量池String+常量池String放在了堆中。
我们再来看看另外一种情况,用同样的方式来看看结果是什么:
代码如下:
1public class StringTest {
2 public static void main(String[] args) {
3 String c = "abc" + "def";
4 System.out.println(c);
5 }
6}
我们也来看看它编译完成后是什么样子:
1C:\Users\GRACE\Documents>javac StringTest.java
2
3C:\Users\GRACE\Documents>javap -verbose StringTest
4Classfile /C:/Users/GRACE/Documents/StringTest.class
5 Last modified 2018-7-21; size 426 bytes
6 MD5 checksum c659d48ff8aeb45a3338dea5d129f593
7 Compiled from "StringTest.java"
8public class StringTest
9 minor version: 0
10 major version: 52
11 flags: ACC_PUBLIC, ACC_SUPER
12Constant pool:
13 #1 = Methodref #6.#15 // java/lang/Object."<init>":()V
14 #2 = String #16 // abcdef
15 #3 = Fieldref #17.#18 // java/lang/System.out:Ljava/io/PrintStream;
16 #4 = Methodref #19.#20 // java/io/PrintStream.println:(Ljava/lang/String;)V
17 #5 = Class #21 // StringTest
18 #6 = Class #22 // java/lang/Object
19 #7 = Utf8 <init>
20 #8 = Utf8 ()V
21 #9 = Utf8 Code
22 #10 = Utf8 LineNumberTable
23 #11 = Utf8 main
24 #12 = Utf8 ([Ljava/lang/String;)V
25 #13 = Utf8 SourceFile
26 #14 = Utf8 StringTest.java
27 #15 = NameAndType #7:#8 // "<init>":()V
28 #16 = Utf8 abcdef
29 #17 = Class #23 // java/lang/System
30 #18 = NameAndType #24:#25 // out:Ljava/io/PrintStream;
31 #19 = Class #26 // java/io/PrintStream
32 #20 = NameAndType #27:#28 // println:(Ljava/lang/String;)V
33 #21 = Utf8 StringTest
34 #22 = Utf8 java/lang/Object
35 #23 = Utf8 java/lang/System
36 #24 = Utf8 out
37 #25 = Utf8 Ljava/io/PrintStream;
38 #26 = Utf8 java/io/PrintStream
39 #27 = Utf8 println
40 #28 = Utf8 (Ljava/lang/String;)V
41{
42 public StringTest();
43 descriptor: ()V
44 flags: ACC_PUBLIC
45 Code:
46 stack=1, locals=1, args_size=1
47 0: aload_0
48 1: invokespecial #1 // Method java/lang/Object."<init>":()V
49 4: return
50 LineNumberTable:
51 line 1: 0
52
53 public static void main(java.lang.String[]);
54 descriptor: ([Ljava/lang/String;)V
55 flags: ACC_PUBLIC, ACC_STATIC
56 Code:
57 stack=2, locals=2, args_size=1
58 0: ldc #2 // String abcdef
59 2: astore_1
60 3: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream;
61 6: aload_1
62 7: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
63 10: return
64 LineNumberTable:
65 line 3: 0
66 line 4: 3
67 line 5: 10
68}
69SourceFile: "StringTest.java"
这一次编译完后的代码比前面少了很多,而且,仔细看,你会发现14行处,编译的过程中直接变成了"abcdef",这是为什么呢?因为当发生“abc” + “def”在同一行发生时,JVM在编译时就认为这个加号是没有用处的,编译的时候就直接变成
1String d = "abcdef";
同理如果出现:String a =“a” + 1,编译时候就会变成:String a = “a1″;
再补充一个例子:
1final String a = "a";
2final String b = "ab";
3String c = a + b;
在编译时候,c部分会被编译为:String c = “aab”;但是如果a或b有任意一个不是final的,都会new一个新的对象出来;其次再补充下,如果a和b,是某个方法返回回来的,不论方法中是final类型的还是常量什么的,都不会被在编译时将数据编译到常量池,因为编译器并不会跟踪到方法体里面去看你做了什么,其次只要是变量就是可变的,即使你认为你看到的代码是不可变的,但是运行时是可以被切入的。
那么效率问题从何说起?
那说了这么多,也没看到有说效率方面的问题呀?
其实上面两个例子,连接字符串行表达式很简单,那么"+"和StringBuilder基本是一样的,但如果结构比较复杂,如使用循环来连接字符串,那么产生的Java Byte Code就会有很大的区别。我们再来看看下面一段代码:
1import java.util.*;
2public class StringTest {
3 public static void main(String[] args){
4 String s = "";
5 Random rand = new Random();
6 for (int i = 0; i < 10; i++){
7 s = s + rand.nextInt(1000) + " ";
8 }
9 System.out.println(s);
10 }
11}
上面代码反编译后的结果如下:
C:\Java\jdk1.8.0_171\bin>javap -c E:\StringTest.class
Picked up _JAVA_OPTIONS: -Xmx512M
Compiled from "StringTest.java"
public class StringTest {
public StringTest();
Code:
0: aload_0
1: invokespecial #8 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
//String s = "";
0: ldc #16 // String
2: astore_1
//Random rand = new Random();
3: new #18 // class java/util/Random
6: dup
7: invokespecial #20 // Method java/util/Random."<init>":()V
10: astore_2
//StringBuilder result = new StringBuilder();
11: iconst_0
12: istore_3
13: goto 49
//s = (new StringBuilder(String.valueOf(s))).append(rand.nextInt(1000)).append(" ").toString();
16: new #21 // class java/lang/StringBuilder
19: dup
20: aload_1
21: invokestatic #23 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String;
24: invokespecial #29 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
27: aload_2
28: sipush 1000
31: invokevirtual #32 // Method java/util/Random.nextInt:(I)I
34: invokevirtual #36 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
37: ldc #40 // String
39: invokevirtual #42 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
42: invokevirtual #45 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
45: astore_1
46: iinc 3, 1
49: iload_3
50: bipush 10
52: if_icmplt 16
//System.out.println(s);
55: getstatic #49 // Field java/lang/System.out:Ljava/io/PrintStream;
58: aload_1
59: invokevirtual #55 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
62: return
}
我们可以看到,虽然编译器将"+"转换成了StringBuilder,但创建StringBuilder对象的位置却在for语句内部。这就意味着每执行一次循环,就会创建一个StringBuilder对象(对于本例来说,是创建了10个StringBuilder对象),虽然Java有垃圾回收器,但这个回收器的工作时间是不定的。如果不断产生这样的垃圾,那么仍然会占用大量的资源。解决这个问题的方法就是在程序中直接使用StringBuilder来连接字符串,代码如下:
import java.util.Random;
public class StringTest {
public static void main(String[] args) {
Random rand = new Random();
StringBuilder result = new StringBuilder();
for (int i = 0; i < 10; i++) {
result.append(rand.nextInt(1000));
result.append(" ");
}
System.out.println(result.toString());
}
}
上面代码反编译后的结果如下:
C:\Java\jdk1.8.0_171\bin>javap -c E:\Dubbo\Demo\bin\StringTest.class
Picked up _JAVA_OPTIONS: -Xmx512M
Compiled from "StringTest.java"
public class StringTest {
public StringTest();
Code:
0: aload_0
1: invokespecial #8 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
//Random rand = new Random();
0: new #16 // class java/util/Random
3: dup
4: invokespecial #18 // Method java/util/Random."<init>":()V
7: astore_1
//StringBuilder result = new StringBuilder();
8: new #19 // class java/lang/StringBuilder
11: dup
12: invokespecial #21 // Method java/lang/StringBuilder."<init>":()V
15: astore_2
//for(int i = 0; i < 10; i++)
16: iconst_0
17: istore_3
18: goto 43
//result.append(rand.nextInt(1000));
21: aload_2
22: aload_1
23: sipush 1000
26: invokevirtual #22 // Method java/util/Random.nextInt:(I)I
29: invokevirtual #26 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
32: pop
//result.append(" ");
33: aload_2
34: ldc #30 // String
36: invokevirtual #32 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
39: pop
40: iinc 3, 1
43: iload_3
44: bipush 10
46: if_icmplt 21
//System.out.println(result.toString());
49: getstatic #35 // Field java/lang/System.out:Ljava/io/PrintStream;
52: aload_2
53: invokevirtual #41 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
56: invokevirtual #45 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
59: return
}
从上面的反编译结果可以看出,创建StringBuilder的代码被放在了for语句外。虽然这样处理在源程序中看起来复杂,但却换来了更高的效率,同时消耗的资源也更少了。
所以,从上述几个例子中我们得出的结论是:String采用连接运算符(+)效率低下,都是上述循环、大批量数据情况造成的,每做一次"+"就产生个StringBuilder对象,然后append后就扔掉。下次循环再到达时重新产生个StringBuilder对象,然后append字符串,如此循环直至结束。如果我们直接采用StringBuilder对象进行append的话,我们可以节省创建和销毁对象的时间。如果只是简单的字面量拼接或者很少的字符串拼接,性能都是差不多的。
转自:https://mp.weixin.qq.com/s/Z3_8_4OHUcqSQiBxqjyrtg