转载

谈谈 JDK8 中的字符串拼接

字符串拼接问题应该是每个Java程序员都熟知的事情了,几乎每个Java程序员都读过关于StringBuffer/StringBuilder来拼接字符串。

在大多数的教程中,也许你会看到用+号拼接字符串会生成多个String,导致性能过差,建议使用StringBuffer/StringBuilder来拼接。

可是真的是这样的吗?

本文在JDK8中做了如下实验:

public static void main(String[] args) {
        String result = "";
        result += "some more data";
        System.out.println(result);
    }

通过javap -c来反编译得到:

Code:
       0: aload_0          // Push 'this' on to the stack
       1: invokespecial #1 // Invoke Object class constructor
                           // pop 'this' ref from the stack
       4: return           // Return from constructor

  public static void main(java.lang.String[]);
    Code:
       0: ldc           #2 // Load constant #2 on to the stack
       2: astore_1         // Create local var from stack (pop #2)
       3: new           #3 // Push new StringBuilder ref on stack
       6: dup              // Duplicate value on top of the stack
       7: invokespecial #4 // Invoke StringBuilder constructor
                           // pop object reference
      10: aload_1          // Push local variable containing #2
      11: invokevirtual #5 // Invoke method StringBuilder.append()
                           // pop obj reference + parameter
                           // push result (StringBuilder ref)
      14: ldc           #6 // Push "some more data" on the stack
      16: invokevirtual #5 // Invoke StringBuilder.append
                           // pop twice, push result
      19: invokevirtual #7 // Invoke StringBuilder.toString:();
      22: astore_1         // Create local var from stack (pop #6)
      23: getstatic     #8 // Push value System.out:PrintStream
      26: aload_1          // Push local variable containing #6
      27: invokevirtual #9 // Invoke method PrintStream.println()
                           // pop twice (object ref + parameter)
      30: return           // Return void from method

可以看到Java编译器优化了生成的字节码,自动创建了一个StringBuilder,并进行append操作。

由于构建最终字符串的子字符串在编译时已经已知了,在这种情况下Java编译器才会进行如上的优化。这种优化称为a static string concatenation optimization,自JDK5时就开始启用。

那是否就能说明在JDK5以后,我们不再需要手动生成StringBuilder,通过+号也能达到同样的性能?

我们尝试下动态拼接字符串:

动态拼接字符串指的是仅在运行时才知道最终字符串的子字符串。比如在循环中增加字符串:

public static void main(String[] args) {
        String result = "";
        for (int i = 0; i < 10; i++) {
            result += "some more data";
        }
        System.out.println(result);
    }

同样反编译:

Code:
       0: aload_0          // Push 'this' on to the stack
       1: invokespecial #1 // Invoke Object class constructor
                           // pop 'this' ref from the stack
       4: return           // Return from constructor

  public static void main(java.lang.String[]);
    Code:
       0: ldc            #2 // Load constant #2 on to the stack
       2: astore_1          // Create local var from stack, pop #2
       3: iconst_0          // Push value 0 onto the stack
       4: istore_2          // Pop value and store it in local var
       5: iload_2           // Push local var 2 on to the stack
       6: i2d               // Convert int to double on
                            // top of stack (pop + push)
       7: ldc2_w         #3 // Push constant 10e6 on to the stack
      10: dcmpg             // Compare two doubles on top of stack
                            // pop twice, push result: -1, 0 or 1
      11: ifge           40 // if value on top of stack is greater
                            // than or equal to 0 (pop once)
                            // branch to instruction at code 40
      14: new            #5 // Push new StringBuilder ref on stack
      17: dup               // Duplicate value on top of the stack
      18: invokespecial  #6 // Invoke StringBuilder constructor
                            // pop object reference
      21: aload_1           // Push local var 1 (empty String)
                            // on to the stack
      22: invokevirtual  #7 // Invoke StringBuilder.append
                            // pop obj ref + param, push result
      25: ldc            #8 // Push "some more data" on the stack
      27: invokevirtual  #7 // Invoke StringBuilder.append
                            // pop obj ref + param, push result
      30: invokevirtual  #9 // Invoke StringBuilder.toString
                            // pop object reference
      33: astore_1          // Create local var from stack (pop)
      34: iinc         2, 1 // Increment local variable 2 by 1
      37: goto            5 // Move to instruction at code 5
      40: getstatic     #10 // Push value System.out:PrintStream
      43: aload_1           // Push local var 1 (result String)
      44: invokevirtual #11 // Invoke method PrintStream.println()
                            // pop twice (object ref + parameter)
      47: return            // Return void from method

可以看到在14的时候new了StringBuilder,但是在37的时候goto到了5,在循环过程中,并没有达到最优化,不断在生成新的StringBuilder。

所以上述代码类似:

String result = "";
for (int i = 0; i < 10; i++) {
    StringBuilder tmp = new StringBuilder();
    tmp.append(result);
    tmp.append("some more data");
    result = tmp.toString();
}
System.out.println(result);

可以看到不断生成新的StringBuilder,并且通过tostring,原来的StringBuilder将不再引用,作为垃圾,也增加了GC成本。

所以,在实际的使用中,当你无法区分字符串是静态拼接还是动态拼接的时候,还是使用StringBuilder吧。

Reference:

  1. http://www.pellegrino.link/2015/08/22/string-concatenation-with-java-8.html
原文  http://www.importnew.com/28486.html
正文到此结束
Loading...