Note: this article is based on the Oracle compiler supplied with SDK's up to Java 7

Concatenating Strings in Loops

This Q&A article attempts to explain why it may be important to use StringBuilder objects when concatenating Strings within a loop.

Why use StringBuilder when we can concatenate Strings using the '+' operator?

This is a common question and there are 2 answers to it:

  1. Using the '+' is prefectly acceptable where Strings are being concatenated as a one off event, or the strings are short and the number of iterations is low.
  2. Using the '+' is definitely not acceptable where Strings are being concatenated in a loop which has many iterations and/or where the Strings are long.

So why the different answers?

The key to understanding this is to understand how the Java compiler handles string concatenation. If you compile the following code:

    for ( int i = 0; i < scores.length; ++i )
        {
        allScores += scores[i] + ", ";
        }
    

And then view the generated p code using the javap command you will see the following:

    0:   iconst_0
    1:   istore_2
    2:   iload_2
    3:   aload_1
    4:   arraylength
    5:   if_icmpge       40
    8:   new     #4; //class java/lang/StringBuilder
    11:  dup
    12:  invokespecial   #5; //Method java/lang/StringBuilder."<init>":()V
    15:  aload_0
    16:  invokevirtual   #7; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
    19:  aload_1
    20:  iload_2
    21:  iaload
    22:  invokevirtual   #12; //Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
    25:  ldc     #13; //String ,
    27:  invokevirtual   #7; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
    30:  invokevirtual   #9; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
    33:  astore_0
    34:  iinc    2, 1
    37:  goto    2
    

This output may not make much sense to you but the javap command handly adds comments showing which methods are being called. Looking at the comments shows that for every iteration of the loop a new StringBuilder object is being created, 3 strings are appended to it and then the StringBuilder is converted back to a String.

But we're not using a StringBuilder we are using the concatenation operator?

True but Java handles String concatenation using StringBuilder objects so everytime you use '+' with Strings a new StringBuilder object is created to do the work.

But why are there 3 calls to append() when we are only appending 2 strings?

The first call to append() loads the current value of allScores into the empty StringBuilder and the following 2 calls to append() add the two strings to it. Finally the toString() method is called to extract the String and assign it back to the allScores variable.

So if the compiler is already using StringBuilder why do we need to explicitly use it?

The generated code creates a new StringBuilder for each iteration of the loop but the problem can be solved using only one StringBuilder object which can be created outside of the loop and then repeatedly appended to. Therefore, we can write the previous example as follows:
    StringBuilder sb = new StringBuilder(allScores);

    for ( int i = 0; i < scores.length; ++i )
        {
        sb.append(scores[i]);
        sb.append(", ");
        }

    allScores = sb.toString();
    

And looking at the generated p code you will see there is now only one StringBuilder object created, 2 calls to append() within the loop and one call to toString() after the loop:

    0:   new     #4; //class java/lang/StringBuilder
    3:   dup
    4:   aload_0
    5:   invokespecial   #14; //Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
    8:   astore_2
    9:   iconst_0
    10:  istore_3
    11:  iload_3
    12:  aload_1
    13:  arraylength
    14:  if_icmpge       36
    17:  aload_2
    18:  aload_1
    19:  iload_3
    20:  iaload
    21:  invokevirtual   #12; //Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
    24:  ldc     #13; //String ,
    26:  invokevirtual   #7; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
    29:  pop
    30:  iinc    3, 1
    33:  goto    11
    36:  aload_2
    37:  invokevirtual   #9; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
    40:  astore_0
    

So the second approach is likely to be more efficient than the first example, but by how much?

The answer to that is a little surprising. If we write a simple test program to time both approaches we can get some idea of the relative efficiency of each approach. For example:

static void test()
    {
    String text = "";

    // warm up system
    for ( int i = 0; i < 10000; ++i )
        text = text + "*";

    System.out.println("Running concatenation time test...");

    text = "";
    long t = System.currentTimeMillis();

    for ( int i = 0; i < 500000; ++i )
        text = text + "*";

    System.out.println("Concatenate strings - time = "+(System.currentTimeMillis()-t)+"ms, string length = "+text.length());

    text = "";
    t = System.currentTimeMillis();

    StringBuilder sb = new StringBuilder(text);

    // warm up system
    for ( int i = 0; i < 10000; ++i )
        sb.append("*");

    sb = new StringBuilder(text);

    for ( int i = 0; i < 500000; ++i )
        sb.append("*");

    text = sb.toString();

    System.out.println("Using StringBuilder - time = "+(System.currentTimeMillis()-t)+"ms, string length = "+text.length());
    }
    

On my system this produces the following output, you may have to adjust the number of iterations to produce reasonable output values on your system.

Running concatenation time test...
Concatenate strings - time = 2297953ms, string length = 500000
Using StringBuilder - time = 31ms, string length = 500000
    

Yes, approach two really is approximately 74,000 times quicker in this example.

So why such a massive difference?

In the first approach for every iteration of the loop a new StringBuilder object is created and all of the characters in the String have to be copied into the object. Then, once the additional text has been appended, a new String object is created and again all the characters have to be copied. Now when the String only holds a few characters this takes very little time but as the String grows in length copying of all the characters twice for each iteration becomes increasingly time consuming, not to mention the increased probability of garbage collection cycles. A good way to see how approach one becomes increasingly inefficient as the length of the String increases is to run the following code and see how much the rate of printing slows down once as the String length grows.

    for ( int i = 0; i < 500000; ++i )
        {
        text = text + "*";

        // print a message every thousand iterations
        if ( i%1000 == 0 )
            System.out.println("Iteration "+i+", String length = "+text.length());
        }

    

Conclusion

If you have performance issues with string concatenation and you are concatenating large strings and/or many strings, changing your code to explicitly use a StringBuilder object may well resolve your performance problems.
 
Back to the Top