How to compare Strings in Java

A common mistake Java novices do is using the == operator to compare Strings. This does almost certainly lead to unexpected and unwanted behaviour. There is a simple solution to this problem and a not so simple explanation why it is such a common mistake.

The simple solution is to use String.equals instead of the == operator.

So when you want to know if two String objects hold the same value instead of

String a = "Test";
String b = "Test";
if (a == b) {
  // Do something
}

just use this:

String a = "Test";
String b = "Test";
if (a.equals(b)) {
  // Do something
}

But look out for null Strings, == handles null Strings nicely but when you call

String a = null;
String b = "Test";
if (a.equals(b)) {
  // Do something
}

it will result in a NullPointerException!

The explanation

The explanation is that the == operator compares the references of the String objects rather than the values they store internally (if you want to read up on on equality of Java objects take a look at this great article Jeff posted).

Yet this does not explain why this is such a common error. Shouldn’t new Java Programmers just try ==, then fail and finally google the solution?

The reason why a wrongly used == can stay hidden in the code for a long time, is because it sometimes does what the programmer expected it to do. This is because Java uses some dirty tricks to save you memory and cpu cycles. The trick is called String interning and means that whenever you declare a constant String in Java, say:

String a = "Test";

the JVM will add an instance of the String class to a so called string intern pool to reuse it. This is possible because Strings are immutable in Java (the value they hold can not be changed at runtime). So when you declare two Strings at compile time like this:

String a = "Test";
String b = "Test";
if (a == b) {
  System.out.println("They are the same?");
}

You will indeed always see “They are the same?” on your console for this Snippet. This is because the variables a and b both reference the same String object from the string intern pool. And this behavior is not some random optimization that differs by the JVM or the compiler implementation you use, but a behavior defined in the Java Language Specification. This means when you are absolutely sure that two String objects will hold the same reference you can use the == operator to squeeze out some performance. In most cases though, it’s safer to stick to .equals.

To show you why it can be dangerous to use == to compare String take a look at this Snippet:

String a = "Test";
String b = "TTest";
if (a == b.substring(1)) {
  System.out.println("I will not get printed :(");
}
 
if (a.equals(b.substring(1))) {
  System.out.println("But I will!");
}

So when we do runtime String calculations such as comparing our variable a to a substring of b, a new instance of String would be created in b.substring(1), therefore the == comparison fails.

When we know about Java’s string interning mechanisms we can even use the string intern pool by ourselves using the String.intern() method.

String a = "Test";
String b = "TTest";
if (a == b.substring(1).intern()) {
  System.out.println("Wow i will be printed!");
}

The b.substring(1).intern() call checks whether the value of the String instance returned by b.substring(1) is already stored in the string intern pool and returns the found reference. If such a String can not be found in the pool, the reference returned by b.substring(1) is stored in the pool and then returned. In our case such a String will by found in the pool though, since we declared String a = “Test”, and all such Constants are interned according to the Specification. For this reason the a == b.substring(1).intern() comparison holds true.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.