Java:Language - When to use equals

From Juneday education
Jump to: navigation, search

Introduction

The comparison operator == is easy to understand when it comes to primitive types. It is safe to assume that you all understand that the value of the expression 5 == 5 is true, just as 'A' == 'A' or 3.14 == 3.14 also are true.

When it comes to comparing references (variables and expressions of non-primitive type), we are comparing what object is referred to, not what's inside that object.

It is true that two references compared using == when the two references refer to the exact same object.

This is often not what we want to bother ourselves with, when programming in Java and using are using references. For instance, when we create two String objects with the same character sequence as the values, we are not concerned with whether the JVM actually created one single object and let both our references refer to that single object. We are more interested in the content of the String. The same goes for most objects. It is not often we want to check whether two objects are in the same memory address of the JVM's heap memory. Much more frequently, we are interested to check whether two objects represent the same thing or value.

To compare the state of two objects, we can't rely on the == operator, since it only works on the references - whether the two objects share the same memory address. We must then investigate the state of the object using some other means. This is where the equals(Object) instance method comes in.

When we write a class, we implicitly inherit all the methods in the class java.lang.Object. See our chapters on inheritance if you didn't know this. One such method we inherit is the public boolean equals(Object o). We can override that method if we want objects of our class to be able to tell us whether they are "equal" in terms of state. If they are equivalent, that is. Two Strings are "equal" if they contain the exact same sequence of characters. Two java.lang.Integer objects are equal if they represent the same integer number. Two Book objects could be considered equal if they represent the same book in e.g. a book store's product line. Not the same physical book, but the same book product that you can order from the store.

Value objects and objects with identities

This makes sense for value objects, where the identity isn't important. Money bills are another example. We can write a class Bill such that the equals method only looks at the value. A Swedish 100 SEK bill is equal to any other Swedish 100 SEK bill (if we decide this it the important factor to consider). Two such bills are not the same bill, but they are both equally good for ordering a beer and a sandwich. Similarly, two such Bill objects may be two distinct objects in the JVM, but they are still equal to each other, if we write the equals method such that it doesn't consider identity (like a bill serial number, issue date etc).

A passport in a passport system for the border police, could work in a similar way. With passports, identity is very important. We can't have two identical passports in the physical world. A passport is issued for one citizen, and it has unique information when compared to other passports. But in our computer system for the border police, we might have a list of all Swedish passports, and still create a passport object using e.g. a photo scanner at the airport. Here again, the two passport objects in the JVM would have two different heap memory addresses, but when we look for the scanned passport object and check it against the list of known passports, we are using the equals method to see if we get a match. Such an equals method would consider identity of the physical passport.

In summary: when accepting a bill in a vending machine written in Java, we are only interested in the value in the equals method of the Bill class. When looking up a passport in a passport system, we are interested in identity too. The issue date and passport number is important in the equals method of the Passport class.

We think that some of the confusion regarding == versus equals() might come from the String class, and the fact that Strings are immutable and therefore can be safely re-used by the JVM. Two references for the same string can in fact be represented by the same object. Read on for a discussion and some examples.

Discussion and examples

Let's start this discussion with some code:

  String s1 = new String ("Hi there");
  String s2 = "Hi";
  String s3 = s1;
  String s4 = s2 + " there";

In the code above we create two objects (via new String ("Hi there");, "Hi"; and " there";) and four reference variables (s1, s2, s3 and s4).

Let's start by asking whether s1 and s2 are the same. They're not. They're two different objects so clearly they're not the same. But if we're really interested in checking if s1 and s2 the same... well, then we simply use the == operator like this if (s1==s2) { ...... some code }. So if we check if s1 and s3 are the same the answer would be yes even though they are different reference variable. They are the same since they refer to the same object. Compare this to looking at two papers with an address on each of them. It is two pieces of paper but we will consider the address to be the same if they refer to the same house/apartment.

Are they equal? What do we mean with equal? Equality usually refers to comparing two objects with respect to some interesting attributes. This could be that two persons are considered equal if the names are the same. Comparing (checking equality) of two Strings is simply done by checking if the text in the two objects are the same. The code to manage this is something the people who developed the String class wrote. We check equality by invoking the method equal on one object and pass the other as an argument. Let's do some comparisons:

  s1.equals(s2) // will be evaluated to false since s1 refers to an object whose value is "Hi there" and s2 refers to an object whose value is "Hi";
  s4.equals(s1) // will be evaluated to true since s4 refers to an object whose value is "Hi there" and s1 also refers to an object whose value also is "Hi there";

In conclusion, the == operator, when applied to two references, checks whether the two references refer to one and the same object (in the JVM's heap memory area). This is a rather technical test, and not one we as programmers very often care about. Objects typically model some entity in the problem domain of our program, like a File, a Customer, an Order or a bank Account for instance. How these model objects are stored in the memory is not as interesting as the question "Are the objects these two variables refer to equivalent to each other?" If the variables happen to refer to exacly the same objet, then it is fair to cosider the answer to thar question as true.

What various classes consider to be equality (via the equals method), varies and is documented. For some examples, we encourage you to read the documentation for the equals method for some API classes, e.g. java.lang.String, java.lang.Integer, java.io.File and java.util.ArrayList.

Note that there are som conventions for writing an equals method, like also providing a hashCode() method, involving the same instance variables as the equals test.

Most classes that are Comparable, return 0 from the compareTo method for objects that are equal according to the equals method. Those that don't should document that and explain why.

Note:' check the constants pool in Java: Constant confusion

Note: we're using the variable names s1, s2 etc even though we usually complain about variable names such as these..... well, we're only humans after all.

Links

See also

Further reading

Videos

Lecture slides

  • TODO