Java:Language - Arrays

From Juneday education
Jump to: navigation, search

Introduction

This chapter introduces the language construct of arrays. The concept of arrays is a built-in language construct for creating lists of similar objects (or primitives).

Arrays are reference types

In Java, an array is an object created on the heap. This means that arrays should be considered reference types. An array reference as an instance variable or static variable is thus initialized to null.

As with every type of local variables, an array reference as a local variable in a method (or constructor) isn't initialized at all and must be assigned a value before use. Arrays are not different in this regard.

Arrays get their elements initialized to default values

Arrays have their elements initialized to default values when they are created. An array of type int[] has its elements initialized to 0 and arrays of reference type elements will have their elements initialized to null. This includes, of course, arrays of arrays (multi-dimensional arrays). Thus, and int[][] array (an array of references to int[] ("an array of references to int-arrays") will have its member elements initialized to the default value null:

int[][] matrix = new int[3][];
System.out.println("matrix[0]: " + matrix[0]);
System.out.println("matrix[1]: " + matrix[1]);
System.out.println("matrix[2]: " + matrix[2]);

In the example above, a two-dimensional array of ints is created. A two-dimensional array of ints, is actually just an array of references to int arrays. Thus, the array above is intialized to a three element long array with all elements initialized to null:

$ javac Test.java && java Test
matrix[0]: null
matrix[1]: null
matrix[2]: null

Normal (one dimensional) arrays have their elements initialized to the default values according to the rules for instance- and static variables of classes and objects. All numeric types get their elements initialized to 0, boolean to false and reference types to null (as shown above). Some examples:

int[] results = new int[3];
System.out.println("results[0]: " + results[0]);
String[] names = new String[3];
System.out.println("names[0]: " + names[0]);
boolean[] truthValues = new boolean[3];
System.out.println("truthValues[0]: " + truthValues[0]);

The above example would yield the following output:

results[0]: 0
names[0]: null
truthValues[0]: false

Explicit initialization

For shorter arrays, you can initialize the values using a special array initialization syntax shown below:

int[] results = new int[]{10, 15, 12};

The values for the initialization are simply literals listed in a comma separated list enclosed with curly braces. Note that the size of the array isn't explicit any more, but inferred by the compiler (which simply counts the elements in the list of literals). So the above creates a three element long array of int values, initialized to the literal values of the list, 10, 15, and, 12.

Indexing and dereferencing elements of an array

So, an array is nothing more than a list containing either primitive values of the same type (like int) or references to objects of a similar type (or the special value null). In order to use or change a value in an array, the special [] brackets construct is used. Arrays are indexed with an int value for each element in the array. Indexing starts from 0 and goes to the length of the array minus one.

The array from above int[] results = new int[]{10, 15, 12}; will thus be indexed from 0 to 2. results[0] contains the int value of 10, results[1] will contain the int value 15, and results[2] will contain the int value 12.

Java checks the boundaries of the indices at runtime, which ensures that we never access the memory outside of the boundaries of the indices. Trying to access results[3] or results[666] of the array above (which has a length of 3 and max index (3-1)), will result in a runtime exception being thrown, ArrayIndexOutOfBoundsException. Therefore, you should always check the length of an array before making assumptions of suitable or possible indices of the array.

Checking the length of an array

Since arrays are Java objects, nothing prevented the inventors of Java to equip the arrays with a public instance variable called length. So they did. This instance variable is always guaranteed to be the length of the array (i.e. the number of elements) and it is calculated at the creation of the array object. Once an array is created, its internal structure is immutable - which means that the length will remain constant over the lifetime of the array object.

To access a public instance variable, dot notation is used like always:

int numberOfElements = someArrayReference.length;

In real life, you will usually not create arrays by hand, hardcoding the element values. If you do, you will probably not check the length of your array on the next line (you probably know how many elements you just created!). So when do we need to check the length of an array, then?

In more realistic situations, you will be passed an array reference to a method. When writing the method body, you won't know the size of the array passed to the parameter, so before trying to access any element in the array, you should check that it 1) is not null and 2) that the index you are interested in is a valid index. Valid indices are, as we said above, between 0 and length - 1.

Here's a code snippet explaing how to acccess and use the length instance variable of an array via the reference to the array:

int[] results = new int[]{10, 15, 12};
if (results != null) {
  int resultsLength = results.length;
  System.out.println("results.length: " + resultsLength);
  System.out.println("first index of results is 0");
  System.out.println("Max index of results is: " + (resultsLength-1) );
}

The output from the above snippet (placed inside the main method of a small test class, for instance) would be:

results.length: 3
first index of results is 0
Max index of results is: 2

This is useful knowledge for looping through each element of an array, using a simple for loop, for instance.

Looping through an array

Code says more than a textual explanation in this case:

int[] results = new int[]{10, 15, 12};
for (int i = 0; i < results.length; i++) {
  System.out.println("results[" + i + "]: " + results[i]);
}

The printout from the above snippet (from the main method of a test class) would be:

$ javac Test.java && java Test
Looping through each element of results:
results[0]: 10
results[1]: 15
results[2]: 12

If the index value isn't of any interest for you, you can use the simpler for-each loop instead (since arrays implement the Iterable interface):

int[] results = new int[]{10, 15, 12};
for (int i : results) {
  System.out.println(i);
}

The printout from the above snippet would be:

$ javac Test.java && java Test
10
15
12

You can look at the source code here. If you want to download it, you should use the link labelled "raw".

Note that in real programs, as opposed to Academic programs, we'd get the reference to an array passed as an argument to the method we are writing the code for. In such cases (as with all reference parameters) we should first check that the array reference is not null, or we'd get a runtime exception of type NullPointerException when we try to use the reference to access the length instance variable. Arrays are Java objects, so they are not different from any such object in this regard. If we were passed a String reference to a method, we can't dereference an instance method or public instance variable if the reference passed to the method is null of course.

Here's a small static method called from a small test program to show the point made above:

public class TestArrayParameter {

  public static void main(String[] args) {
    int[] numbers = new int[]{5,3,4,1,9};
    printArray(numbers);
    printArray(null);
    printArray(new int[0]);
  }

  private static void printArray(int[] nums) {
    if (nums == null) {
      System.out.println("null");
    } else {
      System.out.println("Length of parameter nums: " + nums.length);
      for (int i = 0; i < nums.length; i++) {
        System.out.println("nums[" + i + "]: " + nums[i]);
      }
    }
  }
}

The printout from the small main method above when run would be:

$ javac TestArrayParameter.java && java TestArrayParameter
Length of parameter nums: 5
nums[0]: 5
nums[1]: 3
nums[2]: 4
nums[3]: 1
nums[4]: 9
null
Length of parameter nums: 0

As you see from the last call to the method: printArray(new int[0]);, an array can be of length 0 which would signify an existing but for always empty array. When giving an empty (of length 0) array as an argument to the method, the loop doesn't execute at all, since 0<0 evaluates to false. So for the last call, only the length (0) is printed, and the loop is never entered.

Examples of careless and unsafe use of array references

To prove the importance of checking an array reference given as argument for null and length, we've added two unsafe methods to the test class. The first one neglects to check for null and gets a runtime exception (NullPointerException) when trying to dereference the length variable:

// in main:
{  ...
   unsafePrintLength(null);
   ...
}
// The unsafe method, taking for granted that the parameter isn't null:
  private static void unsafePrintLength(int[] nums) {
    System.out.println("Length of the parameter nums: " + nums.length);
  }

This is what happens when you are sloppy like that:

$ javac TestArrayParameter.java && java TestArrayParameter
Exception in thread "main" java.lang.NullPointerException
	at TestArrayParameter.unsafePrintLength(TestArrayParameter.java:11)
	at TestArrayParameter.main(TestArrayParameter.java:7)

Here's a second unsafe method, neglecting the check of length, trying to access an index outside the legal range:

// in main():
{   ...
    int[] numbers = new int[]{5,3,4,1,9}; // length will be 5
    unsafeIndexing(numbers);
    ...
}
//The unsafe method:
  private static void unsafeIndexing(int[] nums) {
    System.out.println("Element with index 10: " + nums[10]);
  }

This is the penalty for forgetting to check that the index you want exists, by checking the length:

$ javac TestArrayParameter.java && java TestArrayParameter
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10
	at TestArrayParameter.unsafeIndexing(TestArrayParameter.java:14)
	at TestArrayParameter.main(TestArrayParameter.java:8)

That should make it pretty obvious that you should check your parameters!

The remedy would be to check that the nums reference isn't null and check length to see that the index you want to access is less than nums.length. Why strictly less than? Because the greatest number less than nums.length is nums.length - 1 which happens to always correspond to the maximum index of an array (unless the length is 0 in which case there are no indices!).

So when should we use arrays?

It is almost always more convenient to use some kind of List from the collections API which comes with the JDK, than to use arrays. There are a few implementations of java.util.List such as java.util.ArrayList and java.util.LinkedList which you could use instead. An important advantage of using these classes and interfaces over plain arrays are that the collections API is retrofitted to be generic since Java version 5 or so. This provides compile-time type-safety which cannot be achieved (as easily) using plain arrays. Another big advantage from using the collection interfaces and classes over arrays is that the collection implementations are dynamic in that they don't have a fixed predetermined size (as do the plain arrays). For instance, a java.util.ArrayList will dynamically grow its internal storage (actually a plain old array) if needed, so that you can keep adding elements to it without having to worry about ArrayIndexOutOfBoundsException just for adding new elements. The arraylist will manage its internal size itself.

So, when would we choose to use a plain old array then? Well, for example when some API forces you to do so. There are many methods of the I/O streams API which take byte arrays as arguments for low level I/O, for instance. And don't forget about the good old public static void main(String[] args) method! When you pass arguments to your Java applications, you know that they end up in the args[] array of the main method. So it is good to know how to check arrays for length and how to loop through the elements and accessing individual elements by their index using the [] (square brackets) syntax.

Another quite useful use of arrays is when to create a generic List on the fly. Let's say some method needs an argument of a List<String> (a generic list typed to String references). A pretty convenient way of creating such a list, would be to use the java.util.Arrays convenience method asList(). Again, we believe that a code example would show the point better than explanatory text.

Consider a method which expects an argument of type List<String>. For the sake of the argument (no pun intended), we'll say that the method should loop through the list and print each string as uppercase only texts:

  public static void printListAsUpperCase(List<String> list) {
    for (String elem : list) {
      System.out.println(elem.toUpperCase());
    }
  }

Now, in order to test this simple method and send it an argument of some type of List<String>, do we need to create a new ArrayList<String>, for instance? Yes, but there's a short-hand for that, using java.util.Arrays.asList():

    printListAsUpperCase(Arrays.asList
                         ( new String[]{"abba", "europe", "ace of base"} )
                         );

The result of the call to Arrays.asList with the new array created as above, is that we'll get a reference to a List<String> back, which suits the call to printListAsUpperCase() method!

We'll actually get a reference to a new ArrayList<String> but we don't have to care about that. We'll get the reference as a reference of type "reference to an object whose class implements java.util.List<String> which is exactly what we want. Here's the output from the small application:

$ javac ListExample.java && java ListExample
ABBA
EUROPE
ACE OF BASE

And here's the complete source code for this small (and rather stupid) application:

import java.util.Arrays;
import java.util.List;

public class ListExample {

  public static void main(String[] args) {
    printListAsUpperCase(Arrays.asList
                         (new String[]{"abba", "europe", "ace of base"})
                         );
  }

  public static void printListAsUpperCase(List<String> list) {
    for (String elem : list) {
      System.out.println(elem.toUpperCase());
    }
  }
}

Now, actually, Arrays.asList() uses so called varargs arguments. Varargs is a variable number of arguments thing in Java. The signature for the method is asList(T... a). It is a generic method and the complete method declaration reads public static <T> List<T> asList(T... a). The dots between T and a means "zero or more arguments of type T". How has Java implemented varargs? The parameter a above is actually an array (which could be of length 0). This means that any method which accepts an array can be converted to varargs. As a curiosity, you could actually write the main method like this: public static void main(String...args){/* some code */ }.

This means that we could simplify the code above, skipping the new String[]{} thingy. We could add a call to printListAsUpperCase() like this:

    printListAsUpperCase(Arrays.asList("apa", "bepa", "cepa", "depa"));

Arrays.asList() uses varargs, so the strings above would be converted to an array of String references, four long.

Note that we can't do the other way around. A method declared as accepting a normal array, can't be called using the varargs syntax:

  public static void printInts(int[] ints) {
    for (int i : ints) {
      System.out.println(i);
    }
  }

The above cannot be called like this: printInts(1,2,3);, because the method requires a true array of ints. If we tried to call it like that, the compiler would complain:

error: method printInts in class ListExample cannot be applied to given types;
    printInts(1,2,3);
    ^
  required: int[]
  found: int,int,int
  reason: actual and formal argument lists differ in length
1 error

If you write a method accepting varargs, you will need to know how to discover how many arguments and how to access them. Just consider the varargs parameter an array of the appropriate type, and you'll be fine. So now you have one more reason to learn the array syntax of Java.

For the really curious - how can we be sure about the length being an int variable?

We've seen numerous attempts by beginners trying to access the length variable as if it were in fact an instance method called size:

int size = someArray.size();

That's a pretty obvious and easy-to-understand beginner's mistake. You are confusing the length variable with the size() method of the various collections classes.

Arrays are objects, but they don't have a size() method. They don't have any methods (except those inherited from java.lang.Object - the only overridden method of those is clone()). How can we be so sure? We'll, as always when thinking about a Java construct, there is nothing stopping you from using the compiler to tell you the truth. Rather than guessing and making our own mock-explanation of what's going on, we can pretty simply ask the compiler for leads and clues of the implementation of the language.

Let's try to compile the source code snippet abobe:

public class TonyArraya {
  public static void main(String[] args) {
    int[] someArray = new int[10];
    int length = someArray.size();
  }
}

Here's the result of that futile attempt:

$ javac TonyArraya.java 
TonyArraya.java:4: error: cannot find symbol
    int length = someArray.size();
                          ^
  symbol:   method size()
  location: variable someArray of type int[]
1 error

The key part of the error message is found in this information: cannot find symbol, symbol: method size(), location: variable someArray of type int[]

OK, from this we learn that there is no method called size() in objects of type int[]. If there isn't, there won't be any such method in any type of array. Try yourself with different types of arrays if you don't believe us.

To be sure it's not just the name we got wrong, we'll try the same thing using the method call length() instead:

Let's try to compile the source code snippet abobe:

public class TonyArraya {
  public static void main(String[] args) {
    int[] someArray = new int[10];
    int length = someArray.length();
  }
}

Here's the result of that futile attempt:

$ javac TonyArraya.java 
TonyArraya.java:4: error: cannot find symbol
    int length = someArray.length();
                          ^
  symbol:   method length()
  location: variable someArray of type int[]
1 error

Same kind of problem, no such method length(). We're pretty sure now that we shouldn't use a method.

But what about the theory of it being an instance variable instead? Let's try to access someArray.size as if there were an instance variable called size and investigate any error messages from the know-it-all javac compiler: Let's try to compile the source code snippet above:

public class TonyArraya {
  public static void main(String[] args) {
    int[] someArray = new int[10];
    int length = someArray.size;
  }
}

Here's the result of that futile attempt:

$ javac TonyArraya.java 
TonyArraya.java:4: error: cannot find symbol
    int length = someArray.size;
                          ^
  symbol:   variable size
  location: variable someArray of type int[]
1 error

Here, the key to understanding the error message is this part (which we've assembled for you): cannot find symbol, symbol: variable size, location: variable someArray of type int[].

OK, so it looked for a variable called size via the array reference but couldn't find such a variable. Let's try again with the correct variable name length, and verify that it now compiles.

public class TonyArraya {
  public static void main(String[] args) {
    int[] someArray = new int[10];
    int length = someArray.length;
  }
}

No compilation errors this time, which teaches us that there is indeed a variable called length in an array object.

$ javac TonyArraya.java
$

OK. Are you following? When accessing a member via a reference variable, adding parentheses after the member name, will make javac interpret the member as a method. Adding a plain name (like size or length) after the dot, will make javac interpret the member as a variable.

Fine. But how can you be sure that it is a variable of type int (we did mention that length was a variable of type int, didn't we?). We'll once again, there is not much gained by guessing or making up a mock-theory about this. Let's try to assign the length variable value to a variable of type short. Now, if length is indeed of type int as we claimed, the compiler will get upset and complain about us trying to assign a value of type int to a mere short variable. As we all know, a 32 bit Java int will not fit inside a variable of only 16 bits space (as with a short). Let's try!

public class TonyArraya {
  public static void main(String[] args) {
    int[] someArray = new int[10];
    short length = someArray.length;
  }
}

As you will see below, the compiler didn't like that a bit. (No pun intended):

$ javac TonyArraya.java 
TonyArraya.java:4: error: incompatible types: possible lossy conversion from int to short
    short length = someArray.length;
                            ^
1 error

The key part of the upset error message here was: error: incompatible types: possible lossy conversion from int to short. Aha! Thank you, javac, for being loose-lipped! Now we know that length is of type int (because you just said it was!).

Can we figure out all of the modifiers for the length variable? Not quite all of them (we could of course read the specification). But that length is declared public int we've seen. But is it final too? We know that final variables cannot be re-assigned, so that is trivial to test! Let's try to re-assign the length variable:

public class TonyArraya {
  public static void main(String[] args) {
    int[] someArray = new int[10];
    int length = someArray.length;
    someArray.length = 11;
  }
}

Let's inspect the error message:

$ javac TonyArraya.java
TonyArraya.java:7: error: cannot assign a value to final variable length
    someArray.length = 11;
             ^
1 error

The key part here is of course: "cannot assign a value to final variable length".

If we want to be real smart-asses (and who doesn't?) we could do this:

someArray.length = (long)11;

In that case, we'd get two pieces of information from the compiler:

$ javac TonyArraya.java
TonyArraya.java:7: error: cannot assign a value to final variable length
    someArray.length = (long)11;
             ^
TonyArraya.java:7: error: incompatible types: possible lossy conversion from long to int
    someArray.length = (long)11;
                       ^
2 errors

Using that smart-ass-ness, we'd learn that length

  • is a variable
  • declared final
  • of type int

Unfortunately, we can't use Java reflection in order to figure out the type etc of the length variable. Arrays are indeed Java objects, but not of a specific class. The length variable is not part of its type, but we can use the compiler to get all the clues as done above.

Links

Source code

Some code from the above examples on our github repo

External links