19 Subarrays
One of Java’s most convenient features is the length
attribute associated with each array. Because of this feature, it isn’t necessary to pass both the array and its length to a method (as it is in some other languages). However, it leads to people creating methods with inflexible signatures. The subarray pattern is a way to remedy this shortcoming.
Motivation
Most of the examples of arrays that you have seen probably involve iterating over all of the elements. However, there are many situations in which you only need to iterate over some of the elements in the array. Unfortunately, because you have only/mostly been exposed to examples in which this isn’t the case, you may have started to use a pattern that makes this difficult (and, hence, is sometimes called an anti-pattern).
Review
Suppose you were asked to write a method that is passed an array of int
values and returns the total. You would probably use an accumulator (see Chapter 13) as in the following method:
public static int total(int[] data) { int result; result = 0; for (int i = 0; i < data.length; ++i) { result += data[i]; } return result; }
This method takes advantage of the fact that the number of iterations is determinate (or definite) and uses a for
loop. This method also takes advantage of the fact that the array has a length
attribute and, so, does not include it in its signature.[1]
The problem with this implementation is that it does not allow you to find the sum of a subset of the elements. For example, if the indexes represent months and the elements represent sales data, then you might want to find the total sales for only the second quarter (i.e., April, May, and June; 0-based months 3, 4, 5).
Thinking About The Problem
Obviously, what you need to do is add formal parameters to the method that describe the subset of interest. You could, for example, pass another array that contained the indexes to consider. Or, you could pass a conformal boolean[]
array that contains true
for the elements to use and false
for the elements to ignore. In practice, however, both of these solutions are more complicated than is necessary because the most common need is to iterate over a contiguous subset of the elements (i.e., loosely speaking, a subset that contains no “gaps”, or a subset in which the difference between two sequential is exactly 1).
The Pattern
The easiest way to solve this problem of iterating over a contiguous subset of the elements is to add two formal parameters, the index to start with and the size of the subset.[2] Traditionally, the index to start with is thought of as an offset from 0 and, hence, is named offset
. The size of the subset is traditionally named length
.
The example of monthly sales data can be illustrated as in Figure 19.1. As is apparent from this illustration, the bounds on the loop control variable are now going to be offset
and offset + length
. As is also apparent from the illustration, you have to be careful to use a strong inequality when comparing the loop control variable to the upper bound (i.e., <
rather that <=
) to avoid an off-by-one defect. So, the loop control variable will be initialized to offset
and the iterations will continue as long as the loop control variable is strictly less than offset + length
.
The one drawback of adding these parameters is that they must be included in every invocation of the method. To avoid this, you can add an overloaded version of the method that is only passed the array and invokes the three-parameter version, passing it 0
for offset
and the array’s length attribute for length
.
Examples
It’s trivially easy to create and find examples of this pattern.
The Motivating Example
Returning to the example above, the three-parameter version of the method is as follows:
public static double total(double[] data, int offset, int length) { double result; result = 0; for (int i = offset; i < offset + length; ++i) { result += data[i]; } return result; }
The one-parameter version then invokes the three-parameter version as follows:
public static double total(double[] data) { return total(data, 0, data.length); }
Examples in the Java Library
Examples of this pattern abound in the Java library. For example, this pattern is used by the fill()
methods and the copyOfRange()
method in the Arrays
class, and the arraycopy()
method in the System
class.
A Less Obvious Example
The same kind of thing can be done with rectangular arrays of arrays (sometimes called multidimensional arrays). In this case, the flexible method is as follows:
public static double total(double[][] data, int roffset, int coffset, int rlength, int clength) { double result; result = 0; for (int r = roffset; r < roffset + rlength; ++r) { for (int c = coffset; c < coffset + clength; ++c) { result += data[r][c]; } } return result; }
The one-parameter version then invokes the five-parameter version as follows:
public static double total(double[][] data) { return total(data, 0, 0, data.length, data[0].length); }
A Warning
It is possible for the invoker to pass an invalid offset
and/or invalid length
. Hence, you should validate these parameters and respond to invalid values appropriately.
There are two common responses to invalid values. One is to throw an IllegalArgumentException
. The other is to use 0
for values of offset
that are too small, the array’s length for values of offset
that are too large, and the array’s length for values of offset + length
that are too large.
- If you have not yet learned about the
length
attribute, you can instead invoke theArray.getLength()
, passing it the array. ↵ - You could, instead, have the second parameter contain the index to end with. The approach discussed here is more common, in part because it eliminates any confusion about whether the end element is or isn't included. ↵