Miscellaneous Goodies - Java SE 8 for the Really Impatient (2014)

Java SE 8 for the Really Impatient (2014)

Chapter 8. Miscellaneous Goodies

Topics in This Chapter

Image 8.1 Strings

Image 8.2 Number Classes

Image 8.3 New Mathematical Functions

Image 8.4 Collections

Image 8.5 Working with Files

Image 8.6 Annotations

Image 8.7 Miscellaneous Minor Changes

Image Exercises

Java 8 is a big release with important language and library enhancements. But there are also numerous smaller changes throughout the library that are quite useful. I pored over all @since 1.8 notes in the API documentation and grouped the changes into categories for easy reference. In this chapter, you will find what is new for strings, numbers, math, collections, files, annotations, regular expressions, and JDBC.

The key points of this chapter are:

• Joining strings with a delimiter is finally easy: String.join(", ", a, b, c) instead of a + ", " + b + ", " + c.

• Integer types now support unsigned arithmetic.

• The Math class has methods to detect integer overflow.

• Use Math.floorMod(x, n) instead of x % n if x might be negative.

• There are new mutators in Collection (removeIf) and List (replaceAll, sort).

• Files.lines lazily reads a stream of lines.

• Files.list lazily lists the entries of a directory, and Files.walk traverses them recursively.

• There is finally official support for Base64 encoding.

• Annotations can now be repeated and applied to type uses.

• Convenient support for null parameter checks can be found in the Objects class.

8.1. Strings

A common task is to combine several strings, separating them with a delimiter such as ", " or "/". This has now been added to Java 8. The strings can come from an array or an Iterable<? extends CharSequence>:

String joined = String.join("/", "usr", "local", "bin"); // "usr/local/bin"
System.out.println(joined);
String ids = String.join(", ", ZoneId.getAvailableZoneIds());
System.out.println(ids);
// All time zone identifiers, separated by commas

Think of join as the opposite of the String.split instance method. This is the only method added to the String class in Java 8.


Image NOTE

As already mentioned in Chapter 2, the CharSequence interface provides a useful instance method codePoints that returns a stream of Unicode values, and a less useful method chars that returns a stream of UTF-16 code units.


8.2. Number Classes

Ever since Java 5, each of the seven numeric primitive type wrappers (i.e., not Boolean) had a static SIZE field that gives the size of the type in bits. You will be glad to know that there is now a BYTES field that reports the size in bytes, for those who cannot divide by eight.

All eight primitive type wrappers now have static hashCode methods that return the same hash code as the instance method, but without the need for boxing.

The five types Short, Integer, Long, Float, and Double now have static methods sum, max, and min, which can be useful as reduction functions in stream operations. The Boolean class has static methods logicalAnd, logicalOr, and logicalXor for the same purpose.

Integer types now support unsigned arithmetic. For example, instead of having a Byte represent the range from –128 to 127, you can call the static method Byte.toUnsignedInt(b) and get a value between 0 and 255. In general, with unsigned numbers, you lose the negative values and get twice the range of positive values. The Byte and Short classes have methods toUnsignedInt, and Byte, Short, and Integer have methods toUnsignedLong.

The Integer and Long classes have methods compareUnsigned, divideUnsigned, and remainderUnsigned to work with unsigned values. You don’t need special methods for addition, subtraction, and multiplication. The + and - operators do the right thing already for unsigned values. Integer multiplication would overflow with unsigned integers larger than Integer.MAX_VALUE, so you should call toUnsignedLong and multiply them as long values.


Image NOTE

In order to work with unsigned numbers, you need to have a clear understanding of base-two arithmetic and the binary representation of negative numbers. In C and C++, mixing signed and unsigned types is a common cause of subtle errors. Java has wisely decided to stay away from this area, and has managed to live with only signed numbers for many years. The primary reason to use unsigned numbers is if you work with file formats or network protocols that require them.


The Float and Double classes have static methods isFinite. The call Double.isFinite(x) returns true if x is not infinity, negative infinity, or a NaN (not a number). In the past, you had to call the instance methods isInfinite and isNaN to get the same result.

Finally, the BigInteger class has instance methods (long|int|short|byte)ValueExact that return the value as a long, int, short, or byte, throwing an ArithmeticException if the value is not within the target range.

8.3. New Mathematical Functions

The Math class provides several methods for “exact” arithmetic that throw an exception when a result overflows. For example, 100000 * 100000 quietly gives the wrong result 1410065408, whereas multiplyExact(100000, 100000) throws an exception. The provided methods are (add|subtract|multiply|increment|decrement|negate)Exact, with int and long parameters. The toIntExact method converts a long to the equivalent int.

The floorMod and floorDiv methods aim to solve a long-standing problem with integer remainders. Consider the expression n % 2. Everyone knows that this is 0 if n is even and 1 if n is odd. Except, of course, when n is negative. Then it is –1. Why? When the first computers were built, someone had to make rules for how integer division and remainder should work for negative operands. Mathematicians had known the optimal (or “Euclidean”) rule for a few hundred years: always leave the remainder ≥ 0. But, rather than open a math textbook, those pioneers came up with rules that seemed reasonable but are actually inconvenient.

Consider this problem. You compute the position of the hour hand of a clock. An adjustment is applied, and you want to normalize to a number between 0 and 11. That is easy: (position + adjustment) % 12. But what if adjustment is negative? Then you might get a negative number. So you have to introduce a branch, or use ((position + adjustment) % 12 + 12) % 12. Either way, it is a hassle.

The new floorMod method makes it easier: floorMod(position + adjustment, 12) always yields a value between 0 and 11.


Image NOTE

Unfortunately, floorMod gives negative results for negative divisors, but that situation doesn’t often occur in practice.


The nextDown method, defined for both double and float parameters, gives the next smaller floating-point number for a given number. For example, if you promise to produce a number < b, but you happen to have computed exactly b, then you can return Math.nextDown(b). (The corresponding Math.nextUp method exists since Java 6.)


Image NOTE

All methods described in this section also exist in the StrictMath class.


8.4. Collections

The big change for the collections library is, of course, support for streams which you have seen in Chapter 2. There are some smaller changes as well.

8.4.1. Methods Added to Collection Classes

Table 8–1 shows miscellaneous methods added to collection classes and interfaces in Java 8, other than the stream, parallelStream, and spliterator methods.

Image

Table 8–1 Methods Added to Collection Classes and Interface in Java 8

You may wonder why the Stream interface has so many methods that accept lambda expressions but just one such method, removeIf, was added to the Collection interface. If you review the Stream methods, you will find that most of them return a single value or a stream of transformed values that are not present in the original stream. The exceptions are the filter and distinct methods. The removeIf method can be thought of as the opposite of filter, removing rather than producing all matches and carrying out the removal in place. The distinctmethod would be costly to provide on arbitrary collections.

The List interface has a replaceAll method, which is an in-place equivalent of map, and a sort method that is obviously useful.

The Map interface has a number of methods that are particularly important for maps accessed concurrently. See Chapter 6 for more information on these methods.

The Iterator interface has a forEachRemaining method that exhausts the iterator by feeding the remaining iterator elements to a function.

Finally, the BitSet class has a method that yields all members of the set as a stream of int values.

8.4.2. Comparators

The Comparator interface has a number of useful new methods, taking advantage of the fact that interfaces can now have concrete methods.

The static comparing method takes a “key extractor” function that maps a type T to a comparable type (such as String). The function is applied to the objects to be compared, and the comparison is then made on the returned keys. For example, suppose you have an array of Personobjects. Here is how you can sort them by name:

Arrays.sort(people, Comparator.comparing(Person::getName));

You can chain comparators with the thenComparing method for breaking ties. For example,

Arrays.sort(people,
Comparator.comparing(Person::getLastName)
.thenComparing(Person::getFirstName));

If two people have the same last name, then the second comparator is used.

There are a few variations of these methods. You can specify a comparator to be used for the keys that the comparing and thenComparing methods extract. For example, here we sort people by the length of their names:

Arrays.sort(people, Comparator.comparing(Person::getName,
(s, t) -> Integer.compare(s.length(), t.length())));

Moreover, both the comparing and thenComparing methods have variants that avoid boxing of int, long, or double values. An easier way of producing the preceding operation would be

Arrays.sort(people, Comparator.comparingInt(p -> p.getName().length()));

If your key function can return null, you will like the nullsFirst and nullsLast adapters. These static methods take an existing comparator and modify it so that it doesn’t throw an exception when encountering null values but ranks them as smaller or larger than regular values. For example, suppose getMiddleName returns a null when a person has no middle name. Then you can use Comparator.comparing(Person::getMiddleName(), Comparator.nullsFirst(...)).

The nullsFirst method needs a comparator—in this case, one that compares two strings. The naturalOrder method makes a comparator for any class implementing Comparable. A Comparator.<String>naturalOrder() is what we need. Here is the complete call for sorting by potentially null middle names. I use a static import of java.util.Comparator.*, to make the expression more legible. Note that the type for naturalOrder is inferred.

Arrays.sort(people, comparing(Person::getMiddleName,
nullsFirst(naturalOrder())));

The static reverseOrder method gives the reverse of the natural order. To reverse any comparator, use the reversed instance method. For example, naturalOrder().reversed() is the same as reverseOrder().

8.4.3. The Collections Class

Java 6 introduced NavigableSet and NavigableMap classes that take advantage of the ordering of the elements or keys, providing efficient methods to locate, for any given value v, the smallest element ≥ or > v, or the largest element ≤ or < v. Now the Collections class supports these classes as it does other collections, with methods (unmodifiable|synchronized|checked|empty)Navigable(Set|Map).

A checkedQueue wrapper, that has apparently been overlooked all these years, has also been added. As a reminder, the checked wrappers have a Class parameter and throw a ClassCastException when you insert an element of the wrong type. These classes are intended as debugging aids. Suppose you declare a Queue<Path>, and somewhere in your code there is a ClassCastException trying to cast a String to a Path. This could have happened because you passed the queue to a method void getMoreWork(Queue q) with no type parameter. Then, someone somewhere inserted a String into q. (Because the generic type was suppressed, the compiler could not detect that.) Much later, you took out that String, thinking it was a Path, and the error manifested itself. If you temporarily replace the queue with aCheckedQueue(new LinkedList<Path>, Path.class), then every insertion is checked at runtime, and you can locate the faulty insertion code.

Finally, there are emptySorted(Set|Map) methods that give lightweight instances of sorted collections, analogous to the empty(Set|Map) methods that have been around since Java 5.

8.5. Working with Files

Java 8 brings a small number of convenience methods that use streams for reading lines from files and for visiting directory entries. Also, there is finally an official way of performing Base64 encoding and decoding.

8.5.1. Streams of Lines

To read the lines of a file lazily, use the Files.lines method. It yields a stream of strings, one per line of input:

Stream<String> lines = Files.lines(path);
Optional<String> passwordEntry = lines.filter(s -> s.contains("password")).findFirst();

As soon as the first line containing password is found, no further lines are read from the underlying file.


Image NOTE

Unlike the FileReader class, which was a portability nightmare since it opened files in the local character encoding, the Files.lines method defaults to UTF-8. You can specify other encodings by supplying a Charset argument.


You will want to close the underlying file. Fortunately, the Stream interface extends AutoCloseable. The streams that you have seen in Chapter 2 didn’t need to close any resources. But the Files.lines method produces a stream whose close method closes the file. The easiest way to make sure the file is indeed closed is to use a Java 7 try-with-resources block:

try (Stream<String> lines = Files.lines(path)) {
Optional<String> passwordEntry
= lines.filter(s -> s.contains("password")).findFirst();
...
} // The stream, and hence the file, will be closed here

When a stream spawns another, the close methods are chained. Therefore, you can also write

try (Stream<String> filteredLines
= Files.lines(path).filter(s -> s.contains("password"))) {
Optional<String> passwordEntry = filteredLines.findFirst();
...
}

When filteredLines is closed, it closes the underlying stream, which closes the underlying file.


Image NOTE

If you want to be notified when the stream is closed, you can attach an onClose handler. Here is how you can verify that closing filteredLines actually closes the underlying stream:

try (Stream<String> filteredLines
= Files.lines(path).onClose(() -> System.out.println("Closing"))
.filter(s -> s.contains("password"))) { ... }


If an IOException occurs as the stream fetches the lines, that exception is wrapped into an UncheckedIOException which is thrown out of the stream operation. (This subterfuge is necessary because stream operations are not declared to throw any checked exceptions.)

If you want to read lines from a source other than a file, use the BufferedReader.lines method instead:

try (BufferedReader reader
= new BufferedReader(new InputStreamReader(url.openStream()))) {
Stream<String> lines = reader.lines();
...
}

With this method, closing the resulting stream does not close the reader. For that reason, you must place the BufferedReader object, and not the stream object, into the header of the try statement.


Image NOTE

Almost ten years ago, Java 5 introduced the Scanner class to replace the cumbersome BufferedReader. It is unfortunate that the Java 8 API designers decided to add the lines method to BufferedReader but not to Scanner.


8.5.2. Streams of Directory Entries

The static Files.list method returns a Stream<Path> that reads the entries of a directory. The directory is read lazily, making it possible to efficiently process directories with huge numbers of entries.

Since reading a directory involves a system resource that needs to be closed, you should use a try block:

try (Stream<Path> entries = Files.list(pathToDirectory)) {
...
}


Image NOTE

Under the hood, the stream uses a DirectoryStream, a construct introduced in Java 7 for efficient traversal of huge directories. That interface has nothing to do with Java 8 streams; it extends Iterable so that it can be used in an enhanced for loop.

try (DirectoryStream stream = Files.newDirectoryStream(pathToDirectory)) {
for (Path entry : stream) {
...
}
}

In Java 8, just use Files.list.


The list method does not enter subdirectories. To process all descendants of a directory, use the Files.walk method instead.

try (Stream<Path> entries = Files.walk(pathToRoot)) {
// Contains all descendants, visited in depth-first order
}

You can limit the depth of the tree that you want to visit by calling Files.walk(pathToRoot, depth). Both walk methods have a varargs parameter of type FileVisitOption..., but there is currently only one option you can supply: FOLLOW_LINKS to follow symbolic links.


Image NOTE

If you filter the paths returned by walk and your filter criterion involves the file attributes stored with a directory, such as size, creation time, or type (file, directory, symbolic link), then use the find method instead of walk. Call that method with a predicate function that accepts a path and a BasicFileAttributes object. The only advantage is efficiency. Since the directory is being read anyway, the attributes are readily available.


8.5.3. Base64 Encoding

The Base64 encoding encodes a sequence of bytes into a (longer) sequence of printable ASCII characters. It is used for binary data in email messages and “basic” HTTP authentication. For many years, the JDK had a nonpublic (and therefore unusable) class java.util.prefs.Base64and an undocumented class sun.misc.BASE64Encoder. Finally, Java 8 provides a standard encoder and decoder.

The Base64 encoding uses 64 characters to encode six bits of information:

• 26 uppercase letters A . . . Z

• 26 lowercase letters a . . . z

• 10 digits 0 . . . 9

• 2 symbols, + and / (basic) or - and _ (URL- and filename-safe variant)

Normally, an encoded string has no line breaks, but the MIME standard used for email requires a "\r\n" line break every 76 characters.

For encoding, request a Base64.Encoder with one of the static methods getEncoder, getUrlEncoder, or getMimeEncoder of the Base64 class.

That class has methods to encode an array of bytes or a NIO ByteBuffer. For example,

Base64.Encoder encoder = Base64.getEncoder();
String original = username + ":" + password;
String encoded = encoder.encodeToString(original.getBytes(StandardCharsets.UTF_8));

Alternatively, you can “wrap” an output stream, so that all data sent to it is automatically encoded.

Path originalPath = ..., encodedPath = ...;
Base64.Encoder encoder = Base64.getMimeEncoder();
try (OutputStream output = Files.newOutputStream(encodedPath)) {
Files.copy(originalPath, encoder.wrap(output));
}

To decode, reverse these operations:

Path encodedPath = ..., decodedPath = ...;
Base64.Decoder decoder = Base64.getMimeDecoder();
try (InputStream input = Files.newInputStream(encodedPath)) {
Files.copy(decoder.wrap(input), decodedPath);
}

8.6. Annotations

Annotations are tags inserted into the source code that some tools can process. In Java SE, annotations are used for simple purposes, such as marking deprecated features or suppressing warnings. Annotations have a much more important role in Java EE where they are used to configure just about any aspect of an application, replacing painful boilerplate code and XML customization that was the bane of older Java EE versions.

Java 8 has two enhancements to annotation processing: repeated annotations and type use annotations. Moreover, reflection has been enhanced to report method parameter names. This has the potential to simplify annotations on method parameters.

8.6.1. Repeated Annotations

When annotations were first created, they were envisioned to mark methods and fields for processing, for example,

@PostConstruct public void fetchData() { ... } // Call after construction
@Resource("jdbc:derby:sample") private Connection conn;
// Inject resource here

In this context, it made no sense to apply the same annotation twice. You can’t inject a field in two ways. Of course, different annotations on the same element are fine and quite common:

@Stateless @Path("/service") public class Service { ... }

Soon, more and more uses for annotations emerged, leading to situations where one would have liked to repeat the same annotation. For example, to denote a composite key in a database, you need to specify multiple columns:

@Entity
@PrimaryKeyJoinColumn(name="ID")
@PrimaryKeyJoinColumn(name="REGION")
public class Item { ... }

Since that wasn’t possible, the annotations were packed into a container annotation, like this:

@Entity
@PrimaryKeyJoinColumns({
@PrimaryKeyJoinColumn(name="ID")
@PrimaryKeyJoinColumn(name="REGION")
})
public class Item { ... }

That’s pretty ugly, and it is no longer necessary in Java 8.

As an annotation user, that is all you need to know. If your framework provider has enabled repeated annotations, you can just use them.

For a framework implementor, the story is not quite as simple. After all, the AnnotatedElement interface has a method

public <T extends Annotation> T getAnnotation(Class<T> annotationClass)

that gets the annotation of type T, if present. What should that method do if multiple annotations of the same type are present? Return the first one only? That could have all sorts of undesirable behavior with legacy code.

To solve this problem, the inventor of a repeatable annotation must

1. Annotate the annotation as @Repeatable

2. Provide a container annotation

For example, for a simple unit testing framework, we might define a repeatable @TestCase annotation, to be used like this:

@TestCase(params="4", expected="24")
@TestCase(params="0", expected="1")
public static long factorial(int n) { ... }

Here is how the annotation can be defined:

@Repeatable(TestCases.class)
@interface TestCase {
String params();
String expected();
}

@interface TestCases {
TestCase[] value();
}

Whenever the user supplies two or more @TestCase annotations, they are automatically wrapped into a @TestCases annotation.

When annotation processing code calls element.getAnnotation(TestCase.class) on the element representing the factorial method, null is returned. This is because the element is actually annotated with the container annotation TestCases.

When implementing an annotation processor for your repeatable annotation, you will find it simpler to use the getAnnotationsByType method. The call element.getAnnotationsByType(TestCase.class) “looks through” any TestCases container and gives you an array of TestCase annotations.


Image NOTE

What I just described relates to processing runtime annotations with the reflection API. If you process source-level annotations, you use the javax.lang.model and javax.annotation.processing APIs. In those APIs, there is no support for “looking through” a container. You will need to process both the individual annotation (if it is supplied once) and the container (if the same annotation is supplied more than once).


8.6.2. Type Use Annotations

Prior to Java 8, an annotation was applied to a declaration. A declaration is a part of code that introduces a new name. Here are a couple of examples, with the declared name in bold:

@Entity public class Person { ... }
@SuppressWarnings("unchecked") List<Person> people = query.getResultList();

In Java 8, you can annotate any type use. This can be useful in combination with tools that check for common programming errors. One common error is throwing a NullPointerException because the programmer didn’t anticipate that a reference might be null. Now suppose you annotated variables that you never want to be null as @NonNull. A tool can check that the following is correct:

private @NonNull List<String> names = new ArrayList<>();
...
names.add("Fred"); // No possibility of a NullPointerException

Of course, the tool should detect any statement that might cause names to become null:

names = null; // Null checker flags this as an error
names = readNames(); // OK if readNames returns a @NonNull String

It sounds tedious to put such annotations everywhere, but in practice, some of the drudgery can be avoided by simple heuristics. The null checker in the Checker Framework (http://types.cs.washington.edu/checker-framework) assumes that any nonlocal variables are implicitly @NonNull, but that local variables might be null unless the code shows otherwise. If a method may return a null, it needs to be annotated as @Nullable. That may not be any worse than documenting the nullness behavior. (The Java API documentation has over 5,000 occurrences of “NullPointerException.”)

In the preceding example, the names variable was declared as @NonNull. That annotation was possible before Java 8. But how can one express that the list elements should be non-null? Logically, that would be

private List<@NonNull String> names;

It is this kind of annotation that was not possible before Java 8 but has now become legal.


Image NOTE

These annotations are not part of standard Java. There are currently no standard annotations that are meaningful for type use. All examples in this section come from the Checker Framework or from the author’s imagination.


Type use annotations can appear in the following places:

• With generic type arguments: List<@NonNull String>, Comparator.<@NonNull String> reverseOrder().

• In any position of an array: @NonNull String[][] words (words[i][j] is not null), String @NonNull [][] words (words is not null), String[] @NonNull [] words (words[i] is not null).

• With superclasses and implemented interfaces: class Image implements @Rectangular Shape.

• With constructor invocations: new @Path String("/usr/bin").

• With casts and instanceof checks: (@Path String) input, if (input instanceof @Path String). (The annotations are only for use by external tools. They have no effect on the behavior of a cast or an instanceof check.)

• With exception specifications: public Person read() throws @Localized IOException.

• With wildcards and type bounds: List<@ReadOnly ? extends Person>, List<? extends @ReadOnly> Person.

• With method and constructor references: @Immutable Person::getName.

There are a few type positions that cannot be annotated:

@NonNull String.class // Illegal—cannot annotate class literal
import java.lang.@NonNull String; // Illegal—cannot annotate import

It is also impossible to annotate an annotation. For example, given @NonNull String name, you cannot annotate @NonNull. (You can supply a separate annotation, but it would apply to the name declaration.)

The practical use of these annotations hinges on the viability of the tools. If you are interested in exploring the potential of extended type checking, a good place to start is the Checker Framework tutorial at http://types.cs.washington.edu/checker-framework/tutorial.

8.6.3. Method Parameter Reflection

The names of parameters are now available through reflection. That is promising because it can reduce annotation boilerplate. Consider a typical JAX-RS method

Person getEmployee(@PathParam("dept") Long dept, @QueryParam("id") Long id)

In almost all cases, the parameter names are the same as the annotation arguments, or they can be made to be the same. If the annotation processor could read the parameter names, then one could simply write

Person getEmployee(@PathParam Long dept, @QueryParam Long id)

This is possible in Java 8, with the new class java.lang.reflect.Parameter.

Unfortunately, for the necessary information to appear in the classfile, the source must be compiled as javac -parameters SourceFile.java. Let’s hope annotation writers will enthusiastically embrace this mechanism, so there will be momentum to drop that compiler flag in the future.

8.7. Miscellaneous Minor Changes

We end this chapter with miscellaneous minor changes that you might find useful. This section covers the new features of the Objects, Logger, and Locale classes, as well as changes to regular expressions and JDBC.

8.7.1. Null Checks

The Objects class has static predicate methods isNull and nonNull that can be useful for streams. For example,

stream.anyMatch(Object::isNull)

checks whether a stream contains a null, and

stream.filter(Object::nonNull)

gets rid of all of them.

8.7.2. Lazy Messages

The log, logp, severe, warning, info, config, fine, finer, and finest methods of the java.util.Logger class now support lazily constructed messages.

For example, consider the call

logger.finest("x: " + x + ", y:" + y);

The message string is formatted even when the logging level is such that it would never be used. Instead, use

logger.finest(() -> "x: " + x + ", y:" + y);

Now the lambda expression is only evaluated at the FINEST logging level, when the cost of the additional lambda invocation is presumably the least of one’s problems.

The requireNonNull of the Objects class (which is described in Chapter 9) also has a version that computes the message string lazily.

this.directions = Objects.requireNonNull(directions,
() -> "directions for " + this.goal + " must not be null");

In the common case that directions is not null, this.directions is simply set to directions. If directions is null, the lambda is invoked, and a NullPointerException is thrown whose message is the returned string.

8.7.3. Regular Expressions

Java 7 introduced named capturing groups. For example, a valid regular expression is

(?<city>[\p{L} ]+),\s*(?<state>[A-Z]{2})

In Java 8, you can use the names in the start, end, and group methods of Matcher:

Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
String city = matcher.group("city");
...
}

The Pattern class has a splitAsStream method that splits a CharSequence along a regular expression:

String contents = new String(Files.readAllBytes(path), StandardCharsets.UTF_8);
Stream<String> words = Pattern.compile("[\\P{L}]+").splitAsStream(contents);

All nonletter sequences are word separators.

The method asPredicate can be used to filter strings that match a regular expression:

Stream<String> acronyms = words.filter(Pattern.compile("[A-Z]{2,}").asPredicate());

8.7.4. Locales

A locale specifies everything you need to know to present information to a user with local preferences concerning language, date formats, and so on. For example, an American user prefers to see a date formatted as December 24, 2013 or 12/24/2013, whereas a German user expects 24. Dezember 2013 or 24.12.2013.

It used to be that locales were simple, consisting of location, language, and (for a few oddball cases, such as the Norwegians who have two spelling systems) variants. But those oddball cases mushroomed, and the Internet Engineering Task Force issued its “Best Current Practices” memo BCP 47 (http://tools.ietf.org/html/bcp47) to bring some order into the chaos. Nowadays, a locale is composed of up to five components.

1. A language, specified by two or three lowercase letters, such as en (English) or de (German or, in German, Deutsch).

2. A script, specified by four letters with an initial uppercase, such as Latn (Latin), Cyrl (Cyrillic), or Hant (traditional Chinese characters). This is useful because some languages, such as Serbian, are written in Latin or Cyrillic, and some Chinese readers prefer the traditional over the simplified characters.

3. A country, specified by two uppercase letters or three digits, such as US (United States) or CH (Switzerland).

4. Optionally, a variant. Variants are not common any more. For example, the Nynorsk spelling of Norwegian is now expressed with a different language code, nn, instead of a variant NY of the language no.

5. Optionally, an extension. Extensions describe local preferences for calendars (such as the Japanese calendar), numbers (Thai digits), and so on. The Unicode standard specifies some of these extensions. Such extensions start with u- and a two-letter code specifying whether the extension deals with the calendar (ca), numbers (nu), and so on. For example, the extension u-nu-thai denotes the use of Thai numerals. Other extensions are entirely arbitrary and start with x-, such as x-java.

You can still construct a locale the old-fashioned way, such as new Locale("en", "US"), but since Java 7 you can simply call Locale.forLanguageTag("en-US"). Java 8 adds methods for finding locales that match user needs.

A language range is a string that denotes the locale characteristics that a user desires, with * for wildcards. For example, a German speaker in Switzerland might prefer anything in German, followed by anything in Switzerland. This is expressed with two Locale.LanguageRange objects specified with strings "de" and "*-CH". One can optionally specify a weight between 0 and 1 when constructing a Locale.LanguageRange.

Given a list of weighted language ranges and a collection of locales, the filter method produces a list of matching locales, in descending order of match quality:

List<Locale.LanguageRange> ranges = Stream.of("de", "*-CH")
.map(Locale.LanguageRange::new)
.collect(Collectors.toList());
// A list containing the Locale.LanguageRange objects for the given strings
List<Locale> matches = Locale.filter(ranges,
Arrays.asList(Locale.getAvailableLocales()));
// The matching locales: de, de-CH, de-AT, de-LU, de-DE, de-GR, fr-CH, it_CH

The static lookup method just finds the best locale:

Locale bestMatch = Locale.lookup(ranges, locales);

In this case, the best match is de, which isn’t very interesting. But if locales contains a more restricted set of locales, such as those in which a document was available, then this mechanism can be useful.

8.7.5. JDBC

In Java 8, JDBC has been updated to version 4.2. There are a few minor changes.

The Date, Time, and Timestamp classes in the java.sql package have methods to convert from and to their java.time analogs LocalDate, LocalTime, and LocalDateTime.

The Statement class has a method executeLargeUpdate for executing an update whose row count exceeds Integer.MAX_VALUE.

JDBC 4.1 (which was a part of Java 7) specified a generic method getObject(column, type) for Statement and ResultSet, where type is a Class instance. For example, URL url = result.getObject("link", URL.class) retrieves a DATALINK as a URL. Now the corresponding setObject method is provided as well.

Exercises

1. Write a program that adds, subtracts, divides, and compares numbers between 0 and 232 – 1, using int values and unsigned operations. Show why divideUnsigned and remainderUnsigned are necessary.

2. For which integer n does Math.negateExact(n) throw an exception? (Hint: There is only one.)

3. Euclid’s algorithm (which is over two thousand years old) computes the greatest common divisor of two numbers as gcd(a, b) = a if b is zero, and gcd(b, rem(a, b)) otherwise, where rem is the remainder. Clearly, the gcd should not be negative, even if a or b are (since its negation would then be a greater divisor). Implement the algorithm with %, floorMod, and a rem function that produces the mathematical (non-negative) remainder. Which of the three gives you the least hassle with negative values?

4. The Math.nextDown(x) method returns the next smaller floating-point number than x, just in case some random process hit x exactly, and we promised a number < x. Can this really happen? Consider double r = 1 - generator.nextDouble(), where generator is an instance of java.util.Random. Can it ever yield 1? That is, can generator.nextDouble() ever yield 0? The documentation says it can yield any value between 0 inclusive and 1 exclusive. But, given that there are 253 such floating-point numbers, will you ever get a zero? Indeed, you do. The random number generator computes the next seed as next(s) = s · m + a % N, where m = 25214903917, a = 11, and N = 248. The inverse of m modulo N is v = 246154705703781, and therefore you can compute the predecessor of a seed as prev(s) = (sa) · v% N. To make a double, two random numbers are generated, and the top 26 and 27 bits are taken each time. When s is 0, next(s) is 11, so that’s what we want to hit: two consecutive numbers whose top bits are zero. Now, working backwards, let’s start with s = prev(prev(prev(0))). Since the Random constructor sets s = (initialSeed ^ m) % N, offer it s = prev(prev(prev(0))) ^ m = 164311266871034, and you’ll get a zero after two calls to nextDouble. But that is still too obvious. Generate a million predecessors, using a stream of course, and pick the minimum seed. Hint: You will get a zero after 376050 calls to nextDouble.

5. At the beginning of Chapter 2, we counted long words in a list as words.stream().filter(w -> w.length() > 12).count(). Do the same with a lambda expression, but without using streams. Which operation is faster for a long list?

6. Using only methods of the Comparator class, define a comparator for Point2D which is a total ordering (that is, the comparator only returns zero for equal objects). Hint: First compare the x-coordinates, then the y-coordinates. Do the same for Rectangle2D.

7. Express nullsFirst(naturalOrder()).reversed() without calling reversed.

8. Write a program that demonstrates the benefits of the CheckedQueue class.

9. Write methods that turn a Scanner into a stream of words, lines, integers, or double values. Hint: Look at the source code for BufferedReader.lines.

10. Unzip the src.zip file from the JDK. Using Files.walk, find all Java files that contain the keywords transient and volatile.

11. Write a program that gets the contents of a password-protected web page. Call URLConnection connection = url.openConnection();. Form the string username:password and encode it in Base64. Then callconnection.setRequestProperty("Authorization", "Basic " + encoded string), followed by connection.connect() and connection.getInputStream().

12. Implement the TestCase annotation and a program that loads a class with such annotations and invokes the annotated methods, checking whether they yield the expected values. Assume that parameters and return values are integers.

13. Repeat the preceding exercise, but build a source-level annotation processor emitting a program that, when executed, runs the tests in its main method. (See Horstmann and Cornell, Core Java, 9th Edition, Volume 2, Section 10.6 for an introduction into processing source-level annotations.)

14. Demonstrate the use of the Objects.requireNonNull method and show how it leads to more useful error messages.

15. Using Files.lines and Pattern.asPredicate, write a program that acts like the grep utility, printing all lines that contain a match for a regular expression.

16. Use a regular expression with named capturing groups to parse a line containing a city, state, and zip code. Accept both 5- and 9-digit zip codes.