Quantcast
Channel: Hacker News 50
Viewing all articles
Browse latest Browse all 9433

Everything about Java 8 - TechEmpower Blog

$
0
0

Comments:"Everything about Java 8 - TechEmpower Blog"

URL:http://www.techempower.com/blog/2013/03/26/everything-about-java-8/


The following post is a comprehensive summary of the developer-facing changes coming in Java 8. This next iteration of the JDK is currently scheduled for general availability in September 2013.

At the time of this writing, Java 8 development is still very much in progress. Language features and APIs may still change. I'll do my best to keep this document up to date.

Preview builds of Java 8, specifically the "Project Lambda" builds, can be downloaded from java.net: Java™ Platform, Standard Edition 8 Early Access with Lambda Support

I used preview builds of IntelliJ for my IDE. It had the best support for the Java 8 language features at the time I went looking. You can find those builds here: IntelliJIDEA EAP.

The Java 8 javadocs are hosted locally for the time being because I don't know of an official, Oracle-hosted location for them. When Oracle eventually hosts them, I'll change the links to point to the official home for the javadocs.

Interfaces can now define static methods. A common scenario in Java libraries is, for some interface Foo, there would be a companion utility class Foos with static methods for generating or working with Foo instances. Now that static methods can exist on interfaces, in many cases the Foos utility class can go away (or be made package-private), with its public methods going on the interface instead.

Additionally, more importantly, interfaces can now define default methods. For instance, a forEach method was added to java.lang.Iterable:

public default void forEach(Consumer<? super T> action) {
 Objects.requireNonNull(action);
 for (T t : this) {
 action.accept(t);
 }
}

In the past it was essentially impossible for Java libraries to add methods to interfaces. Adding a method to an interface would mean breaking all existing code that implements the interface. Now, as long as a sensible default implementation of a method can be provided, library maintainers can add methods to these interfaces.

In Java 8, a large number of default methods have been added to core JDK interfaces. I'll discuss many of them later.

Why can't default methods override equals, hashCode, and toString?

An interface cannot provide a default implementation for any of the methods of the Object class. In particular, this means one cannot provide a default implementation for equals, hashCode, or toString from within an interface.

This seems odd at first, given that some interfaces actually define their equals behavior in documentation. The List interface is an example. So, why not allow this?

Brian Goetz gave four reasons in a lengthy response on the matter. I'll only describe one here, because that one was enough to convince me:

It would become more difficult to reason about when a default method is invoked. Right now it's simple: if a class implements a method, that always wins over a default implementation. Since all instances of interfaces are Objects, all instances of interfaces have non-default implementations of equals/hashCode/toString already. Therefore, a default version of these on an interface is always useless, and it may as well not compile.

For further reading, see this explanation written by Brian Goetz: response to "Allow default methods to override Object's methods"

A core concept introduced in Java 8 is that of a "functional interface". An interface is a functional interface if it defines exactly one abstract method. For instance, java.lang.Runnable is a functional interface because it only defines one abstract method:

public abstract void run();

Note that the "abstract" modifier is implied because the method lacks a body. It is not necessary to specify the "abstract" modifier, as this code does, in order to qualify as a functional interface.

Default methods are not abstract, so a functional interface can define as many default methods as it likes.

A new annotation, @FunctionalInterface, has been introduced. It can be placed on an interface to declare the intention of it being a functional interface. It will cause the interface to refuse to compile unless you've managed to make it a functional interface. It's sort of like @Override in this way; it declares intention and doesn't allow you to use it incorrectly.

An extremely valuable property of functional interfaces is that they can be instantiated using lambdas. Here are a few examples of lambdas:

Comma-separated list of inputs with specified types on the left, a block with a return on the right:

(int x, int y) -> { return x + y; }

Comma-separated list of inputs with inferred types on the left, a return value on the right:

(x, y) -> x + y

Single parameter with inferred type on the left, a return value on the right:

x -> x * x

No inputs on left (official name: "burger arrow"), return value on the right:

() -> x

Single parameter with inferred type on the left, a block with no return (void return) on the right:

x -> { System.out.println(x); }

Static method reference:

String::valueOf

Non-static method reference:

Object::toString

Capturing method reference:

x::toString

Constructor reference:

ArrayList::new

The method reference forms are shorthand for the other forms.

Method reference  Equivalent lambda expression String::valueOf x -> String.valueOf(x) Object::toString x -> x.toString() x::toString () -> x.toString() ArrayList::new () -> new ArrayList()

A lambda is compatible with a given functional interface when their "shapes" match. By "shapes", I'm referring to to the types of the inputs, outputs, and declared checked exceptions.

To give a couple of concrete, valid examples:

Comparator<String> c = (a, b) -> Integer.compare(a.length(),
 b.length());

A Comparator<String>'s compare method takes two strings as input, and returns an int. That's consistent with the lambda on the right, so this assignment is valid.

Runnable r = () -> { System.out.println("Running!"); }

A Runnable's run method takes no arguments and does not have a return value. That's consistent with the lambda on the right, so this assignment is valid.

The checked exceptions (if present) in the abstract method's signature matter too. The lambda can only throw a checked exception if the functional interface declares that exception in its signature.

Capturing versus non-capturing lambdas

Lambdas are said to be "capturing" if they access a non-static variable or object that was defined outside of the lambda body. For example, this lambda captures the variable x:

int x = 5;
return y -> x + y;

In order for this lambda declaration to be valid, the variables it captures must be "effectively final". So, either they must be marked with the final modifier, or they must not be modified after they're assigned.

Whether a lambda is capturing or not has implications for performance. A non-capturing lambda is generally going to be more efficient than a capturing one. Although this is not defined in any specifications (as far as I know), and you shouldn't count on it for a program's correctness, a non-capturing lambda only needs to be evaluated once. From then on, it will return an identical instance. Capturing lambdas need to be evaluated every time they're encountered, and currently that performs much like instantiating a new instance of an anonymous class.

What lambdas don't do

There are a few features that lambdas don't provide, which you should keep in mind. They were considered for Java 8 but were not included, for simplicity and due to time constraints.

Non-final variable capture - If a variable is assigned a new value, it can't be used within a lambda. This code does not compile:

int count = 0;
List<String> strings = Arrays.asList("a", "b", "c");
strings.forEach(s -> {
 count++; // error: can't modify the value of count
});

Exception transparency - If a checked exception may be thrown from inside a lambda, the functional interface must also declare that checked exception can be thrown. The exception is not propogated to the containing method. This code does not compile:

void appendAll(Iterable<String> values, Appendable out)
 throws IOException { // doesn't help with the error
 values.forEach(s -> {
 out.append(s); // error: can't throw IOException here
 // Consumer.accept(T) doesn't allow it
 });
}

There are ways to work around this, where you can define your own functional interface that extends Consumer and sneaks the IOException through as a RuntimeException. I tried this out in code and found it to be too confusing to be worthwhile.

Control flow (break, early return) - In the forEach examples above, a traditional continue is possible by placing a "return;" statement within the lambda. However, there is no way to break out of the loop or return a value as the result of the containing method from within the lambda. For example:

final String secret = "foo";
boolean containsSecret(Iterable<String> values) {
 values.forEach(s -> {
 if (secret.equals(s)) {
 ??? // want to end the loop and return true, but can't
 }
 });
}

For further reading about these issues, see this explanation written by Brian Goetz: response to "Checked exceptions within Block<T>

Why abstract classes can't be instantiated using a lambda

An abstract class, even if it declares only one abstract method, cannot be instantiated with a lambda.

Two examples of classes with one abstract method are Ordering and CacheLoader from the Guava library. Wouldn't it be nice to be able to declare instances of them using lambdas like this?

Ordering<String> order = (a, b) -> ...;
CacheLoader<String, String> loader = (key) -> ...;

The most common argument against this was that it would add to the difficulty of reading a lambda. Instantiating an abstract class in this way could lead to execution of hidden code: that in the constructor of the abstract class.

Another reason is that it throws out possible optimizations for lambdas. In the future, it may be the case that lambdas are not evaluated into object instances. Letting users declare abstract classes with lambdas would prevent optimizations like this.

Besides, there's an easy workaround. Actually, the two example classes from Guava already demonstrate this workaround. Add factory methods to convert from a lambda to an instance:

Ordering<String> order = Ordering.from((a, b) -> ...);
CacheLoader<String, String> loader =
 CacheLoader.from((key) -> ...);

For further reading, see this explanation written by Brian Goetz: response to "Allow lambdas to implement abstract classes"

Package summary: java.util.function

As demonstrated earlier with Comparator and Runnable, interfaces already defined in the JDK that happen to be functional interfaces are compatible with lambdas. The same goes for any functional interfaces defined in your own code or in third party libraries.

But there are certain forms of functional interfaces that are widely, commonly useful, which did not exist previously in the JDK. A large number of these interfaces have been added to the new java.util.function package. Here are a few:

  • Function<T, R> - take a T as input, return an R as ouput
  • Predicate<T> - take a T as input, return a boolean as output
  • Consumer<T> - take a T as input, perform some action and don't return anything
  • Supplier<T> - with nothing as input, return a T
  • BinaryOperator<T> - take two T's as input, return one T as output, useful for "reduce" operations

Primitive specializations for most of these exist as well. They're provided in int, long, and double forms. For instance:

  • IntConsumer - take an int as input, perform some action and don't return anything

These exist for performance reasons, to avoid boxing and unboxing when the inputs or outputs are primitives.

Package summary: java.util.stream

The new java.util.stream package provides utilities "to support functional-style operations on streams of values" (quoting the javadoc). Probably the most common way to obtain a stream will be from a collection:

Stream<T> stream = collection.stream();

A stream is something like an iterator. The values "flow past" (analogy to a stream of water) and then they're gone. A stream can only be traversed once, then it's used up. Streams may also be infinite.

Streams can be sequential or parallel. They start off as one and may be switched to the other using stream.sequential() or stream.parallel(). The actions of a sequential stream occur in serial fashion on one thread. The actions of a parallel stream may be happening all at once on multiple threads.

So, what do you do with a stream? Here is the example given in the package javadocs:

int sumOfWeights = blocks.stream().filter(b -> b.getColor() == RED)
 .map(b -> b.getWeight())
 .sum();

Note: The above code makes use of a primitive stream, and a sum() method is only available on primitive streams. There will be more detail on primitive streams shortly.

A stream provides a fluent API for transforming values and performing some action on the results. Stream operations are either "intermediate" or "terminal".

  • Intermediate - An intermediate operation keeps the stream open and allows further operations to follow. The filter and map methods in the example above are intermediate operations. The return type of these methods is Stream; they return the current stream to allow chaining of more operations.
  • Terminal - A terminal operation must be the final operation invoked on a stream. Once a terminal operation is invoked, the stream is "consumed" and is no longer usable. The sum method in the example above is a terminal operation.

Usually, dealing with a stream will involve these steps:

Obtain a stream from some source. Perform one or more intermediate operations. Perform one terminal operation.

It's likely that you'll want to perform all those steps within one method. That way, you know the properties of the source and the stream and can ensure that it's used properly. You probably don't want to accept arbitrary Stream<T> instances as input to your method because they may have properties you're ill-equipped to deal with, such as being parallel or infinite.

There are a couple more general properties of stream operations to consider:

  • Stateful - A stateful operation imposes some new property on the stream, such as uniqueness of elements, or a maximum number of elements, or ensuring that the elements are consumed in sorted fashion. These are typically more expensive than stateless intermediate operations.
  • Short-circuiting - A short-circuiting operation potentially allows processing of a stream to stop early without examining all the elements. This is an especially desirable property when dealing with infinite streams; if none of the operations being invoked on a stream are short-circuiting, then the code may never terminate.

Here are short, general descriptions for each Stream method. See the javadocs for more thorough explanations. Links are provided below for each overloaded form of the operation.

Intermediate operations:

  • filter1 - Exclude all elements that don't match a Predicate.
  • map1234 - Perform a one-to-one transformation of elements using a Function.
  • flatMap12345 - Transform each element into zero or more elements using a FlatMapper.
  • peek1 - Perform some action on each element as it is encountered. Primarily useful for debugging.
  • distinct1 - Exclude all duplicate elements according to their .equals behavior. This is a stateful operation.
  • sorted12 - Ensure that stream elements in subsequent operations are encountered according to the order imposed by a Comparator. This is a stateful operation.
  • limit1 - Ensure that subsequent operations only see up to a maximum number of elements. This is a stateful, short-circuiting operation.
  • substream12 - Ensure that subsequent operations only see a range (by index) of elements. Like String.substring except for streams. There are two forms, one with a begin index and one with an end index as well. Both are stateful operations, and the form with an end index is also a short-circuiting operation.

Terminal operations:

  • forEach1 - Perform some action for each element in the stream.
  • toArray12 - Dump the elements in the stream to an array.
  • reduce123 - Combine the stream elements into one using a BinaryOperator.
  • collect12 - Dump the elements in the stream into some container, such as a Collection or Map.
  • min1 - Find the minimum element of the stream according to a Comparator.
  • max1 - Find the maximum element of the stream according to a Comparator.
  • count1 - Find the number of elements in the stream.
  • anyMatch1 - Find out whether at least one of the elements in the stream matches a Predicate. This is a short-circuiting operation.
  • allMatch1 - Find out whether every element in the stream matches a Predicate.
  • noneMatch1 - Find out whether zero elements in the stream match a Predicate.
  • findFirst1 - Find the first element in the stream. This is a short-circuiting operation.
  • findAny1 - Find any element in the stream, which may be cheaper than findFirst for some streams. This is a short-circuiting operation.

Going back to the concept of parallel streams, it's important to note that parallelism is not free. It's not free from a performance standpoint, and you can't simply swap out a sequential stream for a parallel one and expect the results to be identical without further thought. There are properties to consider about your stream, its operations, and the destination for its data before you can (or should) parallelize a stream. For instance: Does encounter order matter to me? Are my functions stateless? Is my stream large enough and are my operations complex enough to make parallelism worthwhile?

There are primitive-specialized versions of Stream for ints, longs, and doubles:

One can convert back and forth between an object stream and a primitive stream using the primitive-specialized map and flatMap functions, among others. To give a few contrived examples:

List<String> strings = Arrays.asList("a", "b", "c");
strings.stream() // Stream<String>
 .mapToInt(String::length) // IntStream
 .longs() // LongStream
 .mapToDouble(x -> x / 10.0) // DoubleStream
 .boxed() // Stream<Double>
 .mapToLong(x -> 1L) // LongStream
 .mapToObj(x -> "") // Stream<String>
 ...

The primitive streams also provide methods for obtaining basic numeric statistics about the stream as a data structure. You can find the count, sum, min, max, and mean of the elements all from one terminal operation.

There are not primitive versions for the rest of the primitive types because it would have required an unacceptable amount of bloat in the JDK. IntStream, LongStream, and DoubleStream were deemed useful enough to include, and streams of other numeric primitives can represented using these three via widening primitive converstion.

The FlatMapper interface used in the flatMap operations is a functional interface with one abstract method:

void flattenInto(T element, Consumer<U> sink);

In the context of a flatMap operation, the stream provides the element and sink to you, and then you define what is to be done with the element and sink. The element is the current element in the stream, and the sink represents what should appear in the stream after the flatMap operation is complete. For example:

Set<Color> colors = ...;
List<Person> people = ...;
Stream<Color> stream = people.stream().flatMap(
 (Person person, Consumer<Color> sink) -> {
 // Map each person to the colors they like.
 for (Color color : colors) {
 if (person.likesColor(color)) {
 sink.accept(color);
 }
 }
 });

Note that the parameter types in the lambda above are specified. In most other contexts you can get away without specifying the types, but here, due to the nature of FlatMapper, the compiler needs your help to figure out the types. If you're using flatMap and wondering why it's not compiling, it may be because you didn't specify the types.

One of the most confusing, intricate, and useful terminal stream operations is collect. It introduces a new, non-functional interface called Collector. This interface is somewhat difficult to understand, but fortunately there is a Collectors utility class for generating all sorts of useful Collectors. For example:

List<String> strings = values.stream()
 .filter(...)
 .map(...)
 .collect(Collectors.toList());

If you want to put your stream elements into a Collection, Map, or String, then Collectors probably has what you need. It's definitely worthwhile to browse through the javadoc of that class.

Summary of proposal: JEP 101: Generalized Target-Type Inference

This was an effort to improve the ability of the compiler to determine generic types where it was previously unable to. There were many cases in previous versions of Java where the compiler could not figure out the generic types for a method in the context of nested or chained method invocations, even when it seemed "obvious" to the programmer. Those situations required the programmer to explicitly specify a "type witness". It's a feature of generics that surprisingly few Java programmers know about (I'm saying this based on personal interactions and reading StackOverflow questions). It looks like this:

// In Java 7:
foo(Utility<Type>.bar());
Utility.<Type>foo().bar();

Without the type witnesses, the compiler might fill in <Object> as the generic type, and the code would fail to compile if a more specific type was required instead.

Java 8 improves this situation tremendously. In many more cases, it can figure out a more specific generic type based on the context.

// In Java 8:
foo(Utility.bar());
Utility.foo().bar();

This one is still a work in progress, so I'm not sure how many of the examples listed in the proposal will actually be included for Java 8. Hopefully it's all of them.

Package summary: java.time

The new date/time API in Java 8 is contained in the java.time package. If you're familiar with Joda Time, it will be really easy to pick up. Actually, I think it's so well-designed that even people who have never heard of Joda Time should find it easy to pick up.

Almost everything in the API is immutable, including the value types and the formatters. No more worrying about exposing Date fields or dealing with thread-local date formatters.

The intermingling with the legacy date/time API is minimal. It was a clean break:

The new API prefers enums over integer constants for things like months and days of the week.

So, what's in it? The package-level javadocs do an excellent job of explaining the additional types. I'll give a brief rundown of some noteworthy parts.

Extremely useful value types:

Less useful value types:

Other useful types:

  • DateTimeFormatter - for converting datetime objects to strings
  • ChronoUnit - for figuring out the amount of time bewteen two points, e.g. ChronoUnit.DAYS.between(t1, t2)
  • TemporalAdjuster - e.g. date.with(TemporalAdjuster.firstDayOfMonth())

The new value types are, for the most part, supported by JDBC. There are minor exceptions, such as ZonedDateTime which has no counterpart in SQL.

The fact that interfaces can define default methods allowed the JDK authors to make a large number of additions to the collection API interfaces. Default implementations for these are provided on all the core interfaces, and more efficient or well-behaved overridden implementations were added to all the concrete classes, where applicable.

Here's a list of the new methods:

Also, Iterator.remove() now has a default, throwing implementation, which makes it slightly easier to define unmodifiable iterators.

Collection.stream() and Collection.parallelStream() are the main gateways into the stream API. There are other ways to generate streams, but those are going to be the most common by far.

The addition of List.sort(Comparator) is fantastic. Previously, the way to sort an ArrayList was this:

Collections.sort(list, comparator);

That code, which was your only option in Java 7, was frustratingly innefficient. It would dump the list into an array, sort the array, then use a ListIterator to insert the array contents into the list in new positions.

The default implementation of List.sort(Comparator) still does this, but concrete implementing classes are free to optimize. For instance, ArrayList.sort invokes Arrays.sort on the ArrayList's internal array. CopyOnWriteArrayList does the same.

Performance isn't the only potential gain from these new methods. They can have more desirable semantics, too. For instance, sorting a Collections.synchronizedList() is an atomic operation using list.sort. You can iterate over all its elements as an atomic operation using list.forEach. Previously this was not possible.

Map.computeIfAbsent makes working with multimap-like structures easier:

// Index strings by length:
Map<Integer, List<String>> map = new HashMap<>();
for (String s : strings) {
 map.computeIfAbsent(s.length(),
 key -> new ArrayList<String>())
 .add(s);
}
// Although in this case the stream API may be a better choice:
Map<Integer, List<String>> map = strings.stream()
 .collect(Collectors.groupingBy(String::length));

ForkJoinPool.commonPool() is the structure that handles all parallel stream operations. It is intended as an easy, good way to obtain a ForkJoinPool/ExecutorService/Executor when you need one.

ConcurrentHashMap<K, V> was completelly rewritten. Internally it looks nothing like the version that was in Java 7. Externally it's mostly the same, except it has a large number of bulk operation methods: many forms of reduce, search, and forEach.

ConcurrentHashMap.newKeySet() provides a concurrent java.util.Set implementation. It is essentially another way of writing Collections.newSetFromMap(new ConcurrentHashMap<T, Boolean>()).

StampedLock is a new lock implementation that can probably replace ReentrantReadWriteLock in most cases. It performs better than RRWL when used as a plain read-write lock. Is also provides an API for "optimistic reads", where you obtain a weak, cheap version of a read lock, do the read operation, then check afterwards if your lock was invalidated by a write. There's more detail about this class and its performance in a set of slides put together by Heinz Kabutz (starting about half-way through the set of slides): "Phaser and StampedLock Presentation"

CompletableFuture<T> is a nice implementation of the Future interface that provides a ton of methods for performing (and chaining together) asynchronous tasks. It relies on functional interfaces heavily; lambdas are a big reason this class was worth adding. If you are currently using Guava's Future utilities, such as Futures, ListenableFuture, and SettableFuture, you may want to check out CompletableFuture as a potential replacement.

Basically, these are ways to obtain java.util.stream.Stream from files and InputStreams. They're a bit different from the streams you obtain from regular collections though. They introduce two new concepts:

  • UncheckedIOException - thrown when an IO error occurs, but since IOException isn't allowed to be in the signature of Iterator/Stream, it has to be smuggled through with an unchecked exception.
  • CloseableStream - a stream that can (and should) be declared in a try-with-resources statement.

Annotations are allowed in more places, e.g. List<@Nullable String>. The biggest impact of this is likely to be for static analysis tools such as Sonar and FindBugs.

This JSR 308 website may be outdated at this point, but it does a better job of explaining the motivation for these changes than I could possibly do: "Type Annotations (JSR 308) and the Checker Framework"

Summary of proposal: JEP 174: Nashorn JavaScript Engine

I did not experiment with Nashorn so I know very little beyond what's described in the proposal above. Short version: It's the successor to Rhino. Rhino is old and a little bit slow, and the developers decided they'd be better off starting from scratch.

There is too much there to talk about, but I'll pick out a few noteworthy items.

ThreadLocal.withInitial(Supplier<T>) makes declaring thread-local variables with initial values much nicer. Previously you would supply an initial value like this:

ThreadLocal<List<String>> strings =
 new ThreadLocal<List<String>>() {
 @Override
 protected List<String> initialValue() {
 return new ArrayList();
 }
 };

Now it's like this:

ThreadLocal<List<String>> strings =
 ThreadLocal.withInital(ArrayList::new);

Optional<T> appears in the stream API as the return value for methods like min/max, findFirst/Any, and some forms of reduce. It's used because there might not be any elements in the stream, and it provides a fluent API for handling the "some result" versus "no result" cases. You can provide a default value, throw an exception, or execute some action only if the result exists.

It's very, very similar to Guava's Optional class. It's nothing at all like Option in Scala, nor is it trying to be, and the name similarity there is purely coincidental.

Aside: it's interesting that Java 8's Optional and Guava's Optional ended up being so similar, despite the absurd amount of debate that occurred over its addition to both libraries.

"FYI.... Optional was the cause of possibly the single greatest conflagration on the internal Java libraries discussion lists ever."

Kevin Bourrillion in response to "Some new Guava classes targeted for release 10"

"On a purely practical note, the discussions surrounding Optional have exceeded its design budget by several orders of magnitude."

Brian Goetz in response to "Optional require(s) NonNull"

StringJoiner and String.join(...) are long, long overdue. They are so long overdue that the vast majority of Java developers likely have already written or have found utilities for joining strings, but it is nice for the JDK to finally provide this itself. Everyone has encountered situations where joining strings is required, and it is a Good Thing™ that we can now express that through a standard API that every Java developer (eventually) will know.

Comparators and Comparator.thenComparing(...) provide some very nice utilities for doing chained comparisons and field-based comparisons. For example:

people.sort(Comparators.comparing(Person::getLastName)
 .thenComparing(Person::getFirstName));

These additions provide good, readable shorthand for complex sorts. Many of the use cases served by Guava's ComparisonChain and Ordering utility classes are now served by these JDK additions. And for what it's worth, I think the JDK verions read better than the functionally-equivalent versions expressed in Guava-ese.

There are lots of various small bug fixes and performance improvements that were not covered in this post. But they are appreciated too!

This post was intended to cover every single language-level and API-level change coming in Java 8. If any were missed, it was an error that should be corrected. Please let me know if you discover an omission. You can contact me via e-mail or post in the Hacker News comment thread.


Viewing all articles
Browse latest Browse all 9433

Trending Articles