The Stream API, introduced in Java 8, is a powerful tool for processing sequences of elements. It provides a functional approach to working with collections and arrays, allowing for expressive and efficient data manipulation. Streams don’t store data; instead, they operate on a data source, such as a Collection or an array, and enable aggregate operations.

Core Concepts

A stream pipeline consists of three parts:

  1. A source: Where the stream originates from (e.g., a List, Set, or array).
  2. Zero or more intermediate operations: These transform the stream into another stream. Examples include filter, map, and sorted.
  3. A terminal operation: This produces a result or a side-effect, and triggers the execution of the pipeline. Examples include forEach, collect, and reduce.

One of the key features of streams is laziness. Intermediate operations are not executed until a terminal operation is invoked. This allows the Stream API to optimize the execution of the pipeline.

Creating Streams

There are several ways to create a stream:

From a Collection

You can create a stream from any Collection (e.g., List, Set) using the stream() method.

List<String> names = Arrays.asList("Alice", "Bob", "Charlie");
Stream<String> nameStream = names.stream();

From an Array

You can create a stream from an array using the Arrays.stream() method or Stream.of().

String[] nameArray = {"Alice", "Bob", "Charlie"};
Stream<String> nameStreamFromArray = Arrays.stream(nameArray);

Stream<String> nameStreamFromOf = Stream.of("Alice", "Bob", "Charlie");

From a Range of Numbers

The IntStream, LongStream, and DoubleStream interfaces provide methods for creating streams of primitive numeric types.

IntStream intStream = IntStream.range(1, 5); // 1, 2, 3, 4
LongStream longStream = LongStream.rangeClosed(1, 5); // 1, 2, 3, 4, 5

Using Stream.iterate() and Stream.generate()

You can also create infinite streams using iterate() and generate().

Stream<Integer> evenNumbers = Stream.iterate(0, n -> n + 2);
Stream<Double> randomNumbers = Stream.generate(Math::random);

It’s important to use limit() with infinite streams to prevent an infinite loop.

Stream Operations

Stream operations are divided into two categories: intermediate and terminal.

Intermediate Operations

Intermediate operations return a new stream and are always lazy.

  • filter(Predicate<T>): Returns a stream consisting of the elements of this stream that match the given predicate.

    List<String> names = Arrays.asList("Alice", "Bob", "Charlie", "Anna");
    names.stream()
         .filter(name -> name.startsWith("A"))
         .forEach(System.out::println); // Alice, Anna
    
  • map(Function<T, R>): Returns a stream consisting of the results of applying the given function to the elements of this stream.

    List<String> names = Arrays.asList("Alice", "Bob", "Charlie");
    names.stream()
         .map(String::length)
         .forEach(System.out::println); // 5, 3, 7
    
  • flatMap(Function<T, Stream<R>>): Transforms each element of the stream into a stream of other objects and then flattens all the generated streams into a single stream.

    List<List<Integer>> listOfLists = Arrays.asList(
        Arrays.asList(1, 2),
        Arrays.asList(3, 4),
        Arrays.asList(5, 6)
    );
    listOfLists.stream()
               .flatMap(List::stream)
               .forEach(System.out::print); // 123456
    
  • distinct(): Returns a stream consisting of the distinct elements (according to Object.equals(Object)) of this stream.

  • sorted(): Returns a stream consisting of the elements of this stream, sorted according to natural order.

  • peek(Consumer<T>): Returns a stream consisting of the elements of this stream, additionally performing the provided action on each element as elements are consumed from the resulting stream. This is useful for debugging.

Terminal Operations

Terminal operations trigger the stream processing and produce a result.

  • forEach(Consumer<T>): Performs an action for each element of this stream.

  • collect(Collector<T, A, R>): Performs a mutable reduction operation on the elements of this stream using a Collector. This is one of the most powerful terminal operations.

    List<String> names = Arrays.asList("Alice", "Bob", "Charlie");
    List<String> upperCaseNames = names.stream()
                                       .map(String::toUpperCase)
                                       .collect(Collectors.toList());
    
  • reduce(T identity, BinaryOperator<T> accumulator): Performs a reduction on the elements of this stream, using an initial value and an associative accumulation function.

    List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);
    int sum = numbers.stream().reduce(0, (a, b) -> a + b); // 15
    
  • count(): Returns the count of elements in this stream.

  • anyMatch(Predicate<T>), allMatch(Predicate<T>), noneMatch(Predicate<T>): These operations check if any, all, or no elements of this stream match the given predicate.

  • findFirst(), findAny(): Return an Optional describing the first element of this stream, or an arbitrary element of the stream, respectively.

Collectors

The Collectors class provides a set of static factory methods for creating Collector instances.

  • toList(), toSet(), toMap(): Collect elements into a List, Set, or Map.
  • joining(CharSequence delimiter): Joins the elements into a String.
  • groupingBy(Function<T, K>): Groups elements according to a classification function.
  • partitioningBy(Predicate<T>): Partitions elements into a Map<Boolean, List<T>> based on a predicate.

Parallel Streams

You can easily create a parallel stream by calling the parallelStream() method on a collection or by calling the parallel() intermediate method on a stream. This can lead to significant performance improvements for large datasets, as the operations are performed concurrently.

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
int sum = numbers.parallelStream()
                 .mapToInt(Integer::intValue)
                 .sum();

However, be mindful of the overhead of parallelism. For small datasets or simple operations, a sequential stream might be faster.

Conclusion

The Java Stream API is a fundamental part of modern Java development. It enables you to write more concise, readable, and potentially more performant code for data processing. By understanding the core concepts of stream creation, intermediate operations, and terminal operations, you can leverage the full power of functional-style programming in Java.