Java官方笔记14流

2023-07-20 19:57:44 浏览数 (2)

Processing Data in Memory

The Stream API is probably the second most important feature added to Java SE 8, after the lambda expressions. In a nutshell, the Stream API is about providing an implementation of the well known map-filter-reduce algorithm to the JDK.

map-filter-reduce:

代码语言:javascript复制
List<Sale> sales = ...; // this is the list of all the sales
int amountSoldInMarch = 0;
for (Sale sale: sales) {
    if (sale.getDate().getMonth() == Month.MARCH) {
        amountSoldInMarch  = sale.getAmount();
    }
}
System.out.println("Amount sold in March: "   amountSoldInMarch);

map:通过get取值,将部分字段映射到新数据(select字段)

filter:根据if判断过滤部分数据(where条件)

reduce:聚合,求和(聚合函数)

简而言之,相当于写一段SQL:

代码语言:javascript复制
select sum(amount)
from Sales
where extract(month from date) = 3;

看看是如何从原始代码转换为Stream API的:

代码语言:javascript复制
List<City> cities = ...;

int sum = 0;
for (City city: cities) {
    int population = city.getPopulation();
    if (population > 100_000) {
        sum  = population;
    }
}

System.out.println("Sum = "   sum);

假设Collection有这几个方法:

代码语言:javascript复制
int sum = cities.map(city -> city.getPopulation())
                .filter(population -> population > 100_000)
                .sum();

为什么Collection不提供这些方法呢?拆分为每一步:

代码语言:javascript复制
Collection<Integer> populations         = cities.map(city -> city.getPopulation());
Collection<Integer> filteredPopulations = populations.filter(population -> population > 100_000);
int sum                                 = filteredPopulations.sum();

假如有1000个city,那么中间数据也是Collection,就会产生很多冗余的中间数据。而for循环却不存在这个问题,因为它不会存储中间数据。虽然Collection提供方法能让代码看起来更好理解,但却会导致大量的冗余数据。所以不得不设计一套Stream API来支持map-filter-reduce。

代码语言:javascript复制
Stream<City> streamOfCities         = cities.stream();
Stream<Integer> populations         = streamOfCities.map(city -> city.getPopulation());
Stream<Integer> filteredPopulations = populations.filter(population -> population > 100_000);
int sum = filteredPopulations.sum(); // in fact this code does not compile; we'll fix it later

The streams created in this code, streamOfCitiespopulations and filteredPopulations must all be empty objects.

It leads to a very important property of streams:

A stream is an object that does not store any data.

Using streams is about creating pipelines of operations. A pipeline is made of a series of method calls on a stream. Each call produces another stream. Then at some point, a last call produces a result.

Adding Intermediate Operations

collect()

Stream本身不会存储数据,通过collect存储为List:

代码语言:javascript复制
List<String> strings = List.of("one", "two", "three", "four");
Function<String, Integer> toLength = String::length;
Stream<Integer> ints = strings.stream()
                              .map(toLength);
代码语言:javascript复制
List<String> strings = List.of("one", "two", "three", "four");
List<Integer> lengths = strings.stream()
                               .map(String::length)
                               .collect(Collectors.toList());
System.out.println("lengths = "   lengths);
代码语言:javascript复制
lengths = [3, 3, 5, 4]

一些方法

  • distinct()
  • sorted()
  • skip()
  • limit()

contact()

连接流

代码语言:javascript复制
List<Integer> list0 = List.of(1, 2, 3);
List<Integer> list1 = List.of(4, 5, 6);
List<Integer> list2 = List.of(7, 8, 9);

// 1st pattern: concat
List<Integer> concat = 
    Stream.concat(list0.stream(), list1.stream())
          .collect(Collectors.toList());

// 2nd pattern: flatMap
List<Integer> flatMap =
    Stream.of(list0.stream(), list1.stream(), list2.stream())
          .flatMap(Function.identity())
          .collect(Collectors.toList());

System.out.println("concat  = "   concat);
System.out.println("flatMap = "   flatMap);
代码语言:javascript复制
concat  = [1, 2, 3, 4, 5, 6]
flatMap = [1, 2, 3, 4, 5, 6, 7, 8, 9]

连接流,推荐使用flatMap()

With the flatmap pattern, you just create a single stream to hold all your streams and do the flatmap. The overhead is much lower.

concat produces a SIZED stream, whereas flatmap does not.

Creating Streams

前面我们看到Collection的stream()方法可以创建流,此外还有很多其他方式创建流:

  • a vararg argument;
  • a supplier;
  • a unary operator, that generates the next element from the previous one;
  • a builder;
  • the characters of a string;
  • the lines of a text file;
  • the elements created by splitting a string of characters with a regular expressions;
  • a random variable, that can create a stream of random numbers.
代码语言:javascript复制
Iterator<String> iterator = ...;

long estimateSize = 10L;
int characteristics = 0;
Spliterator<String> spliterator = Spliterators.spliterator(strings.iterator(), estimateSize, characteristics);

boolean parallel = false;
Stream<String> stream = StreamSupport.stream(spliterator, parallel);

空流:

代码语言:javascript复制
Stream<String> empty = Stream.empty();
List<String> strings = empty.collect(Collectors.toList());

System.out.println("strings = "   strings);

Creating a Stream from a Vararg or an Array

代码语言:javascript复制
Stream<Integer> intStream = Stream.of(1, 2, 3);
List<Integer> ints = intStream.collect(Collectors.toList());

System.out.println("ints = "   ints);
代码语言:javascript复制
String[] stringArray = {"one", "two", "three"};
Stream<String> stringStream = Arrays.stream(stringArray);
List<String> strings = stringStream.collect(Collectors.toList());

System.out.println("strings = "   strings);

Creating a Stream from a Supplier

代码语言:javascript复制
Stream<String> generated = Stream.generate(() -> " ");
List<String> strings = 
        generated
           .limit(10L)
           .collect(Collectors.toList());

System.out.println("strings = "   strings);

Creating a Stream from a UnaryOperator and a Seed

代码语言:javascript复制
Stream<String> iterated = Stream.iterate(" ", s -> s   " ");
iterated.limit(5L).forEach(System.out::println);

Creating a Stream from a Range of Numbers

代码语言:javascript复制
String[] letters = {"A", "B", "C", "D"};
List<String> listLetters =
    IntStream.range(0, 10)
             .mapToObj(index -> letters[index % letters.length])
             .collect(Collectors.toList());
System.out.println("listLetters = "   listLetters);

Creating a Stream of Random Numbers

代码语言:javascript复制
Random random = new Random(314L);
List<Integer> randomInts = 
    random.ints(10, 1, 5)
          .boxed()
          .collect(Collectors.toList());
System.out.println("randomInts = "   randomInts);

Creating a Stream from the Characters of a String

Java SE 10

代码语言:javascript复制
String sentence = "Hello Duke";
List<String> letters =
    sentence.chars()
            .mapToObj(codePoint -> (char)codePoint)
            .map(Object::toString)
            .collect(Collectors.toList());
System.out.println("letters = "   letters);

Creating a Stream from the Lines of a Text File

代码语言:javascript复制
Path log = Path.of("/tmp/debug.log"); // adjust to fit your installation
try (Stream<String> lines = Files.lines(log)) {
    
    long warnings = 
        lines.filter(line -> line.contains("WARNING"))
             .count();
    System.out.println("Number of warnings = "   warnings);
    
} catch (IOException e) {
    // do something with the exception
}

Creating a Stream from a Regular Expression

代码语言:javascript复制
String sentence = "For there is good news yet to hear and fine things to be seen";

Pattern pattern = Pattern.compile(" ");
Stream<String> stream = pattern.splitAsStream(sentence);
List<String> words = stream.collect(Collectors.toList());

System.out.println("words = "   words);

Creating a Stream with the Builder Pattern

代码语言:javascript复制
Stream.Builder<String> builder = Stream.<String>builder();

builder.add("one")
       .add("two")
       .add("three")
       .add("four");

Stream<String> stream = builder.build();

List<String> list = stream.collect(Collectors.toList());
System.out.println("list = "   list);

Creating a Stream on an HTTP Source

代码语言:javascript复制
// The URI of the file
URI uri = URI.create("https://www.gutenberg.org/files/98/98-0.txt");

// The code to open create an HTTP request
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder(uri).build();


// The sending of the request
HttpResponse<Stream<String>> response = client.send(request, HttpResponse.BodyHandlers.ofLines());
List<String> lines;
try (Stream<String> stream = response.body()) {
    lines = stream
        .dropWhile(line -> !line.equals("A TALE OF TWO CITIES"))
        .takeWhile(line -> !line.equals("*** END OF THE PROJECT GUTENBERG EBOOK A TALE OF TWO CITIES ***"))
        .collect(Collectors.toList());
}
System.out.println("# lines = "   lines.size());

Reducing a Stream

Compute a reduction by just providing a binary operator that operates on only two elements. This is how the reduce() method works in the Stream API.

代码语言:javascript复制
Stream<Integer> ints = Stream.of(0, 0, 0, 0);

int sum = ints.reduce(10, (a, b) -> a   b);
System.out.println("sum = "   sum);

Adding a Terminal Operation

In fact, you should use this reduce() method as a last resort, only if you have no other solution.

要想reduce stream,还有其他更多方法,比如count()、sum()等。

count()

代码语言:javascript复制
Collection<String> strings =
        List.of("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten");

long count =
        strings.stream()
                .filter(s -> s.length() == 3)
                .count();
System.out.println("count = "   count);

forEach()

代码语言:javascript复制
Stream<String> strings = Stream.of("one", "two", "three", "four");
strings.filter(s -> s.length() == 3)
       .map(String::toUpperCase)
       .forEach(System.out::println);

collect()

代码语言:javascript复制
Stream<String> strings = Stream.of("one", "two", "three", "four");

List<String> result = 
    strings.filter(s -> s.length() == 3)
           .map(String::toUpperCase)
           .collect(Collectors.toList());

max() min()

代码语言:javascript复制
Stream<String> strings = Stream.of("one", "two", "three", "four");
String longest =
     strings.max(Comparator.comparing(String::length))
            .orElseThrow();
System.out.println("longest = "   longest);

findFirst() findAny()

代码语言:javascript复制
Collection<String> strings =
        List.of("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten");

String first =
    strings.stream()
           // .unordered()
           // .parallel()
           .filter(s -> s.length() == 3)
           .findFirst()
           .orElseThrow();

System.out.println("first = "   first);

allMatch() anyMatch() noneMatch()

代码语言:javascript复制
Collection<String> strings =
    List.of("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten");

boolean noBlank  = 
        strings.stream()
               .allMatch(Predicate.not(String::isBlank));
boolean oneGT3   = 
        strings.stream()
               .anyMatch(s -> s.length() == 3);
boolean allLT10  = 
        strings.stream()
               .noneMatch(s -> s.length() > 10);
        
System.out.println("noBlank = "   noBlank);
System.out.println("oneGT3  = "   oneGT3);
System.out.println("allLT10 = "   allLT10);

Finding the Characteristics

ORDERED

The order in which the elements of the stream are processed matters.

DISTINCT

There are no doubles in the elements processed by that stream.

NONNULL

There are no null elements in that stream.

SORTED

The elements of that stream are sorted.

SIZED

The number of elements this stream processes is known.

SUBSIZED

Splitting this stream produces two SIZED streams.

代码语言:javascript复制
Collection<String> stringCollection = List.of("one", "two", "two", "three", "four", "five");

Stream<String> strings = stringCollection.stream().sorted();
Stream<String> filteredStrings = strings.filtered(s -> s.length() < 5);
Stream<Integer> lengths = filteredStrings.map(String::length);
代码语言:javascript复制
Collection<String> stringCollection = List.of("one", "two", "two", "three", "four", "five");

Stream<String> strings = stringCollection.stream().distinct();
Stream<String> filteredStrings = strings.filtered(s -> s.length() < 5);
Stream<Integer> lengths = filteredStrings.map(String::length);

Using a Collector

代码语言:javascript复制
List<Integer> numbers =
IntStream.range(0, 10)
         .boxed()
         .collect(Collectors.toList());
System.out.println("numbers = "   numbers);
代码语言:javascript复制
Set<Integer> evenNumbers =
IntStream.range(0, 10)
         .map(number -> number / 2)
         .boxed()
        .collect(Collectors.toSet());
System.out.println("evenNumbers = "   evenNumbers);
代码语言:javascript复制
LinkedList<Integer> linkedList =
IntStream.range(0, 10)
         .boxed()
         .collect(Collectors.toCollection(LinkedList::new));
System.out.println("linked listS = "   linkedList);

couting

代码语言:javascript复制
Collection<String> strings = List.of("one", "two", "three");

long count = strings.stream().count();
long countWithACollector = strings.stream().collect(Collectors.counting());

System.out.println("count = "   count);
System.out.println("countWithACollector = "   countWithACollector);

joining

代码语言:javascript复制
String joined = 
    IntStream.range(0, 10)
             .boxed()
             .map(Object::toString)
             .collect(Collectors.joining(", "));

System.out.println("joined = "   joined);

partitioningBy

代码语言:javascript复制
Collection<String> strings =
    List.of("one", "two", "three", "four", "five", "six", "seven", "eight", "nine",
            "ten", "eleven", "twelve");

Map<Boolean, List<String>> map =
    strings.stream()
           .collect(Collectors.partitioningBy(s -> s.length() > 4));

map.forEach((key, value) -> System.out.println(key   " :: "   value));

groupingBy

代码语言:javascript复制
Collection<String> strings =
    List.of("one", "two", "three", "four", "five", "six", "seven", "eight", "nine",
            "ten", "eleven", "twelve");

Map<Integer, List<String>> map =
    strings.stream()
           .collect(Collectors.groupingBy(String::length));

map.forEach((key, value) -> System.out.println(key   " :: "   value));

groupingBy counting

代码语言:javascript复制
Collection<String> strings =
        List.of("one", "two", "three", "four", "five", "six", "seven", "eight", "nine",
                "ten", "eleven", "twelve");

Map<Integer, Long> map =
    strings.stream()
           .collect(
               Collectors.groupingBy(
                   String::length, 
                   Collectors.counting()));

map.forEach((key, value) -> System.out.println(key   " :: "   value));
代码语言:javascript复制
3 :: 4
4 :: 3
5 :: 3
6 :: 2

groupingBy joining

代码语言:javascript复制
Collection<String> strings =
        List.of("one", "two", "three", "four", "five", "six", "seven", "eight", "nine",
                "ten", "eleven", "twelve");

Map<Integer, String> map =
        strings.stream()
                .collect(
                        Collectors.groupingBy(
                                String::length,
                                Collectors.joining(", ")));
map.forEach((key, value) -> System.out.println(key   " :: "   value));
代码语言:javascript复制
3 :: one, two, six, ten
4 :: four, five, nine
5 :: three, seven, eight
6 :: eleven, twelve

toMap

代码语言:javascript复制
Collection<String> strings =
    List.of("one", "two", "three", "four", "five", "six", "seven", "eight", "nine",
            "ten", "eleven", "twelve");

Map<Integer, String> map =
    strings.stream()
            .collect(
                    Collectors.toMap(
                            element -> element.length(),
                            element -> element, 
                            (element1, element2) -> element1   ", "   element2));

map.forEach((key, value) -> System.out.println(key   " :: "   value));
代码语言:javascript复制
3 :: one, two, six, ten
4 :: four, five, nine
5 :: three, seven, eight
6 :: eleven, twelve
  1. element -> element.length() is the key mapper.
  2. element -> element is the value mapper.
  3. (element1, element2) -> element1 ", " element2) is the merge function, called with the two elements that have generated the same key.

Parallelizing Streams

代码语言:javascript复制
int parallelSum = 
    IntStream.range(0, 10)
             .parallel()
             .sum();

参考资料: The Stream API https://dev.java/learn/api/streams/

0 人点赞