Flink Forward 2019--Flink相关(2)--如何join两个流

2019-06-21 16:12:26 浏览数 (1)

How to Join Two Data Streams--Piotr Nowojski(Ververica)

Joins are one of the most common operations in SQL. However it is far from trivial how to express and execute them in Streaming environment with continuously running queries.During this talk we will first look into why Join operations are more difficult on infinite data streams. Next we will check couple of different approaches to tackle this problem like Time Windowed Joins or the recent addition to Flink SQL: Temporal Joins. Temporal Tables and Temporal Joins are new concepts that provide an efficient solution to a common problem of for example data enrichment. Before Flink 1.7 data enrichment in SQL was often impossible to express using Windowed Joins or very inefficient when using Regular Joins. With Temporal Joins Flink provide an interesting and ANSI SQL complaint alternative way how to join two data streams.

Joins是SQL中最常见的操作之一。然而,如何在连续运行查询的流式环境中表达和执行这些查询并不是一件容易的事情,在本文中,我们将首先探讨为什么在无限的数据流上连接操作更加困难。接下来,我们将检查两种不同的方法来解决这个问题,例如时间窗连接或最近添加的Flink SQL:Temporal连接。时态表和时态连接是一个新概念,它为一个常见的问题(例如数据浓缩)提供了一个有效的解决方案。在Flink 1.7之前,SQL中的数据浓缩通常不可能使用窗口连接来表示,或者在使用常规连接时效率非常低。通过使用时态连接,Flink提供了一种有趣的和ANSI SQL投诉的替代方法,即如何连接两个数据流。

对应的现场视频已上传至B站,地址为 https://www.bilibili.com/video/av53226934/

0 人点赞