All Articles

foreign join on kafka stream

๋ฉ”์‹œ์ง€์˜ ํ‚ค = ํŒŒํ‹ฐ์…˜ ํ‚ค ๊ฐ€์ง€๊ณ  ์ŠคํŠธ๋ฆผ-์ŠคํŠธ๋ฆผ ์กฐ์ธํ•˜๋Š” ๊ฒƒ์€ ๋งŽ์ด ๋ดค๋Š”๋ฐโ€ฆ์š”๊ตฌ์‚ฌํ•ญ์ด ํ•ญ์ƒ ๊ทธ๋Ÿฐ ๊ฒƒ์ด ์•„๋‹ˆ์–ด์„œ ํ‚ค๊ฐ€ ์•„๋‹Œ ํ•„๋“œ ๊ฐ€์ง€๊ณ  ์กฐ์ธํ•ด์•ผํ•  ๊ฒฝ์šฐ๋„ ์žˆ๋‹ค. ์ด๋Ÿฐ๊ฑฐ๋ฅผ ๋ญ๋ผ๊ณ  ๋ถ€๋ฅด๋Š”์ง€๋ฅผ ๋ชจ๋ฅด๋‹ค๊ฐ€ ์ด๋Ÿฐ ๊ธฐ๋Šฅ์„ Foreign-Key Join์ด๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋˜์—ˆ๋‹ค.

https://www.confluent.io/resources/kafka-summit-2020/crossing-the-streams-the-new-streaming-foreign-key-join-feature-in-kafka-streams/

์˜์ƒ์€ ์ด๋ฉ”์ผ, ์ด๋ฆ„ ๋“ฑ์„ ์ž…๋ ฅํ•ด์•ผ ๋ณผ ์ˆ˜ ์žˆ์ง€๋งŒ, ์Šฌ๋ผ์ด๋“œ๋งŒ ๋ด๋„ ํฐ ๋ฌด๋ฆฌ๋Š” ์—†๋‹ค. ์นดํ”„์นด 2.4๋ถ€ํ„ฐ ์žˆ๋‹ค๋‹ˆ๊นŒ ์ง€๊ธˆ ์“ธ ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์ด๊ธฐ๋„ ํ•˜๋‹ค (KSQL์—์„œ๋Š” ์•„์ง ์ง€์›ํ•˜์ง€ ์•Š๋Š”๋‹ค https://github.com/confluentinc/ksql/issues/4424)

์ข€ ์šฐ๋ ค?๊ฐ€ ๋๋˜ ๊ฒƒ์€ ๋น„์šฉ์ด๋‹ค. Ktable - Ktable์„ joinํ•˜๋ฉด ์˜ˆ๋ฅผ ๋“ค์–ด์„œ left join์ด๋ผ๊ณ  ํ•˜๋ฉด ์™ผ์ชฝ ํ…Œ์ด๋ธ” ํฌ๊ธฐ ๋งŒํผ rocksdb ์Šคํ† ๋ฆฌ์ง€๊ฐ€ ํ•„์š”ํ•˜๊ณ , ์นดํ”„์นด ๋ธŒ๋กœ์ปค ์Šคํ† ๋ฆฌ์ง€๋„ ํ•„์š”ํ•˜๋‹ค (Log compaction์ด ๋˜๋„ ์–ด์จŒ๋“  ํ‚ค์˜ ๊ฐฏ์ˆ˜์— ๋น„๋ก€ํ•ด์„œ ๊ณต๊ฐ„์ด ํ•„์š”ํ•˜๋‹ค sizing ๋ฌธ์„œ ์ฐธ์กฐ). ๊ฒŒ๋‹ค๊ฐ€ ๋‹น์—ฐํžˆ ๊ธฐ์กด์— ์žˆ๋˜ Ktable - ktable๋“ค๋„ ๊ณต๊ฐ„์„ ์ฐจ์ง€ํ•œ๋‹ค. ์›๋ž˜๋ถ€ํ„ฐ ์นดํ”„์นด๋ฅผ confluent์—์„œ ๋ฐ€๋“ฏ์ด source of truth๋กœ ์‚ผ๊ณ , ์ŠคํŠธ๋ฆผ์„ ๋งŽ์ด ์ผ์œผ๋ฉด ๋ณ„ ์ƒ๊ด€์ด ์—†๊ฒ ์ง€๋งŒ, ๋‹จ์ˆœํžˆ ๋ฉ”์‹œ์ง€ ์ „์†ก ๋ ˆ์ด์–ด๋กœ๋งŒ ์ƒ๊ฐ์„ ํ–ˆ๋‹ค๋ฉด ํˆฌ์ž๋ฅผ ํ›จ์”ฌ ๋Š˜๋ ค์•ผ ํ•  ๊ฒƒ์ด๋‹ค. ์‚ฌ์‹ค Stateful ์ŠคํŠธ๋ฆฌ๋ฐ์ด ์šฉ๋Ÿ‰ ์ฐจ์ง€ํ•˜๋Š” ๊ฒƒ์€ ๋‹น์—ฐํ•œ ๊ฒƒ์ด๊ณ , ์ด๊ฒŒ ์•„๊นŒ์šฐ๋ฉด ๊ทธ๋ƒฅ ์ฝ์„ ๋•Œ joinํ•˜๋ฉด ๋œ๋‹ค.


์ฐธ๊ณ ๋กœ Flink์—์„œ๋Š” ์ด๊ฑธ regular join์ด๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค (PK, foreignkey ๊ตฌ๋ถ„์„ ์•ˆ ํ•จ) https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/streaming/joins.html#regular-joins https://stackoverflow.com/questions/57118053/flink-dynamic-table-vs-kafka-stream-ktable

Published 30 Jan 2021

If I keep marking the dots, someday they will ๐Ÿ”—๐Ÿ”—
Hyeungshik Jung on Twitter