0

Flink offers TTL configuration for managed state and, when using RocksDB as backend, it executes cleanup in a custom compaction filter (if I understand correctly). However, in the case of keyed windowed state in a ProcessWindowFunction, the expectation is that we override the clear method and explicitly call something like

context.windowState().*.clear()

If the state descriptor does not configure TTL, does cleanup still occur after the clear callback? If not, and cleanup for this type of state depends solely on sizes in RocksDB's levels, what's the default setting and is it configurable?

1 Answer 1

0

If the state descriptor does not configure TTL, does cleanup still occur after the clear callback?

Yes, unless the state descriptor was used to create state stored in KeyedStateStore ProcessWindowFunction.Context#globalState. This global state is the only state that is kept after windows are cleared. If you have an ever-growing key space, you should configure state TTL for any globalState you use, as otherwise globalState for stale keys will never be cleaned up.

FWIW, there's nothing RocksDB-specific about this. The answer is the same for any of the state backends.

Sign up to request clarification or add additional context in comments.

2 Comments

My understanding was that, for RocksDB, data is first marked as stale/expired and only deleted during compaction. Does your answer mean that compaction doesn't come into play at all in this scenario?
No, I'm just saying that the mechanism used for the cleanup of expired state is an implementation detail that doesn't affect how you use these APIs, or whether you need to setup TTL. Each state backend has some way to eventually clean up expired state.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.