21 questions
-3
votes
1
answer
145
views
Flink Job Manager Direct Buffer Memory gets exhausted when checkpointing enabled
Issue:
Flink application throws Thread 'jobmanager-io-thread-25' produced an uncaught exception. java.lang.OutOfMemoryError: Direct buffer memory and terminates after running for 2-3 days.
No matter ...
1
vote
1
answer
212
views
Problem with Kafka committing offsets when stopping a Flink job (AT_LEAST_ONCE and EXACTLY_ONCE)
We are using Apache Flink with Kafka and the AT_LEAST_ONCE strategy. The following problem occurs: when stopping a job, Flink commits offsets for messages that have not yet been processed, resulting ...
0
votes
0
answers
29
views
Flink checkpoint storage keeps on increasing in taskmanager
I have a flink Streaming job that takes in stream of events as input from kafka. This stream is then filtered, and the events are then collected in EventtimeWindow of 1 min. This 1 minute window is ...
-2
votes
1
answer
121
views
What data is stored in flink checkpoints?
I have a case to handle restart flink job. I need use checkpointing and use metadata (state of kafkasource input) of it to process. Currently, checkpointing auto use metadata to recovery, but i wanna ...
0
votes
0
answers
71
views
Role of execution.checkpointing.min-pause in case of checkpoint failure
Basic question but just got stumbled on expected behavior of
execution.checkpointing.min-pause in case of checkpoint failure and appreciate your help to correct my understanding
Assuming:
execution....
1
vote
0
answers
96
views
Why is my Flink Tumbling window's checkpointed data size keeps growing
I have a Flink pipeline with RMQ source, some filters, enrichers ,keyby, aggregator ,Tumbling window(of 2sec,no watermark strategy is used. Processing time is used for windows trigger) and then RMQ ...
0
votes
1
answer
239
views
How to configure State TTL for aggregateFunction along with windows operator
I am having a flink streaming pipeline with Rabbitmq source,some filter, map , aggregatorFunction and windows opertors (Tumbling window with 5mins), Rabbitmq sink configured. And I'm using incremental ...
1
vote
1
answer
563
views
Flink Checkpoints Stalling and Timing Out with Latency-related Errors
Recently, I've upgraded an existing Flink job (previously running Flink 1.15) to run against the official Flink Kubernetes Operator (targeting Flink 1.18) and have started to see some strange ...
1
vote
0
answers
218
views
Flink HA unable to recover jobs from recovery folder in S3 and jobmanager goes in CrashLoopBackOff indefinitely
The flink cluster is running on version 1.14.3. The JobManager goes in CrashLoopBackOff and unable to recover the flink jobs from the HA recovery folder.
ERROR: The recovery file ...
1
vote
0
answers
594
views
Flink checkpoints retained
I have deployed task manager and jobmanager in docker container and configured following in flink-conf.yaml.
state.checkpoints.dir: file:///tmp/flink/checkpoints
state.storage.fs.memory-threshold: 0
...
1
vote
1
answer
489
views
How to recover Flink job (Kubernetes) from savepoint?
I'm running Flink 1.14 app (jar embded in docker image) over Kubernetes cluster. Configs like parallelism, numberOfTaskSlots and etc. are specified in ConfigMap as flink-conf.yaml.
Checkpoint ...
1
vote
0
answers
45
views
Modifying State in Flink's Aggregate Function within a Windowed Operation
I'm working on a Flink application where I'm using an aggregate function over a window. I've been able to successfully read the state of its outputs using the provided APIs.
However, in my specific ...
0
votes
1
answer
1k
views
Incremental Checkpoint Data Size ( Flink)
we are using Flink and we have enabled the incremental checkpointing by setting state.backend.incremental=true. We are using rocksdb as state backend. With incremental checkpointing, we expect "...
0
votes
0
answers
504
views
Old Flink checkpoints not always correctly deleted
Summary
We are using Flink 1.15.1 and have long-running stateful Flink jobs ingesting data from Kafka topics. They are configured to write checkpoints, with a RocksDB backend on S3.
We have noticed ...
0
votes
1
answer
294
views
Flink: Key serializer used in Java DataSet API incompatible with that used in Scala DataStream API
Our Flink code (currently, using Flink 1.12) is written in Scala and generally contains a bunch of keyed time windows and process functions as operators. We have externalized savepoint storage to ...
1
vote
1
answer
870
views
Flink upload checkpoint to AWS S3 ERROR: Forbidden Status Code: 403
I deployed a flink application on a Kind cluster(https://kind.sigs.k8s.io/)(1 master & 2 worker nodes) using a yaml file.
As I want to upload flink checkpoint to a S3 bucket, I manually created ...
0
votes
2
answers
226
views
upgrade Flink minor version and restore from checkpoint
From official doc, it says Flink support minor version upgrade - restoring a snapshot taken with an older minor version of Flink (1.x → 1.y)..
Q1. Does it means I can upgrade Flink version of my job ...
0
votes
1
answer
351
views
Flink Incremental CheckPointing Compaction
We have a forever running flink job which reads from kafka , creates sliding time windows with (stream intervals :1hr , 2 hr to 24 hr) and (slide intervals : 1 min , 10 min to 1 hours).
basically ...
0
votes
1
answer
893
views
How to Control Size of Flink Checkpoints
I am running a simple Flink aggregation job which consumes from Kafka and applies multiple windows(1 hr, 2 hr...upto 24 hours) with specific sliding interval and does the aggregation on windows. ...
0
votes
1
answer
380
views
Retain Flink Checkpoint on cancellation
I'm using Flink 1.15.0 and I want to keep triggered checkpoint when job is cancelled.
Flink indicates to set ExternalizeCheckpointCleanup mode in this way
env.getCheckpointConfig()....
1
vote
1
answer
1k
views
Apache Flink - streaming app doesn't start from checkpoint after stop and start
I have the following Flink streaming application running locally, written with the SQL API:
object StreamingKafkaJsonsToCsvLocalFs {
val brokers = "localhost:9092"
val topic = "...