Newest 'flink-checkpoint' Questions

-3 votes

1 answer

145 views

Flink Job Manager Direct Buffer Memory gets exhausted when checkpointing enabled

Issue: Flink application throws Thread 'jobmanager-io-thread-25' produced an uncaught exception. java.lang.OutOfMemoryError: Direct buffer memory and terminates after running for 2-3 days. No matter ...

Strange

1,514

asked Nov 12 at 18:14

1 vote

1 answer

212 views

Problem with Kafka committing offsets when stopping a Flink job (AT_LEAST_ONCE and EXACTLY_ONCE)

We are using Apache Flink with Kafka and the AT_LEAST_ONCE strategy. The following problem occurs: when stopping a job, Flink commits offsets for messages that have not yet been processed, resulting ...

Руслан Цегельников

11

asked Feb 12 at 14:07

0 votes

0 answers

29 views

Flink checkpoint storage keeps on increasing in taskmanager

I have a flink Streaming job that takes in stream of events as input from kafka. This stream is then filtered, and the events are then collected in EventtimeWindow of 1 min. This 1 minute window is ...

user24351367

1

asked Feb 6 at 11:15

-2 votes

1 answer

121 views

What data is stored in flink checkpoints?

I have a case to handle restart flink job. I need use checkpointing and use metadata (state of kafkasource input) of it to process. Currently, checkpointing auto use metadata to recovery, but i wanna ...

ndycuong

1

asked Jan 21 at 8:22

0 votes

0 answers

71 views

Role of execution.checkpointing.min-pause in case of checkpoint failure

Basic question but just got stumbled on expected behavior of execution.checkpointing.min-pause in case of checkpoint failure and appreciate your help to correct my understanding Assuming: execution....

Faisal Ahmed Siddiqui

717

asked Dec 5, 2024 at 16:55

1 vote

0 answers

96 views

Why is my Flink Tumbling window's checkpointed data size keeps growing

I have a Flink pipeline with RMQ source, some filters, enrichers ,keyby, aggregator ,Tumbling window(of 2sec,no watermark strategy is used. Processing time is used for windows trigger) and then RMQ ...

Banupriya

194

asked May 15, 2024 at 5:11

0 votes

1 answer

239 views

How to configure State TTL for aggregateFunction along with windows operator

I am having a flink streaming pipeline with Rabbitmq source,some filter, map , aggregatorFunction and windows opertors (Tumbling window with 5mins), Rabbitmq sink configured. And I'm using incremental ...

Banupriya

194

asked May 8, 2024 at 4:24

1 vote

1 answer

563 views

Flink Checkpoints Stalling and Timing Out with Latency-related Errors

Recently, I've upgraded an existing Flink job (previously running Flink 1.15) to run against the official Flink Kubernetes Operator (targeting Flink 1.18) and have started to see some strange ...

Rion Williams

76.9k

asked Apr 15, 2024 at 16:33

1 vote

0 answers

218 views

Flink HA unable to recover jobs from recovery folder in S3 and jobmanager goes in CrashLoopBackOff indefinitely

The flink cluster is running on version 1.14.3. The JobManager goes in CrashLoopBackOff and unable to recover the flink jobs from the HA recovery folder. ERROR: The recovery file ...

Rupesh More

45

asked Jan 19, 2024 at 6:51

1 vote

0 answers

594 views

Flink checkpoints retained

I have deployed task manager and jobmanager in docker container and configured following in flink-conf.yaml. state.checkpoints.dir: file:///tmp/flink/checkpoints state.storage.fs.memory-threshold: 0 ...

Banupriya

194

asked Dec 5, 2023 at 6:40

1 vote

1 answer

489 views

How to recover Flink job (Kubernetes) from savepoint?

I'm running Flink 1.14 app (jar embded in docker image) over Kubernetes cluster. Configs like parallelism, numberOfTaskSlots and etc. are specified in ConfigMap as flink-conf.yaml. Checkpoint ...

deeplay

416

asked Oct 9, 2023 at 18:50

1 vote

0 answers

45 views

Modifying State in Flink's Aggregate Function within a Windowed Operation

I'm working on a Flink application where I'm using an aggregate function over a window. I've been able to successfully read the state of its outputs using the provided APIs. However, in my specific ...

Jakub Berezowski

11

asked Sep 20, 2023 at 9:29

0 votes

1 answer

1k views

Incremental Checkpoint Data Size ( Flink)

we are using Flink and we have enabled the incremental checkpointing by setting state.backend.incremental=true. We are using rocksdb as state backend. With incremental checkpointing, we expect "...

vikas J

11

asked Aug 24, 2023 at 18:16

0 votes

0 answers

504 views

Old Flink checkpoints not always correctly deleted

Summary We are using Flink 1.15.1 and have long-running stateful Flink jobs ingesting data from Kafka topics. They are configured to write checkpoints, with a RocksDB backend on S3. We have noticed ...

Colin Smetz

181

asked Aug 7, 2023 at 14:42

0 votes

1 answer

294 views

Flink: Key serializer used in Java DataSet API incompatible with that used in Scala DataStream API

Our Flink code (currently, using Flink 1.12) is written in Scala and generally contains a bunch of keyed time windows and process functions as operators. We have externalized savepoint storage to ...

vanrzk

11

asked Jul 25, 2023 at 10:15

1 vote

1 answer

870 views

Flink upload checkpoint to AWS S3 ERROR: Forbidden Status Code: 403

I deployed a flink application on a Kind cluster(https://kind.sigs.k8s.io/)(1 master & 2 worker nodes) using a yaml file. As I want to upload flink checkpoint to a S3 bucket, I manually created ...

aqqqqqq

15

asked Jul 5, 2023 at 22:22

0 votes

2 answers

226 views

upgrade Flink minor version and restore from checkpoint

From official doc, it says Flink support minor version upgrade - restoring a snapshot taken with an older minor version of Flink (1.x → 1.y).. Q1. Does it means I can upgrade Flink version of my job ...

Shenjiaqi

11

asked Jan 29, 2023 at 8:39

0 votes

1 answer

351 views

Flink Incremental CheckPointing Compaction

We have a forever running flink job which reads from kafka , creates sliding time windows with (stream intervals :1hr , 2 hr to 24 hr) and (slide intervals : 1 min , 10 min to 1 hours). basically ...

Pritam Agarwala

1

asked Nov 7, 2022 at 15:28

0 votes

1 answer

893 views

How to Control Size of Flink Checkpoints

I am running a simple Flink aggregation job which consumes from Kafka and applies multiple windows(1 hr, 2 hr...upto 24 hours) with specific sliding interval and does the aggregation on windows. ...

Pritam Agarwala

1

asked Nov 3, 2022 at 8:04

0 votes

1 answer

380 views

Retain Flink Checkpoint on cancellation

I'm using Flink 1.15.0 and I want to keep triggered checkpoint when job is cancelled. Flink indicates to set ExternalizeCheckpointCleanup mode in this way env.getCheckpointConfig()....

Vin

873

asked Sep 1, 2022 at 16:20

1 vote

1 answer

1k views

Apache Flink - streaming app doesn't start from checkpoint after stop and start

I have the following Flink streaming application running locally, written with the SQL API: object StreamingKafkaJsonsToCsvLocalFs { val brokers = "localhost:9092" val topic = "...

Gabio

9,564

asked Mar 14, 2022 at 16:08

Collectives™ on Stack Overflow

Flink Job Manager Direct Buffer Memory gets exhausted when checkpointing enabled

Problem with Kafka committing offsets when stopping a Flink job (AT_LEAST_ONCE and EXACTLY_ONCE)

Flink checkpoint storage keeps on increasing in taskmanager

What data is stored in flink checkpoints?

Role of execution.checkpointing.min-pause in case of checkpoint failure

Why is my Flink Tumbling window's checkpointed data size keeps growing

How to configure State TTL for aggregateFunction along with windows operator

Flink Checkpoints Stalling and Timing Out with Latency-related Errors

Flink HA unable to recover jobs from recovery folder in S3 and jobmanager goes in CrashLoopBackOff indefinitely

Flink checkpoints retained

How to recover Flink job (Kubernetes) from savepoint?

Modifying State in Flink's Aggregate Function within a Windowed Operation

Incremental Checkpoint Data Size ( Flink)

Old Flink checkpoints not always correctly deleted

Flink: Key serializer used in Java DataSet API incompatible with that used in Scala DataStream API

Flink upload checkpoint to AWS S3 ERROR: Forbidden Status Code: 403

upgrade Flink minor version and restore from checkpoint

Flink Incremental CheckPointing Compaction

How to Control Size of Flink Checkpoints

Retain Flink Checkpoint on cancellation

Apache Flink - streaming app doesn't start from checkpoint after stop and start

Hot Network Questions