My team migrated to Airbyte at the start of 2024 and mid-year, we started using the CDC capability of SQL Server.
However, one of the jobs has started failing again on a java heap space error. The CDC _CT table had approx. 91M rows at the time of investigation and the log file retention period on the DB is 3 days.
Important point to note is the job, containing only one table, runs fine syncing about 2M rows several times a day.
However, once a month when a month-end process kicks off and initiates a large change on the table, the job starts failing.
This is our current values.yml configuration:
global:
edition: "community"
jobs:
resources:
limits:
cpu: 1000m
memory: 12Gi ## e.g. 500m
requests:
cpu: 500m
memory: 2Gi
env_vars:
HTTP_IDLE_TIMEOUT: 1800s
DEBEZIUM_MAX_QUEUE_SIZE_IN_BYTES: 536870912
#LOG_LEVEL: DEBUG
CDC_LOG_LEVEL: DEBUG
#DEBEZIUM_LOG_LEVEL: DEBUG
MSSQL_CDC_LOG_LEVEL: DEBUG
JOB_MAIN_CONTAINER_MEMORY_REQUEST: 2Gi
JOB_MAIN_CONTAINER_MEMORY_LIMIT: 15Gi
NORMALIZATION_JOB_MAIN_CONTAINER_MEMORY_REQUEST: 2Gi
NORMALIZATION_JOB_MAIN_CONTAINER_MEMORY_LIMIT: 8Gi
JAVA_OPTS: "-XX:+ExitOnOutOfMemoryError -XX:MaxRAMPercentage=80.0 -XX:+UseG1GC"
webapp:
ingress:
annotations:
kubernetes.io/ingress.class: internal
nginx.ingress.kubernetes.io/proxy-body-size: 16m
nginx.ingress.kubernetes.io/proxy-send-timeout: 1800
nginx.ingress.kubernetes.io/proxy-read-timeout: 1800
airbyte-bootloader:
resources:
limits:
cpu: 1000m
memory: 5Gi ## e.g. 500m
requests:
cpu: 500m
memory: 1Gi
worker:
enabled: true
# -- Number of worker replicas
replicaCount: 1
image:
# -- The repository to use for the airbyte worker image.
repository: airbyte/worker
# -- the pull policy to use for the airbyte worker image
pullPolicy: IfNotPresent
## worker resource requests and limits
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
## We usually recommend not to specify default resources and to leave this as a conscious
## choice for the user. This also increases chances charts run on environments with little
## resources, such as Minikube. If you do want to specify resources, uncomment the following
## lines, adjust them as necessary, and remove the curly braces after 'resources:'.
resources:
#! -- The resources limits for the worker container
limits:
memory: 5Gi
cpu: 500m
# -- The requested resources for the worker container
requests:
memory: 1Gi
cpu: 250m
Related GitHub issue raised on Nov 05 - https://github.com/airbytehq/airbyte/discussions/48348?sort=new