0

I have Java pod that after few days is restarted.

Looking at kubectl describe pod ... there is just the following:

Last State:     Terminated
  Reason:       Error
  Exit Code:    137

This message, in my experience, usually means that I have an OutOfMemoryError somewhere but looking at the log I don't see anything useful.

Is there a way to execute a script (or save few files) just before the inevitable restart? Something that could help me to identify the problem.

For example: in case the restart was caused by an OutOfMemoryError, would be wonderful if I could save the memory dump or the garbage collection logs.

3
  • 1
    See docs.oracle.com/en/java/javase/17/docs/specs/man/java.html Look at the -XX:OnOutOfMemoryError option Commented Aug 10, 2022 at 22:36
  • @tgdavies thanks, I had a look but I was unable to use that options because the pod was restarter before I could do anything. Commented Aug 10, 2022 at 23:08
  • Probably not an OOM exception then. Maybe the JVM is trying to allocate more memory than the pod has available? Try running with a small er heap size. Commented Aug 10, 2022 at 23:27

2 Answers 2

1

There is some solutions to do that:

  • you can mount a volume to your application, and configure log4j to write the log to a file in the volume, so the log will be persistent
  • the best solution is using a log collector (fluentd, logstash) to save the log in Elastic Search or a S3 file, or using a managed service like AWS cloudWatch, datadog, ...

To solve the problem of OOM, you can add a big request memory to your application (2-4G), then you can watch the memory usage by using top command or a tool to monitor your cluster (ex prometheus):

apiVersion: v1
kind: Pod
metadata:
  name: ...
spec:
  containers:
  - name: ...
    image: ...
    resources:
      requests:
        memory: "2G"
Sign up to request clarification or add additional context in comments.

3 Comments

Argh. The Deployment ... is invalid: spec.template.spec.restartPolicy: Unsupported value: "Never": supported values: "Always"
In fact, the deployment doesn't support the Never restart policy, but the pod does, so you can run your application as a simple pod (kubectl get pod ... -o yaml > pod.yml, then update the value for restartPolicy and remove the owner info, then apply it with a new name), or just update your question maybe someone else has an answer for the deployment
0

I found the below two ways to investigate the out of memory error in the Kubernetes. Either it would be best if you had a logging solution that will keep the logs or you can use --previous run to read the logs, which I generally use for debugging until it is the same pod that is in crashloop.

Write thread to stdout of the pod

You can take advantage of the lifecycle hook, and take a thread dump and write to stdout, so you will be able to see at k logs -f pod_name -c container_name --previous

          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "jcmd 1 Thread.print > /proc/1/fd/1"]          

pod-lifecycle

This will also help if you write dumb logs to Datadog or Elasticsearch.

Writing to volume

you will need to update the java command or env and deployment chart.

      serviceAccountName: {{ include "helm-chart.fullname" . }}
      volumes:
        - name: heap-dumps
          emptyDir: {}
      containers:
        - name: java-container
          volumeMounts:
          - name: heap-dumps
            mountPath: /dumps

add this env

ENV JAVA_OPTS="-XX:+CrashOnOutOfMemoryError  -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/dumps/oom.bin -Djava.io.tmpdir=/tmp"

you will be able to see what's going on in the JVM.

Some more config regarding JVM in the container that can help you to utlizie the advance option of the JVM running inside container.

-XX:InitialRAMPercentage=50.0 -XX:MinRAMPercentage=50.0 -XX:MaxRAMPercentage=85.0

he JVM has been modified to be aware that it is running in a Docker container and will extract container specific configuration information instead of querying the operating system.

jvm-in-a-container

best-practices-java-memory-arguments-for-containers

4 Comments

Hi @Adiii, thanks for sharing. Not clear how I can mount a volume in a kube deployment. Could you please elaborate this better?
I already mentioined in the section Writing to volume
kubernetes.io/docs/concepts/storage/volumes for furhter details you can look into this, but a i mentioned for quick around i will go for the thread one
Dump usually very heavy and hard to find , although there are tool but thread to stdout pretty straight forward and you can easily check which function trigger the OOM

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.