I have deployed a MySQL database (statefulset) on Kubernetes zonal cluster, running as a service (GKE) in Google Cloud Platform.
The zonal cluster consist of 3 instances of type e2-medium.
The MySQL container cannot start due to the following error.
kubectl logs mysql-statefulset-0
2022-02-07 05:55:38+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.35-1debian10 started.
find: '/var/lib/mysql/': Input/output error
Last seen events.
4m57s Warning Ext4Error gke-cluster-default-pool-rnfh kernel-monitor, gke-cluster-default-pool-rnfh EXT4-fs error (device sdb): __ext4_find_entry:1532: inode #2: comm mysqld: reading directory lblock 0 40d 8062 gke-cluster-default-pool-rnfh
3m22s Warning BackOff pod/mysql-statefulset-0 spec.containers{mysql} kubelet, gke-cluster-default-pool-rnfh Back-off restarting failed container
Nodes.
kubectl get node -owide
gke-cluster-default-pool-ayqo Ready <none> 54d v1.21.5-gke.1302 So.Me.I.P So.Me.I.P Container-Optimized OS from Google 5.4.144+ containerd://1.4.8
gke-cluster-default-pool-rnfh Ready <none> 54d v1.21.5-gke.1302 So.Me.I.P So.Me.I.P Container-Optimized OS from Google 5.4.144+ containerd://1.4.8
gke-cluster-default-pool-sc3p Ready <none> 54d v1.21.5-gke.1302 So.Me.I.P So.Me.I.P Container-Optimized OS from Google 5.4.144+ containerd://1.4.8
I also noticed that rnfh node is out of memory.
kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
gke-cluster-default-pool-ayqo 117m 12% 992Mi 35%
gke-cluster-default-pool-rnfh 180m 19% 2953Mi 104%
gke-cluster-default-pool-sc3p 179m 19% 1488Mi 52%
MySql mainfest
# HEADLESS SERVICE
apiVersion: v1
kind: Service
metadata:
name: mysql-headless-service
labels:
kind: mysql-headless-service
spec:
clusterIP: None
selector:
tier: mysql-db
ports:
- name: 'mysql-http'
protocol: 'TCP'
port: 3306
---
# STATEFUL SET
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql-statefulset
spec:
selector:
matchLabels:
tier: mysql-db
serviceName: mysql-statefulset
replicas: 1
template:
metadata:
labels:
tier: mysql-db
spec:
terminationGracePeriodSeconds: 10
containers:
- name: my-mysql
image: my-mysql:latest
imagePullPolicy: Always
args:
- "--ignore-db-dir=lost+found"
ports:
- name: 'http'
protocol: 'TCP'
containerPort: 3306
volumeMounts:
- name: mysql-pvc
mountPath: /var/lib/mysql
env:
- name: MYSQL_ROOT_USER
valueFrom:
secretKeyRef:
name: mysql-secret
key: mysql-root-username
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: mysql-root-password
- name: MYSQL_USER
valueFrom:
configMapKeyRef:
name: mysql-config
key: mysql-username
- name: MYSQL_PASSWORD
valueFrom:
configMapKeyRef:
name: mysql-config
key: mysql-password
- name: MYSQL_DATABASE
valueFrom:
configMapKeyRef:
name: mysql-config
key: mysql-database
volumeClaimTemplates:
- metadata:
name: mysql-pvc
spec:
storageClassName: 'mysql-fast'
resources:
requests:
storage: 120Gi
accessModes:
- ReadWriteOnce
- ReadOnlyMany
MySQL storage class manifest:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: mysql-fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate
Why Kubernetes is trying to schedule pod in out of memory node?
UPDATES
I've added requests and limits to MySQL manifest to improve the Qos Class. Now the Qos Class is Guaranteed.
Unfortunately, Kubernetes still trying to schedule to out of memory rnfh node.
kubectl describe po mysql-statefulset-0 | grep node -i
Node: gke-cluster-default-pool-rnfh/So.Me.I.P
kubectl describe po mysql-statefulset-0 | grep qos -i
QoS Class: Guaranteed
I also noticed that rnfh node is out of memory.- this is AFTER your mysql pod was deployed to the node?my-mysql:latest. Is that your own image? When I've usedimage: mysql:5.7everything is working as expected and last entry from log is[Note] mysqld: ready for connections.If you would useimage: mysql:5.7you also are getting this issue? If this is your custom image, there is possibility that something fell into loop and it's "eating" your resource. With one pod with image mysql:5.7 on e2-medium I have only754Mi 26%my-mysql:latestbased onmysql:5.7with additional COPYmysql.cnfto/etc/mysql/conf.d/custom.cnf. In cnf file I defineddefault-character-set = utf8mb4mysql:5.7image? Do you have more statefulsets or deploymends in this claster? Also what output you will get when you execute$ kubectl top pods -A | sort? Maybe there is something else which consumes your memory. I don't see any node affinity or anything similar. Maybe your sql was deployed and later another application started to consume memory. It's whole new cluster?rnfamachine. The container was probably evicted from another node.