A Step-By-Step Tutorial for Deploying PostgreSQL on Kubernetes

By Albert - On Dec 13, 2023

Using Kubernetes to deploy your PostgreSQL database will make it available and easy to scale.

Kubernetes PostgreSQL is a stateful application, meaning it needs to store data persistently. Container ephemeral storage is insufficient, so a Persistent Volume Claim must be used.

Deployment Manifest

PostgreSQL is a stateful workload requiring a dedicated, persistent storage volume to store the database’s write-ahead logs. This ensures data integrity and allows the operator to roll back the database to a previous time in case of server failure.

The deployment manifest defines the PostgreSQL instance in YAML format, including its configuration and environment variables. It also specifies a Pod SecurityContext for the database container, restricting it to only the Linux privileges required by the application.

In addition, the operator supports the Kubernetes storage class feature, allowing users to request expandable storage volumes for the database instances. This can be useful for high-availability applications that need to scale up or down.

Finally, the operator allows you to specify a persistent volume claim policy for the database instance that determines whether or not the PVC will be deleted when the instance is terminated. To avoid losing your data, set the value of this parameter to retain in your instance YAML file.

StatefulSet

Kubernetes is well-known for its ability to manage stateless applications. Stateful applications, however, require more complex and flexible Deployment and scaling strategies. In these scenarios, you want to ensure that each Pod has a unique network identity and access to persistent storage. You also want to guarantee that Pods are created and removed in a specific order. StatefulSets enables you to accomplish this task.

A StatefulSet is a Deployment that contains multiple replicas of a database application. The Deployment can specify a rolling update strategy, allowing you to perform updates without downtime. The Deployment can also define a maxUnavailable and a maxSurge field to control the number of replicas that can be unavailable at any one time during an update.

The StatefulSet can also define a volumeClaimTemplate to provision PersistentVolumeClaims for the pods in the StatefulSet dynamically. For example, the template below specifies a volume called www for each of the three Pods in the StatefulSet. The volume is accessible to each of the Pods through its mount path. This is especially useful for database applications that use replication to manage the state of each replica.

Pods

As a stateful service, PostgreSQL requires its data to be persistent. It uses a write-ahead log (WAL) to reapply logged changes to the database. The WAL is stored on a Persistent Volume Claim. A Persistent Volume Claim’s lifecycle is independent of the Pod that claims it. This means that even if the Pod is deleted or killed by memory limits or OOMkill, its data will be restored to another Pod on the same cluster.

A Pod has a ConfigMap configuration that stores environment parameters to be fetched by the Pod when it starts up. The values in a ConfigMap can be key-pair strings, whole files, or both. Values in a ConfigMap are not encrypted, so they should be used only with data that is not sensitive.

To ensure your Deployment is up and running, run kubectl to check the status of all Pods in the Postgres namespace. Ensure that the Pod with ID PostgreSQL-dev-0 has a Status of Running and that its PersistentVolumeClaim is mounted to a host. You can also check the Pod’s logs to verify that the database is ready for connections, as shown in the screenshot below.

Service

A Service is a component in Kubernetes that allows you to expose your database to outside applications. The deployment manifest configures the service and provides a path to your database. It also specifies the number of replicas deployed and what container image should be used to run the PostgreSQL cluster. It also includes information such as CPU resources, configured secrets, and the persistent volume claim that should be attached to the cluster.

When deploying the service, you can also specify a Blue-Green deployment strategy. This deployment strategy simulates a disaster scenario and keeps one environment active while allowing the other to be restored without impacting your production data.

You can also set a security policy for the service using the Seccomp field in the deployment manifest. This will restrict the Pod’s syscalls to what is allowed by the PostgreSQL version being used. You can also specify a readOnlyUser or readWriteUser to control the privileges granted to the Service Binding application user. You can also set the instance persistent volume claim retention policy, which determines whether or not the PVC is deleted when the instance is deleted.

Database

PostgreSQL requires a persistent database to maintain data availability. To do so, it relies on WAL (Write-Ahead Logging), which records changes in the database before they are written to disk. These changes can be recovered in a disaster by replaying the WAL archive.

Using a headless service, Kubernetes provides a simple way to deploy your PostgreSQL database as a StatefulSet. The CloudNativePG operator can easily deploy and scale a PostgreSQL cluster with automatic upgrades, backups, and self-healing features.

You must have a Kubernetes cluster to deploy PostgreSQL with the CloudNativePG operator. You can use any public cloud provider that supports Kubernetes or create a local Kubernetes environment. You also need Helm and a Kubectl tool installed on your machine. Helm is an application deployment manager for Kubernetes that simplifies the process of deploying applications, enables easy updates, and allows the sharing of charts. It also has a rollback feature to restore your previous application version.