Kubernetes: PersistentVolume and PersistentVolumeClaim – an overview with examples

By | 08/05/2020
 

For the persistent data Kubernetes provides two main types of objects – the PersistentVolume and PersistentVolumeClaim.

PersistentVolume – is a storage device and a filesystem volume on it, for example, it could be AWS EBS, which is attached to an AWS EC2, and from the cluster’s perspective of view, a PersistentVolume is a similar resource like let’s say a Kubernetes Worker Node.

PersistentVolumeClaim in its turn is a request to use such a PersistentVolume resource and is similar to a Kubernetes Pod – as a pod is requesting a WorkernNode’s resource, a PersistentVolumeClaim will request resources from a PersistentVolume: as a Pod requesting a CPU, memory from a WorkerNode – a PersistentVolumeClaim will request a necessary storage size and an access type – ReadWriteOnce, ReadOnlyMany, or ReadWriteMany, see the AccessModes.

A PersistentVolume can be created in two ways – a static, and dynamic (recommended one).

When creating a PV statically, you’ll have to create a storage device first, for example, AWS EBS, which will be used by a PersistentVolume.

In case of a cluster wasn’t able to find an appropriate PV for a PersistentVolumeClaim н- it can create a new storage device exactly for this PVC – this will be the dynamic PV creation way.

To make this works a PVC has to have a Storage Class set the same, and this class has to be supported by a cluster.

For example, for the AWS EKS, we have the gp2 StorageClass:

kubectl get storageclass
NAME            PROVISIONER             AGE
gp2 (default)   kubernetes.io/aws-ebs   64d

Storage types

For a better understanding of the PersistentVolume concept – let’s see all available storages:

  • Node-local storage (emptyDir and hostPath)
  • Cloud volumes (for example, awsElasticBlockStore, gcePersistentDisk, and azureDiskVolume)
  • File-sharing volumes, such as Network File System
  • Distributed-file systems (for example, CephFS, RBD, and GlusterFS)
  • special types such  as PersistentVolumeClaim, secret, and gitRepo

emptyDir and hostPath are attached to pods directly and can store data only while such a pod is alive, while cloud volumes, NFS, and PersistentVolume are independent of pods and will store data until such a volume will be deleted.

Create a PersistentVolumeClaim

Static PersistentVolume provisioning

Create an EBS

For the Static provisioning first, we need to create a storage device, in this case, it will be AWS EBS, and then we will create a PersistentVolume that will use this EBS.

Create an EBS:

aws ec2 --profile arseniy --region us-east-2 create-volume --availability-zone us-east-2a --size 50
{
"AvailabilityZone": "us-east-2a",
"CreateTime": "2020-07-29T13:10:12.000Z",
"Encrypted": false,
"Size": 50,
"SnapshotId": "",
"State": "creating",
"VolumeId": "vol-0928650905a2491e2",
"Iops": 150,
"Tags": [],
"VolumeType": "gp2"
}

Store ID – “vol-0928650905a2491e2”.

Create a PersistentVolume

Write a manifest file, let’s call it  pv-static.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-static
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: gp2
  awsElasticBlockStore:
    fsType: ext4
    volumeID: vol-0928650905a2491e2

Here:

  • capacity: storage size
  • accessModes: access type, here it is the ReadWriteOnce, which means that this PV can be attached to an only one WorkerNode at the same time
  • storageClassName: storage access, see below
  • awsElasticBlockStore: used device type
    • fsType: a filesystem type to be created on this volume
    • volumeID: an AWS EBS disc ID

Create the PersistentVolume:

kubectl apply -f pv-static.yaml
persistentvolume/pv-static created

Check it:

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                                                                  STORAGECLASS   REASON   AGE
pv-static                                  5Gi        RWO            Retain           Available                                                                                                  69s

StorageClass

The storageClassName parameter will set the storage type.

Both PVC and PV must have the same class, otherwise, a PVC will not find a PV, and STATUS of such a PVC will be Pending.

If a PVC has no  StorageClass set – then a default value will be used:

kubectl get storageclass -o wide
NAME            PROVISIONER             AGE
gp2 (default)   kubernetes.io/aws-ebs   65d

During this, if the StorageClass is not set for a PV – this PV will be crated without class, and our PVC with the default class will not be able to use this PV with the “Cannot bind to requested volume “pvname”: storageClassName does not match” error:

...
Events:
Type       Reason          Age                  From                         Message
----       ------          ----                 ----                         -------
Warning    VolumeMismatch  12s (x17 over 4m2s)  persistentvolume-controller  Cannot bind to requested volume "pvname": storageClassName does not match
...

See documentation here>>> and here>>>.

Create a PersistentVolumeClaim

Now, we can create a PersistentVolumeClaim which will use the PersistentVolume we’ve created above to the pvc-static.yaml file:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-static
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  volumeName: pv-static

Create this PVC:

kubectl apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created

Check it:

kubectl get pvc pvc-static
NAME         STATUS   VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-static   Bound    pv-static   5Gi        RWO            gp2            31s

Dynamic PersistentVolume provisioning

The dynamic way to create a PersistentVolume is similar to the static with the only difference that you don’t need to create an AWS EBS and PersistentVolume resources manually – instead, you’ll just create a PersistentVolumeClaim object and Kubernetes will create an EBS via AWS API and will mount to an AWS EC2 which is playing the WorkerNode role in the Kubernetes cluster:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-dynamic
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Create this PVC:

kubectl apply -f pvc-dynamic.yaml
persistentvolumeclaim/pvc-dynamic created

Check it:

kubectl get pvc pvc-dynamic
NAME          STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-dynamic   Pending                                      gp2            45s

Okay, but why it’s in the Pending STATUS? Check its Events:

kubectl describe pvc pvc-dynamic
...
Events:
Type       Reason                Age               From                         Message
----       ------                ----              ----                         -------
Normal     WaitForFirstConsumer  1s (x4 over 33s)  persistentvolume-controller  waiting for first consumer to be created before binding
Mounted By:  <none>

WaitForFirstConsumer

Let’s see  our default StorageClass‘s setting:

kubectl describe sc gp2
Name:            gp2
IsDefaultClass:  Yes
...
Provisioner:           kubernetes.io/aws-ebs
Parameters:            fsType=ext4,type=gp2
...
VolumeBindingMode:     WaitForFirstConsumer
Events:                <none>

Here, the VolumeBindingMode defines how exactly a PersistentVolume will be created. With the Immediate value such a PV will be created immediately when a requester VPC will appear, but with the WaitForFirstConsumer as in this case – Kubernetes will wait for a first consumer such as a pod, which will request this PV, and then depending on an AvailbiltyZone of a WorkerNode where this pod is running – Kubernetes will create a new PV and an AWS EBS disc.

Now, let’s create pods to consume those volumes.

Using PersistentVolumeClaim in Pods

Dynamic PersistentVolumeClaim

Let’s describe a pod which will use our dynamic PVC:

apiVersion: v1
kind: Pod
metadata:
  name: pv-dynamic-pod
spec:
  volumes:
    - name: pv-dynamic-storage
      persistentVolumeClaim:
        claimName: pvc-dynamic
  containers:
    - name: pv-dynamic-container
      image: nginx
      ports:
        - containerPort: 80
          name: "nginx"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: pv-dynamic-storage

Here:

  • volumes:
    • persistentVolumeClaim:
      • claimName: a PVC name which will be requested when a pod will be created
  • containers:
    • volumeMounts: mount the pv-dynamic-storage volume to the /usr/share/nginx/html dircetory in the pod

Create it:

kubectl apply -f pv-pods.yaml
pod/pv-dynamic-pod created

Check again our PVC:

kubectl get pvc pvc-dynamic
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-dynamic   Bound    pvc-6d024b40-a239-4c35-8694-f060bd117053   5Gi        RWO            gp2            21h

Now we can see a new Volume with the ID pvc-6d024b40-a239-4c35-8694-f060bd117053 – check it:

kubectl describe pvc pvc-dynamic
Name:          pvc-dynamic
Namespace:     default
StorageClass:  gp2
Status:        Bound
Volume:        pvc-6d024b40-a239-4c35-8694-f060bd117053
...
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      5Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Events:        <none>
Mounted By:    pv-dynamic-pod

Check that volume:

kubectl describe pv pvc-6d024b40-a239-4c35-8694-f060bd117053
Name:              pvc-6d024b40-a239-4c35-8694-f060bd117053
...
StorageClass:      gp2
Status:            Bound
Claim:             default/pvc-dynamic
Reclaim Policy:    Delete
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          5Gi
Node Affinity:
Required Terms:
Term 0:        failure-domain.beta.kubernetes.io/zone in [us-east-2b]
failure-domain.beta.kubernetes.io/region in [us-east-2]
Message:
Source:
Type:       AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID:   aws://us-east-2b/vol-040a5e004876f1a40
FSType:     ext4
Partition:  0
ReadOnly:   false
Events:         <none>

And AWS EBS vol-040a5e004876f1a40:

aws ec2 --profile arseniy --region us-east-2 describe-volumes --volume-ids vol-040a5e004876f1a40 --output json
{
"Volumes": [
{
"Attachments": [
{
"AttachTime": "2020-07-30T11:08:29.000Z",
"Device": "/dev/xvdcy",
"InstanceId": "i-0a3225e9fe7cb7629",
"State": "attached",
"VolumeId": "vol-040a5e004876f1a40",
"DeleteOnTermination": false
}
],
...

Check inside of the pod:

kk exec -ti pv-dynamic-pod bash
root@pv-dynamic-pod:/# lsblk
NAME          MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1       259:0    0  50G  0 disk
|-nvme0n1p1   259:1    0  50G  0 part /etc/hosts
`-nvme0n1p128 259:2    0   1M  0 part
nvme1n1       259:3    0   5G  0 disk /usr/share/nginx/html

nvme1n1 – here is our partition.

Let’s write some data:

root@pv-dynamic-pod:/# echo Test > /usr/share/nginx/html/index.html

Drop the pod:

kk delete pod pv-dynamic-pod
pod "pv-dynamic-pod" deleted

Re-create it:

kubectl apply -f pv-pods.yaml
pod/pv-dynamic-pod created

Check the data:

kk exec -ti pv-dynamic-pod cat /usr/share/nginx/html/index.html
Test

Everything is still on its place.

Static PersistentVolumeClaim

Now, let’s try to use our statically created PV.

We can use the same manifest – the pv-static.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-static
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: gp2
  awsElasticBlockStore:
    fsType: ext4
    volumeID: vol-0928650905a2491e2

And let’s use the pvc-static.yaml manifest for our PVC:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-static
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  volumeName: pv-static

Create the PV:

kk apply -f pv-static.yaml
persistentvolume/pv-static created

Check it:

kk get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                                                                  STORAGECLASS   REASON   AGE
pv-static                                  5Gi        RWO            Retain           Available                                                                          gp2                     58s
...

Create the PVC:

kk apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created

Check it:

kk get pvc pvc-static
NAME         STATUS   VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-static   Bound    pv-static   5Gi        RWO            gp2            9s

STATUS Bound means that PVC was able to find its PV and was successfully connected.

Pod nodeAffinity

Next, we need to determine an AWS AvailabilityZone where is our AWS EBS for the Static PV was created:

aws ec2 --profile arseniy --region us-east-2 describe-volumes --volume-ids vol-0928650905a2491e2 --query '[Volumes[*].AvailabilityZone]'  --output text
us-east-2a

us-east-2a – okay, then we need to create a pod on a Kubernetes Worker Node in the same AvailabilityZone.

Create a manifest:

apiVersion: v1
kind: Pod
metadata:
  name: pv-static-pod
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: failure-domain.beta.kubernetes.io/zone
            operator: In
            values:
            - us-east-2a
  volumes:
    - name: pv-static-storage
      persistentVolumeClaim:
        claimName: pvc-static
  containers:
    - name: pv-static-container
      image: nginx
      ports:
        - containerPort: 80
          name: "nginx"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: pv-static-storage

As opposed to the Dynamic PVC – here we’ve used the nodeAffinity to specify that we want to use a node from the s-east-2a AZ.

Create that pod:

kk apply -f pv-pod-stat.yaml
pod/pv-static-pod created

Check events:

0s    Normal   Scheduled   Pod   Successfully assigned default/pv-static-pod to ip-10-3-47-58.us-east-2.compute.internal
0s    Normal   SuccessfulAttachVolume   Pod   AttachVolume.Attach succeeded for volume "pv-static"
0s    Normal   Pulling   Pod   Pulling image "nginx"
0s    Normal   Pulled   Pod   Successfully pulled image "nginx"
0s    Normal   Created   Pod   Created container pv-static-container
0s    Normal   Started   Pod   Started container pv-static-container

Partitions in the pod:

kk exec -ti pv-static-pod bash
root@pv-static-pod:/# lsblk
NAME          MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1       259:0    0  50G  0 disk
|-nvme0n1p1   259:1    0  50G  0 part /etc/hosts
`-nvme0n1p128 259:2    0   1M  0 part
nvme1n1       259:3    0  50G  0 disk /usr/share/nginx/html

nvme1n1 is mounted, all works.

PersistentVolume nodeAffinity

Another option could be nodeAffinity for the As opposed to t.

Is this case when creating a pod that will use this PV, Kubernetes first will check which Worker Nodes can be used to attach this volume to and then will create a pod on such a node.

In the pod’s manifest delete the nodeAffinity:

apiVersion: v1
kind: Pod
metadata:
  name: pv-static-pod
spec:
  volumes:
    - name: pv-static-storage
      persistentVolumeClaim:
        claimName: pvc-static
  containers:
    - name: pv-static-container
      image: nginx
      ports:
        - containerPort: 80
          name: "nginx"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: pv-static-storage

And add to the PV’s manifest:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-static
spec:
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: failure-domain.beta.kubernetes.io/zone
          operator: In
          values:
          - us-east-2a    
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: gp2
  awsElasticBlockStore:
    fsType: ext4
    volumeID: vol-0928650905a2491e2

Create this PV:

kk apply -f pv-static.yaml
persistentvolume/pv-static created

Create its PVC – nothing was changed here:

kk apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created

Create the pod:

kk apply -f pv-pod-stat.yaml
pod/pv-static-pod created

Check logs:

0s    Normal   Scheduled   Pod   Successfully assigned default/pv-static-pod to ip-10-3-47-58.us-east-2.compute.internal
0s    Normal   SuccessfulAttachVolume   Pod   AttachVolume.Attach succeeded for volume "pv-static"
0s    Normal   Pulling   Pod   Pulling image "nginx"
0s    Normal   Pulled   Pod   Successfully pulled image "nginx"
0s    Normal   Created   Pod   Created container pv-static-container
0s    Normal   Started   Pod   Started container pv-static-container

Delete PersistentVolume and PersistentVolumeClaim

When a user wants to delete a PVC that is currently used by a live pod, such a PVC will not be deleted immediately – it will be present until a corresponding pod is running.

Similarly, when deleting a PersistentVolume that has a binding from a PersistentVolumeClaim such a PV will not be deleted until such a binding present, e.g. until its PVC is present.

Reclaiming

Documentation is here>>>.

When we want to finish work with our PersistentVolume, we can delete it from a cluster to release a corresponding AWS EBS (reclaim).

The Reclaim policy for a PersistentVolume specifies to a cluster what it has to do with such a released volume and can have Retained, Recycled or Deleted values.

Retain

The Retain policy allows us to clean up a disk manually.

After deleting related PersistentVolumeClaim, a PersistentVolume will not be deleted, and will be marked as “released“, but it will be available for new PersistentVolumeClaims as it still keeps some data from the previous PersistentVolumeClaim.

To make it available for the next use, you need to delete the PersistentVolume object from the cluster.

Delete

With the Delete value, when you delete a PVC it will drop its corresponding PersistentVolume and volume’s device such as AWS EBS, GCE PD, or Azure Disk.

Keep in mind, that volumes created in the dynamic way will inherit policy from the StorageClass used, which is by default set to the Delete.

Recycle

Deprecated, was used to delete a data via common rm -rf.

Deleting PV and PVC – an example

So, we have a pod running:

kk get pod pv-static-pod
NAME            READY   STATUS    RESTARTS   AGE
pv-static-pod   1/1     Running   0          19s

Which  is using a PVC:

kk get pvc pvc-static
NAME         STATUS   VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-static   Bound    pv-static   50Gi       RWO            gp2            19h

And this PVC is bound to the PV:

kk get pv pv-static
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                STORAGECLASS   REASON   AGE
pv-static   50Gi       RWO            Retain           Bound    default/pvc-static   gp2                     19h

And our PV has its RECLAIM POLICY set to Retain – so, after we will drop its PVC and PV all data must be kept.

Let’s check – add some data:

kk exec -ti pv-static-pod bash
root@pv-static-pod:/# echo Test > /usr/share/nginx/html/test.txt
root@pv-static-pod:/# cat /usr/share/nginx/html/test.txt
Test

Exit from the pod and delete it, and then its PVC:

kubectl delete pod pv-static-pod
pod "pv-static-pod" deleted
kubectl delete pvc pvc-static
persistentvolumeclaim "pvc-static" deleted

Check the PV’s status:

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                                                                  STORAGECLASS   REASON   AGE
pv-static                                  50Gi       RWO            Retain           Released   default/pvc-static                                                     gp2                     25s

STATUS == Released, and at this moment we are not able to attach this volume again via a new PVC.

Let’s check – create a PVC again:

kubectl apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created

Create a pod:

kubectl apply -f pv-pod-stat.yaml
pod/pv-static-pod created

And check its PVC status:

kubectl get pvc pvc-static
NAME         STATUS    VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-static   Pending   pv-static   0                         gp2            59s

The STATUS is Pending.

Delete the pod, PVC and at this time – delete the PersistentVolume too:

kubectl delete -f pv-pod-stat.yaml
pod "pv-static-pod" deleted
kubectl delete -f pvc-static.yaml
persistentvolumeclaim "pvc-static" deleted
kubectl delete -f pv-static.yaml
persistentvolume "pv-static" deleted

Create all over again:

kubectl apply -f pv-static.yaml
persistentvolume/pv-static created
kubectl apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created
kubectl apply -f pv-pod-stat.yaml
pod/pv-static-pod created

Check the PVC:

kubectl get pvc pvc-static
NAME         STATUS   VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-static   Bound    pv-static   50Gi       RWO            gp2            27s

And check the data we’ve added earlier:

kubectl exec -ti pv-static-pod cat /usr/share/nginx/html/test.txt
Test

All good – the data is still in its place.

Changing Reclaim Policy for PersistentVolume

Documentation is here>>>.

Currently, our PV has the Retain value:

kubectl get pv pv-static -o jsonpath='{.spec.persistentVolumeReclaimPolicy}'
Retain

Apply a patch – update its persistentVolumeReclaimPolicy parameter to the Delete value:

kubectl patch pv pv-static -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}'
persistentvolume/pv-static patched

Check it:

kubectl get pv pv-static -o jsonpath='{.spec.persistentVolumeReclaimPolicy}'
Delete

Delete the pod and its PVC:

kubectl delete -f pv-pod-stat.yaml
pod "pv-static-pod" deleted
kubectl delete -f pvc-static.yaml
persistentvolumeclaim "pvc-static" deleted

Check the PersistentVolume:

kubectl get pv pv-static
Error from server (NotFound): persistentvolumes "pv-static" not found

And a AWS EBS which was used for this PV:

aws ec2 --profile arseniy --region us-east-2 describe-volumes --volume-ids vol-0928650905a2491e2
An error occurred (InvalidVolume.NotFound) when calling the DescribeVolumes operation: The volume 'vol-0928650905a2491e2' does not exist.

Actually, that’s all.

Useful links



Also published on Medium.