For the persistent data Kubernetes provides two main types of objects – the PersistentVolume and PersistentVolumeClaim.
PersistentVolume – is a storage device and a filesystem volume on it, for example, it could be AWS EBS, which is attached to an AWS EC2, and from the cluster’s perspective of view, a PersistentVolume is a similar resource like let’s say a Kubernetes Worker Node.
PersistentVolumeClaim in its turn is a request to use such a PersistentVolume resource and is similar to a Kubernetes Pod – as a pod is requesting a WorkernNode’s resource, a PersistentVolumeClaim will request resources from a PersistentVolume: as a Pod requesting a CPU, memory from a WorkerNode – a PersistentVolumeClaim will request a necessary storage size and an access type – ReadWriteOnce
, ReadOnlyMany
, or ReadWriteMany
, see the AccessModes.
A PersistentVolume can be created in two ways – a static, and dynamic (recommended one).
When creating a PV statically, you’ll have to create a storage device first, for example, AWS EBS, which will be used by a PersistentVolume.
In case of a cluster wasn’t able to find an appropriate PV for a PersistentVolumeClaim н- it can create a new storage device exactly for this PVC – this will be the dynamic PV creation way.
To make this works a PVC has to have a Storage Class set the same, and this class has to be supported by a cluster.
For example, for the AWS EKS, we have the gp2 StorageClass
:
[simterm]
$ kubectl get storageclass NAME PROVISIONER AGE gp2 (default) kubernetes.io/aws-ebs 64d
[/simterm]
Contents
Storage types
For a better understanding of the PersistentVolume concept – let’s see all available storages:
- Node-local storage (
emptyDir
andhostPath
) - Cloud volumes (for example,
awsElasticBlockStore
,gcePersistentDisk
, andazureDiskVolume
) - File-sharing volumes, such as Network File System
- Distributed-file systems (for example, CephFS, RBD, and GlusterFS)
- special types such as
PersistentVolumeClaim
,secret
, andgitRepo
emptyDir
and hostPath
are attached to pods directly and can store data only while such a pod is alive, while cloud volumes, NFS, and PersistentVolume are independent of pods and will store data until such a volume will be deleted.
Create a PersistentVolumeClaim
Static PersistentVolume provisioning
Create an EBS
For the Static provisioning first, we need to create a storage device, in this case, it will be AWS EBS, and then we will create a PersistentVolume that will use this EBS.
Create an EBS:
[simterm]
$ aws ec2 --profile arseniy --region us-east-2 create-volume --availability-zone us-east-2a --size 50 { "AvailabilityZone": "us-east-2a", "CreateTime": "2020-07-29T13:10:12.000Z", "Encrypted": false, "Size": 50, "SnapshotId": "", "State": "creating", "VolumeId": "vol-0928650905a2491e2", "Iops": 150, "Tags": [], "VolumeType": "gp2" }
[/simterm]
Store ID – “vol-0928650905a2491e2”.
Create a PersistentVolume
Write a manifest file, let’s call it pv-static.yaml
:
apiVersion: v1 kind: PersistentVolume metadata: name: pv-static spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce storageClassName: gp2 awsElasticBlockStore: fsType: ext4 volumeID: vol-0928650905a2491e2
Here:
capacity
: storage sizeaccessModes
: access type, here it is theReadWriteOnce
, which means that this PV can be attached to an only one WorkerNode at the same timestorageClassName
: storage access, see belowawsElasticBlockStore
: used device typefsType
: a filesystem type to be created on this volumevolumeID
: an AWS EBS disc ID
Create the PersistentVolume:
[simterm]
$ kubectl apply -f pv-static.yaml persistentvolume/pv-static created
[/simterm]
Check it:
[simterm]
$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv-static 5Gi RWO Retain Available 69s
[/simterm]
StorageClass
The storageClassName
parameter will set the storage type.
Both PVC and PV must have the same class, otherwise, a PVC will not find a PV, and STATUS of such a PVC will be Pending.
If a PVC has no StorageClass
set – then a default value will be used:
[simterm]
$ kubectl get storageclass -o wide NAME PROVISIONER AGE gp2 (default) kubernetes.io/aws-ebs 65d
[/simterm]
During this, if the StorageClass
is not set for a PV – this PV will be crated without class, and our PVC with the default class will not be able to use this PV with the “Cannot bind to requested volume “pvname”: storageClassName does not match” error:
[simterm]
... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning VolumeMismatch 12s (x17 over 4m2s) persistentvolume-controller Cannot bind to requested volume "pvname": storageClassName does not match ...
[/simterm]
See documentation here>>> and here>>>.
Create a PersistentVolumeClaim
Now, we can create a PersistentVolumeClaim which will use the PersistentVolume we’ve created above to the pvc-static.yaml
file:
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pvc-static spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi volumeName: pv-static
Create this PVC:
[simterm]
$ kubectl apply -f pvc-static.yaml persistentvolumeclaim/pvc-static created
[/simterm]
Check it:
[simterm]
$ kubectl get pvc pvc-static NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-static Bound pv-static 5Gi RWO gp2 31s
[/simterm]
Dynamic PersistentVolume provisioning
The dynamic way to create a PersistentVolume is similar to the static with the only difference that you don’t need to create an AWS EBS and PersistentVolume resources manually – instead, you’ll just create a PersistentVolumeClaim object and Kubernetes will create an EBS via AWS API and will mount to an AWS EC2 which is playing the WorkerNode role in the Kubernetes cluster:
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pvc-dynamic spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi
Create this PVC:
[simterm]
$ kubectl apply -f pvc-dynamic.yaml persistentvolumeclaim/pvc-dynamic created
[/simterm]
Check it:
[simterm]
$ kubectl get pvc pvc-dynamic NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-dynamic Pending gp2 45s
[/simterm]
Okay, but why it’s in the Pending STATUS? Check its Events:
[simterm]
$ kubectl describe pvc pvc-dynamic ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal WaitForFirstConsumer 1s (x4 over 33s) persistentvolume-controller waiting for first consumer to be created before binding Mounted By: <none>
[/simterm]
WaitForFirstConsumer
Let’s see our default StorageClass
‘s setting:
[simterm]
$ kubectl describe sc gp2 Name: gp2 IsDefaultClass: Yes ... Provisioner: kubernetes.io/aws-ebs Parameters: fsType=ext4,type=gp2 ... VolumeBindingMode: WaitForFirstConsumer Events: <none>
[/simterm]
Here, the VolumeBindingMode
defines how exactly a PersistentVolume will be created. With the Immediate value such a PV will be created immediately when a requester VPC will appear, but with the WaitForFirstConsumer
as in this case – Kubernetes will wait for a first consumer such as a pod, which will request this PV, and then depending on an AvailbiltyZone of a WorkerNode where this pod is running – Kubernetes will create a new PV and an AWS EBS disc.
Now, let’s create pods to consume those volumes.
Using PersistentVolumeClaim in Pods
Dynamic PersistentVolumeClaim
Let’s describe a pod which will use our dynamic PVC:
apiVersion: v1 kind: Pod metadata: name: pv-dynamic-pod spec: volumes: - name: pv-dynamic-storage persistentVolumeClaim: claimName: pvc-dynamic containers: - name: pv-dynamic-container image: nginx ports: - containerPort: 80 name: "nginx" volumeMounts: - mountPath: "/usr/share/nginx/html" name: pv-dynamic-storage
Here:
volumes
:persistentVolumeClaim
:claimName
: a PVC name which will be requested when a pod will be created
containers
:volumeMounts
: mount the pv-dynamic-storage volume to the/usr/share/nginx/html
dircetory in the pod
Create it:
[simterm]
$ kubectl apply -f pv-pods.yaml pod/pv-dynamic-pod created
[/simterm]
Check again our PVC:
[simterm]
$ kubectl get pvc pvc-dynamic NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-dynamic Bound pvc-6d024b40-a239-4c35-8694-f060bd117053 5Gi RWO gp2 21h
[/simterm]
Now we can see a new Volume with the ID pvc-6d024b40-a239-4c35-8694-f060bd117053 – check it:
[simterm]
$ kubectl describe pvc pvc-dynamic Name: pvc-dynamic Namespace: default StorageClass: gp2 Status: Bound Volume: pvc-6d024b40-a239-4c35-8694-f060bd117053 ... Finalizers: [kubernetes.io/pvc-protection] Capacity: 5Gi Access Modes: RWO VolumeMode: Filesystem Events: <none> Mounted By: pv-dynamic-pod
[/simterm]
Check that volume:
[simterm]
$ kubectl describe pv pvc-6d024b40-a239-4c35-8694-f060bd117053 Name: pvc-6d024b40-a239-4c35-8694-f060bd117053 ... StorageClass: gp2 Status: Bound Claim: default/pvc-dynamic Reclaim Policy: Delete Access Modes: RWO VolumeMode: Filesystem Capacity: 5Gi Node Affinity: Required Terms: Term 0: failure-domain.beta.kubernetes.io/zone in [us-east-2b] failure-domain.beta.kubernetes.io/region in [us-east-2] Message: Source: Type: AWSElasticBlockStore (a Persistent Disk resource in AWS) VolumeID: aws://us-east-2b/vol-040a5e004876f1a40 FSType: ext4 Partition: 0 ReadOnly: false Events: <none>
[/simterm]
And AWS EBS vol-040a5e004876f1a40:
[simterm]
$ aws ec2 --profile arseniy --region us-east-2 describe-volumes --volume-ids vol-040a5e004876f1a40 --output json { "Volumes": [ { "Attachments": [ { "AttachTime": "2020-07-30T11:08:29.000Z", "Device": "/dev/xvdcy", "InstanceId": "i-0a3225e9fe7cb7629", "State": "attached", "VolumeId": "vol-040a5e004876f1a40", "DeleteOnTermination": false } ], ...
[/simterm]
Check inside of the pod:
[simterm]
$ kk exec -ti pv-dynamic-pod bash root@pv-dynamic-pod:/# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:0 0 50G 0 disk |-nvme0n1p1 259:1 0 50G 0 part /etc/hosts `-nvme0n1p128 259:2 0 1M 0 part nvme1n1 259:3 0 5G 0 disk /usr/share/nginx/html
[/simterm]
nvme1n1 – here is our partition.
Let’s write some data:
[simterm]
root@pv-dynamic-pod:/# echo Test > /usr/share/nginx/html/index.html
[/simterm]
Drop the pod:
[simterm]
$ kk delete pod pv-dynamic-pod pod "pv-dynamic-pod" deleted
[/simterm]
Re-create it:
[simterm]
$ kubectl apply -f pv-pods.yaml pod/pv-dynamic-pod created
[/simterm]
Check the data:
[simterm]
$ kk exec -ti pv-dynamic-pod cat /usr/share/nginx/html/index.html Test
[/simterm]
Everything is still on its place.
Static PersistentVolumeClaim
Now, let’s try to use our statically created PV.
We can use the same manifest – the pv-static.yaml
:
apiVersion: v1 kind: PersistentVolume metadata: name: pv-static spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce storageClassName: gp2 awsElasticBlockStore: fsType: ext4 volumeID: vol-0928650905a2491e2
And let’s use the pvc-static.yaml
manifest for our PVC:
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pvc-static spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi volumeName: pv-static
Create the PV:
[simterm]
$ kk apply -f pv-static.yaml persistentvolume/pv-static created
[/simterm]
Check it:
[simterm]
$ kk get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv-static 5Gi RWO Retain Available gp2 58s ...
[/simterm]
Create the PVC:
[simterm]
$ kk apply -f pvc-static.yaml persistentvolumeclaim/pvc-static created
[/simterm]
Check it:
[simterm]
$ kk get pvc pvc-static NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-static Bound pv-static 5Gi RWO gp2 9s
[/simterm]
STATUS Bound means that PVC was able to find its PV and was successfully connected.
Pod nodeAffinity
Next, we need to determine an AWS AvailabilityZone where is our AWS EBS for the Static PV was created:
[simterm]
$ aws ec2 --profile arseniy --region us-east-2 describe-volumes --volume-ids vol-0928650905a2491e2 --query '[Volumes[*].AvailabilityZone]' --output text us-east-2a
[/simterm]
us-east-2a – okay, then we need to create a pod on a Kubernetes Worker Node in the same AvailabilityZone.
Create a manifest:
apiVersion: v1 kind: Pod metadata: name: pv-static-pod spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: failure-domain.beta.kubernetes.io/zone operator: In values: - us-east-2a volumes: - name: pv-static-storage persistentVolumeClaim: claimName: pvc-static containers: - name: pv-static-container image: nginx ports: - containerPort: 80 name: "nginx" volumeMounts: - mountPath: "/usr/share/nginx/html" name: pv-static-storage
As opposed to the Dynamic PVC – here we’ve used the nodeAffinity
to specify that we want to use a node from the s-east-2a AZ.
Create that pod:
[simterm]
$ kk apply -f pv-pod-stat.yaml pod/pv-static-pod created
[/simterm]
Check events:
[simterm]
0s Normal Scheduled Pod Successfully assigned default/pv-static-pod to ip-10-3-47-58.us-east-2.compute.internal 0s Normal SuccessfulAttachVolume Pod AttachVolume.Attach succeeded for volume "pv-static" 0s Normal Pulling Pod Pulling image "nginx" 0s Normal Pulled Pod Successfully pulled image "nginx" 0s Normal Created Pod Created container pv-static-container 0s Normal Started Pod Started container pv-static-container
[/simterm]
Partitions in the pod:
[simterm]
$ kk exec -ti pv-static-pod bash root@pv-static-pod:/# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:0 0 50G 0 disk |-nvme0n1p1 259:1 0 50G 0 part /etc/hosts `-nvme0n1p128 259:2 0 1M 0 part nvme1n1 259:3 0 50G 0 disk /usr/share/nginx/html
[/simterm]
nvme1n1 is mounted, all works.
PersistentVolume nodeAffinity
Another option could be nodeAffinity
for the As opposed to t.
Is this case when creating a pod that will use this PV, Kubernetes first will check which Worker Nodes can be used to attach this volume to and then will create a pod on such a node.
In the pod’s manifest delete the nodeAffinity
:
apiVersion: v1 kind: Pod metadata: name: pv-static-pod spec: volumes: - name: pv-static-storage persistentVolumeClaim: claimName: pvc-static containers: - name: pv-static-container image: nginx ports: - containerPort: 80 name: "nginx" volumeMounts: - mountPath: "/usr/share/nginx/html" name: pv-static-storage
And add to the PV’s manifest:
apiVersion: v1 kind: PersistentVolume metadata: name: pv-static spec: nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: failure-domain.beta.kubernetes.io/zone operator: In values: - us-east-2a capacity: storage: 50Gi accessModes: - ReadWriteOnce storageClassName: gp2 awsElasticBlockStore: fsType: ext4 volumeID: vol-0928650905a2491e2
Create this PV:
[simterm]
$ kk apply -f pv-static.yaml persistentvolume/pv-static created
[/simterm]
Create its PVC – nothing was changed here:
[simterm]
$ kk apply -f pvc-static.yaml persistentvolumeclaim/pvc-static created
[/simterm]
Create the pod:
[simterm]
$ kk apply -f pv-pod-stat.yaml pod/pv-static-pod created
[/simterm]
Check logs:
[simterm]
0s Normal Scheduled Pod Successfully assigned default/pv-static-pod to ip-10-3-47-58.us-east-2.compute.internal 0s Normal SuccessfulAttachVolume Pod AttachVolume.Attach succeeded for volume "pv-static" 0s Normal Pulling Pod Pulling image "nginx" 0s Normal Pulled Pod Successfully pulled image "nginx" 0s Normal Created Pod Created container pv-static-container 0s Normal Started Pod Started container pv-static-container
[/simterm]
Delete PersistentVolume and PersistentVolumeClaim
When a user wants to delete a PVC that is currently used by a live pod, such a PVC will not be deleted immediately – it will be present until a corresponding pod is running.
Similarly, when deleting a PersistentVolume that has a binding from a PersistentVolumeClaim such a PV will not be deleted until such a binding present, e.g. until its PVC is present.
Reclaiming
Documentation is here>>>.
When we want to finish work with our PersistentVolume, we can delete it from a cluster to release a corresponding AWS EBS (reclaim).
The Reclaim policy for a PersistentVolume specifies to a cluster what it has to do with such a released volume and can have Retained
, Recycled
or Deleted
values.
Retain
The Retain
policy allows us to clean up a disk manually.
After deleting related PersistentVolumeClaim, a PersistentVolume will not be deleted, and will be marked as “released“, but it will be available for new PersistentVolumeClaims as it still keeps some data from the previous PersistentVolumeClaim.
To make it available for the next use, you need to delete the PersistentVolume object from the cluster.
Delete
With the Delete
value, when you delete a PVC it will drop its corresponding PersistentVolume and volume’s device such as AWS EBS, GCE PD, or Azure Disk.
Keep in mind, that volumes created in the dynamic way will inherit policy from the StorageClass
used, which is by default set to the Delete
.
Recycle
Deprecated, was used to delete a data via common rm -rf
.
Deleting PV and PVC – an example
So, we have a pod running:
[simterm]
$ kk get pod pv-static-pod NAME READY STATUS RESTARTS AGE pv-static-pod 1/1 Running 0 19s
[/simterm]
Which is using a PVC:
[simterm]
$ kk get pvc pvc-static NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-static Bound pv-static 50Gi RWO gp2 19h
[/simterm]
And this PVC is bound to the PV:
[simterm]
$ kk get pv pv-static NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv-static 50Gi RWO Retain Bound default/pvc-static gp2 19h
[/simterm]
And our PV has its RECLAIM POLICY
set to Retain – so, after we will drop its PVC and PV all data must be kept.
Let’s check – add some data:
[simterm]
$ kk exec -ti pv-static-pod bash root@pv-static-pod:/# echo Test > /usr/share/nginx/html/test.txt root@pv-static-pod:/# cat /usr/share/nginx/html/test.txt Test
[/simterm]
Exit from the pod and delete it, and then its PVC:
[simterm]
$ kubectl delete pod pv-static-pod pod "pv-static-pod" deleted $ kubectl delete pvc pvc-static persistentvolumeclaim "pvc-static" deleted
[/simterm]
Check the PV’s status:
[simterm]
$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv-static 50Gi RWO Retain Released default/pvc-static gp2 25s
[/simterm]
STATUS
== Released, and at this moment we are not able to attach this volume again via a new PVC.
Let’s check – create a PVC again:
[simterm]
$ kubectl apply -f pvc-static.yaml persistentvolumeclaim/pvc-static created
[/simterm]
Create a pod:
[simterm]
$ kubectl apply -f pv-pod-stat.yaml pod/pv-static-pod created
[/simterm]
And check its PVC status:
[simterm]
$ kubectl get pvc pvc-static NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-static Pending pv-static 0 gp2 59s
[/simterm]
The STATUS
is Pending.
Delete the pod, PVC and at this time – delete the PersistentVolume too:
[simterm]
$ kubectl delete -f pv-pod-stat.yaml pod "pv-static-pod" deleted $ kubectl delete -f pvc-static.yaml persistentvolumeclaim "pvc-static" deleted $ kubectl delete -f pv-static.yaml persistentvolume "pv-static" deleted
[/simterm]
Create all over again:
[simterm]
$ kubectl apply -f pv-static.yaml persistentvolume/pv-static created $ kubectl apply -f pvc-static.yaml persistentvolumeclaim/pvc-static created $ kubectl apply -f pv-pod-stat.yaml pod/pv-static-pod created
[/simterm]
Check the PVC:
[simterm]
$ kubectl get pvc pvc-static NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-static Bound pv-static 50Gi RWO gp2 27s
[/simterm]
And check the data we’ve added earlier:
[simterm]
$ kubectl exec -ti pv-static-pod cat /usr/share/nginx/html/test.txt Test
[/simterm]
All good – the data is still in its place.
Changing Reclaim Policy for PersistentVolume
Documentation is here>>>.
Currently, our PV has the Retain
value:
[simterm]
$ kubectl get pv pv-static -o jsonpath='{.spec.persistentVolumeReclaimPolicy}' Retain
[/simterm]
Apply a patch – update its persistentVolumeReclaimPolicy
parameter to the Delete
value:
[simterm]
$ kubectl patch pv pv-static -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}' persistentvolume/pv-static patched
[/simterm]
Check it:
[simterm]
$ kubectl get pv pv-static -o jsonpath='{.spec.persistentVolumeReclaimPolicy}' Delete
[/simterm]
Delete the pod and its PVC:
[simterm]
$ kubectl delete -f pv-pod-stat.yaml pod "pv-static-pod" deleted $ kubectl delete -f pvc-static.yaml persistentvolumeclaim "pvc-static" deleted
[/simterm]
Check the PersistentVolume:
[simterm]
$ kubectl get pv pv-static Error from server (NotFound): persistentvolumes "pv-static" not found
[/simterm]
And a AWS EBS which was used for this PV:
[simterm]
$ aws ec2 --profile arseniy --region us-east-2 describe-volumes --volume-ids vol-0928650905a2491e2 An error occurred (InvalidVolume.NotFound) when calling the DescribeVolumes operation: The volume 'vol-0928650905a2491e2' does not exist.
[/simterm]
Actually, that’s all.
Useful links
- Topology-Aware Volume Provisioning in Kubernetes
- Using preexisting persistent disks as PersistentVolumes
- Persistent volumes with persistent disks
- Kubernetes Persistent Storage: Why, Where and How
- Stateful Containers on Kubernetes using Persistent Volume and Amazon EBS