I actually started to write about creating my own Kubernetes Operator, but decided to make a separate topic about what a Kubernetes CustomResourceDefinition is, and how creating a CRD works at the level of the Kubernetes API and the
etcd
.
That is, to start with how Kubernetes actually works with resources, and what happens when we create or edit resources.
The second part: Kubernetes: what is Kubernetes Operator and CustomResourceDefinition.
Contents
Kubernetes API
So, all communication with the Kubernetes Control Plane takes place through its main endpoint – the Kubernetes API, which is a component of the Kubernetes Control Plane – see Cluster Architecture.
Documentation – The Kubernetes API and Kubernetes API Concepts.
Through the API, we communicate with Kubernetes, and all resources and information about them are stored in the database – etcd
.
Other components of the Control Plane are the Kube Controller Manager with a set of default controllers that are responsible for working with resources, and the Scheduler, which is responsible for how resources will be placed on Worker Nodes.
The Kubernetes API is just a regular HTTPS REST API that we can access even from curl
.
To access the cluster, we can use kubectl proxy
, which will take the parameters from ~/.kube/config
with the API Server address and token, and create a tunnel to it.
I have access to AWS EKS configured, so the connection will go to it:
$ kubectl proxy --port=8080 Starting to serve on 127.0.0.1:8080
And we turn to the API:
$ curl -s localhost:8080 | jq { "paths": [ "/.well-known/openid-configuration", "/api", "/api/v1", "/apis", ... "/version" ] }
Actually, what we see is a list of API endpoints supported by the Kubernetes API:
/api/
: information on the Kubernetes API itself and the entry point to the core API Groups (see below)/api/v1
: core API group with Pods, ConfigMaps, Services, etc./apis/
: APIGroupList – the rest of the API Groups in the system and their versions, including API Groups created from different CRDs- for example, for the API Group
operator.victoriametrics.com
we can see support for two versions – “operator.victoriametrics.com/v
1″ “operator.victoriametrics.com/v1beta1
“
- for example, for the API Group
/version
: information on the cluster version
And then we can go deeper and see what’s inside each endpoint, for example, to get information about all Pods in the cluster:
$ curl -s localhost:8080/api/v1/pods | jq ... { "metadata": { "name": "backend-ws-deployment-6db58cc97c-k56lm", ... "namespace": "staging-backend-api-ns" "labels": { "app": "backend-ws", "component": "backend", ... "spec": { "volumes": [ { "name": "eks-pod-identity-token", ... "containers": [ { "name": "backend-ws-container", "image": "492***148.dkr.ecr.us-east-1.amazonaws.com/challenge-backend-api:v0.171.9", "command": [ "gunicorn", "websockets_backend.run_api:app", ... "resources": { "requests": { "cpu": "200m", "memory": "512Mi" } }, ...
Here we can see information about the Pod named “backend-ws-deployment-6db58cc97c-k56lm” which lives in the Kubernetes Namespace “staging-backend-api-ns“, and the rest of the information about it – the volumes, which containers, resources, etc.
Kubernetes API Groups and Kind
API Groups are a way to organize resources in Kubernetes. They are grouped by groups, versions, and resource types (Kind).
That is the structure of the API:
- API Group
- versions
- kind
- versions
For example, in /api/v1
we see the Kubernetes Core API Group, in /apis
– API Groups apps
, batch
, events
, and so on.
The structure will be as follows:
/apis/<group>
– the group itself and its versions/apis/<group>/<version>
– a specific version of the group with specific resources (Kind)/apis/<group>/<version>/<resource>
– access to a specific resource and objects in it
Note: Kind vs resource: Kind is the name of the resource that is specified in the schema of this resource. And resource is the name that is used to build the URI when requesting the API Server.
For example, for the API Group apps
we have the version v1
:
$ curl -s localhost:8080/apis/apps | jq { "kind": "APIGroup", "apiVersion": "v1", "name": "apps", "versions": [ { "groupVersion": "apps/v1", "version": "v1" } ], ...
And inside the version – resources, for example deployments
:
$ curl -s localhost:8080/apis/apps/v1 | jq { ... { "name": "deployments", "singularName": "deployment", "namespaced": true, "kind": "Deployment", "verbs": [ "create", "delete", "deletecollection", "get", "list", "patch", "update", "watch" ], "shortNames": [ "deploy" ], "categories": [ "all" ], ...
And using this group, version, and specific resource type (kind), we get all the objects:
$ curl -s localhost:8080/apis/apps/v1/deployments/ | jq { "kind": "DeploymentList", "apiVersion": "apps/v1", "metadata": { "resourceVersion": "1534" }, "items": [ { "metadata": { "name": "coredns", "namespace": "kube-system", "uid": "9d7f6de3-041e-4afe-84f4-e124d2cc6e8a", "resourceVersion": "709", "generation": 2, "creationTimestamp": "2025-07-12T10:15:33Z", "labels": { "k8s-app": "kube-dns" }, ...
Okay, so we’ve accessed the API – but where does it get all that data that we’re being shown?
Kubernetes and etcd
For storing data in Kubernetes, we have another key component of the Control Plane – etcd.
Actually, this is just a key:value database with all the data that forms our cluster – all its settings, all resources, all states of these resources, RBAC rules, etc.
When the Kubernetes API Server receives a request, for example, POST /apis/apps/v1/namespaces/default/deployments
, it first checks if the object matches the resource schema (validation), and only then saves it to etcd
.
The etcd
database consists of a set of keys. For example, a Pod named “nginx-abc” will be stored in a key named /registry/pods/default/nginx-abc
.
See the documentation Operating etcd clusters for Kubernetes.
In AWS EKS, we don’t have access to etcd
(and that’s a good thing), but we can start Minikube and have a look at it:
$ minikube start ... 🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
Check the system pods:
$ kubectl -n kube-system get pod NAME READY STATUS RESTARTS AGE coredns-674b8bbfcf-68q8p 0/1 ContainerCreating 0 57s etcd-minikube 1/1 Running 0 62s ...
Connect to the cluster:
$ minikube ssh
If we had used minikube start --driver=virtualbox
, we would have used minikube ssh
to enter the VirtualBox instance.
But since we have the default docker
driver, we simply enter the minikube
container.
Install etcd
here to get the etcdctl
CLI utility:
docker@minikube:~$ sudo apt update docker@minikube:~$ sudo apt install etcd
Check it:
docker@minikube:~$ etcdctl -version etcdctl version: 3.3.25
And now we can see what’s in the database:
docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl \ --endpoints=https://127.0.0.1:2379 \ --cacert=/var/lib/minikube/certs/etcd/ca.crt \ --cert=/var/lib/minikube/certs/etcd/server.crt \ --key=/var/lib/minikube/certs/etcd/server.key \ get "" --prefix --keys-only ... /registry/namespaces/kube-system /registry/pods/kube-system/coredns-674b8bbfcf-68q8p /registry/pods/kube-system/etcd-minikube ... /registry/services/endpoints/default/kubernetes /registry/services/endpoints/kube-system/kube-dns ...
The data in the keys is stored in Protobuf (Protocol Buffers) format, so with the usual etcdctl get KEY
, the data will look a little crooked.
Let’s see what is in the database about the Pod of etcd
itself :
docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/var/lib/minikube/certs/etcd/ca.crt --cert=/var/lib/minikube/certs/etcd/server.crt --key=/var/lib/minikube/certs/etcd/server.key get "/registry/pods/kube-system/etcd-minikube"
The result:
OK.
CustomResourceDefinitions and Kubernetes API
So, when we create a CRD, we extend the Kubernetes API by creating our own API Group with our own name, version, and a new resource type (Kind) that is described in the CRD.
Documentation – Extend the Kubernetes API with CustomResourceDefinitions.
Let’s write a simple CRD:
apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: name: myapps.mycompany.com spec: group: mycompany.com names: kind: MyApp plural: myapps singular: myapp scope: Namespaced versions: - name: v1 served: true storage: true schema: openAPIV3Schema: type: object properties: spec: type: object properties: image: type: string
Here we have:
- use the existing API Group
apiextensions.k8s.io
and versionv1
- from it take the schema of the CustomResourceDefinition object
- and based on this schema, we create our own API Group named
mycompany.com
- in this API Group, we describe a single resource type –
kind: MyApp
- and one version –
v1
- then using
openAPIV3Schema
we describe the schema of our resource – what fields it has and their types, and here you can also set default values (see OpenAPI Specification)
- in this API Group, we describe a single resource type –
With this CRD, we will be able to create new Custom Resources with a manifest in which we pass the apiVersion
, kind
, and spec.image
fields from the schema.openAPIV3Schema.properties.spec.properties.image of
our CRD:
apiVersion: mycompany.com/v1 kind: MyApp metadata: name: example spec: image: nginx:1.25
Create the CRD:
$ kk apply -f test-crd.yaml customresourcedefinition.apiextensions.k8s.io/myapps.mycompany.com created
Check in the Kubernetes API (you can use the | jq '.groups[] | select(.name == "mycompany.com")' selector
):
$ curl -s localhost:8080/apis/ | jq ... { "name": "mycompany.com", "versions": [ { "groupVersion": "mycompany.com/v1", "version": "v1" } ], ... } ...
And the API Group mycompany.com
itself :
$ curl -s localhost:8080/apis/mycompany.com/v1 | jq { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "mycompany.com/v1", "resources": [ { "name": "myapps", "singularName": "myapp", "namespaced": true, "kind": "MyApp", "verbs": [ "delete", "deletecollection", "get", "list", "patch", "create", "update", "watch" ], "storageVersionHash": "MZjF6nKlCOU=" } ] }
And in etcd
:
docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/var/lib/minikube/certs/etcd/ca.crt --cert=/var/lib/minikube/certs/etcd/server.crt --key=/var/lib/minikube/certs/etcd/server.key get "" --prefix --keys-only /registry/apiextensions.k8s.io/customresourcedefinitions/myapps.mycompany.com ... /registry/apiregistration.k8s.io/apiservices/v1.mycompany.com ...
Here, the /registry/apiextensions.k8s.io/customresourcedefinitions/myapps.
mycompany
.com
key stores information about the new CRD itself – the CRD structure, its OpenAPI schema, versions, etc., and the /registry/apiregistration.k8s.io/apiservices/v1.mycompany.com
key registers the API Service for this group to access the group via the Kubernetes API.
And of course, we can see the CRD with kubectl`:
$ kk get crd NAME CREATED AT myapps.mycompany.com 2025-07-12T11:23:19Z
Create the CustomResource itself from the manifest we wrote above:
$ kk apply -f test-resource.yaml myapp.mycompany.com/example created
Test it:
$ kk describe MyApp Name: example Namespace: default Labels: <none> Annotations: <none> API Version: mycompany.com/v1 Kind: MyApp Metadata: Creation Timestamp: 2025-07-12T13:34:52Z Generation: 1 Resource Version: 4611 UID: a88e37fd-1477-4a7e-8c00-46c925f510ac Spec: Image: nginx:1.25
But this is just data in etcd
for now – we don’t have any real Pods resources, because there is no controller that handles resources from Kind: MyApp
.
Note: looking ahead to the next post: actually, Kubernetes Operator is a set of CRDs and a controller that “controls” resources with the specified Kind
Kubernetes API Service
When we add a new CRD, Kubernetes not only has to create a new key in etcd
with the new API Group and the corresponding resource schema, but also add a new endpoint to its routes – as we do in Python with @app.get("/")
in FastAPI – so that the API server knows that the GET
request /apis/mycompany.com/v1/myapps
should return resources of this type.
The corresponding API Service will contain a spec
with the group and version:
$ kk get apiservice v1.mycompany.com -o yaml apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: creationTimestamp: "2025-07-12T11:53:52Z" labels: kube-aggregator.kubernetes.io/automanaged: "true" name: v1.mycompany.com resourceVersion: "2632" uid: 26fc8c6b-6770-422f-8996-3f35d86be6c7 spec: group: mycompany.com groupPriorityMinimum: 1000 version: v1 versionPriority: 100 ...
That is, when we create a new CRD, Kubernetes API Server creates an API Service (writing it to /registry/apiregistration.k8s.io/apiservices/v1.mycompany.com
), and adds it to its own routers in the /apis
endpoint.
And now, having an idea of what the API looks like and the database that stores all the resources, we can move on to creating the CRD and controller, that is, to actually write the Operator itself.
But this is already in the next part.