Kubernetes: Kubernetes API, API groups, CRDs, and the etcd

By | 07/21/2025
 

I actually started to write about creating my own Kubernetes Operator, but decided to make a separate topic about what a Kubernetes CustomResourceDefinition is, and how creating a CRD works at the level of the Kubernetes API and the etcd.

That is, to start with how Kubernetes actually works with resources, and what happens when we create or edit resources.

The second part: Kubernetes: what is Kubernetes Operator and CustomResourceDefinition.

Kubernetes API

So, all communication with the Kubernetes Control Plane takes place through its main endpoint – the Kubernetes API, which is a component of the Kubernetes Control Plane – see Cluster Architecture.

Documentation – The Kubernetes API and Kubernetes API Concepts.

Through the API, we communicate with Kubernetes, and all resources and information about them are stored in the database – etcd.

Other components of the Control Plane are the Kube Controller Manager with a set of default controllers that are responsible for working with resources, and the Scheduler, which is responsible for how resources will be placed on Worker Nodes.

The Kubernetes API is just a regular HTTPS REST API that we can access even from curl.

To access the cluster, we can use kubectl proxy, which will take the parameters from ~/.kube/config with the API Server address and token, and create a tunnel to it.

I have access to AWS EKS configured, so the connection will go to it:

$ kubectl proxy --port=8080 
Starting to serve on 127.0.0.1:8080

And we turn to the API:

$ curl -s localhost:8080 | jq
{
  "paths": [
    "/.well-known/openid-configuration",
    "/api",
    "/api/v1",
    "/apis",
    ...
    "/version"
  ]
}

Actually, what we see is a list of API endpoints supported by the Kubernetes API:

  • /api/: information on the Kubernetes API itself and the entry point to the core API Groups (see below)
  • /api/v1: core API group with Pods, ConfigMaps, Services, etc.
  • /apis/: APIGroupList – the rest of the API Groups in the system and their versions, including API Groups created from different CRDs
    • for example, for the API Group operator.victoriametrics.com we can see support for two versions – “operator.victoriametrics.com/v1″ “operator.victoriametrics.com/v1beta1
  • /version: information on the cluster version

And then we can go deeper and see what’s inside each endpoint, for example, to get information about all Pods in the cluster:

$ curl -s localhost:8080/api/v1/pods | jq
...
    {
      "metadata": {
        "name": "backend-ws-deployment-6db58cc97c-k56lm",
      ...
        "namespace": "staging-backend-api-ns"
        "labels": {
          "app": "backend-ws",
          "component": "backend",
      ...
      "spec": {
        "volumes": [
          {
            "name": "eks-pod-identity-token",
      ...
        "containers": [
          {
            "name": "backend-ws-container",
            "image": "492***148.dkr.ecr.us-east-1.amazonaws.com/challenge-backend-api:v0.171.9",
            "command": [
              "gunicorn",
              "websockets_backend.run_api:app",
      ...
            "resources": {
              "requests": {
                "cpu": "200m",
                "memory": "512Mi"
              }
            },
...

Here we can see information about the Pod named “backend-ws-deployment-6db58cc97c-k56lm” which lives in the Kubernetes Namespace “staging-backend-api-ns“, and the rest of the information about it – the volumes, which containers, resources, etc.

Kubernetes API Groups and Kind

API Groups are a way to organize resources in Kubernetes. They are grouped by groups, versions, and resource types (Kind).

That is the structure of the API:

  • API Group
    • versions
      • kind

For example, in /api/v1 we see the Kubernetes Core API Group, in /apis – API Groups apps, batch, events, and so on.

The structure will be as follows:

  • /apis/<group> – the group itself and its versions
  • /apis/<group>/<version> – a specific version of the group with specific resources (Kind)
  • /apis/<group>/<version>/<resource> – access to a specific resource and objects in it

Note: Kind vs resource: Kind is the name of the resource that is specified in the schema of this resource. And resource is the name that is used to build the URI when requesting the API Server.

For example, for the API Group apps we have the version v1:

$ curl -s localhost:8080/apis/apps | jq
{
  "kind": "APIGroup",
  "apiVersion": "v1",
  "name": "apps",
  "versions": [
    {
      "groupVersion": "apps/v1",
      "version": "v1"
    }
  ],
...

And inside the version – resources, for example deployments:

$ curl -s localhost:8080/apis/apps/v1 | jq
{
...
    {
      "name": "deployments",
      "singularName": "deployment",
      "namespaced": true,
      "kind": "Deployment",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "deploy"
      ],
      "categories": [
        "all"
      ],
...

And using this group, version, and specific resource type (kind), we get all the objects:

$ curl -s localhost:8080/apis/apps/v1/deployments/ | jq
{
  "kind": "DeploymentList",
  "apiVersion": "apps/v1",
  "metadata": {
    "resourceVersion": "1534"
  },
  "items": [
    {
      "metadata": {
        "name": "coredns",
        "namespace": "kube-system",
        "uid": "9d7f6de3-041e-4afe-84f4-e124d2cc6e8a",
        "resourceVersion": "709",
        "generation": 2,
        "creationTimestamp": "2025-07-12T10:15:33Z",
        "labels": {
          "k8s-app": "kube-dns"
        },
...

Okay, so we’ve accessed the API – but where does it get all that data that we’re being shown?

Kubernetes and etcd

For storing data in Kubernetes, we have another key component of the Control Plane – etcd.

Actually, this is just a key:value database with all the data that forms our cluster – all its settings, all resources, all states of these resources, RBAC rules, etc.

When the Kubernetes API Server receives a request, for example, POST /apis/apps/v1/namespaces/default/deployments, it first checks if the object matches the resource schema (validation), and only then saves it to etcd.

The etcd database consists of a set of keys. For example, a Pod named “nginx-abc” will be stored in a key named /registry/pods/default/nginx-abc.

See the documentation Operating etcd clusters for Kubernetes.

In AWS EKS, we don’t have access to etcd (and that’s a good thing), but we can start Minikube and have a look at it:

$ minikube start
...
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

Check the system pods:

$ kubectl -n kube-system get pod
NAME                               READY   STATUS              RESTARTS      AGE
coredns-674b8bbfcf-68q8p           0/1     ContainerCreating   0             57s
etcd-minikube                      1/1     Running             0             62s
...

Connect to the cluster:

$ minikube ssh

If we had used minikube start --driver=virtualbox, we would have used minikube ssh to enter the VirtualBox instance.

But since we have the default docker driver, we simply enter the minikube container.

Install etcd here to get the etcdctl CLI utility:

docker@minikube:~$ sudo apt update 
docker@minikube:~$ sudo apt install etcd

Check it:

docker@minikube:~$ etcdctl -version 
etcdctl version: 3.3.25

And now we can see what’s in the database:

docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/var/lib/minikube/certs/etcd/ca.crt \
  --cert=/var/lib/minikube/certs/etcd/server.crt \
  --key=/var/lib/minikube/certs/etcd/server.key \
  get "" --prefix --keys-only
...
/registry/namespaces/kube-system
/registry/pods/kube-system/coredns-674b8bbfcf-68q8p
/registry/pods/kube-system/etcd-minikube
...
/registry/services/endpoints/default/kubernetes
/registry/services/endpoints/kube-system/kube-dns
...

The data in the keys is stored in Protobuf (Protocol Buffers) format, so with the usual etcdctl get KEY, the data will look a little crooked.

Let’s see what is in the database about the Pod of etcd itself :

docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/var/lib/minikube/certs/etcd/ca.crt --cert=/var/lib/minikube/certs/etcd/server.crt --key=/var/lib/minikube/certs/etcd/server.key get "/registry/pods/kube-system/etcd-minikube"

The result:

OK.

CustomResourceDefinitions and Kubernetes API

So, when we create a CRD, we extend the Kubernetes API by creating our own API Group with our own name, version, and a new resource type (Kind) that is described in the CRD.

Documentation – Extend the Kubernetes API with CustomResourceDefinitions.

Let’s write a simple CRD:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: myapps.mycompany.com
spec:
  group: mycompany.com
  names:
    kind: MyApp
    plural: myapps
    singular: myapp
  scope: Namespaced
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                image:
                  type: string

Here we have:

  • use the existing API Group apiextensions.k8s.io and version v1
    • from it take the schema of the CustomResourceDefinition object
  • and based on this schema, we create our own API Group named mycompany.com
    • in this API Group, we describe a single resource type – kind: MyApp
    • and one version – v1
    • then using openAPIV3Schema we describe the schema of our resource – what fields it has and their types, and here you can also set default values (see OpenAPI Specification)

With this CRD, we will be able to create new Custom Resources with a manifest in which we pass the apiVersion, kind, and spec.image fields from the schema.openAPIV3Schema.properties.spec.properties.image of our CRD:

apiVersion: mycompany.com/v1
kind: MyApp
metadata:
  name: example
spec:
  image: nginx:1.25

Create the CRD:

$ kk apply -f test-crd.yaml 
customresourcedefinition.apiextensions.k8s.io/myapps.mycompany.com created

Check in the Kubernetes API (you can use the | jq '.groups[] | select(.name == "mycompany.com")' selector):

$ curl -s localhost:8080/apis/ | jq
...
{
  "name": "mycompany.com",
  "versions": [
    {
      "groupVersion": "mycompany.com/v1",
      "version": "v1"
    }
  ],
  ...
}
...

And the API Group mycompany.com itself :

$ curl -s localhost:8080/apis/mycompany.com/v1 | jq
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "mycompany.com/v1",
  "resources": [
    {
      "name": "myapps",
      "singularName": "myapp",
      "namespaced": true,
      "kind": "MyApp",
      "verbs": [
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "create",
        "update",
        "watch"
      ],
      "storageVersionHash": "MZjF6nKlCOU="
    }
  ]
}

And in etcd:

docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl   --endpoints=https://127.0.0.1:2379   --cacert=/var/lib/minikube/certs/etcd/ca.crt   --cert=/var/lib/minikube/certs/etcd/server.crt   --key=/var/lib/minikube/certs/etcd/server.key   get "" --prefix --keys-only
/registry/apiextensions.k8s.io/customresourcedefinitions/myapps.mycompany.com
...
/registry/apiregistration.k8s.io/apiservices/v1.mycompany.com
...

Here, the /registry/apiextensions.k8s.io/customresourcedefinitions/myapps. mycompany .com key stores information about the new CRD itself – the CRD structure, its OpenAPI schema, versions, etc., and the /registry/apiregistration.k8s.io/apiservices/v1.mycompany.com key registers the API Service for this group to access the group via the Kubernetes API.

And of course, we can see the CRD with kubectl`:

$ kk get crd
NAME                   CREATED AT
myapps.mycompany.com   2025-07-12T11:23:19Z

Create the CustomResource itself from the manifest we wrote above:

$ kk apply -f test-resource.yaml 
myapp.mycompany.com/example created

Test it:

$ kk describe MyApp
Name:         example
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  mycompany.com/v1
Kind:         MyApp
Metadata:
  Creation Timestamp:  2025-07-12T13:34:52Z
  Generation:          1
  Resource Version:    4611
  UID:                 a88e37fd-1477-4a7e-8c00-46c925f510ac
Spec:
  Image:  nginx:1.25

But this is just data in etcd for now – we don’t have any real Pods resources, because there is no controller that handles resources from Kind: MyApp.

Note: looking ahead to the next post: actually, Kubernetes Operator is a set of CRDs and a controller that “controls” resources with the specified Kind

Kubernetes API Service

When we add a new CRD, Kubernetes not only has to create a new key in etcd with the new API Group and the corresponding resource schema, but also add a new endpoint to its routes – as we do in Python with @app.get("/") in FastAPI – so that the API server knows that the GET request /apis/mycompany.com/v1/myapps should return resources of this type.

The corresponding API Service will contain a spec with the group and version:

$ kk get apiservice v1.mycompany.com -o yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  creationTimestamp: "2025-07-12T11:53:52Z"
  labels:
    kube-aggregator.kubernetes.io/automanaged: "true"
  name: v1.mycompany.com
  resourceVersion: "2632"
  uid: 26fc8c6b-6770-422f-8996-3f35d86be6c7
spec:
  group: mycompany.com
  groupPriorityMinimum: 1000
  version: v1
  versionPriority: 100
...

That is, when we create a new CRD, Kubernetes API Server creates an API Service (writing it to /registry/apiregistration.k8s.io/apiservices/v1.mycompany.com), and adds it to its own routers in the /apis endpoint.

And now, having an idea of what the API looks like and the database that stores all the resources, we can move on to creating the CRD and controller, that is, to actually write the Operator itself.

But this is already in the next part.