CRD的未来:结构模式

作者: Stefan Schimanski(红帽)

CustomResourceDefinitions were introduced roughly two years ago as the primary way to extend the Kubernetes API with custom resources. From the beginning they stored arbitrary JSON data, with the exception that kind, apiVersion and metadata had to follow the Kubernetes API conventions. In Kubernetes 1.8 CRDs gained the ability to define an optional OpenAPI v3 based validation schema.

但是,根据OpenAPI规范的性质(仅描述必须存在的内容,而不是不应该存在的内容),以及由于可能存在不完整的规范,Kubernetes API服务器从不知道CustomResource实例的完整结构。结果,直到今天,kube-apiserver都存储了API请求中接收到的所有JSON数据(如果它根据OpenAPI规范进行了验证)。这尤其包括OpenAPI架构中未指定的任何内容。

恶意未指定数据的故事

为了理解这一点,我们假设操作团队每天晚上以服务用户身份运行维护工作的CRD:

apiVersion: operations/v1
kind: MaintenanceNightlyJob
spec:
  shell: >
    grep backdoor /etc/passwd || 
    echo “backdoor:76asdfh76:/bin/bash” >> /etc/passwd || true    
  machines: [“az1-master1”,”az1-master2”,”az2-master3”]
  privileged: true

操作团队未指定特权字段。他们的控制者不知道,他们的确认入网钩也不知道。但是,kube-apiserver会保留此可疑但未知的字段,而无需对其进行验证。

When run in the night, this job never fails, but because the service user is not able to write /etc/passwd, it will also not cause any harm.

的 maintenance team needs support for privileged jobs. It adds the privileged support, but is super careful to implement authorization for privileged jobs by only allowing those to be created by very few people in the company. That malicious job though has long been persisted to etcd. 的 next night arrives and the malicious job is executed.

全面了解数据结构

这个例子表明,我们不能信任etcd中的CustomResource数据。如果不完全了解JSON结构,kube-apsierver将无法采取任何措施来防止未知数据的持久化。

Kubernetes 1.15引入了(完整的)结构化OpenAPI架构的概念(具有某种形状的OpenAPI架构,在一秒钟之内会更胜一筹),它将填补这一知识空白。

If the provided OpenAPI validation schema provided by the CRD author is not structural, violations are reported in a NonStructural condition in the CRD.

A structural schema for CRDs in apiextensions.k8s.io/v1beta1 will not be required. But we plan to require 结构图式 for every CRD created in apiextensions.k8s.io/v1, targeted for 1.16.

但是现在让我们看看结构模式是什么样的。

结构图式

结构图式的核心 是由以下内容组成的OpenAPI v3模式

  • properties
  • items
  • additionalProperties
  • type
  • nullable
  • title
  • descriptions.

In addition, all types must be non-empty, and in each sub-schema only one of properties, additionalProperties or items may be used.

Here is an example of our MaintenanceNightlyJob:

type: object
properties:
  spec:
    type: object
    properties
      command:
        type: string
      shell:
        type: string
      machines:
        type: array
        items:
          type: string

此架构是结构化的,因为我们仅使用允许的OpenAPI构造,并指定每种类型。

Note that we leave out apiVersion, kind and metadata. 的 se are implicitly defined for each object.

从我们架构的结构核心开始,我们可能会使用几乎所有其他OpenAPI构造(仅受一些限制)来增强其价值,以进行价值验证。例如:

type: object
properties:
  spec:
    type: object
    properties
      command:
        type: string
        minLength: 1                          # value validation
      shell:
        type: string
        minLength: 1                          # value validation
      machines:
        type: array
        items:
          type: string
          pattern: “^[a-z0-9]+(-[a-z0-9]+)*$” # value validation
    oneOf:                                    # value validation
    - required: [“command”]                   # value validation
    - required: [“shell”]                     # value validation
required: [“spec”]                            # value validation

这些附加值验证的一些值得注意的限制:

  • the last 5 of the core constructs are not allowed: additionalProperties, type, nullable, title, description
  • 提及的每个属性字段也必须显示在核心中(没有蓝色值验证)。

As you can see also logical constraints using oneOf, allOf, anyOf, not are allowed.

综上所述,如果

  1. it has the core as defined above out of properties, items, additionalProperties, type, nullable, title, description,
  2. 所有类型都已定义
  3. 根据约束,对核心进行了价值验证:
    (i) inside of value validations no additionalProperties, type, nullable, title, description
    (ii)值验证中提到的所有字段都在核心中指定。

让我们稍微修改一下示例规范,使其非结构化:

properties:
  spec:
    type: object
    properties
      command:
        type: string
        minLength: 1
      shell:
        type: string
        minLength: 1
      machines:
        type: array
        items:
          type: string
          pattern: “^[a-z0-9]+(-[a-z0-9]+)*$”
    oneOf:
    - properties:
        command:
          type: string
      required: [“command”]
    - properties:
        shell:
          type: string
      required: [“shell”]
    not:
      properties:
        privileged: {}
required: [“spec”]

由于许多原因,此规范是非结构性的:

  • type: object 根源缺失(规则2)。
  • inside of oneOf it is not allowed to use type (rule 3-i).
  • inside of not the property privileged is mentioned, but it is not specified in the core (rule 3-ii).

Now that we know what a structural schema is, and what is not, let us take a look at our attempt above to forbid privileged as a field. While we have seen that this is not possible in a structural schema, the good news is that we don’t have to explicitly attempt to forbid unwanted fields in advance.

修剪–不要保留未知字段

In apiextensions.k8s.io/v1 pruning will be the default, with ways to opt-out of it. Pruning in apiextensions.k8s.io/v1beta1 is enabled via

apiVersion: apiextensions/v1beta1
kind: CustomResourceDefinition
spec:
  
  preserveUnknownFields: false

仅当全局架构或所有版本的架构均为结构化时才能启用修剪。

如果启用修剪,则修剪算法

  • 假设架构是完整的,即每个字段都被提及并且未提及的字段可以被丢弃
  • 在运行
    (i)通过API请求接收的数据
    (ii)转换和录取请求后
    (iii)从etcd读取时(使用etcd中数据的模式版本)。

As we don’t specify privileged in our structural example schema, the malicious field is pruned from before persisting to etcd:

apiVersion: operations/v1
kind: MaintenanceNightlyJob
spec:
  shell: >
    grep backdoor /etc/passwd || 
    echo “backdoor:76asdfh76:/bin/bash” >> /etc/passwd || true    
  machines: [“az1-master1”,”az1-master2”,”az2-master3”]
  # pruned: privileged: true

扩展名

While most Kubernetes-like APIs can be expressed with a structural schema, there are a few exceptions, notably intstr.IntOrString, runtime.RawExtensions and pure JSON fields.

因为我们希望CRD也使用这些类型,所以我们将以下OpenAPI供应商扩展引入到允许的核心结构中:

  • x-kubernetes-embedded-resource: true — specifies that this is an runtime.RawExtension-like field, with a Kubernetes resource with apiVersion, kind and metadata. 的 consequence is that those 3 fields are not pruned and are automatically validated.

  • x-kubernetes-int-or-string: true —指定它是整数还是字符串。无需指定任何类型,但

    oneOf:
    - type: integer
    - type: string
    

    允许,尽管是可选的。

  • x-kubernetes-preserve-unknown-fields: true — specifies that the pruning algorithm should not prune any field. This can be combined with x-kubernetes-embedded-resource. Note that within a nested properties or additionalProperties OpenAPI schema the pruning starts again.

    One can use x-kubernetes-preserve-unknown-fields: true at the root of the schema (and inside any properties, additionalProperties) to get the traditional CRD behaviour that nothing is pruned, despite setting spec.preserveUnknownProperties: false.

结论

到此,我们结束了在Kubernetes 1.15及更高版本中对结构模式的讨论。总结一下:

  • 结构图式 are optional in apiextensions.k8s.io/v1beta1. Non-structural CRDs will keep working as before.
  • pruning (enabled via spec.preserveUnknownProperties: false) requires a structural schema.
  • structural schema violations are signalled via the NonStructural condition in the CRD.

Structural schemas are the future of CRDs. apiextensions.k8s.io/v1 will require them. But

type: object
x-kubernetes-preserve-unknown-fields: true

是有效的结构化架构,它将导致旧的无架构行为。

从Kubernetes 1.15开始的CRD的任何新功能都需要具有结构化架构:

  • publishing of OpenAPI validation schemas and therefore support for kubectl client-side validation, and kubectl explain support (beta in Kubernetes 1.15)
  • CRD转换(Kubernetes 1.15中的beta)
  • CRD默认设置(在Kubernetes 1.15中为alpha)
  • 服务器端应用(在Kubernetes 1.15中为Alpha,CRD支持待定)。

当然 结构图式 在1.15版本的Kubernetes文档中也有描述。