wao-core

module
v1.27.0-alpha.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 26, 2023 License: Apache-2.0

README

wao-core

CRDs and libraries for WAO.

Description

This repository contains CRDs and libraries for WAO. They are intended to be used with wao-metrics-adapter, wao-scheduler, etc.

Getting Started

Installation

Install CRDs and controllers.

kubectl apply -f https://github.com/waok8s/wao-core/releases/download/v1.27.0/wao-core.yaml

Wait for the pod to be ready.

kubectl wait pod $(kubectl get pods -n wao-system -l control-plane=controller-manager -o jsonpath="{.items[0].metadata.name}") -n wao-system --for condition=Ready
Prerequisites for NodeConfig[Template]

Before using NodeConfig[Template], you need to do the following.

  • Make sure that your nodes have Redfish API enabled and inlet temperature sensors are available. The client implementation can be found in pkg/metrics/inlettemp/redfish.go.
  • Make sure that you have a differential pressure API server running. The client implementation can be found in pkg/metrics/deltap/dpapi.go.
  • Make sure that you have a power consumption predictor running. The client implementation can be found in pkg/predictor/v2inferenceprotocol/powerconsumption.go.
  • (Optional) Make sure that your nodes have /redfish/v1/Systems/{systemId}/MachineLearningModel Redfish property. This property is currently not supported in most Redfish implementations, but you can use MLMM to provide it. The client implementation can be found in pkg/predictor/endpointprovider/redfish.go.

Then, you can configure your nodes with NodeConfig[Template]. See Configuration for details.

Configuration

NodeConfig CRD

NodeConfig CRD is used to configure a node for other WAO components. It provides the following information:

  • Metrics collector: how to collect metrics from the node.
  • Predictor: how to predict power consumption of the node.

Here is an example.

[!IMPORTANT] Currently, only "wao-system" namespace is supported for NodeConfig, NodeConfigTemplate and related Secrets. (This is due to RBAC.)

apiVersion: wao.bitmedia.co.jp/v1beta1
kind: NodeConfig
metadata:
  name: worker-0
  namespace: wao-system
spec:
  nodeName: worker-0
  metricsCollector:
    inletTemp:
      type: Redfish
      endpoint: "https://10.0.0.100"
      basicAuthSecret:
        name: "worker-0-redfish-basicauth"
      fetchInterval: 10s
    deltaP:
      type: DifferentialPressureAPI
      endpoint: "http://10.0.0.1:5000"
      fetchInterval: 10s
  predictor:
    powerConsumption:
      type: V2InferenceProtocol
      endpoint: "http://10.0.0.1:8080/v2/models/myModel/versions/v0.1.0/infer"

The above example uses Redfish and DifferentialPressureAPI to collect inlet temperature and differential pressure, and uses V2InferenceProtocol to predict power consumption.

Metrics Collector: Inlet Temperature

This part of the spec is used to configure how to collect inlet temperature.

  • type: Redfish or Fake.
  • endpoint: Endpoint URL. Ignored when type is Fake.
  • basicAuthSecret (Optional): Secret containing username and password for basic authentication. Ignored when the type does not require authentication.
  • fetchInterval (Optional): Interval to fetch metrics. Default is 15s.
    inletTemp:
      type: Redfish
      endpoint: "https://10.0.0.100"
      basicAuthSecret:
        name: "worker-0-redfish-basicauth"
      fetchInterval: 10s
Metrics Collector: Differential Pressure

This part of the spec is used to configure how to collect differential pressure.

  • type: DifferentialPressureAPI or Fake.
  • endpoint: Endpoint URL. Ignored when type is Fake.
  • basicAuthSecret (Optional): Secret containing username and password for basic authentication. Ignored when the type does not require authentication.
  • fetchInterval (Optional): Interval to fetch metrics. Default is 15s.
    deltaP:
      type: DifferentialPressureAPI
      endpoint: "http://10.0.0.1:5000"
      fetchInterval: 10s
Predictor: Power Consumption

This part of the spec is used to configure how to predict power consumption.

  • type: V2InferenceProtocol or Fake.
  • endpoint: Endpoint URL. Ignored when type is Fake.
  • basicAuthSecret (Optional): Secret containing username and password for basic authentication. Ignored when the type does not require authentication.
  • fetchInterval (Unused): Ignored.
    powerConsumption:
      type: V2InferenceProtocol
      endpoint: "http://10.0.0.1:8080/v2/models/myModel/versions/v0.1.0/infer"
Predictor: Power Consumption Endpoint Provider

This part of the spec is used to configure how to get endpoint for power consumption predictor. This is useful when the endpoint is described in Redfish or other APIs.

  • type: Redfish or Fake.
  • endpoint: Endpoint URL. Ignored when type is Fake.
  • basicAuthSecret (Optional): Secret containing username and password for basic authentication. Ignored when the type does not require authentication.
  • fetchInterval (Unused): Ignored.
    powerConsumptionEndpointProvider:
      type: Redfish
      endpoint: "https://10.0.0.1"
      basicAuthSecret:
        name: "worker-0-redfish-basicauth"

If your predictor requires authentication, you can set your Secret in powerConsumption.basicAuthSecret while leaving other fields empty.

    powerConsumption:
      type: ""      # Endpoint provider will set this.
      endpoint: ""  # Endpoint provider will set this.
      basicAuthSecret:
        name: "predictor-basicauth"
    powerConsumptionEndpointProvider:
      type: Redfish
      endpoint: "https://10.0.0.1"
      basicAuthSecret:
        name: "worker-0-redfish-basicauth"
NodeConfigTemplate CRD

NodeConfigTemplate CRD is used to configure a group of nodes by selecting nodes with labels. The controller will create NodeConfig for each node.

Here is an example.

apiVersion: wao.bitmedia.co.jp/v1beta1
kind: NodeConfigTemplate
metadata:
  name: redfish-enabled-nodes
  namespace: wao-system
spec:
  nodeSelector:
    matchLabels:
      node.kubernetes.io/instance-type: "redfish-enabled"
  metricsCollector:
    inletTemp:
      type: Redfish
      endpoint: "https://10.0.0.100"
      basicAuthSecret:
        name: "worker-0-redfish-basicauth"
      fetchInterval: 10s
    deltaP:
      type: DifferentialPressureAPI
      endpoint: "http://10.0.0.1:5000"
      fetchInterval: 10s
  predictor:
    powerConsumptionEndpointProvider:
      type: Redfish
      endpoint: "https://10.0.0.1"
      basicAuthSecret:
        name: "worker-0-redfish-basicauth"

After applying the above NodeConfigTemplate, you can see NodeConfig for each node.

$ kubectl get nodeconfig -n wao-system

NAME                              AGE
redfish-enabled-nodes-worker-0    10s
redfish-enabled-nodes-worker-1    10s
redfish-enabled-nodes-worker-2    10s

Development

This project uses Kubebuilder (v3.11) to generate the CRDs and controllers. However, codes under pkg (contain libraries) do not follow Kubebuilder conventions.

Components
  • api/wao: CRDs.
  • internal/controller: Controllers.
  • pkg/controller: Controllers not run in the controller manager.
  • pkg/metrics: Custom metrics library.
  • pkg/predictor: Predictor library.
  • pkg/client: Cached clients for metrics and predictors.

Changelog

Versioning: we use the same major.minor as Kubernetes, and the patch is our own.

  • What comes next:
    • TBD
  • 2023-xx-xx v1.27.0
    • First release.
    • CRDs, controllers and libraries.

License

Copyright 2023 Bitmedia, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Directories

Path Synopsis
api
internal
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL
JackTT - Gopher 🇻🇳