README
¶
PIKA and NotifyMaintenance
Controller and API for coordinating Maintenance operations in Kubernetes.
Our Approach
How an End User drains workloads and conducts Maintenance is unique, however every Maintenance follows the same steps:
- Schedule a Drain
- Start a Drain
- Complete a Drain
- Start a Maintenance
- Complete a Maintenance or Fail a Maintenance
- Start a Validation Job
- Pass Validation or Fail Validation
- Return to Production
We created the NotifyMaintenance API to track these common states.
NotifyMaintenance API
The NotifyMaintenance API is a Kubernetes CRD that proivdes a description of a Maintenance and the state of that Maintenance. Using well-known states, controllers & operators can coordinate with one another to run complex drain and upgrade operations.
Maintenance Status | Event | Description |
---|---|---|
MaintenanceUnknown | NotifyMaintenance Creation | A NotifyMaintenance has been created, but it's yet to be reconciled |
MaintenanceScheduled | Maintenance was scheduled, ready to start | A NotifyMaintenance has been reconciled by PIKA |
MaintenanceStarted | Node was cordoned and labeled | Notify watchers that a Node is activly draining |
SLAExpired | Workload drain SLA has been met | Notify watchers that any objects remaining may be forcefully removed at any time |
ObjectsDrained | Node is drained | Notify wachters that a Node is drained and maintenance can start on the Node |
Validating | Test that a Node is ready for Production | Notify wachters that a Node is being validated as ready for production |
MaintenanceIncomplete | Maintenance wasn't completed | A maintenance wasn't completed for any reason. The Node is not ready for production |
*MaintenanceEnded | A request was made to delete a NotifyMaintenance | A NotifyMaintenance object has a deletionTimestamp |
*MaintenanceComplete | A Node was uncordoned and the NotifyMaintenance removed | A NotifyMaintenance object was removed and a Node uncordoned. Node is returned to production |
*Implied state on the NotifyMaintenance CRD
How to use NotifyMaintenance
-
NotifyMaintenance has 3 entry points for End User Maintenance
- Drain
- Do Maintenance
- Validate Admins should write operators to fulfill these roles. The operators will watch for NotifyMaintenance transitions then act on each state.
-
NotifyMaintenance is designed to work with a "scheduler"
Creating a NotifyMaintenance CR is not enough to start maintenance on a Node. Admins need to create an operator that tells PIKA to transition NotifyMaintenance CRs through each state based on the criteria for the state. For example, the "scheduler" operator will tell PIKA to transition a NotifyMaintenance CR to ObjectsDrained state if it's safe to start an upgrade.
At this point, you may be asking yourself: why do I have to write these operators? The answer is, we don't want to tell you how to do your Maintenance. Instead, we want to provide one less API you have to build and maintain.
How NVIDIA uses NotifyMaintenance
NVIDIA's workloads on Kubernetes mostly made up of Cloud Gaming and AI. What these workloads have in common is they want to access GPU resources for a period of time.
Our perspective is that if we cordoned a Node it would eventually get drained on its own after enough time passes. This solution works, however it's incredibly inefficient. Instead, we want to stack the odds in our favor, and target Nodes that are likely to be drained quickly. The "scheduler" operator we built, implements this solution.
For the 3 entry points, we mostly use kubectl drain
to drain Nodes,
Ansible powered by Jenkins Jobs for doing maintenance, and Go test suites to
Validate a Node for Production.
Finite-state Diagram
Getting Started
A released image in nvcr.io is work in progress. In the meantime, you can build the container yourself.
make build-image
Deploy with Kustomize.
make deploy
Directories
¶
Path | Synopsis |
---|---|
api
|
|
v1alpha1
Package v1alpha1 contains API Schema definitions for the ngn2 v1alpha1 API group +kubebuilder:object:generate=true +groupName=ngn2.nvidia.com
|
Package v1alpha1 contains API Schema definitions for the ngn2 v1alpha1 API group +kubebuilder:object:generate=true +groupName=ngn2.nvidia.com |
cmd
|
|
pkg
|
|
notify
Package notify is a generated GoMock package.
|
Package notify is a generated GoMock package. |
notify/sns
Package sns is a generated GoMock package.
|
Package sns is a generated GoMock package. |