Quick Start
Get AIK8s Operator up and running on your Kubernetes cluster in minutes.
Prerequisites
Ensure you have the following tools and requirements before starting
- Kubernetes cluster (v1.24+)
- kubectl configured to access your cluster
- Helm 3.x installed
Step 1: Install the Operator
Choose your installation method: Helm (recommended) or manifests
Option A: Using Helm (Recommended)
# Create namespace
kubectl create namespace aik8s-system
# Add your LLM API key as a secret
kubectl create secret generic openai-key \
--from-literal=api-key='your-actual-openai-api-key' \
-n aik8s-system
# Install the operator
helm install aik8s-operator deploy/helm-chart -n aik8s-system \
--set llm.enabled=true \
--set llm.provider=openai \
--set llm.model=gpt-4Option B: Using Manifests
# Apply CRDs
kubectl apply -f config/rbac/role.yaml
kubectl apply -f config/rbac/role_binding.yaml
kubectl apply -f config/rbac/service_account.yaml
# Apply examples
kubectl apply -f examples/00-namespace.yaml
kubectl apply -f examples/01-agentcontroller.yamlStep 2: Verify Installation
Confirm the operator is running correctly
# Check operator is running
kubectl get pods -n aik8s-system
# Expected output:
# NAME READY STATUS RESTARTS AGE
# aik8s-operator-<hash>-manager- 1/1 Running 0 1m
# Check operator logs
kubectl logs -n aik8s-system -l app.kubernetes.io/name=aik8s-operator -fStep 3: Create an AgentController
Deploy your AI agent configuration
# Apply the example AgentController
kubectl apply -f examples/01-agentcontroller.yaml
# Check status
kubectl get agentcontroller -n aik8s-system
# Watch status changes
kubectl get agentcontroller -n aik8s-system -wStep 4: Monitor Incidents
Watch for AI-detected incidents in real-time
# List detected incidents
kubectl get incidents -n aik8s-system
# Get incident details
kubectl get incident <incident-name> -n aik8s-system -o yaml
# Watch for new incidents
kubectl get incidents -n aik8s-system -wStep 5: Review Actions
Examine remediation actions taken by the AI
# List remediation actions
kubectl get actions -n aik8s-system
# Get action details
kubectl get action <action-name> -n aik8s-system -o yamlConfiguration Examples
Common AgentController configurations for different use cases
Minimum Configuration
Basic setup with essential features enabled
agentcontroller-minimal.yaml
apiVersion: ai.aik8s.io/v1alpha1
kind: AgentController
metadata:
name: minimal-agent
namespace: aik8s-system
spec:
enableKnowledgeGraph: true
llm:
provider: openai
model: gpt-4
apiKeySecret:
name: openai-key
namespace: aik8s-systemFull Configuration with Auto-Remediation
Complete setup with predictive engine, auto-remediation, and approval workflows
agentcontroller-full.yaml
apiVersion: ai.aik8s.io/v1alpha1
kind: AgentController
metadata:
name: full-featured-agent
namespace: aik8s-system
spec:
enablePredictiveEngine: true
enableAutoRemediation: true
enableKnowledgeGraph: true
llm:
provider: anthropic
model: claude-3-sonnet-20240229
maxTokens: 4000
apiKeySecret:
name: anthropic-key
namespace: aik8s-system
key: api-key
autoRemediation:
enableTier1: true
enableTier2: true
approval:
platform: slack
channel: "#k8s-incidents"
timeout: 10m
rollbackTimeout: 15m
observability:
prometheusUrl: http://prometheus-server.monitoring.svc:9090
lokiUrl: http://loki.monitoring.svc:3100
metricsInterval: 15sUsing Local Ollama
Self-hosted LLM setup with local inference
agentcontroller-ollama.yaml
apiVersion: ai.aik8s.io/v1alpha1
kind: AgentController
metadata:
name: ollama-agent
namespace: aik8s-system
spec:
enablePredictiveEngine: true
llm:
provider: ollama
model: llama2
maxTokens: 2000
endpoint: http://ollama.ollama.svc.cluster.local:11434Troubleshooting
Common issues and their solutions
Operator Pod Not Starting
If the operator pod fails to start or crashloops, check pod status and logs
# Check pod status
kubectl describe pod -n aik8s-system -l app.kubernetes.io/name=aik8s-operator
# Check logs for errors
kubectl logs -n aik8s-system -l app.kubernetes.io/name=aik8s-operator --previousCRD Not Applied
If CRDs are not registered, manually apply them
# Check CRDs
kubectl get crd | grep ai.aik8s.io
# Manually apply CRDs if needed
kubectl apply -f config/crd/bases/Permission Denied Errors
Verify RBAC roles and service account are correctly configured
# Verify RBAC is applied
kubectl get clusterrole -n aik8s-system
kubectl get clusterrolebinding -n aik8s-system
# Check service account
kubectl get sa -n aik8s-systemLLM Connection Issues
Check environment variables, secrets, and API connectivity
# Check environment variables
kubectl exec -n aik8s-system deployment/aik8s-operator -- env | grep LLM
# Verify secret exists
kubectl get secret openai-key -n aik8s-system
# Test connectivity
kubectl exec -n aik8s-system deployment/aik8s-operator -- curl -v https://api.openai.com/v1/modelsNext Steps
Explore more advanced configurations and features
Uninstall
Remove AIK8s Operator from your cluster
# Remove AgentController
kubectl delete agentcontroller -n aik8s-system --all
# Remove operator
helm uninstall aik8s-operator -n aik8s-system
# Remove CRDs
kubectl delete -f config/crd/bases/
# Remove namespace (optional)
kubectl delete namespace aik8s-systemSupport
- Documentation: https://github.com/aik8s/aik8s-operator
- Issues: https://github.com/aik8s/aik8s-operator/issues
- Discussions: https://github.com/aik8s/aik8s-operator/discussions