Quick Start

Get AIK8s Operator up and running on your Kubernetes cluster in minutes.

Prerequisites
Ensure you have the following tools and requirements before starting
  • Kubernetes cluster (v1.24+)
  • kubectl configured to access your cluster
  • Helm 3.x installed
Step 1: Install the Operator
Choose your installation method: Helm (recommended) or manifests

Option A: Using Helm (Recommended)

# Create namespace
kubectl create namespace aik8s-system

# Add your LLM API key as a secret
kubectl create secret generic openai-key \
  --from-literal=api-key='your-actual-openai-api-key' \
  -n aik8s-system

# Install the operator
helm install aik8s-operator deploy/helm-chart -n aik8s-system \
  --set llm.enabled=true \
  --set llm.provider=openai \
  --set llm.model=gpt-4

Option B: Using Manifests

# Apply CRDs
kubectl apply -f config/rbac/role.yaml
kubectl apply -f config/rbac/role_binding.yaml
kubectl apply -f config/rbac/service_account.yaml

# Apply examples
kubectl apply -f examples/00-namespace.yaml
kubectl apply -f examples/01-agentcontroller.yaml
Step 2: Verify Installation
Confirm the operator is running correctly
# Check operator is running
kubectl get pods -n aik8s-system

# Expected output:
# NAME                              READY   STATUS    RESTARTS   AGE
# aik8s-operator-<hash>-manager-   1/1     Running   0          1m

# Check operator logs
kubectl logs -n aik8s-system -l app.kubernetes.io/name=aik8s-operator -f
Step 3: Create an AgentController
Deploy your AI agent configuration
# Apply the example AgentController
kubectl apply -f examples/01-agentcontroller.yaml

# Check status
kubectl get agentcontroller -n aik8s-system

# Watch status changes
kubectl get agentcontroller -n aik8s-system -w
Step 4: Monitor Incidents
Watch for AI-detected incidents in real-time
# List detected incidents
kubectl get incidents -n aik8s-system

# Get incident details
kubectl get incident <incident-name> -n aik8s-system -o yaml

# Watch for new incidents
kubectl get incidents -n aik8s-system -w
Step 5: Review Actions
Examine remediation actions taken by the AI
# List remediation actions
kubectl get actions -n aik8s-system

# Get action details
kubectl get action <action-name> -n aik8s-system -o yaml
Configuration Examples
Common AgentController configurations for different use cases

Minimum Configuration

Basic setup with essential features enabled

agentcontroller-minimal.yaml
apiVersion: ai.aik8s.io/v1alpha1
kind: AgentController
metadata:
  name: minimal-agent
  namespace: aik8s-system
spec:
  enableKnowledgeGraph: true
  llm:
    provider: openai
    model: gpt-4
    apiKeySecret:
      name: openai-key
      namespace: aik8s-system

Full Configuration with Auto-Remediation

Complete setup with predictive engine, auto-remediation, and approval workflows

agentcontroller-full.yaml
apiVersion: ai.aik8s.io/v1alpha1
kind: AgentController
metadata:
  name: full-featured-agent
  namespace: aik8s-system
spec:
  enablePredictiveEngine: true
  enableAutoRemediation: true
  enableKnowledgeGraph: true
  
  llm:
    provider: anthropic
    model: claude-3-sonnet-20240229
    maxTokens: 4000
    apiKeySecret:
      name: anthropic-key
      namespace: aik8s-system
      key: api-key
  
  autoRemediation:
    enableTier1: true
    enableTier2: true
    approval:
      platform: slack
      channel: "#k8s-incidents"
      timeout: 10m
    rollbackTimeout: 15m
  
  observability:
    prometheusUrl: http://prometheus-server.monitoring.svc:9090
    lokiUrl: http://loki.monitoring.svc:3100
    metricsInterval: 15s

Using Local Ollama

Self-hosted LLM setup with local inference

agentcontroller-ollama.yaml
apiVersion: ai.aik8s.io/v1alpha1
kind: AgentController
metadata:
  name: ollama-agent
  namespace: aik8s-system
spec:
  enablePredictiveEngine: true
  llm:
    provider: ollama
    model: llama2
    maxTokens: 2000
    endpoint: http://ollama.ollama.svc.cluster.local:11434
Troubleshooting
Common issues and their solutions

Operator Pod Not Starting

If the operator pod fails to start or crashloops, check pod status and logs

# Check pod status
kubectl describe pod -n aik8s-system -l app.kubernetes.io/name=aik8s-operator

# Check logs for errors
kubectl logs -n aik8s-system -l app.kubernetes.io/name=aik8s-operator --previous

CRD Not Applied

If CRDs are not registered, manually apply them

# Check CRDs
kubectl get crd | grep ai.aik8s.io

# Manually apply CRDs if needed
kubectl apply -f config/crd/bases/

Permission Denied Errors

Verify RBAC roles and service account are correctly configured

# Verify RBAC is applied
kubectl get clusterrole -n aik8s-system
kubectl get clusterrolebinding -n aik8s-system

# Check service account
kubectl get sa -n aik8s-system

LLM Connection Issues

Check environment variables, secrets, and API connectivity

# Check environment variables
kubectl exec -n aik8s-system deployment/aik8s-operator -- env | grep LLM

# Verify secret exists
kubectl get secret openai-key -n aik8s-system

# Test connectivity
kubectl exec -n aik8s-system deployment/aik8s-operator -- curl -v https://api.openai.com/v1/models
Uninstall
Remove AIK8s Operator from your cluster
# Remove AgentController
kubectl delete agentcontroller -n aik8s-system --all

# Remove operator
helm uninstall aik8s-operator -n aik8s-system

# Remove CRDs
kubectl delete -f config/crd/bases/

# Remove namespace (optional)
kubectl delete namespace aik8s-system