Installation & Configuration
Configure your AI agent with LLM providers, observability integrations, and auto-remediation tiers.
LLM Configuration
| Provider | Models | Notes |
|---|---|---|
| OpenAI | gpt-4, gpt-3.5-turbo | Requires API key |
| Anthropic | claude-3-opus, claude-3-sonnet | Requires API key |
| Ollama | Various models | Self-hosted, runs locally |
| Z.AI | glm-4.7, glm-4, glm-4-air, glm-4-flash, glm-4-plus | Requires API key from Z.AI Coding Plan |
Observability Integration
Seamlessly integrates with Prometheus for real-time metrics collection. Uses custom metrics and recording rules to enhance incident detection accuracy.
Connects to Loki for centralized log aggregation. Parses and analyzes logs to identify error patterns and correlations with system events.
Integrates with Jaeger for distributed tracing. Tracks request flows across microservices to pinpoint performance bottlenecks and failure points.
Supports OpenTelemetry for standardized telemetry collection. Provides a unified approach to metrics, logs, and traces across your entire infrastructure.
Auto-Remediation Tiers
Low-risk operations that can safely run without human intervention.
- Restart failing pods automatically
- Scale replicas based on load
- Clear stale cache entries
Operations that require approval before execution, with configurable timeout.
- Rollback deployments to previous versions
- Drain nodes for maintenance
- Restart services across namespaces
Approval Channels: Slack, Teams, Discord, PagerDuty
Critical operations that must be reviewed and executed by operators.
- Delete namespaces or resources
- Modify cluster-wide configurations
- Change networking policies
Environment Variables
OPENAI_API_KEY
Required for OpenAI provider. Get your API key from the OpenAI dashboard.
export OPENAI_API_KEY=your-openai-api-key-hereANTHROPIC_API_KEY
Required for Anthropic provider. Get your API key from the Anthropic console.
export ANTHROPIC_API_KEY=your-anthropic-api-key-hereZAI_API_KEY
Required for Z.AI provider. Get your API key from Z.AI Coding Plan.
export ZAI_API_KEY=your-zai-api-key-hereConfiguration Examples
apiVersion: ai.aik8s.io/v1alpha1
kind: AgentController
metadata:
name: my-agent
namespace: aik8s-system
spec:
enablePredictiveEngine: true
enableAutoRemediation: true
enableKnowledgeGraph: true
llm:
provider: openai
model: gpt-4
maxTokens: 2000
apiKeySecret:
name: openai-key
namespace: aik8s-system
key: api-key
autoRemediation:
enableTier1: true
enableTier2: true
approval:
platform: slack
channel: "#ops-alerts"
timeout: 5m
rollbackTimeout: 10m
observability:
prometheusUrl: http://prometheus-operated.monitoring.svc.cluster.local:9090
lokiUrl: http://loki.monitoring.svc.cluster.local:3100
metricsInterval: 30s
clusters:
- name: prod-cluster
region: us-west-2
- name: staging-cluster
region: us-west-2# Example AgentController with Z.AI Coding Plan Integration
apiVersion: ai.aik8s.io/v1alpha1
kind: AgentController
metadata:
name: zai-agent
namespace: aik8s-system
spec:
enablePredictiveEngine: true
enableAutoRemediation: true
enableKnowledgeGraph: true
llm:
provider: zai
model: glm-4.7
maxTokens: 2000
apiKeySecret:
name: zai-api-key
namespace: aik8s-system
key: api-key
autoRemediation:
enableTier1: true
enableTier2: false
approval:
platform: slack
channel: "#ops-alerts"
timeout: 5m
rollbackTimeout: 10m
observability:
prometheusUrl: http://prometheus-operated.monitoring.svc.cluster.local:9090
lokiUrl: http://loki.monitoring.svc.cluster.local:3100
metricsInterval: 30s
clusters:
- name: prod-cluster
region: us-west-2
- name: staging-cluster
region: us-east-1Need More Details?
Explore the complete API Reference for detailed CRD specifications, field definitions, and advanced configuration options.
View API Reference