Skip to content

adamatdevops/mlifecycle-orchestrator

ML Lifecycle Orchestrator

Zero-touch ML model deployment platform with policy-as-code governance.

CI Model Governance Security Scan Zero-Touch Deploy

License: MIT Python 3.11+ Terraform Kubernetes OPA FastAPI PyTorch

Ruff Code style: black Docker Trivy Grype Gitleaks Hadolint

Overview

Data scientists push a model manifest. The platform deploys it to production automatically.

Zero-Touch ML Deployment Flow

No DevOps tickets. No manual intervention. Full governance and auditability.

Key Features

Feature Description
Zero-Touch Deploy Push manifest, get production endpoint
Policy-as-Code OPA/Rego governance for quality, fairness, drift
Auto-Scaling Kubernetes HPA based on load
Security First Container scanning, SBOM, non-root
Observability Prometheus metrics, Grafana dashboards

Quick Start

1. Create Model Manifest

# examples/valid/model-manifest.yaml
apiVersion: mlifecycle.io/v1
kind: ModelDeployment
spec:
  model:
    name: fraud-detector
    version: "2.1.0"
    framework: pytorch
    experiment_id: exp-001
  metrics:
    accuracy: 0.94
    f1_score: 0.90
  fairness:
    demographic_parity: 0.02
  drift:
    score: 0.05
  dependencies:
    - torch>=2.0.0
  data_sources:
    - source: transactions-prod
      approved: true
  inference:
    resources:
      requests:
        cpu: "500m"
        memory: "1Gi"
      limits:
        cpu: "2"
        memory: "4Gi"
  monitoring:
    enabled: true

2. Push to Repository

git add examples/valid/model-manifest.yaml
git commit -m "Deploy fraud-detector v2.1.0"
git push origin main

3. Automatic Deployment

The platform:

  1. Validates against governance policies
  2. Builds container image
  3. Scans for vulnerabilities
  4. Deploys to Kubernetes
  5. Registers in model catalog

Governance Policies

Models must pass all policies before deployment:

Policy Threshold Purpose
Accuracy >= 0.85 Minimum model quality
F1 Score >= 0.80 Balanced precision/recall
Fairness Required Bias evaluation present
Demographic Parity < 0.10 Fair across groups
Drift Score < 0.30 Model stability
Dependencies Allowlist Supply chain security
Monitoring Required Observability enabled
Explainability Required Model interpretability
Version Format Semantic Valid semver (e.g., 1.0.0)

Policy Validation

# Validate model manifest
opa eval \
  --input examples/valid/model-manifest.yaml \
  --data src/policies/model-governance.rego \
  'data.model.governance.deny'

Inference Service

FastAPI-based inference endpoint:

Endpoint Method Description
/health GET Liveness probe
/ready GET Model loaded check
/predict POST Run inference
/model/info GET Model metadata
/metrics GET Prometheus metrics

Example Request

curl -X POST https://inference.example.com/predict \
  -H "Content-Type: application/json" \
  -d '{"instances": [[1.0, 2.0, 3.0, 4.0, 5.0]]}'

Response

{
  "predictions": [
    {
      "prediction": 1,
      "confidence": 0.92,
      "probabilities": [0.08, 0.92]
    }
  ],
  "model_name": "fraud-detector",
  "model_version": "2.1.0",
  "inference_time_ms": 12.5
}

Project Structure

mlifecycle-orchestrator/
├── .github/workflows/       # CI/CD pipelines
│   ├── zero-touch-deploy.yml
│   ├── model-governance.yml
│   ├── security-scan.yml
│   └── ci.yml
├── src/
│   ├── policies/            # OPA/Rego governance
│   │   ├── model-governance.rego
│   │   └── deployment-policy.rego
│   └── inference-service/   # FastAPI inference
│       ├── app/main.py
│       ├── Dockerfile
│       └── tests/
├── infrastructure/
│   ├── terraform/           # AWS EKS infrastructure
│   └── kubernetes/          # K8s manifests
│       ├── base/
│       └── overlays/
├── examples/
│   ├── valid/               # Valid model manifests
│   └── invalid/             # Policy violation examples
└── docs/
    ├── architecture.md
    ├── zero-touch-workflow.md
    ├── model-governance.md
    └── adr/

Local Development

Prerequisites

  • Python 3.11+
  • Docker
  • OPA CLI
  • kubectl (optional)

Run Inference Service

cd src/inference-service
pip install -r requirements.txt
uvicorn app.main:app --reload

Run Policy Tests

# Install OPA
brew install opa  # macOS
# or download from https://www.openpolicyagent.org/

# Run tests
opa test src/policies/ -v

Build Container

cd src/inference-service
docker build -t inference-service:local .
docker run -p 8080:8080 inference-service:local

Kubernetes Architecture

Kubernetes Deployment Architecture

Key Components

Component Description
Ingress TLS termination, routing
Service ClusterIP load balancing
Deployment Rolling updates, replicas
HPA Auto-scaling (2-10 pods)
PDB High availability guarantee
NetworkPolicy Pod isolation
ServiceMonitor Prometheus scraping

Infrastructure

Detailed setup guides available in documentation:

Security & Governance Pipeline

Security & Governance Pipeline

Security Scanning

Category Tools Purpose
Container Trivy, Grype Vulnerability scanning
Dependencies pip-audit, Safety Python package vulnerabilities
Secrets Gitleaks Credential detection
SAST Bandit, Semgrep Static code analysis
Dockerfile Hadolint, Dockle Best practices linting

Container Security

  • Multi-stage Docker builds for minimal attack surface
  • Non-root user execution
  • Read-only filesystem
  • Health checks enabled

Documentation

Technology Stack

Layer Technology
Policy Engine Open Policy Agent, Rego
ML Framework PyTorch
API Framework FastAPI, Uvicorn
Container Docker, Buildx
Orchestration Kubernetes, Kustomize
Infrastructure Terraform, AWS EKS
CI/CD GitHub Actions
Observability Prometheus, Grafana
Code Quality Ruff, Black, isort, mypy
Security Trivy, Grype, Gitleaks, Hadolint, Bandit
Testing pytest, OPA test

License

MIT License - see LICENSE for details.