Cloud Native, GitOps, Platform Engineering

Part 4 – Building a Scalable, Multi‑Environment GitOps Architecture with Argo CD

In the previous parts of this series, we focused on instrumentation, the OpenTelemetry Operator, and the Collector. Now we shift gears and build the GitOps architecture that will manage everything, platform components, workloads, and environment‑specific configuration in a clean, scalable, production‑ready way.

This is where the project becomes a real platform.

All manifests, ApplicationSets, and configuration used in this series are available in the companion GitHub repository

🎯 What We’re Building in Part 4

By the end of this part, you will have:

  • A multi‑environment GitOps structure
  • A clean separation between:
    • Platform components (cert-manager, OTel Operator, Collector)
    • Application workloads (demo-dotnet)
  • A split ApplicationSet model:
    • One ApplicationSet for Helm‑based platform components
    • One ApplicationSet for plain‑YAML platform components
    • One ApplicationSet for application workloads
  • Matrix generators that produce environment named instances e.g. dev-cert-manager, dev-collector, dev-demo-dotnet, if another environment was added e.g. staging then they would be staging-cert-manager, staging-collector, staging-demo-dotnet
  • Sync waves to enforce deterministic ordering
  • Namespace isolation per environment
  • Environment‑specific overrides via environments/{{.environment}}/values/

This is the architecture used by real platform teams running GitOps at scale.

📁 Repository Structure

A clean repo structure makes GitOps easier. This series uses:

argocd/
  app-of-apps.yaml
  applicationset-platform-helm.yaml
  applicationset-platform.yaml
  applicationset-apps.yaml

platform/
  cert-manager/
  opentelemetry-operator/
  collector/

apps/
  demo-dotnet/

environments/
  dev/
    values/
      platform-values.yaml
      apps-values.yaml

This structure is intentionally simple, scalable, and DRY.

🌱 The App-of-Apps Root

Argo CD starts with a single root Application:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: root
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/<your-repo>
    targetRevision: main
    path: argocd
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: false
      selfHeal: false
    syncOptions:
      - CreateNamespace=true
      - PruneLast=true

This Application discovers and applies all ApplicationSets inside argocd/.

🏗️ Splitting Platform Components:

Not all platform components are created equal:

  • cert-manager → Helm chart
  • opentelemetry-operator → Helm chart
  • collector → plain YAML

Trying to force everything through Helm creates errors and unnecessary complexity.
So we split the platform into two ApplicationSets:

1. applicationset-platform-helm.yaml

Manages cert-manager + OTel Operator.

2. applicationset-platform.yaml

Manages the Collector.

This keeps the repo clean and avoids Helm‑related errors

🧬 Matrix Generators: env × component

Each ApplicationSet uses a matrix generator:

  • One list defines environments (dev, staging, prod)
  • One list defines components (e.g., cert-manager, operator, collector)

This series includes only a dev environment, but the structure supports adding staging and prod with no additional changes

Argo CD multiplies them:

(dev × cert-manager)
(dev × operator)
(dev × collector)
(staging × cert-manager)
...

This produces a clean, predictable set of Applications per environment.

⏱️ Sync Waves: Ordering Matters

Platform components must deploy in the correct order:

WaveComponent
0cert-manager, opentelemetry-operator
1collector
3workloads

This ensures:

  • CRDs exist before the Operator starts
  • The Collector exists before workloads send telemetry
  • Workloads deploy last

🌍 Environment-Specific Overrides

Each environment has its own values e.g.

environments/dev/values/platform-values.yaml
environments/staging/values/platform-values.yaml
environments/prod/values/platform-values.yaml

Only dev is included in this series, but the pattern scales to additional environments easily.

This keeps platform definitions DRY while allowing environment‑specific behaviour.

🚀 The Result

By the end of Part 4, you have:

  • A fully declarative, multi‑environment GitOps architecture
  • Clean separation of platform vs apps
  • Deterministic ordering via sync waves
  • Environment‑specific overrides
  • Namespace isolation
  • A scalable pattern for adding new apps or environments

This is the foundation for everything that follows in the series.

.NET, Observability, OpenTelemetry

Part 3 – Auto‑Instrumenting .NET with OpenTelemetry

In Part 2, we deployed Jaeger v2 using the OpenTelemetry Collector and exposed the Jaeger UI. Now it’s time to generate real traces without modifying application code or rebuilding container images.

This part shows how to use the OpenTelemetry Operator to inject the .NET auto‑instrumentation agent automatically. This approach is fully declarative, GitOps‑friendly, and ideal for platform teams who want consistent instrumentation across many services.

All manifests, ApplicationSets, Code and configuration used in this series are available in the companion GitHub repository

🧠 How Operator‑Managed .NET Auto‑Instrumentation Works

The OpenTelemetry Operator can automatically:

  • Inject the .NET auto‑instrumentation agent into your pod
  • Mount the agent files
  • Set all required environment variables
  • Configure OTLP exporters
  • Apply propagators
  • Ensure consistent agent versions across workloads

This means:

  • No Dockerfile changes
  • No manual environment variables
  • No code changes
  • No per‑service configuration drift

Instrumentation becomes a cluster‑level concern, not an application‑level burden.

📦 Defining the .NET Instrumentation Resource

To enable .NET auto‑instrumentation, create an Instrumentation CR

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: auto-dotnet
  namespace: apps
spec:
  dotnet:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:latest

This tells the Operator:

  • Manage the lifecycle of the agent declaratively
  • Use the official .NET auto‑instrumentation agent
  • Inject it into workloads in this namespace (or those that opt‑in)

Commit this file to Git and let ArgoCD sync it.

🏗️ Instrumenting a .NET Application (No Image Changes Required)

To instrument a .NET application, you simply annotate the Deployment:

metadata:
  annotations:
    instrumentation.opentelemetry.io/inject-dotnet: "true"

That’s it.

The Operator will:

  • Inject the agent
  • Mount the instrumentation files
  • Set all required environment variables
  • Configure the OTLP exporter
  • Enrich traces with Kubernetes metadata

Your Deployment YAML stays clean and simple.

📁 Example .NET Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dev-demo-dotnet
  namespace: apps
  annotations:
    instrumentation.opentelemetry.io/inject-dotnet: "true"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: dev-demo-dotnet
  template:
    metadata:
      labels:
        app: dev-demo-dotnet
    spec:
      containers:
        - name: dev-demo-dotnet
          image:  demo-dotnet:latest
          ports:
            - containerPort: 8080

Notice what’s missing:

  • No agent download
  • No Dockerfile changes
  • No environment variables
  • No profiler configuration

The Operator handles everything.

🔬 What the Operator Injects (Real Example)

Here is a simplified version of the actual mutated pod from your cluster. This shows exactly what the Operator adds:

initContainers:
  - name: opentelemetry-auto-instrumentation-dotnet
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:latest
    command: ["cp", "-r", "/autoinstrumentation/.", "/otel-auto-instrumentation-dotnet"]

Injected environment variables

env:
  - name: CORECLR_ENABLE_PROFILING
    value: "1"
  - name: CORECLR_PROFILER
    value: "{918728DD-259F-4A6A-AC2B-B85E1B658318}"
  - name: CORECLR_PROFILER_PATH
    value: /otel-auto-instrumentation-dotnet/linux-x64/OpenTelemetry.AutoInstrumentation.Native.so
  - name: DOTNET_STARTUP_HOOKS
    value: /otel-auto-instrumentation-dotnet/net/OpenTelemetry.AutoInstrumentation.StartupHook.dll
  - name: DOTNET_ADDITIONAL_DEPS
    value: /otel-auto-instrumentation-dotnet/AdditionalDeps
  - name: DOTNET_SHARED_STORE
    value: /otel-auto-instrumentation-dotnet/store
  - name: OTEL_DOTNET_AUTO_HOME
    value: /otel-auto-instrumentation-dotnet
  - name: OTEL_SERVICE_NAME
    value: dev-demo-dotnet
  - name: OTEL_EXPORTER_OTLP_ENDPOINT
    value: http://jaeger-inmemory-instance-collector.monitoring.svc.cluster.local:4318

Kubernetes metadata enrichment

- name: OTEL_RESOURCE_ATTRIBUTES
  value: k8s.container.name=dev-demo-dotnet,...

Volume for instrumentation files

volumes:
  - name: opentelemetry-auto-instrumentation-dotnet
    emptyDir:
      sizeLimit: 200Mi

This is the Operator doing exactly what it was designed to do:
injecting a complete, production‑grade instrumentation layer without touching your application code.

🚀 Deploying the Instrumented App

Once the Instrumentation CR and Deployment are committed:

  1. ArgoCD syncs the changes
  2. The Operator mutates the pod
  3. The .NET agent is injected
  4. The app begins emitting OTLP traces

Check the pod:

kubectl get pods -n apps

You’ll see:

  • An init container
  • A mounted instrumentation volume
  • Injected environment variables

🔍 Verifying That Traces Are Flowing

1. Port‑forward the Jaeger UI

kubectl -n monitoring port-forward svc/jaeger-inmemory-instance-collector 16686:16686

Open:

http://localhost:16686

2. Generate traffic

kubectl -n apps port-forward svc/dev-demo-dotnet 8080:8080
curl http://localhost:8080/

3. Check the Jaeger UI

You should now see:

  • Service: dev-demo-dotnet
  • HTTP server spans
  • Outgoing calls (if any)
  • Full trace graphs

If you see traces, the Operator‑managed pipeline is working end‑to‑end.

🧪 Troubleshooting Common Issues

No traces appear

  • Ensure the Deployment has the annotation
  • Ensure the Instrumentation CR is in the same namespace
  • Check Operator logs for mutation errors
  • Verify the Collector’s OTLP ports (4317/4318)

App restarts repeatedly

  • The Operator may be injecting into a non‑.NET container
  • Ensure your image is .NET 8+

Traces appear but missing context

  • The Operator sets tracecontext,baggage automatically
  • Ensure no middleware strips headers

🧭 What’s Next

With Jaeger v2 deployed and .NET auto‑instrumentation fully automated, you now have a working observability pipeline that requires:

  • No code changes
  • No image modifications
  • No per‑service configuration

In Part 4, we’ll take this setup and make it fully declarative using ArgoCD:

  • Repo structure
  • ArgoCD Applications
  • Sync strategies
  • Drift correction
  • Multi‑component GitOps workflows

This is where the system becomes operationally robust.