DevOps | Coding With Taz

Distributed systems are powerful, but they come with a familiar cost: every service ends up carrying a surprising amount of infrastructure code. Database clients, message broker SDKs, storage SDKs, retry logic, connection handling, secrets, configuration, observability, the list grows quickly.

Most teams don’t notice this at first. It feels normal.
But over time, the weight becomes obvious:

Every service looks different
Every SDK behaves differently
Every language has its own patterns
Every infrastructure change requires code changes
Local development becomes fragile
Testing becomes harder
Onboarding slows down

This is the problem space where Dapr lives.

What Dapr Is

Dapr (the Distributed Application Runtime) is a runtime that provides a set of building blocks for common distributed system capabilities:

State management
Pub/Sub
Bindings to external systems
Secrets
Service invocation
Observability

Each building block exposes a consistent API, regardless of the underlying infrastructure.

Your service talks to Dapr.
Dapr talks to Redis, Kafka, S3, Postgres, Service Bus, and more.

Dapr runs as a sidecar next to your service, exposing HTTP and gRPC endpoints your application can call.

This separation keeps your application code clean, portable, and focused on business logic.

What Dapr Is Not

Dapr is not:

a database
a message broker
a workflow engine
a service mesh
a replacement for Kubernetes
a silver bullet

It doesn’t remove the need to understand your infrastructure.
It removes the need to couple your application code to it.

Dapr can run alongside service meshes, orchestrators, and cloud‑native tooling, they solve different problems.

Why Dapr Exists (The Real Problem It Solves)

Most developers think the pain is:

“I have to write boilerplate code for Redis, Kafka, S3…”

But the real pain is deeper:

1. SDK sprawl

Every SDK has its own:

retry semantics
connection lifecycle
error model
configuration
authentication
testing story

Multiply that across languages and teams, and the system becomes inconsistent and hard to evolve.

2. Infrastructure leaking into application code

Connection strings, broker details, storage paths, all embedded in services.

3. Local development drift

Running Redis, Kafka, storage, secrets, and multiple services locally is painful and rarely matches production.

4. Polyglot inconsistency

Go, .NET, Python, Java, each has different libraries, patterns, and failure modes.

5. Infrastructure churn

Switching from Redis to Postgres, or Kafka to Service Bus or RabbitMQ, becomes a multi‑service refactor.

6. Service-to-service communication complexity

Retries, timeouts, discovery, identity, and mTLS all behave differently across languages and frameworks.

Dapr solves these problems by providing consistent, portable building blocks that sit beside your service, not inside it.

Why AI Doesn’t Replace Dapr

AI can generate code. Dapr removes the need to write certain kinds of code.
These are not the same thing.

AI tools (Copilot, ChatGPT, etc.) can help you write code faster, but they do not:

provide service discovery
implement retries, backoff, or circuit breakers
enforce mTLS
manage secrets
abstract cloud infrastructure
provide a consistent API surface across languages
run as a sidecar
handle distributed tracing
guarantee idempotency
manage state consistency
orchestrate pub/sub delivery
provide actor placement
run workflows
integrate with brokers or databases
provide runtime‑level resiliency

AI can describe these patterns.
AI can generate code for these patterns.
AI cannot execute these patterns at runtime.

Dapr is a runtime, not a code generator.

AI ≠ Runtime

AI can help you write:

a retry loop
a Kafka consumer
a Redis client
a workflow engine wrapper
a secret retrieval helper

But AI cannot:

run a sidecar
enforce mTLS between services
manage distributed locks
guarantee delivery semantics
provide cross‑language consistency
abstract infrastructure behind a stable API
hot‑reload components
manage actor placement across nodes
provide a unified telemetry pipeline

These require execution, not generation.

AI-generated code still needs a runtime

Even if AI writes perfect code:

you still need service discovery
you still need retries and backoff
you still need state consistency
you still need pub/sub semantics
you still need secrets management
you still need observability
you still need portability
you still need mTLS
you still need infrastructure abstraction

Dapr provides these at runtime, consistently, across languages and environments.

AI cannot replace that.

AI + Dapr is actually the ideal pairing

AI helps you write business logic.
Dapr handles the distributed systems plumbing.

Together, they give you:

less boilerplate
fewer SDKs
fewer infrastructure decisions
more consistent architecture
faster iteration
safer defaults

AI accelerates development.
Dapr stabilizes execution.

They solve different problems.

Why Architects Care About Dapr

Architects think in terms of:

consistency
portability
governance
cross‑cutting concerns
security boundaries
observability
multi‑language teams
future‑proofing

Dapr provides:

consistent APIs across languages
consistent cross‑cutting behavior
consistent local and production environments
consistent observability
consistent security (mTLS, identity)
consistent patterns for state, events, and external systems

It gives architects a way to standardise distributed system capabilities without forcing a specific language, framework, or service mesh.

Why This Series Exists

Dapr’s documentation explains each building block clearly.
What it doesn’t try to do is:

show how those pieces fit together in real systems
explain how Dapr helps in day‑to‑day engineering
address developer and architect objections
show how to run Dapr locally in a practical way
provide a polyglot example that feels real
explain what Dapr solves, and what it doesn’t

This series fills that gap.

It’s designed to answer three questions:

1. What is Dapr?

A runtime that provides consistent building blocks for distributed systems.

2. Why would I use it?

To reduce complexity, improve consistency, and keep infrastructure out of application code.

3. How do I get up and running?

By running Dapr locally, building a real service, and understanding how it fits into your architecture.

What This Series Covers

Over the next posts, we’ll walk through:

Running Dapr locally
State management
Pub/Sub
Bindings
Observability
Building a real service in Go and .NET
Deploying to Kubernetes
Using Dapr with .NET Aspire (bonus)

Each part includes practical examples you can run yourself.

What You’ll Be Able to Do by the End

By the end of this series, you’ll know how to:

Build services that don’t depend on infrastructure SDKs
Run a multi‑service system locally with consistent behavior
Store state, publish events, and integrate with external systems
Observe cross‑service flows with zero instrumentation
Deploy the same patterns to Kubernetes
Build polyglot services that share the same architecture
Understand where Dapr fits, and where it doesn’t

In the previous parts of this series, we focused on instrumentation, the OpenTelemetry Operator, and the Collector. Now we shift gears and build the GitOps architecture that will manage everything, platform components, workloads, and environment‑specific configuration in a clean, scalable, production‑ready way.

This is where the project becomes a real platform.

All manifests, ApplicationSets, and configuration used in this series are available in the companion GitHub repository

🎯 What We’re Building in Part 4

By the end of this part, you will have:

A multi‑environment GitOps structure
A clean separation between:
- Platform components (cert-manager, OTel Operator, Collector)
- Application workloads (demo-dotnet)
A split ApplicationSet model:
- One ApplicationSet for Helm‑based platform components
- One ApplicationSet for plain‑YAML platform components
- One ApplicationSet for application workloads
Matrix generators that produce environment named instances e.g. dev-cert-manager, dev-collector, dev-demo-dotnet, if another environment was added e.g. staging then they would be staging-cert-manager, staging-collector, staging-demo-dotnet
Sync waves to enforce deterministic ordering
Namespace isolation per environment
Environment‑specific overrides via environments/{{.environment}}/values/

This is the architecture used by real platform teams running GitOps at scale.

📁 Repository Structure

A clean repo structure makes GitOps easier. This series uses:

argocd/
  app-of-apps.yaml
  applicationset-platform-helm.yaml
  applicationset-platform.yaml
  applicationset-apps.yaml

platform/
  cert-manager/
  opentelemetry-operator/
  collector/

apps/
  demo-dotnet/

environments/
  dev/
    values/
      platform-values.yaml
      apps-values.yaml

This structure is intentionally simple, scalable, and DRY.

🌱 The App-of-Apps Root

Argo CD starts with a single root Application:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: root
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/<your-repo>
    targetRevision: main
    path: argocd
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: false
      selfHeal: false
    syncOptions:
      - CreateNamespace=true
      - PruneLast=true

This Application discovers and applies all ApplicationSets inside argocd/.

🏗️ Splitting Platform Components:

Not all platform components are created equal:

cert-manager → Helm chart
opentelemetry-operator → Helm chart
collector → plain YAML

Trying to force everything through Helm creates errors and unnecessary complexity.
So we split the platform into two ApplicationSets:

1. applicationset-platform-helm.yaml

Manages cert-manager + OTel Operator.

2. applicationset-platform.yaml

Manages the Collector.

This keeps the repo clean and avoids Helm‑related errors

🧬 Matrix Generators: env × component

Each ApplicationSet uses a matrix generator:

One list defines environments (dev, staging, prod)
One list defines components (e.g., cert-manager, operator, collector)

This series includes only a dev environment, but the structure supports adding staging and prod with no additional changes

Argo CD multiplies them:

(dev × cert-manager)
(dev × operator)
(dev × collector)
(staging × cert-manager)
...

This produces a clean, predictable set of Applications per environment.

⏱️ Sync Waves: Ordering Matters

Platform components must deploy in the correct order:

Wave	Component
0	cert-manager, opentelemetry-operator
1	collector
3	workloads

This ensures:

CRDs exist before the Operator starts
The Collector exists before workloads send telemetry
Workloads deploy last

🌍 Environment-Specific Overrides

Each environment has its own values e.g.

environments/dev/values/platform-values.yaml
environments/staging/values/platform-values.yaml
environments/prod/values/platform-values.yaml

Only dev is included in this series, but the pattern scales to additional environments easily.

This keeps platform definitions DRY while allowing environment‑specific behaviour.

🚀 The Result

By the end of Part 4, you have:

A fully declarative, multi‑environment GitOps architecture
Clean separation of platform vs apps
Deterministic ordering via sync waves
Environment‑specific overrides
Namespace isolation
A scalable pattern for adding new apps or environments

This is the foundation for everything that follows in the series.

Coding With Taz

Coding With Taz

Thoughts and things I learn

Tag: DevOps