Architecture, Cloud Native, Dapr Series, Platform Engineering

Part 2 – Running Dapr Locally: Setup, Run, and Debug Your First Service

In Part 1, we explored what Dapr is and why it exists. Now it’s time to make it real. Before you can use state management, pub/sub, or any other building block, you need a smooth local development workflow, one that feels natural, fast, and familiar.

Dapr is often associated with Kubernetes and cloud deployments, but most development happens on a laptop. If Dapr doesn’t fit cleanly into your inner loop, it won’t be adopted at all. This post focuses on exactly that: running and debugging Dapr locally, using the same workflow you’d expect for any other service.

What “Running Dapr Locally” Actually Means

Running Dapr locally does not mean:

  • Running Kubernetes
  • Deploying to the cloud
  • Learning a new development model

It means:

  • Running your application as a normal process
  • Running Dapr as a sidecar alongside it
  • Using local infrastructure (or containers) for dependencies

Dapr was designed for fast, iterative development and that’s what we’ll focus on here.

Installing Dapr Locally

Dapr consists of two main parts:

  • The Dapr CLI
  • The Dapr runtime

Once the CLI is installed, initialising Dapr locally is a one‑time step:

dapr init

This sets up:

  • The Dapr runtime
  • A local Redis instance (used by default for state and pub/sub)
  • The placement service (used only for actors)

You don’t need to understand all of these yet. The important part is: Dapr now has everything it needs to run locally.

Note: In local mode, Dapr loads components at startup and does not hot‑reload them. In Kubernetes, components can be updated dynamically.

Your First Local Dapr App

At its simplest, running an app with Dapr looks like this:

.NET example

dapr run \
  --app-id myapp \
  --app-port 8080 \
  --dapr-http-port 3500 \
  -- dotnet run

Or for Go:

Go example

dapr run \
  --app-id myapp \
  --app-port 8080 \
  --dapr-http-port 3500 \
  -- go run main.go

What’s happening here:

  • Your application runs exactly as it normally would
  • Dapr starts a sidecar process alongside it
  • Dapr listens on port 3500
  • Your app listens on its own port (e.g. 8080)

From your application’s point of view, nothing special is happening and that’s the point.

Understanding the Local Architecture

Locally, the architecture looks like this:

Your App (8080)
      ↓
Dapr Sidecar (3500)
      ↓
Local Infrastructure (Redis, etc.)

Your application:

  • Receives HTTP requests as usual
  • Calls Dapr via HTTP or gRPC when it needs state, pub/sub, or bindings

Dapr:

  • Handles communication with infrastructure
  • Manages retries, timeouts, and serialisation
  • Emits logs and metrics independently

This separation is key to understanding how Dapr fits into your workflow.

Adding Components Locally

Dapr integrations are configured using components, which are simple YAML files.

Locally, components are usually placed in a components/ directory:

components/
└── statestore.yaml

When you run Dapr, you point it at this directory:

dapr run \
  --app-id myapp \
  --app-port 8080 \
  --components-path ./components \
  -- dotnet run

This mirrors how Dapr is configured in production, the same components, the same structure, just running locally.

Note: If you don’t specify a components path, Dapr uses the default directory at ~/.dapr/components.

Debugging with Dapr

This is where Dapr fits surprisingly well into normal development workflows.

Debugging the application

Your application runs as a normal process:

  • Attach a debugger
  • Set breakpoints
  • Step through code
  • Inspect variables

Nothing about Dapr changes this.

Debugging Dapr itself

Dapr runs as a separate process, with its own logs.

Useful commands include:

dapr list
dapr logs --app-id myapp

This separation makes it easier to answer an important question:

“Is this a bug in my application, or a configuration/infrastructure issue?”

Common Local Pitfalls

A few things that commonly trip people up:

Port conflicts

Dapr needs its own HTTP and gRPC ports.

Forgetting to restart Dapr

Component changes require restarting the sidecar.

Confusing app logs with Dapr logs

They are separate processes, check both.

Missing components path

If Dapr can’t find your components, integrations won’t work.

Once you understand these, local development becomes predictable and fast

Why This Matters for the Rest of the Series

Everything else in this series builds on this local setup:

  • State management
  • Pub/Sub
  • Bindings and storage
  • End‑to‑end workflows

The same dapr run workflow applies everywhere. Once you’re comfortable running and debugging Dapr locally, the rest of the building blocks feel much less intimidating.

What’s Next

Now that we can run and debug Dapr locally, we can start using it for real work.

In the next post, we’ll look at State Management with Dapr, using Redis and Postgres, all running locally, using the setup described here.

Architecture, Observability, Platform Engineering

Part 5 – Troubleshooting, Scaling, and Production Hardening

By this point in the series, you now have a fully working, multi‑environment observability pipeline, deployed and reconciled entirely through GitOps:

  • Jaeger v2 running on the OpenTelemetry Collector
  • Applications emitting traces automatically via the OpenTelemetry Operator
  • Environment‑scoped Collectors and Instrumentation CRs
  • Argo CD managing everything through ApplicationSets and sync waves

This final part focuses on what matters most in real‑world environments: operability. Deploying Jaeger v2 is easy. Running it reliably at scale with predictable performance, clear failure modes, and secure communication is where engineering judgment comes in.

This guide covers the most important lessons learned from operating OpenTelemetry and Jaeger in production.

All manifests, ApplicationSets, and configuration used in this series are available in the companion GitHub repository

🩺 Troubleshooting: The Most Common Issues (and How to Fix Them)

1. “I don’t see any traces.”

This is the most common issue, and it almost always comes down to one of three things:

a. Wrong OTLP endpoint

Check the app’s environment variables (injected by the Operator):

  • OTEL_EXPORTER_OTLP_ENDPOINT
  • OTEL_EXPORTER_OTLP_TRACES_ENDPOINT

If these are missing, your Instrumentation CR is not configured with an exporter.

Protocol mismatch also matters:

Your Instrumentation CR should point to 4318 for .NET auto‑instrumentation

b. Collector not listening on the expected ports

Verify:

kubectl get svc jaeger-inmemory-instance-collector -n monitoring

In this architecture, the Collector runs in the monitoring namespace, while instrumented workloads run in the apps namespace

You must see:

  • 4318 (OTLP HTTP)
  • 4317 (OTLP gRPC)

c. Auto‑instrumentation not activated

For .NET, the Operator must inject:

  • DOTNET_STARTUP_HOOKS
  • CORECLR_ENABLE_PROFILING=1
  • CORECLR_PROFILER
  • CORECLR_PROFILER_PATH

If any of these are missing:

  • The annotation is wrong
  • The Instrumentation CR is missing
  • The Operator webhook failed to mutate the pod
  • The workload was deployed before the Operator (sync wave ordering issue)

2. “Traces appear, but they’re incomplete.”

Common causes:

  • Missing propagation headers
  • Reverse proxies stripping traceparent
  • Sampling too aggressive
  • Instrumentation library not loaded

For .NET, ensure:

  • OTEL_PROPAGATORS=tracecontext,baggage
  • No middleware overwrites headers

3. “Collector is dropping spans.”

Check Collector logs:

kubectl logs deploy/jaeger-inmemory-instance-collector -n monitoring

Look for:

  • batch processor timeout
  • queue full
  • exporter failed

Fixes:

  • Increase batch processor size
  • Increase memory limits
  • Add more Collector replicas
  • Use a more performant storage backend

📈 Scaling the Collector

The OpenTelemetry Collector is extremely flexible, but scaling it requires understanding its architecture.

Horizontal scaling

You can run multiple Collector replicas behind a Service. This works well when:

  • Apps send OTLP over gRPC (load‑balanced)
  • You use stateless exporters (e.g., Tempo, OTLP → another Collector)

Vertical scaling

Increase CPU/memory when:

  • You use heavy processors (tail sampling, attributes filtering)
  • You export to slower backends (Elasticsearch, Cassandra)

Pipeline separation

For large systems, split pipelines:

  • Gateway Collectors – Receive traffic from apps
  • Aggregation Collectors – Apply sampling, filtering
  • Export Collectors – Write to storage

This isolates concerns and improves reliability.

🗄️ Choosing a Production Storage Backend

The demo uses memstore, which is perfect for local testing but not for production. Real deployments typically use:

1. Tempo (Grafana)

  • Highly scalable
  • Cheap object storage
  • Great for high‑volume traces
  • No indexing required

2. Elasticsearch

  • Mature
  • Powerful search
  • Higher operational cost

3. ClickHouse (via Jaeger or SigNoz)

  • Extremely fast
  • Efficient storage
  • Great for long retention

4. Cassandra

  • Historically used by Jaeger v1
  • Still supported
  • Operationally heavy

For most modern setups, Tempo or ClickHouse are the best choices.

🔐 Security Considerations

1. TLS everywhere

Enable TLS for:

  • OTLP ingestion
  • Collector → backend communication
  • Jaeger UI

2. mTLS for workloads

The Collector supports mTLS for OTLP:

  • Prevents spoofed telemetry
  • Ensures only trusted workloads send data

3. Network policies

Lock down:

  • Collector ports
  • Storage backend
  • Jaeger UI

4. Secrets management

Use:

  • Kubernetes Secrets (encrypted at rest)
  • External secret stores (Vault, SSM, Azure Key Vault)

Never hardcode credentials in Collector configs.

🧪 Sampling Strategies

Sampling is one of the most misunderstood parts of tracing. The wrong sampling strategy can make your traces useless.

Head sampling (default)

  • Simple
  • Fast
  • Drops spans early
  • Good for high‑volume systems

Tail sampling

  • Makes decisions after seeing the full trace
  • Better for error‑focused sampling
  • More expensive
  • Requires dedicated Collector pipelines

Adaptive sampling

  • Dynamically adjusts sampling rates
  • Useful for spiky workloads

Best practice

Start with head sampling, then introduce tail sampling only if you need it.

🌐 Multi‑Cluster and Multi‑Environment Patterns

As your platform grows, you may need:

1. Per‑cluster Collectors, shared backend

Each cluster runs its own Collector, exporting to a central storage backend.

2. Centralized Collector fleet

Apps send OTLP to a global Collector layer.

3. GitOps per environment

Structure your repo like:

environments/
  dev/
  staging/
  prod/

This series includes only dev, but the structure supports adding staging and prod easily.

Each environment can have:

  • Different sampling
  • Different storage
  • Different Collector pipelines

🧭 Final Thoughts

Jaeger v2, OpenTelemetry, and GitOps form a powerful, modern observability stack. Across this series, you’ve built:

  • A Jaeger v2 deployment using the OpenTelemetry Collector
  • A .NET application emitting traces with zero code changes
  • A GitOps workflow that keeps everything declarative and self‑healing
  • A production‑ready understanding of scaling, troubleshooting, and hardening

This is the kind of architecture that scales with your platform, not against it. It’s simple where it should be simple, flexible where it needs to be flexible, and grounded in open standards.