Part 29: Distributed Tracing Deep Dive - Following Requests Across Services
"In a distributed system, a single user request might touch dozens of services. Distributed tracing is how you follow that request's journey and understand what actually happened."
Welcome to Part 29 of our distributed systems course! After exploring service mesh architecture, we now take a deep dive into distributed tracing - essential for understanding request flows across services.
Why Distributed Tracing?
Traditional logging falls short in distributed systems:
- No correlation: Hard to connect logs across services
- Lost context: Timing and causality are unclear
- Needle in haystack: Finding relevant logs is difficult
Distributed tracing provides:
- End-to-end visibility: See the entire request journey
- Performance analysis: Identify bottlenecks
- Dependency mapping: Understand service relationships
- Root cause analysis: Debug production issues
Core Concepts
Traces, Spans, and Context
gopackage tracing import ( "context" "crypto/rand" "encoding/hex" "encoding/json" "fmt" "net/http" "sync" "time" ) // TraceID uniquely identifies a trace type TraceID string // SpanID uniquely identifies a span within a trace type SpanID string // Span represents a unit of work in a trace type Span struct { TraceID TraceID `json:"trace_id"` SpanID SpanID `json:"span_id"` ParentSpanID SpanID `json:"parent_span_id,omitempty"` OperationName string `json:"operation_name"` ServiceName string `json:"service_name"` StartTime time.Time `json:"start_time"` Duration time.Duration `json:"duration"` Status SpanStatus `json:"status"` Tags map[string]string `json:"tags,omitempty"` Logs []SpanLog `json:"logs,omitempty"` References []SpanReference `json:"references,omitempty"` // Internal mu sync.Mutex finished bool tracer *Tracer } // SpanStatus represents the outcome of a span type SpanStatus int const ( SpanStatusUnset SpanStatus = iota SpanStatusOK SpanStatusError ) // SpanLog represents a log entry within a span type SpanLog struct { Timestamp time.Time `json:"timestamp"` Fields map[string]string `json:"fields"` } // SpanReference represents a relationship to another span type SpanReference struct { Type ReferenceType `json:"type"` TraceID TraceID `json:"trace_id"` SpanID SpanID `json:"span_id"` } // ReferenceType defines the relationship between spans type ReferenceType int const ( ChildOf ReferenceType = iota FollowsFrom ) // SpanContext carries trace context across boundaries type SpanContext struct { TraceID TraceID SpanID SpanID Baggage map[string]string TraceFlags byte } // generateID generates a random ID func generateID() string { b := make([]byte, 8) rand.Read(b) return hex.EncodeToString(b) } // NewTraceID creates a new trace ID func NewTraceID() TraceID { b := make([]byte, 16) rand.Read(b) return TraceID(hex.EncodeToString(b)) } // NewSpanID creates a new span ID func NewSpanID() SpanID { return SpanID(generateID()) }
The Tracer
go// Tracer creates and manages spans type Tracer struct { serviceName string exporter SpanExporter sampler Sampler propagator Propagator activeSpans sync.Map } // TracerConfig configures a tracer type TracerConfig struct { ServiceName string Exporter SpanExporter Sampler Sampler Propagator Propagator } // NewTracer creates a new tracer func NewTracer(cfg TracerConfig) *Tracer { if cfg.Sampler == nil { cfg.Sampler = AlwaysSample() } if cfg.Propagator == nil { cfg.Propagator = NewW3CPropagator() } return &Tracer{ serviceName: cfg.ServiceName, exporter: cfg.Exporter, sampler: cfg.Sampler, propagator: cfg.Propagator, } } // StartSpan starts a new span func (t *Tracer) StartSpan(ctx context.Context, operationName string, opts ...SpanOption) (context.Context, *Span) { cfg := &spanConfig{} for _, opt := range opts { opt(cfg) } var traceID TraceID var parentSpanID SpanID // Check for parent span in context if parentSpan := SpanFromContext(ctx); parentSpan != nil { traceID = parentSpan.TraceID parentSpanID = parentSpan.SpanID } else if cfg.parentContext != nil { traceID = cfg.parentContext.TraceID parentSpanID = cfg.parentContext.SpanID } else { traceID = NewTraceID() } // Apply sampling decision if !t.sampler.ShouldSample(traceID) { // Return no-op span return ctx, &Span{finished: true} } span := &Span{ TraceID: traceID, SpanID: NewSpanID(), ParentSpanID: parentSpanID, OperationName: operationName, ServiceName: t.serviceName, StartTime: time.Now(), Tags: make(map[string]string), tracer: t, } // Add default tags span.Tags["service.name"] = t.serviceName // Apply custom tags for k, v := range cfg.tags { span.Tags[k] = v } // Store in active spans t.activeSpans.Store(span.SpanID, span) return ContextWithSpan(ctx, span), span } // spanConfig holds span creation options type spanConfig struct { parentContext *SpanContext tags map[string]string startTime time.Time } // SpanOption configures a span type SpanOption func(*spanConfig) // WithParentContext sets the parent context func WithParentContext(ctx *SpanContext) SpanOption { return func(cfg *spanConfig) { cfg.parentContext = ctx } } // WithTags sets initial tags func WithTags(tags map[string]string) SpanOption { return func(cfg *spanConfig) { cfg.tags = tags } } // WithStartTime sets the start time func WithStartTime(t time.Time) SpanOption { return func(cfg *spanConfig) { cfg.startTime = t } } // Finish completes a span func (s *Span) Finish() { s.mu.Lock() defer s.mu.Unlock() if s.finished { return } s.finished = true s.Duration = time.Since(s.StartTime) if s.tracer != nil { s.tracer.activeSpans.Delete(s.SpanID) if s.tracer.exporter != nil { s.tracer.exporter.Export(s) } } } // SetTag sets a tag on the span func (s *Span) SetTag(key, value string) *Span { s.mu.Lock() defer s.mu.Unlock() if !s.finished { s.Tags[key] = value } return s } // SetStatus sets the span status func (s *Span) SetStatus(status SpanStatus) *Span { s.mu.Lock() defer s.mu.Unlock() if !s.finished { s.Status = status } return s } // LogEvent logs an event in the span func (s *Span) LogEvent(event string, fields map[string]string) *Span { s.mu.Lock() defer s.mu.Unlock() if !s.finished { log := SpanLog{ Timestamp: time.Now(), Fields: make(map[string]string), } log.Fields["event"] = event for k, v := range fields { log.Fields[k] = v } s.Logs = append(s.Logs, log) } return s } // RecordError records an error in the span func (s *Span) RecordError(err error) *Span { s.SetTag("error", "true") s.SetTag("error.message", err.Error()) s.SetStatus(SpanStatusError) s.LogEvent("error", map[string]string{ "error.object": err.Error(), }) return s } // Context key for spans type spanContextKey struct{} // ContextWithSpan returns a context with the span func ContextWithSpan(ctx context.Context, span *Span) context.Context { return context.WithValue(ctx, spanContextKey{}, span) } // SpanFromContext retrieves a span from context func SpanFromContext(ctx context.Context) *Span { if span, ok := ctx.Value(spanContextKey{}).(*Span); ok { return span } return nil }
Context Propagation
W3C Trace Context
go// Propagator handles trace context propagation type Propagator interface { Inject(ctx context.Context, carrier Carrier) Extract(ctx context.Context, carrier Carrier) context.Context } // Carrier carries trace context type Carrier interface { Get(key string) string Set(key, value string) } // HTTPHeaderCarrier adapts http.Header to Carrier type HTTPHeaderCarrier http.Header func (c HTTPHeaderCarrier) Get(key string) string { return http.Header(c).Get(key) } func (c HTTPHeaderCarrier) Set(key, value string) { http.Header(c).Set(key, value) } // W3CPropagator implements W3C Trace Context type W3CPropagator struct{} func NewW3CPropagator() *W3CPropagator { return &W3CPropagator{} } // W3C Trace Context header names const ( TraceparentHeader = "traceparent" TracestateHeader = "tracestate" ) // Inject injects trace context into carrier func (p *W3CPropagator) Inject(ctx context.Context, carrier Carrier) { span := SpanFromContext(ctx) if span == nil { return } // Format: version-traceid-spanid-flags // Example: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01 traceparent := fmt.Sprintf("00-%s-%s-%02x", span.TraceID, span.SpanID, 0x01, // sampled flag ) carrier.Set(TraceparentHeader, traceparent) } // Extract extracts trace context from carrier func (p *W3CPropagator) Extract(ctx context.Context, carrier Carrier) context.Context { traceparent := carrier.Get(TraceparentHeader) if traceparent == "" { return ctx } // Parse traceparent header parts := splitTraceparent(traceparent) if len(parts) != 4 { return ctx } spanCtx := &SpanContext{ TraceID: TraceID(parts[1]), SpanID: SpanID(parts[2]), TraceFlags: parseTraceFlags(parts[3]), Baggage: make(map[string]string), } return context.WithValue(ctx, spanContextKey{}, &Span{ TraceID: spanCtx.TraceID, SpanID: spanCtx.SpanID, ParentSpanID: spanCtx.SpanID, // Will become parent of new span }) } func splitTraceparent(traceparent string) []string { var parts []string current := "" for _, c := range traceparent { if c == '-' { parts = append(parts, current) current = "" } else { current += string(c) } } parts = append(parts, current) return parts } func parseTraceFlags(s string) byte { if len(s) >= 2 { b, _ := hex.DecodeString(s[:2]) if len(b) > 0 { return b[0] } } return 0 } // B3Propagator implements B3 propagation (Zipkin) type B3Propagator struct { singleHeader bool } func NewB3Propagator(singleHeader bool) *B3Propagator { return &B3Propagator{singleHeader: singleHeader} } func (p *B3Propagator) Inject(ctx context.Context, carrier Carrier) { span := SpanFromContext(ctx) if span == nil { return } if p.singleHeader { // Single header format: {TraceId}-{SpanId}-{SamplingState}-{ParentSpanId} b3 := fmt.Sprintf("%s-%s-1", span.TraceID, span.SpanID) if span.ParentSpanID != "" { b3 += fmt.Sprintf("-%s", span.ParentSpanID) } carrier.Set("b3", b3) } else { carrier.Set("X-B3-TraceId", string(span.TraceID)) carrier.Set("X-B3-SpanId", string(span.SpanID)) carrier.Set("X-B3-Sampled", "1") if span.ParentSpanID != "" { carrier.Set("X-B3-ParentSpanId", string(span.ParentSpanID)) } } } func (p *B3Propagator) Extract(ctx context.Context, carrier Carrier) context.Context { var traceID, spanID, parentSpanID string if p.singleHeader { b3 := carrier.Get("b3") if b3 == "" { return ctx } parts := splitTraceparent(b3) // Reuse split function if len(parts) >= 2 { traceID = parts[0] spanID = parts[1] if len(parts) >= 4 { parentSpanID = parts[3] } } } else { traceID = carrier.Get("X-B3-TraceId") spanID = carrier.Get("X-B3-SpanId") parentSpanID = carrier.Get("X-B3-ParentSpanId") } if traceID == "" || spanID == "" { return ctx } return context.WithValue(ctx, spanContextKey{}, &Span{ TraceID: TraceID(traceID), SpanID: SpanID(spanID), ParentSpanID: SpanID(parentSpanID), }) }
Sampling
go// Sampler determines whether to sample a trace type Sampler interface { ShouldSample(traceID TraceID) bool Description() string } // AlwaysSampler always samples type AlwaysSampler struct{} func AlwaysSample() Sampler { return &AlwaysSampler{} } func (s *AlwaysSampler) ShouldSample(traceID TraceID) bool { return true } func (s *AlwaysSampler) Description() string { return "AlwaysSample" } // NeverSampler never samples type NeverSampler struct{} func NeverSample() Sampler { return &NeverSampler{} } func (s *NeverSampler) ShouldSample(traceID TraceID) bool { return false } func (s *NeverSampler) Description() string { return "NeverSample" } // RatioSampler samples at a ratio type RatioSampler struct { ratio float64 threshold uint64 } func RatioBasedSampler(ratio float64) Sampler { if ratio <= 0 { return NeverSample() } if ratio >= 1 { return AlwaysSample() } return &RatioSampler{ ratio: ratio, threshold: uint64(ratio * float64(^uint64(0))), } } func (s *RatioSampler) ShouldSample(traceID TraceID) bool { // Hash trace ID to get deterministic sampling h := fnv.New64a() h.Write([]byte(traceID)) return h.Sum64() < s.threshold } func (s *RatioSampler) Description() string { return fmt.Sprintf("RatioSampler{%.2f}", s.ratio) } // AdaptiveSampler adjusts sampling rate based on throughput type AdaptiveSampler struct { targetRate float64 // Target samples per second minRatio float64 maxRatio float64 currentRatio float64 windowCount int64 windowStart time.Time windowDuration time.Duration mu sync.Mutex } func NewAdaptiveSampler(targetRate, minRatio, maxRatio float64) *AdaptiveSampler { return &AdaptiveSampler{ targetRate: targetRate, minRatio: minRatio, maxRatio: maxRatio, currentRatio: maxRatio, windowStart: time.Now(), windowDuration: time.Second, } } func (s *AdaptiveSampler) ShouldSample(traceID TraceID) bool { s.mu.Lock() defer s.mu.Unlock() s.windowCount++ // Check if window expired if time.Since(s.windowStart) >= s.windowDuration { // Calculate new ratio currentRate := float64(s.windowCount) desiredRatio := s.targetRate / currentRate // Smoothly adjust s.currentRatio = s.currentRatio*0.7 + desiredRatio*0.3 // Clamp to bounds if s.currentRatio < s.minRatio { s.currentRatio = s.minRatio } if s.currentRatio > s.maxRatio { s.currentRatio = s.maxRatio } // Reset window s.windowCount = 0 s.windowStart = time.Now() } // Apply current ratio h := fnv.New64a() h.Write([]byte(traceID)) threshold := uint64(s.currentRatio * float64(^uint64(0))) return h.Sum64() < threshold } func (s *AdaptiveSampler) Description() string { s.mu.Lock() defer s.mu.Unlock() return fmt.Sprintf("AdaptiveSampler{target=%.0f/s, current=%.2f}", s.targetRate, s.currentRatio) }
Span Exporters
go// SpanExporter exports completed spans type SpanExporter interface { Export(span *Span) error Shutdown(ctx context.Context) error } // ConsoleExporter exports spans to console type ConsoleExporter struct { encoder *json.Encoder mu sync.Mutex } func NewConsoleExporter() *ConsoleExporter { return &ConsoleExporter{ encoder: json.NewEncoder(os.Stdout), } } func (e *ConsoleExporter) Export(span *Span) error { e.mu.Lock() defer e.mu.Unlock() return e.encoder.Encode(span) } func (e *ConsoleExporter) Shutdown(ctx context.Context) error { return nil } // BatchExporter batches spans before export type BatchExporter struct { inner SpanExporter batchSize int flushInterval time.Duration spans []*Span mu sync.Mutex stopCh chan struct{} doneCh chan struct{} } func NewBatchExporter(inner SpanExporter, batchSize int, flushInterval time.Duration) *BatchExporter { e := &BatchExporter{ inner: inner, batchSize: batchSize, flushInterval: flushInterval, spans: make([]*Span, 0, batchSize), stopCh: make(chan struct{}), doneCh: make(chan struct{}), } go e.flushLoop() return e } func (e *BatchExporter) Export(span *Span) error { e.mu.Lock() e.spans = append(e.spans, span) if len(e.spans) >= e.batchSize { batch := e.spans e.spans = make([]*Span, 0, e.batchSize) e.mu.Unlock() return e.exportBatch(batch) } e.mu.Unlock() return nil } func (e *BatchExporter) flushLoop() { defer close(e.doneCh) ticker := time.NewTicker(e.flushInterval) defer ticker.Stop() for { select { case <-ticker.C: e.flush() case <-e.stopCh: e.flush() return } } } func (e *BatchExporter) flush() { e.mu.Lock() if len(e.spans) == 0 { e.mu.Unlock() return } batch := e.spans e.spans = make([]*Span, 0, e.batchSize) e.mu.Unlock() e.exportBatch(batch) } func (e *BatchExporter) exportBatch(batch []*Span) error { for _, span := range batch { if err := e.inner.Export(span); err != nil { // Log error but continue fmt.Printf("Failed to export span: %v\n", err) } } return nil } func (e *BatchExporter) Shutdown(ctx context.Context) error { close(e.stopCh) select { case <-e.doneCh: return e.inner.Shutdown(ctx) case <-ctx.Done(): return ctx.Err() } } // JaegerExporter exports to Jaeger type JaegerExporter struct { endpoint string client *http.Client serviceName string } func NewJaegerExporter(endpoint, serviceName string) *JaegerExporter { return &JaegerExporter{ endpoint: endpoint, client: &http.Client{Timeout: 5 * time.Second}, serviceName: serviceName, } } func (e *JaegerExporter) Export(span *Span) error { // Convert to Jaeger format jaegerSpan := e.convertToJaeger(span) data, err := json.Marshal(jaegerSpan) if err != nil { return err } req, err := http.NewRequest("POST", e.endpoint, bytes.NewReader(data)) if err != nil { return err } req.Header.Set("Content-Type", "application/json") resp, err := e.client.Do(req) if err != nil { return err } defer resp.Body.Close() if resp.StatusCode >= 400 { return fmt.Errorf("jaeger export failed: %s", resp.Status) } return nil } func (e *JaegerExporter) convertToJaeger(span *Span) map[string]interface{} { tags := make([]map[string]interface{}, 0) for k, v := range span.Tags { tags = append(tags, map[string]interface{}{ "key": k, "type": "string", "value": v, }) } logs := make([]map[string]interface{}, 0) for _, log := range span.Logs { fields := make([]map[string]interface{}, 0) for k, v := range log.Fields { fields = append(fields, map[string]interface{}{ "key": k, "type": "string", "value": v, }) } logs = append(logs, map[string]interface{}{ "timestamp": log.Timestamp.UnixMicro(), "fields": fields, }) } return map[string]interface{}{ "traceIdLow": span.TraceID, "spanId": span.SpanID, "parentSpanId": span.ParentSpanID, "operationName": span.OperationName, "startTime": span.StartTime.UnixMicro(), "duration": span.Duration.Microseconds(), "tags": tags, "logs": logs, "process": map[string]interface{}{ "serviceName": e.serviceName, }, } } func (e *JaegerExporter) Shutdown(ctx context.Context) error { return nil }
HTTP Middleware
go// TracingMiddleware adds tracing to HTTP handlers func TracingMiddleware(tracer *Tracer) func(http.Handler) http.Handler { return func(next http.Handler) http.Handler { return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { // Extract context from incoming request ctx := tracer.propagator.Extract(r.Context(), HTTPHeaderCarrier(r.Header)) // Start server span ctx, span := tracer.StartSpan(ctx, fmt.Sprintf("%s %s", r.Method, r.URL.Path), WithTags(map[string]string{ "http.method": r.Method, "http.url": r.URL.String(), "http.host": r.Host, "http.user_agent": r.UserAgent(), "span.kind": "server", }), ) defer span.Finish() // Wrap response writer to capture status code wrapped := &statusResponseWriter{ResponseWriter: w, statusCode: 200} // Serve request next.ServeHTTP(wrapped, r.WithContext(ctx)) // Record response info span.SetTag("http.status_code", fmt.Sprintf("%d", wrapped.statusCode)) if wrapped.statusCode >= 400 { span.SetStatus(SpanStatusError) } else { span.SetStatus(SpanStatusOK) } }) } } type statusResponseWriter struct { http.ResponseWriter statusCode int } func (w *statusResponseWriter) WriteHeader(code int) { w.statusCode = code w.ResponseWriter.WriteHeader(code) } // TracingHTTPClient wraps an HTTP client with tracing type TracingHTTPClient struct { client *http.Client tracer *Tracer } func NewTracingHTTPClient(tracer *Tracer) *TracingHTTPClient { return &TracingHTTPClient{ client: &http.Client{Timeout: 30 * time.Second}, tracer: tracer, } } func (c *TracingHTTPClient) Do(ctx context.Context, req *http.Request) (*http.Response, error) { // Start client span ctx, span := c.tracer.StartSpan(ctx, fmt.Sprintf("%s %s", req.Method, req.URL.Host), WithTags(map[string]string{ "http.method": req.Method, "http.url": req.URL.String(), "span.kind": "client", }), ) defer span.Finish() // Inject context into request headers c.tracer.propagator.Inject(ctx, HTTPHeaderCarrier(req.Header)) // Make request resp, err := c.client.Do(req.WithContext(ctx)) if err != nil { span.RecordError(err) return nil, err } span.SetTag("http.status_code", fmt.Sprintf("%d", resp.StatusCode)) if resp.StatusCode >= 400 { span.SetStatus(SpanStatusError) } else { span.SetStatus(SpanStatusOK) } return resp, nil }
Database Tracing
go// TracingDB wraps a database connection with tracing type TracingDB struct { db *sql.DB tracer *Tracer } func NewTracingDB(db *sql.DB, tracer *Tracer) *TracingDB { return &TracingDB{ db: db, tracer: tracer, } } func (tdb *TracingDB) QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error) { ctx, span := tdb.tracer.StartSpan(ctx, "db.Query", WithTags(map[string]string{ "db.type": "sql", "db.statement": truncateQuery(query, 100), "span.kind": "client", }), ) defer span.Finish() rows, err := tdb.db.QueryContext(ctx, query, args...) if err != nil { span.RecordError(err) return nil, err } span.SetStatus(SpanStatusOK) return rows, nil } func (tdb *TracingDB) ExecContext(ctx context.Context, query string, args ...interface{}) (sql.Result, error) { ctx, span := tdb.tracer.StartSpan(ctx, "db.Exec", WithTags(map[string]string{ "db.type": "sql", "db.statement": truncateQuery(query, 100), "span.kind": "client", }), ) defer span.Finish() result, err := tdb.db.ExecContext(ctx, query, args...) if err != nil { span.RecordError(err) return nil, err } span.SetStatus(SpanStatusOK) return result, nil } func truncateQuery(query string, maxLen int) string { if len(query) <= maxLen { return query } return query[:maxLen] + "..." }
gRPC Tracing
go// UnaryTracingInterceptor creates a gRPC unary interceptor func UnaryTracingInterceptor(tracer *Tracer) grpc.UnaryServerInterceptor { return func( ctx context.Context, req interface{}, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler, ) (interface{}, error) { // Extract context from metadata md, ok := metadata.FromIncomingContext(ctx) if ok { carrier := &metadataCarrier{md: md} ctx = tracer.propagator.Extract(ctx, carrier) } // Start span ctx, span := tracer.StartSpan(ctx, info.FullMethod, WithTags(map[string]string{ "rpc.system": "grpc", "rpc.service": info.FullMethod, "span.kind": "server", }), ) defer span.Finish() // Handle request resp, err := handler(ctx, req) if err != nil { span.RecordError(err) span.SetTag("rpc.grpc.status_code", fmt.Sprintf("%d", status.Code(err))) } else { span.SetStatus(SpanStatusOK) span.SetTag("rpc.grpc.status_code", "0") } return resp, err } } // UnaryClientTracingInterceptor creates a gRPC client interceptor func UnaryClientTracingInterceptor(tracer *Tracer) grpc.UnaryClientInterceptor { return func( ctx context.Context, method string, req, reply interface{}, cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption, ) error { // Start span ctx, span := tracer.StartSpan(ctx, method, WithTags(map[string]string{ "rpc.system": "grpc", "rpc.service": method, "span.kind": "client", }), ) defer span.Finish() // Inject context into metadata md, ok := metadata.FromOutgoingContext(ctx) if !ok { md = metadata.New(nil) } carrier := &metadataCarrier{md: md} tracer.propagator.Inject(ctx, carrier) ctx = metadata.NewOutgoingContext(ctx, md) // Make call err := invoker(ctx, method, req, reply, cc, opts...) if err != nil { span.RecordError(err) span.SetTag("rpc.grpc.status_code", fmt.Sprintf("%d", status.Code(err))) } else { span.SetStatus(SpanStatusOK) span.SetTag("rpc.grpc.status_code", "0") } return err } } type metadataCarrier struct { md metadata.MD } func (c *metadataCarrier) Get(key string) string { vals := c.md.Get(key) if len(vals) > 0 { return vals[0] } return "" } func (c *metadataCarrier) Set(key, value string) { c.md.Set(key, value) }
Complete Example
gopackage main import ( "context" "fmt" "log" "net/http" "time" ) func main() { // Create exporter jaegerExporter := NewJaegerExporter( "http://localhost:14268/api/traces", "order-service", ) // Create batch exporter batchExporter := NewBatchExporter(jaegerExporter, 100, 5*time.Second) // Create tracer tracer := NewTracer(TracerConfig{ ServiceName: "order-service", Exporter: batchExporter, Sampler: RatioBasedSampler(0.1), // Sample 10% Propagator: NewW3CPropagator(), }) // Create traced HTTP client httpClient := NewTracingHTTPClient(tracer) // Setup HTTP server with tracing middleware mux := http.NewServeMux() mux.HandleFunc("/orders", func(w http.ResponseWriter, r *http.Request) { ctx := r.Context() // Get current span span := SpanFromContext(ctx) span.SetTag("order.type", "standard") // Call downstream service ctx, childSpan := tracer.StartSpan(ctx, "validate-order") time.Sleep(10 * time.Millisecond) childSpan.Finish() // Call external service req, _ := http.NewRequest("GET", "http://inventory-service/check", nil) resp, err := httpClient.Do(ctx, req) if err != nil { span.RecordError(err) http.Error(w, err.Error(), http.StatusInternalServerError) return } defer resp.Body.Close() // Database operation ctx, dbSpan := tracer.StartSpan(ctx, "db.insert-order", WithTags(map[string]string{"db.type": "postgresql"})) time.Sleep(20 * time.Millisecond) dbSpan.Finish() span.LogEvent("order_created", map[string]string{ "order_id": "12345", }) w.WriteHeader(http.StatusCreated) fmt.Fprintf(w, `{"order_id": "12345"}`) }) // Apply tracing middleware handler := TracingMiddleware(tracer)(mux) // Start server log.Println("Starting server on :8080") log.Fatal(http.ListenAndServe(":8080", handler)) }
Best Practices
- Always propagate context - Pass context through all function calls
- Use meaningful span names - Include operation type and target
- Add relevant tags - But don't add PII or sensitive data
- Sample appropriately - High traffic systems need lower sample rates
- Batch exports - Don't export each span individually
- Handle errors properly - Record errors and set status
What's Next?
In Part 30, we'll explore Building Production-Ready Systems - bringing together everything we've learned to create robust distributed systems.
"Distributed tracing is your system's story - each span is a chapter, and together they tell you exactly what happened."