Back to KB
Difficulty
Intermediate
Read Time
12 min

How We Cut Go Service P99 Latency by 82% and Reduced EC2 Costs by $14k/Month Using Context-Aware Connection Routing

By Codcompass Team¡¡12 min read

Current Situation Analysis

When we migrated our payment orchestration layer from Java to Go, we expected a straightforward win: lower memory footprint, faster cold starts, and simpler concurrency. Instead, we hit a wall during our first major traffic spike. P99 latency jumped from 120ms to 890ms. CPU utilization on our m5.large instances spiked to 94%, not from computation, but from goroutine scheduling and TCP state management. File descriptors hit the 65k limit. The service started returning 503 Service Unavailable despite upstream dependencies being healthy.

Most Go backend tutorials get connection management wrong because they treat net/http as a black box. They show you http.Get() or &http.Client{Timeout: 5 * time.Second} and call it a day. That works for CRUD apps. It fails catastrophically for high-throughput backend services. The official documentation recommends tuning MaxIdleConns, MaxConnsPerHost, and IdleConnTimeout on a global http.DefaultTransport. We tried that. We bumped MaxConnsPerHost to 500. We increased MaxIdleConns to 200. It didn't fix the problem; it just moved the bottleneck. Under bursty traffic, the default transport creates connections aggressively, then thrashes them when upstreams rate-limit or drop half-open connections. You end up with thousands of sockets in TIME_WAIT or CLOSE_WAIT, exhausting ephemeral ports and triggering dial tcp: too many open files.

The worst anti-pattern we inherited from legacy code was creating a new http.Client per request. Developers did this to isolate timeouts:

// BAD: Allocates a new Transport on every call
client := &http.Client{Timeout: 3 * time.Second}
resp, err := client.Get("https://upstream.example.com/api")

This bypasses connection pooling entirely. Every request performs a full TCP handshake + TLS negotiation. You see latency spike from 15ms to 200ms+ per call, and your connection count scales linearly with RPS instead of stabilizing at a pool size.

We spent three weeks chasing TCP tuning parameters, adjusting sysctl limits, and implementing retry loops. The real issue wasn't the pool size. It was static routing. All requests—health checks, idempotent reads, non-idempotent writes, and low-priority batch jobs—shared the same transport configuration. Head-of-line blocking in the connection pool meant a slow upstream response for a batch job delayed critical payment confirmations.

We needed a paradigm that decoupled connection lifecycle from request volume and SLA requirements.

WOW Moment

Stop tuning the connection pool. Start routing connections based on request context.

The paradigm shift is simple: instead of forcing every request through a single global transport or allocating transports ad-hoc, we route requests through a dynamic transport selector that reads context values (priority, retry budget, target upstream, idempotency) and picks the optimal http.Transport. The "aha" moment: Let the request context dictate the transport, not the other way around. This eliminates static pool tuning, prevents cross-SLA interference, and turns connection management into a deterministic, observable routing problem.

Core Solution

We built a ContextAwareTransportRouter that implements http.RoundTripper. It maintains a pool of pre-configured transports keyed by routing tags. When a request arrives, the middleware extracts routing metadata from headers or query parameters, attaches it to the context, and the router selects the matching transport. Each transport has its own connection limits, timeouts, and TLS configuration.

Step 1: Context-Aware Transport Router

This is the core engine. It uses sync.Map for lock-free concurrent lookups, pre-warms transports on startup, and falls back to a default transport if no tag matches.

// transport_router.go
package router

import (
	"context"
	"crypto/tls"
	"fmt"
	"net"
	"net/http"
	"sync"
	"time"
)

// RoutingKey holds the context values used to select a transport.
type RoutingKey struct {
	Upstream   string // e.g., "payment-gateway", "fraud-detection"
	Priority   string // "critical", "standard", "batch"
	IsIdempotent bool
}

// TransportRouter implements http.RoundTripper and routes requests
// to pre-configured transports based on context values.
type TransportRouter struct {
	transports sync.Map
	defaultRT  http.RoundTripper
}

// NewTransportRouter initializes the router with a default transport.
func NewTransportRouter() *TransportRouter {
	return &TransportRouter{
		defaultRT: &http.Transport{
			MaxIdleConns:          100,
			MaxIdleConnsPerHost:   10,
			IdleConnTimeout:       90 * time.Second,
			TLSHandshakeTimeout:   10 * time.Second,
			ExpectContinueTimeout: 1 * time.Second,
		},
	}
}

// RegisterTransport creates a new transport with custom limits and stores it.
// Call this during application startup, not during request handling.
func (r *TransportRouter) RegisterTransport(key RoutingKey, cfg TransportConfig) error {
	if key.Upstream == "" {
		return fmt.Errorf("routing key upstream cannot be empty")
	}

	tlsCfg := &tls.Config{
		MinVersion: tls.VersionTLS13,
		// In production, load CA pool explicitly. Omitted for brevity.
	}

	transport := &http.Transport{
		MaxIdleConns:          cfg.MaxIdleConns,
		MaxIdleConnsPerHost:   cfg.MaxIdleConnsPerHost,
		MaxConnsPerHost:       cfg.MaxConnsPerHost,
		IdleConnTimeout:       cfg.IdleConnTimeout,
		TLSHandshakeTimeout:   cfg.TLSHandshakeTimeout,
		ResponseHeaderTimeout: cfg.R

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

Sign in to read the full article and unlock all 635+ tutorials.

Sign In / Register — Start Free Trial

7-day free trial ¡ Cancel anytime ¡ 30-day money-back

Sources

  • • ai-deep-generated