Concurrency Controls in Golang

February 19, 2026

From Simple Data Pipelines to Token Buckets

In my day-to-day role as a Site Reliability Engineer, my primary focus naturally gravitates toward system stability, infrastructure scaling, and Kubernetes management. While I write a considerable amount of Go and Python to automate these operations, I rarely get the opportunity to build complex, highly concurrent backend patterns from scratch. Lately, I decided to look closer at how high-throughput systems actually process all that data, which naturally led me to study Go’s concurrency models. I wanted to step outside my usual operational boundaries and understand exactly how software engineers manage massive data streams efficiently without overwhelming their infrastructure.

I’ve always found that the best way to actually understand a complex topic is to try explaining it. I’m writing this post to solidify my own grasp on these architectural patterns, breaking them down from a basic data pipeline all the way up to a production-grade token bucket rate limiter

Level 1: The Basic Pipeline for Stream Processing

Imagine you need to parse a massive 50GB log file to extract specific error messages. If you attempt to load that entire file into memory at once, your application will inevitably crash due to memory exhaustion. The elegant solution to this problem is the Pipeline Pattern, which allows you to break the workload into distinct processing stages that run simultaneously. By passing data downstream through Go channels, your application only ever needs to hold a single line of the file in memory at any given time.

When designing a basic pipeline, the architecture generally flows through three primary components:

The Reader (Submitter): This stage reads the source file line-by-line and pushes each string into an outgoing channel.
The Transformer: This middle stage reads from the incoming channel, performs the necessary filtering or modifications, and pushes the successful results to the next channel.
The Sinker (Consumer): The final stage collects the processed results and handles the final output or database insertion.

func aggregateLog(ctx context.Context, filepath string) int {
	pending := reader(ctx, filepath) 
	collector := transformer(ctx, pending) 
	return sinker(collector) 
}

⚠️ Common Pitfall: The Blocking Trap

When writing these individual stages, it is incredibly easy to accidentally block a goroutine forever, which creates a silent memory leak. If a user cancels the operation or a system timeout occurs, your goroutines must be able to recognize that signal and exit immediately.

If you simply send data to a channel like pending <- line, and the downstream consumer has already stopped listening, your program will hang indefinitely. To prevent this, you should always wrap your channel sends and receives in a select statement alongside the context cancellation signal.

select {
case pending <- line:
    // The data was successfully sent downstream
case <-ctx.Done():
    return // The context was cancelled, so we shut down gracefully
}

In a real-world backend, requests frequently get cancelled, connections drop, or API timeouts expire. The context is how Go signals your application to stop working on a task that is no longer needed. If your goroutines ignore this cancellation signal while waiting to send or receive data on a channel, they will hang in the background forever. Over time, these stranded goroutines accumulate and cause silent memory leaks that degrade performance and eventually crash your application.

Level 2: Scaling Up with Fan-Out and Fan-In

While a linear pipeline is fantastic for memory management, it introduces a new bottleneck if your middle transformation stage is computationally expensive. If your transformer performs a heavy regex match or a slow database lookup that takes a full second, your entire program operates at a maximum speed of one line per second.

To resolve this bottleneck, we introduce the Fan-Out pattern. Instead of relying on a single transformer, we spawn a pool of worker goroutines that all read concurrently from the exact same input channel.

Managing multiple workers safely requires a sync.WaitGroup to track their progress. The most challenging aspect of this pattern is knowing exactly when to close the output channel. If you close the channel prematurely, your active workers will panic when they try to send data. Conversely, if you forget to close it entirely, your final sinker stage will wait forever.

func transformDispatcher(ctx context.Context, pending <-chan string) <-chan string {
	collector := make(chan string)
	wg := &sync.WaitGroup{}
    
	// Fan-Out: Spawn multiple parallel workers
	for i := 0; i < 3; i++ {
		wg.Add(1)
		go worker(ctx, pending, collector, wg)
	}
    
	// Fan-In: Wait for all workers to finish in a background routine, then close
	go func() {
		wg.Wait()
		close(collector)
	}()
    
	return collector
}

This design gives you some great built-in benefits: true parallelism, natural load balancing (since multiple goroutines are actively pulling from one shared channel), and an organized, ordered completion of tasks.

However, there is one major caveat you must keep in mind: output order is not preserved. Because the workers operate independently and process data at slightly different speeds, the results will arrive at the final sinker completely out of sequence. If your application strictly requires the output to match the original input order, this specific Fan-Out pattern is incorrect and you will need to reach for a different design.

⚠️ Common Pitfall: The “Busy Wait” Loop

When you write the infinite loop for your worker routines, it is tempting to include a default case in your select block to handle moments when the channel is empty. You must avoid doing this.

If you include a default case, you accidentally create a “busy wait.” Instead of pausing, the loop runs millions of times per second whenever there is no immediate data available, which will entirely consume your CPU resources.

Here is what that mistake looks like in our log transformer worker:

The Bad Way (Burns CPU):

for {
    select {
    case <-ctx.Done():
        return
    case checkStr := <-pending:
        // Process the log line
    default:
        // DANGER: If 'pending' is empty, the code instantly falls through to here.
        // The loop restarts immediately, spinning endlessly and burning 100% CPU.
    }
}

To fix this, you simply remove the default case. Without it, the select statement blocks naturally. The Go scheduler recognizes that the goroutine has nothing to do, so it puts the routine to sleep until new data arrives in the channel or the context cancels.

The Good Way (Zero CPU while waiting):

for {
    select {
    case <-ctx.Done():
        return
    case checkStr, ok := <-pending:
        // If the channel is closed, ok is false and checkStr is "".
        // Without this check, the loop would spin on zero-values
        // just like a busy wait — except harder to spot.
        if !ok {
            return
        }
        // If 'pending' is open but empty, the goroutine safely sleeps here.
        // It wakes up the moment a new log line arrives or the channel closes.
    }
}

Level 3: Controlling Flow with a Rate Limiter

Sometimes the architectural challenge isn’t moving data too slowly; it is moving data too quickly. If your pipeline calls a third-party API that strictly enforces a limit of five requests per second, executing your tasks instantly will result in blocked connections or banned IP addresses.

To mitigate this risk, we need to build a concurrency turnstile using time.Ticker to enforce a strict, mandatory delay between each operation. However, a common misconception here is that the upstream code simply keeps dumping data into the incoming channel at lightning speed while the rate limiter builds a massive internal queue. If that were true, your application would quickly exhaust its available memory by stockpiling thousands of unprocessed items.

Instead, Go channels naturally provide a powerful system design feature called backpressure.

Because our rate limiter explicitly pauses to wait for the next timer tick, it completely stops pulling new items out of the pending channel. Once that incoming channel is full, the upstream reader is forced to pause because it physically cannot send any more data. The slow speed of the rate limiter naturally pushes back on the fast reader, which forces the entire pipeline to slow down and synchronize to a safe, predictable pace.

Here is what that pipeline stage looks like when we wrap it in a proper function. Notice how the logic intentionally blocks reading the next item from the pending channel until the throttled channel successfully emits the current one alongside the timer tick.

// limiter takes a channel of pending requests and outputs them at a strict, steady pace.
func limiter(ctx context.Context, pending <-chan string, interval time.Duration) <-chan string {
    // The channel where we will send our rate-limited data downstream
    throttled := make(chan string)

    go func() {
        // Ensure we clean up the channel and ticker when we exit
        defer close(throttled)
        
        ticker := time.NewTicker(interval)
        defer ticker.Stop()

        for {
            select {
            case <-ctx.Done():
                return
                
            // Step 1: Read a single pending item. 
            // If we are waiting for a tick below, this read cannot happen, which creates backpressure!
            case item, ok := <-pending:
                if !ok {
                    return // Upstream closed the channel, meaning all work is done
                }
                
                // Step 2: We have the item. Now, we must wait for the green light.
                select {
                case <-ticker.C:
                    // Step 3: The tick arrived. Now we safely send the item downstream.
                    // We must check context again here to prevent a deadlock if downstream crashes!
                    select {
                    case throttled <- item:
                    case <-ctx.Done():
                        return
                    }
                case <-ctx.Done():
                    return
                }
            }
        }
    }()
    
    return throttled
}

Level 4: Token Buckets and Handling Bursts

While our basic rate limiter does the job, it is too rigid for most real-world applications. If a user hasn’t made a single request in an hour, the system still forces them to wait 200 milliseconds before processing their very first action. Real-world APIs like AWS or Stripe solve this by using the Token Bucket algorithm, which allows an initial burst of immediate traffic before the strict throttling takes over.

To build a burstable rate limiter, we decouple our timing mechanism from our consumer logic by utilizing a buffered channel as our token bucket.

This architecture requires three distinct components interacting together:

The Bucket: A buffered channel where the capacity represents our maximum allowed burst limit.
The Refiller: A background goroutine that adds a single token to the bucket every interval.
The Consumer: Our primary loop that simply attempts to read a token from the bucket before it processes the next incoming item.

Here is what this looks like when we wrap it into a proper pipeline stage:

// burstableLimiter allows an initial burst of traffic, then throttles to a steady interval.
func burstableLimiter(ctx context.Context, pending <-chan string, interval time.Duration, burstLimit int) <-chan string {
    throttled := make(chan string)
    
    // 1. The Bucket: A buffered channel to hold our available tokens
    bucket := make(chan struct{}, burstLimit)

    // Pre-fill the bucket so the initial burst executes immediately
    for i := 0; i < burstLimit; i++ {
        bucket <- struct{}{}
    }

    // 2. The Refiller: Runs in the background and adds tokens
    go func() {
        ticker := time.NewTicker(interval)
        defer ticker.Stop()

        for {
            select {
            case <-ctx.Done():
                return
            case <-ticker.C:
                // We use 'default' here to make this send non-blocking.
                // If the bucket is full, we simply drop the new token and move on.
                select {
                case bucket <- struct{}{}:
                default:
                }
            }
        }
    }()

    // 3. The Consumer: Reads data and waits for tokens
    go func() {
        defer close(throttled)

        for {
            select {
            case <-ctx.Done():
                return
            case item, ok := <-pending:
                if !ok {
                    return // Upstream closed the channel
                }

                // Wait for a token. As long as the bucket is not empty, this is instant!
                select {
                case <-bucket:
                case <-ctx.Done():
                    return
                }

                // We have the item and the token. Safely send it downstream.
                select {
                case throttled <- item:
                case <-ctx.Done():
                    return
                }
            }
        }
    }()

    return throttled
}

Because we deliberately pre-filled the bucket during the initialization phase, the pipeline processes the first few items instantly. Once that initial burst depletes the bucket, the consumer naturally pauses and waits for the refiller goroutine to drop a new token into the bucket at our specified interval.

⚠️ The Strategic Exception: Why We Used default in the Refiller

This has confused me a lot. Earlier in this post, I explicitly called out the dangers of using a default case, noting how it creates a CPU-burning busy loop. However, you might have noticed that I intentionally used one inside the refiller goroutine. This isn’t a mistake; it’s a specific pattern called a non-blocking send.

If the system experiences a lull in traffic, the token bucket quickly fills up to its maximum burst capacity. When the next timer tick fires, the refiller attempts to push another token into that full channel. If we simply wrote bucket <- struct{}{}, the operation would block. The refiller goroutine would freeze, waiting for a consumer to take a token.

By wrapping that send operation in a select statement with an empty default case, we tell the Go runtime: “Try to drop a token into the bucket. If the bucket is full, simply discard the token and go back to sleep until the next tick.” It gracefully enforces the maximum burst limit without locking up our background process.

You might wonder why this doesn’t spike the CPU like the earlier example. In the Level 2 pitfall, the default case was a direct option in the main select block. Whenever the channel was empty, the select instantly fell through to the default case, which caused the outer for loop to spin continuously without pausing.

In our refiller, the default case sits inside a nested select block. The outer select still forces the goroutine to pause and wait for the ticker.C channel. The program only evaluates that nested default case once per tick. This ensures the loop only executes every 200 milliseconds, keeping CPU usage negligible.

Final Thoughts and Best Practices

Stepping away from my usual infrastructure and reliability work to study these backend patterns has definitely changed how I look at low level design. Writing highly concurrent code in Go yields incredibly powerful applications, but it requires strict operational discipline to keep those systems stable under heavy load.

As I worked through these iterations, I developed a practical checklist that I now use when reviewing any concurrent code.

Respect the Context to Prevent Goroutine Leaks: In a real-world backend, requests frequently get cancelled, connections drop, or API timeouts expire. The context is how Go signals your application to stop working on a task that is no longer needed. If your goroutines ignore this cancellation signal while waiting to send or receive data on a channel, they will hang in the background forever. Over time, these stranded goroutines accumulate and cause silent memory leaks that degrade performance and eventually crash your application. Always pair your blocking channel operations with a case <-ctx.Done():.
Embrace Backpressure: A well-designed pipeline does not rely on massive, bottomless queues that hoard memory. Instead, it leverages the natural blocking behavior of Go channels. If a downstream stage (like our rate limiter) slows down, let the channel fill up. That full channel forces the upstream reader to pause, which safely synchronizes the entire application to a sustainable pace.
Watch Out for the “Busy Wait” Trap: Be extremely careful when using a default case inside an infinite for loop. Normally, a select statement pauses your goroutine until a channel is ready to send or receive data. If you add a default case, the select becomes non-blocking. When the channels are empty, the code instantly executes the default case and restarts the loop. This causes the loop to spin millions of times per second, which completely consumes your CPU. You should only use default when you explicitly want to make a single channel operation non-blocking—like dropping a token when a bucket is full—and only if the outer loop is already paused by something else, like a timer tick.
Understand the Trade-offs of Parallelism: When you implement a Fan-Out pattern, you gain massive speed improvements and natural load balancing across multiple worker goroutines. However, because those workers operate independently, the final output order is not preserved. Always evaluate if your specific use case requires strict sequential processing before throwing a worker pool at the problem.

Documenting this progression helped me solidify my own understanding of these concurrency models. Hopefully, breaking down these concepts from a basic pipeline to a production-grade token bucket helps you build more resilient systems as well.