Building a Resilient .NET Web API - Part 2 - Henrique Siebert Domareski

Retry is one of the most commonly used resilience strategies when building applications that rely on external services. While the basic retry concept is simple (try again if something fails), there are different configuration options that control how retries behave, including Backoff Strategies, Jitter, Selective Retry based on status codes, and Retry-After header support. In this article, I explore these retry strategies with practical examples using a .NET 10 Web API and Microsoft’s HTTP resilience package, which is powered by Polly.

If you want to check the first article of this series, which contains an introduction to resilience and the setup of the demo project, check out the first article here: https://henriquesd.com/articles/building-a-resilient-net-web-api-part-1

The Retry Strategy

The Retry strategy is one of the simplest and most effective ways to handle transient failures in distributed systems, and it acts as a first line of defence in resilience strategies. Instead of failing immediately when an operation does not succeed, the application attempts the same operation again, giving some time for the issue to resolve itself.

For example, imagine a scenario of a payment system where a payment request fails due to a timeout. With a retry strategy, the application can automatically retry the request and if the retry succeeds, the user never notices the failure.

Another example, let's say a downstream API briefly returns a 503 (Service Unavailable) error. Instead of immediately showing an error, the application can wait for a short predefined interval and retry the request after this period, often succeeding once the service recovers.

Without retries, these temporary glitches would surface as user-facing errors, negatively impacting the overall experience.

How Retry Works

When a request to an external dependency fails (whether due to a timeout, a network glitch, or a temporary service disruption), the retry strategy intercepts that failure, and based on the configuration create for it, it can:

Retry the request
Retry the request for a pre-defined number of times
Wait a specific time between each retry attempt

If the conditions are met, the request is executed again according to the configured policy.

Retry Backoff Types

When a request fails and Polly retries it, the backoff type determines how long Polly waits between each retry attempt. Polly v8 provides three backoff types through the DelayBackoffType enum:

Constant: The delay between retries is always the same. If you set the delay to 2 seconds, every retry will wait exactly 2 seconds. For example: 2s, 2s, 2s, 2s, 2s.
Linear: The delay increases linearly with each retry attempt. If you set the base delay to 1 second, the first retry waits 1 second, the second waits 2 seconds, and so on. For example: 1s, 2s, 3s, 4s, 5s.
Exponential: The delay doubles with each retry attempt. If you set the base delay to 1 second, the first retry waits 1 second, the second waits 2 seconds, the third waits 4 seconds, and so on. For example: 1s, 2s, 4s, 8s, 16s.

Exponential backoff is the recommended default for most scenarios, because it progressively reduces the load on a struggling service, giving it more time to recover with each retry.

The UnreliableWeatherApi

Before configuring the different retry strategies, let's add three new endpoints to the UnreliableWeatherApi that simulate specific failure scenarios.

First, let's add a variable to track the requests:

[ApiController]
[Route("api/[controller]")]
public class WeatherController : ControllerBase
{
    private static int _slowRequestCount;

    // ...
}

On line 5, a static counter _slowRequestCount is declared to track how many requests the slow endpoint has received.

Next, the GetSlow method simulates a service that is temporarily down:

[HttpGet("slow")]
public IActionResult GetSlow()
{
    var count = Interlocked.Increment(ref _slowRequestCount);

    if (count <= 4)
    {
        return StatusCode(503, new { Message = "Service temporarily unavailable" });
    }

    Interlocked.Exchange(ref _slowRequestCount, 0);

    return Ok(new { Temperature = 20, Summary = "Cloudy" });
}

It returns 503 Service Unavailable for the first 4 requests and then succeeds on the 5th, simulating a service that recovers after a period.
On line 4, Interlocked.Increment is used to safely increment the counter in a thread-safe manner.
On line 11, the counter is reset to 0 after a successful response, so the cycle can repeat for the next demo run.

Next, the GetRateLimited method simulates a rate-limited service:

[HttpGet("rate-limited")]
public IActionResult GetRateLimited()
{
    if (Random.Shared.Next(1, 4) == 1)
    {
        return Ok(new { Temperature = 22, Summary = "Partly Cloudy" });
    }

    Response.Headers["Retry-After"] = "2";

    return StatusCode(429, new { Message = "Too many requests" });
}

This method succeeds approximately 33% of the time, and for failed requests, it returns a 429 Too Many Requests status code.
On line 9, the Retry-After response header is set to 2 seconds, indicating to the client how long it should wait before retrying.

Finally, the GetMixed method simulates a service that returns a mix of success and error codes:

[HttpGet("mixed")]
public IActionResult GetMixed()
{
    var roll = Random.Shared.Next(1, 4);

    if (roll == 1)
    {
        return Ok(new { Temperature = 24, Summary = "Mild" });
    }

    if (roll == 2)
    {
        return StatusCode(503, new { Message = "Service unavailable" });
    }

    return StatusCode(500, new { Message = "Internal server error" });
}

Approximately 33% of requests succeed with 200, 33% return 503 Service Unavailable, and 33% return 500 Internal Server Error. This endpoint is used to demonstrate selective retry, where Polly only retries on specific status codes, retrying on 503 but stopping immediately on 500.

Organising Resilience Configuration with Extension Methods

Since on this demo we are registering multiple resilient HttpClient instances with different retry strategies, it is a good practice to organise the configuration into extension methods or into separated classes. This keeps Program.cs clean and makes each strategy easy to find and maintain.

For this demo, I created a static class called HttpClientResilienceExtensions with one extension method per retry strategy. All methods share a common OnRetry callback defined as a static property:

public static class HttpClientResilienceExtensions
{
    private static Func<OnRetryArguments<HttpResponseMessage>, ValueTask> OnRetry => args =>
    {
        Console.WriteLine($"Retry attempt {args.AttemptNumber + 1} after {args.RetryDelay.TotalSeconds:F1}s delay. Status: {args.Outcome.Result?.StatusCode}");
        return default;
    };

    // Extension methods for each retry strategy go here...
}

On lines 3 to 7, the OnRetry property defines a shared callback that is reused by all retry strategies. It logs the retry attempt number, the delay before the next attempt, and the HTTP status code of the failed response. By defining it once, we avoid duplicating the same logging logic across every resilience handler.
On line 5, AttemptNumber + 1 is used because AttemptNumber is zero-based, this way the user sees "Retry attempt 1" for the first retry.
On line 6, default is returned, which is equivalent to ValueTask.CompletedTask.

Note that in a production application, you should consider using ILogger instead of Console.WriteLine for structured logging. You can access the IServiceProvider through the AddResilienceHandler overload that receives a ResilienceHandlerContext as the second parameter, and resolve an ILoggerFactory from it.

When the application runs and a request is retried with exponential backoff, you will see console output like:

Retry attempt 1 after 1.1s delay. Status: ServiceUnavailable
Retry attempt 2 after 2.3s delay. Status: ServiceUnavailable
Retry attempt 3 after 4.5s delay. Status: ServiceUnavailable
Retry attempt 4 after 8.8s delay. Status: ServiceUnavailable

Now let's look at each retry strategy extension method.

Configuring Retry with Constant Backoff

Let's start with the simplest backoff type. The following extension method registers an HttpClient with a constant delay of 2 seconds between each retry:

public static IServiceCollection AddConstantRetryClient(this IServiceCollection services, Uri baseAddress)
{
    services.AddHttpClient("UnreliableWeatherApi-Constant", client =>
    {
        client.BaseAddress = baseAddress;
    })
    .AddResilienceHandler("constant-retry", pipelineBuilder =>
    {
        pipelineBuilder.AddRetry(new HttpRetryStrategyOptions
        {
            MaxRetryAttempts = 5,
            Delay = TimeSpan.FromSeconds(2),
            BackoffType = DelayBackoffType.Constant,
            OnRetry = OnRetry
        });
    });

    return services;
}

On line 1, the method extends IServiceCollection and receives the baseAddress as a parameter, so the URL is not hardcoded.
On line 11, MaxRetryAttempts is set to 5.
On line 12, Delay is set to 2 seconds.
On line 13, BackoffType is set to Constant.
On line 14, OnRetry references the shared logging callback.

With this configuration, every retry waits exactly 2 seconds: 2s, 2s, 2s, 2s, 2s. This is the simplest option and can be useful when you know the downstream service has a fixed recovery time.

Configuring Retry with Linear Backoff

With linear backoff, the delay increases by the base delay value with each attempt:

public static IServiceCollection AddLinearRetryClient(this IServiceCollection services, Uri baseAddress)
{
    services.AddHttpClient("UnreliableWeatherApi-Linear", client =>
    {
        client.BaseAddress = baseAddress;
    })
    .AddResilienceHandler("linear-retry", pipelineBuilder =>
    {
        pipelineBuilder.AddRetry(new HttpRetryStrategyOptions
        {
            MaxRetryAttempts = 5,
            Delay = TimeSpan.FromSeconds(1),
            BackoffType = DelayBackoffType.Linear,
            OnRetry = OnRetry
        });
    });

    return services;
}

On line 12, Delay is set to 1 second as the base delay.
On line 13, BackoffType is set to Linear.

With this configuration, the delays increase linearly: 1s, 2s, 3s, 4s, 5s. This provides a gradual increase that gives the service progressively more time to recover.

Configuring Retry with Exponential Backoff

Exponential backoff doubles the delay with each attempt, providing the most aggressive delay growth:

public static IServiceCollection AddExponentialRetryClient(this IServiceCollection services, Uri baseAddress)
{
    services.AddHttpClient("UnreliableWeatherApi-Exponential", client =>
    {
        client.BaseAddress = baseAddress;
    })
    .AddResilienceHandler("exponential-retry", pipelineBuilder =>
    {
        pipelineBuilder.AddRetry(new HttpRetryStrategyOptions
        {
            MaxRetryAttempts = 5,
            Delay = TimeSpan.FromSeconds(1),
            BackoffType = DelayBackoffType.Exponential,
            UseJitter = true,
            OnRetry = OnRetry
        });
    });

    return services;
}

On line 12, Delay is set to 1 second as the base delay.
On line 13, BackoffType is set to Exponential.
On line 14, UseJitter is set to true.

With this configuration, the delays grow exponentially: approximately 1s, 2s, 4s, 8s, 16s (with jitter adding small random variations). This is the recommended default for most scenarios.

Adding Jitter to Prevent Thundering Herd

The UseJitter property adds a small random variation to each retry delay. This is important in distributed systems where multiple clients might be calling the same service. Without jitter, if the service goes down and all clients retry at the same time, they will all retry again at the same time, creating a thundering herd effect that can overwhelm the service just as it tries to recover.

In computer science, the thundering herd is a performance-degrading phenomenon that happens when a large number of concurrent requests—often caused by cache expiration, a system restart, or a sudden surge in traffic—simultaneously hit a shared resource or backend service.

With jitter enabled, each client's retry delay is slightly different, spreading the retries over time:

Without jitter, Client A retries at 1s, 2s, 4s, 8s, 16s and Client B retries at 1s, 2s, 4s, 8s, 16s (simultaneous)
With jitter: Client A retries at 1.1s, 2.3s, 3.8s, 8.5s, 15.2s and Client B retries at 0.9s, 1.7s, 4.2s, 7.6s, 16.8s (spread out)

Note that UseJitter works with all three backoff types, but it is most commonly used with exponential backoff. The combination of exponential backoff with jitter is considered a best practice for resilient HTTP calls.

Retrying on Specific HTTP Status Codes

By default, the HttpRetryStrategyOptions retries on all 5xx server errors and on HttpRequestException. In some scenarios, you may want to retry only on specific status codes. For example, you might want to retry on 429 Too Many Requests and 503 Service Unavailable, but not on 500 Internal Server Error.

For that, you can use the ShouldHandle property:

public static IServiceCollection AddSelectiveRetryClient(this IServiceCollection services, Uri baseAddress)
{
    services.AddHttpClient("UnreliableWeatherApi-Selective", client =>
    {
        client.BaseAddress = baseAddress;
    })
    .AddResilienceHandler("selective-retry", pipelineBuilder =>
    {
        pipelineBuilder.AddRetry(new HttpRetryStrategyOptions
        {
            MaxRetryAttempts = 5,
            Delay = TimeSpan.FromSeconds(1),
            BackoffType = DelayBackoffType.Exponential,
            ShouldHandle = static args => ValueTask.FromResult(
                args.Outcome.Result?.StatusCode is HttpStatusCode.TooManyRequests
                    or HttpStatusCode.ServiceUnavailable),
            OnRetry = OnRetry
        });
    });

    return services;
}

On lines 14 to 16, the ShouldHandle delegate inspects the response status code. It returns true only when the status code is 429 (TooManyRequests) or 503 (ServiceUnavailable), meaning Polly will only retry for these specific codes.

To demonstrate this, the selective retry endpoint calls /api/weather/mixed, which randomly returns 200, 503, or 500. In the console logs, you will see Polly retry when it receives a 503 but stop immediately when it receives a 500, clearly showing the selectivity in action. When the endpoint returns 200, the request succeeds without any retries. This is useful when you know that certain errors are transient (429, 503) while others indicate a permanent problem that retrying will not solve.

Respecting the Retry-After Header

Some APIs return a Retry-After header in 429 responses, telling the client how many seconds to wait before making the next request. Polly allows you to read this header and use it as the retry delay through the DelayGenerator property:

public static IServiceCollection AddRetryAfterClient(this IServiceCollection services, Uri baseAddress)
{
    services.AddHttpClient("UnreliableWeatherApi-RetryAfter", client =>
    {
        client.BaseAddress = baseAddress;
    })
    .AddResilienceHandler("retry-after", pipelineBuilder =>
    {
        pipelineBuilder.AddRetry(new HttpRetryStrategyOptions
        {
            MaxRetryAttempts = 5,
            BackoffType = DelayBackoffType.Constant,
            Delay = TimeSpan.FromSeconds(1),
            DelayGenerator = static args =>
            {
                if (args.Outcome.Result is HttpResponseMessage response
                    && response.Headers.RetryAfter?.Delta is TimeSpan retryAfter)
                {
                    return ValueTask.FromResult<TimeSpan?>(retryAfter);
                }

                return ValueTask.FromResult<TimeSpan?>(null);
            },
            OnRetry = OnRetry
        });
    });

    return services;
}

On lines 14 to 23, the DelayGenerator delegate is configured. It inspects the failed response for a Retry-After header.
On line 16, it checks if the response exists and if the RetryAfter header has a Delta value (a TimeSpan).
On line 18, if the header is present, its value is returned as the retry delay, overriding the configured Delay and BackoffType.
On line 21, if the header is not present, null is returned, which tells Polly to fall back to the default delay calculation.

This way, when the UnreliableWeatherApi returns a 429 response with Retry-After: 2, Polly will wait exactly 2 seconds before retrying, respecting the server's request.

Registering Resilient HttpClients in Program.cs

With all the retry strategies defined as extension methods, Program.cs becomes clean and concise. The base address is read from configuration, and each resilient HttpClient is registered through a fluent chain of extension method calls:

using HttpResilienceDemo.ResilientApi.Extensions;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddControllers();

var unreliableWeatherApiBaseAddress = new Uri(builder.Configuration["UnreliableWeatherApi:BaseAddress"]!);

builder.Services
    .AddConstantRetryClient(unreliableWeatherApiBaseAddress)
    .AddLinearRetryClient(unreliableWeatherApiBaseAddress)
    .AddExponentialRetryClient(unreliableWeatherApiBaseAddress)
    .AddSelectiveRetryClient(unreliableWeatherApiBaseAddress)
    .AddRetryAfterClient(unreliableWeatherApiBaseAddress);

var app = builder.Build();

app.UseHttpsRedirection();

app.MapControllers();

app.Run();

On line 1, the HttpResilienceDemo.ResilientApi.Extensions namespace is imported to access the extension methods.
On line 7, the UnreliableWeatherApi base address is read from configuration (appsettings.json), so the URL is not hardcoded in the source code.
On lines 9 to 14, all five resilient HttpClient registrations are chained together. Each extension method returns the IServiceCollection, allowing the fluent syntax.

Note that the UnreliableWeatherApi:BaseAddress value is configured in appsettings.Development.json:

{
  "UnreliableWeatherApi": {
    "BaseAddress": "https://localhost:5001"
  }
}

The Retry Controller

Now let's look at the RetryController that uses each of these named HttpClients to demonstrate the different retry behaviors:

First, the class declaration and constructor, along with the GetWithConstantRetry action:

[ApiController]
[Route("api/[controller]")]
public class RetryController : ControllerBase
{
    private readonly IHttpClientFactory _httpClientFactory;

    public RetryController(IHttpClientFactory httpClientFactory)
    {
        _httpClientFactory = httpClientFactory;
    }

    [HttpGet("constant")]
    public async Task<IActionResult> GetWithConstantRetry()
    {
        var client = _httpClientFactory.CreateClient("UnreliableWeatherApi-Constant");
        var response = await client.GetAsync("/api/weather/slow");

        if (response.IsSuccessStatusCode)
        {
            var content = await response.Content.ReadAsStringAsync();
            return Ok(content);
        }

        return StatusCode((int)response.StatusCode);
    }
}

On line 5, the IHttpClientFactory is injected through the constructor, as we saw in the previous article. We will use it to create the HTTP Clients next.
On line 15, on the GetWithConstantRetry method, we create the "UnreliableWeatherApi-Constant" client, which uses the constant backoff configuration. It calls the /api/weather/slow endpoint that fails 4 times then succeeds on the 5th request.

The GetWithLinearRetry action uses the "UnreliableWeatherApi-Linear" client with linear backoff:

[HttpGet("linear")]
public async Task<IActionResult> GetWithLinearRetry()
{
    var client = _httpClientFactory.CreateClient("UnreliableWeatherApi-Linear");
    var response = await client.GetAsync("/api/weather/slow");

    if (response.IsSuccessStatusCode)
    {
        var content = await response.Content.ReadAsStringAsync();
        return Ok(content);
    }

    return StatusCode((int)response.StatusCode);
}

It also calls /api/weather/slow, but retries with linearly increasing delays.

The GetWithExponentialRetry action uses the "UnreliableWeatherApi-Exponential" client with exponential backoff and jitter:

[HttpGet("exponential")]
public async Task<IActionResult> GetWithExponentialRetry()
{
    var client = _httpClientFactory.CreateClient("UnreliableWeatherApi-Exponential");
    var response = await client.GetAsync("/api/weather/slow");

    if (response.IsSuccessStatusCode)
    {
        var content = await response.Content.ReadAsStringAsync();
        return Ok(content);
    }

    return StatusCode((int)response.StatusCode);
}

It also calls /api/weather/slow, but retries with exponentially increasing delays.

The GetWithSelectiveRetry action uses the "UnreliableWeatherApi-Selective" client and calls the /api/weather/mixed endpoint:

[HttpGet("selective")]
public async Task<IActionResult> GetWithSelectiveRetry()
{
    var client = _httpClientFactory.CreateClient("UnreliableWeatherApi-Selective");
    var response = await client.GetAsync("/api/weather/mixed");

    if (response.IsSuccessStatusCode)
    {
        var content = await response.Content.ReadAsStringAsync();
        return Ok(content);
    }

    return StatusCode((int)response.StatusCode);
}

This client only retries on 429 and 503 status codes, so when the endpoint returns a 500, Polly does not retry.

The GetWithRetryAfter action uses the "UnreliableWeatherApi-RetryAfter" client and calls the /api/weather/rate-limited endpoint:

[HttpGet("retry-after")]
public async Task<IActionResult> GetWithRetryAfter()
{
    var client = _httpClientFactory.CreateClient("UnreliableWeatherApi-RetryAfter");
    var response = await client.GetAsync("/api/weather/rate-limited");

    if (response.IsSuccessStatusCode)
    {
        var content = await response.Content.ReadAsStringAsync();
        return Ok(content);
    }

    return StatusCode((int)response.StatusCode);
}

This client respects the Retry-After header, waiting the server-specified duration before each retry.

Note that each action method follows the same pattern, the only difference is which named HttpClient is created and which UnreliableWeatherApi endpoint is called. All the retry behavior is configured in the HttpClientResilienceExtensions class, keeping the controller code clean.

Running the Demo

To demonstrate these examples, I configured the UnreliableWeatherApi project to run on port 5001 and the ResilientApi project on port 5002.

To run the demo, start both applications. First, start the UnreliableWeatherApi:

dotnet run --project src/HttpResilienceDemo.UnreliableWeatherApi

Then, in a separate terminal, start the ResilientApi:

dotnet run --project src/HttpResilienceDemo.ResilientApi

Now you can test each retry strategy:

# Constant backoff (2s, 2s, 2s, 2s, 2s)
curl https://localhost:5002/api/retry/constant

# Linear backoff (1s, 2s, 3s, 4s, 5s)
curl https://localhost:5002/api/retry/linear

# Exponential backoff with logging (1s, 2s, 4s, 8s, 16s)
curl https://localhost:5002/api/retry/exponential

# Selective retry (only on 429 and 503 — stops immediately on 500)
curl https://localhost:5002/api/retry/selective

# Retry-After header support
curl https://localhost:5002/api/retry/retry-after

Alternatively, you can also run the Web APIs in Visual Studio and make use of the .http file to send the requests (this file is also included in the demo project).

All endpoints log retry attempts to the console, so you can observe the different delay patterns in real time.

When testing the constant endpoint, you will see 4 retries with all delays at exactly 2.0s before the request succeeds on the 5th attempt.
When testing the linear endpoint, you will see 4 retries with delays increasing by 1 second each time (1.0s, 2.0s, 3.0s, 4.0s) before success.
When testing the exponential endpoint, you will see 4 retries with delays doubling (approximately 1.0s, 2.0s, 4.0s, 8.0s with jitter) before success.
When testing the selective endpoint, you will see Polly retry on 503 responses but return immediately on 500 responses, demonstrating that the ShouldHandle filter controls which errors trigger retries.
When testing the retry-after endpoint, the delay between retries will be 2.0 seconds as specified by the server's Retry-After header.

Conclusion

Polly offers three backoff types: Constant, Linear, and Exponential. Each one creates a different pattern of delays, tailored to different scenarios. For most applications, exponential backoff with jitter is the recommended default, as it gradually reduces the load on failing services and prevents the thundering herd problem. In the next article, I will cover the Circuit Breaker pattern with Polly.

This is the link for the project in GitHub: https://github.com/henriquesd/HttpResilienceDemo

If you like this demo, I kindly ask you to give a ⭐ in the repository.

Thanks for reading!

References

Building a Resilient .NET Web API - Part 2