Security Under the Hood | HCS Passwordless - Part 6

In this last post in the series I want to talk about some of the security techniques woven throughout the library. I'm always heistent when doing things relating to authentication because security is so important. When writing this package I've been asking Claude to help, especially when it's come to security audits of the approaches taken. So let's get exploring.

Timing Attacks: The Problem People Forget About

Let's start with one that I've been aware of for quite some time (no pun intended), timing attacks.

Imagine a magic link request endpoint that does this:

var member = await _lookup.FindApprovedAsync(model.Email);
if (member is null) return Accepted();  // fast — no DB results

var token = await GenerateToken(member);
await SendEmail(member.Email, token);   // slow — email sending
return Accepted();

Both branches return 202 Accepted, so status code is the same. But the response time is different, which can be seen if looking at two comparative requests in your browsers dev tools. The "email not found" path takes a few milliseconds where as the "email found, send email" path takes hundreds of milliseconds.

An attacker can automate requests and measure response times, and within a few thousand requests they can build a reliable list of which email addresses are registered on your site. So despite having implemented email enumeration protection correctly from a response body perspective and completely undermined it from a timing perspective.

To tackle this, I've added in a FakeWork.DelayAsync(). Every authentication endpoint applies it on every code path: success, failure, rate limit, not found, all of them. The method looks like this:

public static Task DelayAsync(TimeSpan budget, CancellationToken ct = default)
{
    // Random duration between 50% and 100% of budget
    var half = (int)(budget.TotalMilliseconds / 2);
    var ms = half + RandomNumberGenerator.GetInt32(half + 1);
    return Task.Delay(ms, ct);
}

The delay is varied, if it were fixed an attacker could use statistical analysis to identify the two different paths. By randomising the delay within a range it helps to defeat that analysis approach.

The default budget is 250ms. One critical mistake to avoid: the delay must be awaited before returning. Don't fire-and-forget it (yes I did do this in my first implementation and wondered why Claude still flagged it as a security isse):

// WRONG — response is already sent when delay fires
_ = FakeWork.DelayAsync(_options.FakeWorkBudget);
return Unauthorized();

// RIGHT
await FakeWork.DelayAsync(_options.FakeWorkBudget, ct);
return Unauthorized();

Constant-Time Comparison

This next one was completely new to me, but it is another timing based risk area.

When you compare a submitted code against a stored hash, the natural thing to write is to do simple string comparision like this:

if (submittedHash == storedHash) ...

However, what I didn't know is that string comparison in .NET is early-exit. What this means is that it stops as soon as it finds a mismatch. First character wrong? Returns in nanoseconds. First 99 characters right? Takes slightly longer.

A sufficiently patient attacker can time these differences across many requests and gradually home in on the correct value one character at a time.

This is apparently called a timing oracle attack and while it's difficult to exploit in practice (you need to control the submitted value and measure many thousands of responses), according to Claude "it's the kind of thing that belongs in the threat model of an auth library".

.Net provides a fix for this in the form of CryptographicOperations.FixedTimeEquals() from the BCL, which always takes the same time regardless of where the mismatch occurs.

[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.NoOptimization)]
public static bool Equals(string a, string b)
{
    var aBytes = Encoding.UTF8.GetBytes(a);
    var bBytes = Encoding.UTF8.GetBytes(b);
    return CryptographicOperations.FixedTimeEquals(aBytes, bBytes);
}

Claude explanation of the code

The MethodImpl attribute is defensive belt-and-braces — it tells the JIT not to inline or optimise this method, which could otherwise introduce the very branching behaviour we're trying to avoid. In practice FixedTimeEquals already handles this internally, but being explicit costs nothing.

This is used in the OTP token provider for code comparison. The magic link verify path is different, UserManager.VerifyUserTokenAsync(), as this uses the Data Protection API internally, which already does constant-time comparison.

Rate Limiting Atomicity

The rate limiter implementation has a subtlety worth explaining. If I'd taken an approach to the increment which looks like this there would have been a risk around concurency:

var count = _buckets.GetValueOrDefault(key, 0);
_buckets[key] = count + 1;
return count < limit;

Under a concurrent load, two requests can both read 0, both write 1, slipping under the limit. This is because the read and the write would be two separate operations, meaning there's no guarantee they're atomic.

To tackle this .Net provides the ConcurrentDictionary which can be used with it's ConcurrentDictionary.GetOrAdd method to ensure the counter object exists, then an Interlocked.Increment is used for the actual count:

var counter = _buckets.GetOrAdd(key, _ => new Counter());
var count = Interlocked.Increment(ref counter.Value);
return count <= limit;

Claude explanation of the code

Interlocked.Increment is a single CPU instruction (LOCK XADD on x86). No thread can observe the value between the read and the write. This makes the rate limiter safe under concurrent load without needing locks.

The Distributed Cache TOCTOU Window

For the single-use token store, the distributed cache implementation has a documented limitation I need to talk about. IDistributedCache doesn't expose an atomic get-and-delete operation meaning The implementation has to do two separate calls:

public async Task<bool> TryMarkUsedAsync(string tokenHash, TimeSpan ttl, CancellationToken ct)
{
    var key = $"pwl:singleuse:{tokenHash}";
    var existing = await _cache.GetAsync(key, ct);    // step 1: get
    if (existing is not null) return false;            // already used

    await _cache.SetAsync(key, Placeholder, ttl, ct); // step 2: set
    return true;
}

Between step 1 and step 2, there is a narrow window where two concurrent requests could both read null (token not yet marked used), and both proceed to sign in. In practice this window is microseconds wide and requires two requests to arrive at essentially the same instant, but it exists.

For magic links and OTP, the secondary defence is that the underlying UserManager.VerifyUserTokenAsync() is also single-use, so the token is only valid once from the Data Protection API's perspective as well, so a replay of the same token would fail there regardless.

For scenarios where you need true atomicity, the solution would be to use something like a Redis-backed implementation using GETSET or Lua scripting (yeah this came from Claude, I've not heard of it or looked further into it). To allow for this potential requirement the package allows for a custom implementation to be used:

builder.AddPasswordlessMagicLink(cfg => cfg
    .UseSingleUseTokenStore<AtomicRedisTokenStore>()
);

The in-memory default that I ship with should be fine for single-node deployments, and the distributed cache implementation is fine for most production cases given the secondary defences. But if your threat model includes highly coordinated simultaneous replay attacks, you should replace it.

Return URL Validation

Because the redirect URL is part of the signin process, I discussed the risks with Claude (this was sparked by an issue raised by a Pen Test report related return URLs and Headers in OAuth - possibly a conversation for another time). This is what Claude had to say:

Open redirect vulnerabilities in authentication flows are particularly dangerous. The pattern is: attacker crafts a link to your login page with a returnUrl pointing at their malicious site. Victim signs in, gets redirected to attacker's site, attacker steals session data.

The fix is strict validation to ensure the redirect URL is part of the current site before allowing it to be executed:

public static string Sanitize(string? returnUrl, string fallback = "/")
{
    if (string.IsNullOrWhiteSpace(returnUrl)) return fallback;

    // Only allow relative URLs that start with / but not //
    // // would be interpreted as protocol-relative by browsers
    if (returnUrl.StartsWith('/') && !returnUrl.StartsWith("//"))
        return returnUrl;

    return fallback;
}

I asked Claude why it needed the check for //, this is what it had to say:

The //evil.com pattern is the one people often forget. A URL starting with // is protocol-relative — browsers interpret it as https://evil.com (or whatever scheme the current page uses). Checking for / but not // catches this.

Because of this risk, every redirect in the library goes through ReturnUrlValidator.Sanitize() first.

Token Hashing

Magic link tokens and OTP hashes are stored as SHA-256 hashes. WebAuthn challenge states are stored as JSON blobs keyed by ceremony ID, but the challenge itself (a random nonce) isn't separately stored — it lives inside the state object.

The reason for this hashing is to apply "defence-in-depth". If the cache or distributed storage were ever compromised, an attacker would find hashes rather than usable tokens. This combined with short TTLs (5–15 minutes) means that the window for exploiting a compromised store is small.

What "Secure Defaults" Actually Means

Because I was building something relating to authentication, I wanted the package to be "Secure" out of the box, or atleast as secure as I could make it. This meant making the default config options have sensible starting values, reducing the amount of config that an end user needs to do.

If you look at the library from the outside, the defaults are:

Single-use enforcement: on
Rate limiting: on (10 req/min per IP, 5 req/hour per email)
Fake-work delays: on (250ms budget)
Token hashing: always
Constant-time comparison: always

I think this is the right approach for an auth library becuase the default behaviour should be the safe behaviour.

So thanks for sticking with me, if you've read this entire series thank you! If you've only read this article, thank you as well.

I hope it's been useful, whether you're planning to use the packages, build something similar yourself, or just interested in what I've been working on.

Finally, I must give a big thank you to Claude, it has allowed me to push myself to build something I probably would have avoided in the past. It has allowed me to question things quickly, improve my knowledge in an area I wasn't familiar with in the past. AI is changing the way we dev, and it continues to terrify and impress me.

If anything raised a question or you spotted something I've missed, I'd genuinely like to hear about it. Find me on the Umbraco Discord as Nik, on Mastodon, on X, or open an issue in the GitHub repository.

Building a Passwordless Authentication Library for Umbraco - Part 6: Security Under the Hood