The N×log(N) Limit of Multi-Factor Authentication.
2023-12-08 | Nazar Vinnichuk <nazar@depauth.com>
TL;DR: Given N authentication factors, the optimal number of authentication options within each factor approaches log2(N) as N grows. However, some problems are only solvable with small N — more factors is not always better.
Let us define a simple model of multi-factor authentication. We use locks to protect our homes from intruders. It is common to have a few locks to make it more difficult for strangers to get inside. It is also common to have a few identical keys for each lock as a backup in case some keys are lost.
The locks correspond to authentication factors and the keys correspond to authentication options within each factor. Let us denote the number of locks as $x$ and the number of identical keys for each lock as $y$. We assume that there is the same number of keys for each lock and that each key fits only one lock.
We lose our keys sometimes. If we lose all the keys of a lock, we can no longer enter our own home — at least not the normal way. Let us denote the probability of losing any one key during the next year $p$. We assume that each key is independent and losing one does not affect the probability of losing another.
Similarly, keys can be compromised by getting into wrong hands. If someone gets hold of a key for each one of the locks, they can get inside. For simplicity, we assume that the probability of a key being compromised is equal to the probability of a key being lost, $p$.
We let $a$ denote the probability of being locked out of our own home during the next year that we are comfortable with. Normally, $a \ll p$. Similarly, $b$ is the acceptable probability of someone unauthorized getting access to our home during the next year.
The constraints of the multi-factor authentication problem can now be expressed mathematically. $$ \begin{cases} 1 - (1 - p^y)^x \le a \\ (1 - (1 - p)^y)^x \le b \end{cases} $$ The subtle difference between the left sides of the inequations comes from the fact that you lose access to your home if you lose all keys for one lock, whereas unauthorized access comes from compromise of any one key for each lock.
The inequations let us express $y$ from the other variables. $$ \begin{cases} y \ge \log_p (1-(1-a)^\frac{1}{x}) \\ y \le \log_{1-p}{(1-b^\frac{1}{x}}) \end{cases} $$
If we let $x \rightarrow \infty$, we get constraints that do not depend on $a$ and $b$. $$ \begin{cases} y \ge -\log_p x \\ y \le -\log_{1-p}x \end{cases} $$ Since $\log_2{x} = -\log_{\frac{1}{2}}{x}$, it is always between $-\log_p x$ and $-\log_{1-p} x$.
What are the implications? If $p \le \frac12$, we can solve the authentication problem for any $0 \lt a \lt 1$ and $0 \lt b \lt 1$ by choosing a large enough number of locks $x$ and having $\log_2x$ keys for each lock. If $p \gt \frac12$, no amount of keys will help us if the number of locks is large enough. There can still be solutions with fewer locks, though. For example, if $a = b = p \gt \frac12$, we are satisfied only with $x = y = 1$.
You can play around with the problem on Desmos. All the $(x, y)$ points that satisfy the constraints are within the green area. The purple line is $y = \log_2x$ and it always enters the green area at some point if $p \le \frac12$.