A Dice Game

It’s no secret that I really enjoy solving interview brainteaser problems. I recently stumbled across a problem from a Jane Street Interview that I thought was interesting.

Problem Statement

The original problem is as follows: You are playing a game with a fair \(n\) sided dice, with faces 1 through \(n\) and \(T\) turns. The die starts with side 1 facing up. On each turn, you may either

  1. Receive a payment equal to the value of the current face up side of the dice
  2. Re-roll the dice.

The question is: What is the optimal strategy, and what is the most you would be willing to pay to play this game?

In this blog post we will consider a continuous version of the game and discuss how our insights would then carry over to the discrete case.

In our continuous case, instead of a discrete dice, we will play with a “dice” that acts as a uniform random number generator from \([0,1]\). We will also assume that we begin with a face up value of zero.


Solution

Before diving into calculations, we can think a about what form the optimal strategy may take. Our first insight is that on any given turn, if the current “face-up” side of the dice is very close to 1, we should probably choose to receive payment. Furthermore, if we are on our last turn, it doesn’t matter what side of the dice is up, it is optimal to choose to receive the payment. We can begin to reason that for \(k\) turns left, the optimal strategy might have some threshold function \(v_k\) where if on the \(k\)-th to last turn the face up value is \(x\) then it is optimal to receive payment only if \(x \geq v_k\) and it is optimal to re-roll otherwise.

First let’s try to show at there exists this threshold function. Let \(w(v,k)\) be the expected value of the game, under optimal play, starting with value \(v\) face up and having \(k\) turns left.

We have that \(w(v,0)= 0\) and a recurrence relation

\[\begin{equation} w(v,k)= \max\left\{\underbrace{v + w(v, k - 1)}_{\text{receive payout}}, \; \underbrace{\int_0^1 w(x, k - 1) \; dx}_{\text{re-roll}}\right\}.\label{eq: cont rec.} \end{equation}\]

We claim that \(w(v,k)\) is a continuous function in \(v\). This can be proven by induction on \(k\) and noting that taking the max of continuous functions preserves continuity.

Now, we claim that \(w(v,k)\) is a non-decreasing function in \(v\). This can again be proven by induction on \(k\) using (\ref{eq: cont rec.}). With these, If we consider the cases where \(v = 0\) and \(v = 1\), we see that

\(w(0,k-1) \leq \int_0^1 w(x, k - 1) \; dx\) and \(1 +w(1,k-1) > \int_0^1 w(x, k - 1)\). It then follows by the fact that \(w(v,k)\) is continuous and non-decreasing and the intermediate value theorem that for each \(k\) there is some smallest value of \(v \in [0,1]\) such that receiving payout is optimal if the face up side is at least this value. It remains to find the for each \(k\) what this threshold \(v_k\) is.

We can reason that \(v_k\) ought to be an increasing in \(k\). That is, if you are willing to receive a payout of \(x\) with \(k\) turns remaining, then you would also be willing to accept it with at most \(k-1\) turns remaining. With this in mind, we can begin to derive a recurrence relation for \(v_k\).

Equation (\ref{eq: cont rec.}) and the existence of \(v_k\) imply that we have the equality \(\begin{equation} v_k + w(v_k, k-1) = \int_0^1 w(x, k - 1) \; \label{eq:v_k equal}. \end{equation}\)

However, since \(v_k\) is an non-decreasing function, we have that \(v_k + w(v_k, k-1) = k v_k\) (meaning that if we we take payout, we will keep taking that payout every turn until the game ends). If we denote \(I_k := \int_0^1 w(x, k)\), this give us the relation

\[\begin{equation} \boxed{k v_k = I_{k-1}} \label{eq:I_k def} \end{equation}\]

We can then expand the RHS of (\ref{eq:v_k equal}) as

\[\begin{align*} \int_0^1 w(x, k - 1) \; dx & = \int_0^{v_{k-1} }w(x, k - 1)\; dx + \int_{v_{k-1} }^1 w(x, k - 1)\; dx\\ & = \int_0^{v_{k-1} }\left(I_{k-2}\right)\; dx+ \int_{v_{k-1} }^1 (k-1) x \; dx\\ & = v_{k-1} I_{k-2} + \frac{k-1}{2}\left(1 - v_{k-1} ^2\right)\\ & = (k-1)v_{k-1}^2 + \frac{k-1}{2}\left(1 - v_{k-1} ^2\right)\\ & = \frac{k-1}{2}\left(1 + v_{k-1}^2\right). \end{align*}\]

If we then combine this with (\ref{eq:v_k equal}) and (\ref{eq:I_k def}), we have a recurrence relation for \(v_k\)

\[\begin{equation} \boxed{v_k = \frac{k-1}{2k}\left(1 + v_{k-1}^2\right), \quad v_1 = 0} \label{eq:v_k req rel}. \end{equation}\]

Unfortunately, we cannot solve for a closed form solution to (\ref{eq:v_k req rel}). But we can solve for the numerical value of our game for a given number of turns. Starting with a current value of zero and \(T\) turns, the value of the game is given by \(w(0,T)\). Since $v_k > 0$ for all \(k > 1\), the optimal first move is to re-roll and which give us a value of \(\begin{equation} w(0,T) = \int_0^1 w(x, T-1) = I_{T-1} = T v_T \end{equation}\)

Which means that the value of our game with \(T\) turns is \(\boxed{w(0,T)= Tv_T}\).

We can say a few more things about the continuous case before concluding with a few remarks on what the optimal strategy would be in the discrete case. Recall that \(v_k\) is increasing in \(k\). It is also bound above by 1. Therefore the monotone convergence theorem implies that \(v_k\) converges. In the limit we find that \(v_k\to 1\) which matches our intuition that with many turns remaining, we are willing to wait for a value very close to 1. Below are plots of \(v_k\) for \(k\leq 50\) and the value of the game for \(T \leq 50\)

Fig 1: Plotting $v_k$ for for all $k \leq 50$.

Fig 2: Plotting $w(0,T)$ for for all $T \leq 50$.

To conclude, how might we use what we have found to tackle the discrete case? The analysis is exactly the same but now the cutoff thresholds will be discrete. The recurrence relation in (\ref{eq: cont rec.}) will now be

\(\begin{equation} w(v,k)= \max\left\{\underbrace{v + w(v, k - 1)}_{\text{receive payout}}, \; \underbrace{\frac{1}{n} \sum_{u=1}^n w(u, k - 1)}_{\text{re-roll}}\right\}.\label{eq: disc rec.} \end{equation}\) Otherwise, the insights from the continuous case still hold!