Abstract Formulation of the SEU Sensitivity Model

Jeff Helzner

Abstract Formulation of the SEU Sensitivity Model

Foundational Report 1

foundations

theory

m_0

A formal specification of the softmax choice model with sensitivity parameter α, including complete proofs of the three fundamental properties that characterize sensitivity to value maximization.

Author

Jeff Helzner

Published

May 12, 2026

0.1 Introduction

This report establishes the theoretical foundations for the intended interpretation of the models that are discussed in subsequent reports. We derive three properties of the softmax choice model with respect to an arbitrary value function, then show how these properties apply specifically when values are subjective expected utilities (SEU). While mathematically straightforward, these properties are essential to our intended interpretation: when value is taken to be subjective expected utility, $\alpha$ can be interpreted as measuring a decision maker’s alignment with the normative standard of SEU rationality—how consistently their choices track the SEU ranking.

This separation clarifies an important conceptual point: the core choice-theoretic results are independent of how values are constructed—they follow from the structure of softmax choice alone. The SEU interpretation provides the substantive behavioral content and connects our model to classical decision theory (Savage 1954; Neumann and Morgenstern 1947).

0.2 General Softmax Choice Model

0.2.1 Notation and Definitions

We begin with abstract notation for the general softmax choice model, then show how it specializes to SEU. The notation is chosen to align with our Stan implementations.

Notation Summary (Abstract Model)

Symbol	Description
$\mathcal{R} = \{1, 2, \ldots, R\}$	Set of distinct alternatives
$N_m$	Number of alternatives available in problem $m$
$V: \mathcal{R} \to \mathbb{R}$	Value function assigning utilities to alternatives
$V(r)$	Value of alternative $r$
$\alpha \in \mathbb{R}_+$	Sensitivity parameter (non-negative)

Notation used in proofs: We write $\mathcal{R}^* = \{r : V(r) = V^*\}$ for the set of value-maximizing alternatives, where $V^* = \max_r V(r)$, and $\mathcal{R}^- = \mathcal{R} \setminus \mathcal{R}^*$ for suboptimal alternatives.

0.2.2 The Softmax Choice Rule

The probability that a decision maker selects alternative $r$ from a choice set is given by the softmax rule:

\[ P(\text{choose } r \mid \alpha, V) = \frac{\exp(\alpha \cdot V(r))}{\sum_{j \in \mathcal{R}} \exp(\alpha \cdot V(j))} \tag{1}\]

This functional form has historical precedents in several research traditions. Luce (Luce 1959) derived a ratio-scale choice rule from axiomatic foundations, working with abstract “scale values” rather than utilities per se. McFadden (McFadden 1974) arrived at the same functional form from random utility theory in econometrics. The rule also appears in statistical mechanics as the Boltzmann distribution, where the sensitivity parameter is called inverse temperature, often written as $\beta = 1/T$. The temperature parameterization $T \to \infty$ corresponds to $\alpha \to 0$ (high temperature = random choice), while $T \to 0$ corresponds to $\alpha \to \infty$ (low temperature = deterministic optimization). We adopt $\alpha$ rather than $T$ throughout this report series because higher $\alpha$ corresponds to higher sensitivity—a more intuitive direction for interpretation. We adopt the term “softmax” from machine learning, where this transformation is ubiquitous.

The sensitivity parameter $\alpha$ controls how deterministically choices track value differences.

Parameter Space for α

While the limiting behavior at $\alpha = 0$ yields uniform choice (Property 3), in practice we restrict attention to $\alpha \in (0, \infty)$. The Stan constraint real<lower=0> alpha is a hard sampler bound that technically permits $\alpha = 0$; the substantive content comes from the prior. The choice of prior is therefore consequential: a standard-normal prior combined with the lower-bound constraint—equivalent to a half-normal prior—would place positive density at $\alpha = 0$, making uniform choice an attainable parameter value. We instead use a lognormal prior in subsequent reports, which assigns probability zero to $\alpha = 0$. Under the model as specified, then, uniform choice is a limiting behavior rather than an attainable parameter value.

In our Stan implementations, this corresponds to chi[m] = softmax(alpha * eta_m) where eta_m contains the expected utilities of alternatives available in problem $m$.

This rule has several appealing properties:

Probabilistic: Assigns positive probability to all alternatives
Monotonic in value: Higher-value alternatives are more likely to be chosen
Parameterized sensitivity: α controls how sharply choices concentrate on high-value alternatives

Show code

# Demonstrate softmax with varying alpha
values = np.array([0.2, 0.5, 0.8])
alphas = np.linspace(0.01, 10, 100)

probs = np.array([softmax(a * values) for a in alphas])

fig, ax = plt.subplots(figsize=(8, 5))
for i, label in enumerate(['V=0.2 (low)', 'V=0.5 (medium)', 'V=0.8 (high)']):
    ax.plot(alphas, probs[:, i], label=label, linewidth=2)

ax.axhline(y=1/3, color='gray', linestyle='--', alpha=0.5, label='Uniform (α→0)')
ax.set_xlabel('Sensitivity (α)', fontsize=12)
ax.set_ylabel('Choice Probability χ', fontsize=12)
ax.set_title('Softmax Choice Probabilities', fontsize=14)
ax.legend()
ax.set_ylim(0, 1)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Figure 1: Softmax choice probabilities for three alternatives with values V = (0.2, 0.5, 0.8) as sensitivity α varies. When values represent subjective expected utilities (SEU), V = η = ψᵀυ.

0.3 Fundamental Properties of Softmax Choice

The following three properties hold for any value function $V: \mathcal{R} \to \mathbb{R}$. This generality is important—the properties follow from any choice model of this functional form, not any particular theory of value.

0.3.1 Property 1: Monotonicity in Sensitivity

Theorem 1 (Monotonicity)

For any value function $V: \mathcal{R} \to \mathbb{R}$, holding $V$ fixed:

For $r \in \mathcal{R}^*$ (value-maximizing): $P(\text{choose } r \mid \alpha, V)$ is strictly increasing in $\alpha$
For $r \notin \mathcal{R}^*$ (suboptimal): $P(\text{choose } r \mid \alpha, V)$ is strictly decreasing in $\alpha$

Proof of Theorem 1

Part A: Value-maximizing alternatives ($r \in \mathcal{R}^*$)

Let $r \in \mathcal{R}^*$ such that $V(r) = V^*$. Define the partition function: \[ Z(\alpha) = \sum_{j \in \mathcal{R}} \exp(\alpha \cdot V(j)) \]

Taking the derivative of $P(r)$ with respect to $\alpha$: \[ \frac{\partial P(r)}{\partial \alpha} = \frac{\partial}{\partial \alpha} \left[\frac{\exp(\alpha \cdot V(r))}{Z(\alpha)}\right] \]

Using the quotient rule: \[ \frac{\partial P(r)}{\partial \alpha} = \frac{Z(\alpha) \cdot V(r) \cdot \exp(\alpha \cdot V(r)) - \exp(\alpha \cdot V(r)) \cdot Z'(\alpha)}{Z(\alpha)^2} \]

Simplifying: \[ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot \left[V(r) - \frac{Z'(\alpha)}{Z(\alpha)}\right] \]

Computing $Z'(\alpha)$: \[ Z'(\alpha) = \sum_{j \in \mathcal{R}} V(j) \cdot \exp(\alpha \cdot V(j)) \]

Therefore: \[ \frac{Z'(\alpha)}{Z(\alpha)} = \sum_{j \in \mathcal{R}} V(j) \cdot P(j) = \mathbb{E}[V] \]

where $\mathbb{E}[V]$ is the expected value under the current choice distribution.

Thus: \[ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot [V^* - \mathbb{E}[V]] \]

Since $V^* = \max_j V(j)$ and $\mathbb{E}[V]$ is a weighted average: \[ \mathbb{E}[V] = \sum_{j \in \mathcal{R}} P(j) \cdot V(j) \leq V^* \]

with equality only when $P(r) = 1$ for some $r \in \mathcal{R}^*$ (which occurs only as $\alpha \to \infty$).

For any finite $\alpha$, all alternatives receive positive probability under the softmax rule (since $\exp(x) > 0$ for all real $x$), ensuring that both optimal and suboptimal alternatives contribute to the expectation. Hence $\mathbb{E}[V] < V^*$ strictly, so: \[ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot [V^* - \mathbb{E}[V]] > 0 \quad \blacksquare \]

Part B: Suboptimal alternatives ($r \notin \mathcal{R}^*$)

For $r \notin \mathcal{R}^*$, we have $V(r) < V^*$. Following the same derivation: \[ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot [V(r) - \mathbb{E}[V]] \]

Since $r$ is suboptimal and $\mathcal{R}^*$ is non-empty (so $P(\mathcal{R}^*) > 0$ for all finite $\alpha$): \[ \mathbb{E}[V] \geq P(\mathcal{R}^*) \cdot V^* + P(r) \cdot V(r) > P(\mathcal{R}^*) \cdot V(r) + P(r) \cdot V(r) \]

The strict inequality follows because $V^* > V(r)$.

Therefore, $V(r) - \mathbb{E}[V] < 0$, and: \[ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot [V(r) - \mathbb{E}[V]] < 0 \quad \blacksquare \]

0.3.2 Property 2: Perfect Optimization Limit

Theorem 2 (Convergence to Value Maximization)

For any value function $V: \mathcal{R} \to \mathbb{R}$, as $\alpha \to \infty$:

\[ \lim_{\alpha \to \infty} P(\text{choose } r \mid \alpha, V) = \begin{cases} 1/|\mathcal{R}^*| & \text{if } r \in \mathcal{R}^* \\ 0 & \text{if } r \notin \mathcal{R}^* \end{cases} \]

Remark. The theorem statement covers ties (multiple value-maximizing alternatives), but ties have measure zero under the continuous prior distributions used in subsequent reports. The single-maximizer case—where one alternative uniquely attains $V^*$—is therefore generic, and we work in this case throughout the rest of the report series.

Proof of Theorem 2

Case 1: $r \in \mathcal{R}^*$ (value-maximizing)

\[ P(r) = \frac{\exp(\alpha \cdot V^*)}{|\mathcal{R}^*| \cdot \exp(\alpha \cdot V^*) + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot V(j))} \]

Dividing numerator and denominator by $\exp(\alpha \cdot V^*)$: \[ P(r) = \frac{1}{|\mathcal{R}^*| + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot [V(j) - V^*])} \]

For $j \in \mathcal{R}^-$, we have $V(j) < V^*$, so $V(j) - V^* < 0$.

As $\alpha \to \infty$: \[ \exp(\alpha \cdot [V(j) - V^*]) \to 0 \quad \text{for all } j \in \mathcal{R}^- \]

Thus: \[ \lim_{\alpha \to \infty} P(r) = \frac{1}{|\mathcal{R}^*|} \quad \blacksquare \]

Case 2: $r \notin \mathcal{R}^*$ (suboptimal)

\[ P(r) = \frac{\exp(\alpha \cdot V(r))}{\sum_{s \in \mathcal{R}^*} \exp(\alpha \cdot V^*) + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot V(j))} \]

Dividing by $\exp(\alpha \cdot V^*)$: \[ P(r) = \frac{\exp(\alpha \cdot [V(r) - V^*])}{|\mathcal{R}^*| + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot [V(j) - V^*])} \]

Since $V(r) - V^* < 0$:

Numerator $\to 0$
Denominator $\geq |\mathcal{R}^*| > 0$

Therefore: \[ \lim_{\alpha \to \infty} P(r) = 0 \quad \blacksquare \]

0.3.3 Property 3: Uniform Choice Limit

Note that the model is stochastic at every value of $\alpha$ (every alternative receives positive probability under softmax for any finite $\alpha$). The limit at $\alpha \to 0$ is more specific: choices become uniformly distributed over the available alternatives, independent of $V$.

Theorem 3 (Convergence to Uniform Choice)

For any value function $V: \mathcal{R} \to \mathbb{R}$, as $\alpha \to 0$:

\[ \lim_{\alpha \to 0} P(\text{choose } r \mid \alpha, V) = \frac{1}{|\mathcal{R}|} \quad \text{for all } r \in \mathcal{R} \]

Proof of Theorem 3

Using Taylor expansion $\exp(x) = 1 + x + O(x^2)$:

\[ P(r) = \frac{1 + \alpha \cdot V(r) + O(\alpha^2)}{\sum_{j \in \mathcal{R}} (1 + \alpha \cdot V(j) + O(\alpha^2))} \]

\[ = \frac{1 + \alpha \cdot V(r) + O(\alpha^2)}{|\mathcal{R}| + \alpha \cdot \sum_j V(j) + O(\alpha^2)} \]

As $\alpha \to 0$: \[ \lim_{\alpha \to 0} P(r) = \frac{1}{|\mathcal{R}|} \quad \blacksquare \]

Alternative proof via logarithms:

\[ \log P(r) = \alpha \cdot V(r) - \log\left[\sum_{j \in \mathcal{R}} \exp(\alpha \cdot V(j))\right] \]

Expanding the log-sum-exp: \[ \log\left[\sum_{j \in \mathcal{R}} \exp(\alpha \cdot V(j))\right] = \log|\mathcal{R}| + \frac{\alpha \cdot \sum_j V(j)}{|\mathcal{R}|} + O(\alpha^2) \]

Therefore: \[ \log P(r) = -\log|\mathcal{R}| + \alpha \cdot \left[V(r) - \frac{\sum_j V(j)}{|\mathcal{R}|}\right] + O(\alpha^2) \]

As $\alpha \to 0$: $\log P(r) \to -\log|\mathcal{R}|$, hence $P(r) \to 1/|\mathcal{R}|$. $\blacksquare$

Remark: The alternative derivation via logarithms uses the fact that $\log(\sum_j \exp(x_j))$ — the log-sum-exp function — is the cumulant generating function for a discrete uniform distribution, whose expansion near zero is well-characterized.

0.3.4 Summary: The Three Properties

Show code

fig, axes = plt.subplots(1, 3, figsize=(12, 4))

# Values for a 3-alternative problem
values = np.array([0.3, 0.5, 0.9])  # Third is optimal

# Property 1: Monotonicity
alphas = np.linspace(0.01, 8, 100)
probs = np.array([softmax(a * values) for a in alphas])

axes[0].plot(alphas, probs[:, 2], 'b-', linewidth=2.5, label='Optimal (η=0.9)')
axes[0].plot(alphas, probs[:, 1], 'orange', linewidth=2, label='Middle (η=0.5)')
axes[0].plot(alphas, probs[:, 0], 'r-', linewidth=2, label='Low (η=0.3)')
axes[0].axhline(y=1/3, color='gray', linestyle='--', alpha=0.5)
axes[0].set_xlabel('α', fontsize=12)
axes[0].set_ylabel('χ (choice probability)', fontsize=12)
axes[0].set_title('Property 1: Monotonicity', fontsize=12, fontweight='bold')
axes[0].legend(fontsize=9)
axes[0].set_ylim(0, 1)
axes[0].grid(True, alpha=0.3)

# Property 2: α → ∞
alpha_large = 20
probs_large = softmax(alpha_large * values)
axes[1].bar(['η=0.3', 'η=0.5', 'η=0.9'], probs_large, color=['red', 'orange', 'blue'])
axes[1].axhline(y=1, color='blue', linestyle='--', alpha=0.5, label='Limit')
axes[1].set_ylabel('χ (choice probability)', fontsize=12)
axes[1].set_title('Property 2: α → ∞\n(Deterministic Optimal)', fontsize=12, fontweight='bold')
axes[1].set_ylim(0, 1.1)
axes[1].grid(True, alpha=0.3, axis='y')

# Property 3: α → 0
alpha_small = 0.01
probs_small = softmax(alpha_small * values)
axes[2].bar(['η=0.3', 'η=0.5', 'η=0.9'], probs_small, color=['red', 'orange', 'blue'])
axes[2].axhline(y=1/3, color='gray', linestyle='--', alpha=0.5, label='Uniform')
axes[2].set_ylabel('χ (choice probability)', fontsize=12)
axes[2].set_title('Property 3: α → 0\n(Uniform Random)', fontsize=12, fontweight='bold')
axes[2].set_ylim(0, 0.5)
axes[2].grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

Figure 2: Visual summary of the three fundamental properties. Left: Monotonicity—optimal alternative probability increases with α. Middle: Limiting behavior at α→∞ (deterministic optimal choice). Right: Limiting behavior at α→0 (uniform random choice).

0.4 Application to Subjective Expected Utility

We now specialize the general softmax framework to the case where values are subjective expected utilities (SEU).

0.4.1 SEU as a Value Function

Notation Summary (SEU Specialization)

Symbol	Description
$K$	Number of possible consequences (outcomes)
$\boldsymbol{\upsilon} \in \mathbb{R}^K$	Utility vector over consequences
$\boldsymbol{\psi}_r \in \Delta^{K-1}$	Subjective probability distribution over consequences for alternative $r$
$\eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon}$	Expected utility of alternative $r$
$\boldsymbol{\chi}_m$	Choice probability vector for problem $m$

Definition: Subjective Expected Utility

Each alternative $r$ is associated with:

Subjective probabilities $\boldsymbol{\psi}_r \in \Delta^{K-1}$ over $K$ consequences
An expected utility computed as: \[ \eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon} = \sum_{k=1}^{K} \psi_{r,k} \cdot \upsilon_k \]

The choice probability for alternative $r$ in problem $m$ is then: \[ \chi_{m,r} = \frac{\exp(\alpha \cdot \eta_r)}{\sum_{j: I_{m,j}=1} \exp(\alpha \cdot \eta_j)} \]

where $I_{m,r} \in \{0, 1\}$ is an indicator variable: $I_{m,r} = 1$ if alternative $r$ is available in problem $m$, and $I_{m,r} = 0$ otherwise. This notation allows different choice problems to present different subsets of alternatives to the decision maker.

Key observation: The expected utility $\eta_r$ serves as our value function $V(r) = \eta_r$. Therefore, all three properties proved above apply immediately.

0.4.2 Corollaries for SEU

By substituting $V(r) = \eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon}$ into Properties 1-3:

Corollary 1 (Monotonicity for SEU)

Holding utilities $\boldsymbol{\upsilon}$ and beliefs $\boldsymbol{\psi}$ fixed, higher sensitivity $\alpha$ increases the probability of choosing alternatives that maximize expected utility $\eta$.

Corollary 2 (Perfect Rationality)

As $\alpha \to \infty$, the decision maker chooses uniformly among SEU-maximizing alternatives (those with highest $\eta$) with probability 1. When there is a unique maximizer, choice becomes deterministic.

Remark on Ties

The corollaries above handle the case of ties (multiple alternatives with equal maximal expected utility $\eta$). However, ties have measure zero under the continuous prior distributions used in subsequent reports. When utilities and beliefs are drawn from continuous distributions, the probability that two distinct alternatives yield exactly equal expected utility is zero. Thus, the single-maximizer case—where one alternative uniquely maximizes $\eta$—is generic.

Corollary 3 (Random Choice)

As $\alpha \to 0$, the decision maker chooses uniformly at random over available alternatives, independent of $\eta$ values.

0.4.3 What SEU Adds to the Framework

While the mathematical properties of softmax choice hold for any value function, the SEU construction provides:

Decomposition: Expected utilities $\eta$ decompose into beliefs ($\boldsymbol{\psi}$) and utilities ($\boldsymbol{\upsilon}$), allowing separate analysis of epistemic and preference components
Normative content: SEU maximization is a rationality criterion—Properties 1-3 characterize adherence to this normative standard (Savage 1954)
Empirical predictions: The model predicts that choices will track $\eta$, providing testable restrictions

The decomposition in (1) is conceptually attractive but raises a substantive identification question: beliefs $\boldsymbol{\psi}$ and utilities $\boldsymbol{\upsilon}$ enter the expected utility $\eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon}$ multiplicatively, so choices over uncertain alternatives alone may not separately identify them. Report 4 documents this empirically through parameter recovery, Report 5 analyzes its structure and shows how adding risky choices (with objectively given probabilities) breaks the confound via an Anscombe–Aumann–style design, and Report 6 confirms the resulting calibration improvements via simulation-based calibration. The corollaries below should therefore be read with the caveat that empirical identification of the components of $\eta$ requires more than the abstract framework guarantees.

0.5 Scale Invariance and Representation

0.5.1 The Representation Problem

A fundamental property of utility functions is that they are unique only up to positive affine transformations. This raises a critical question: how can we meaningfully interpret $\alpha$ when the scale of utility is arbitrary?

This non-uniqueness has historical resonance with Luce’s (Luce 1959) derivation of the softmax functional form: Luce worked with abstract “scale values” determined only up to a common multiplicative constant, foreshadowing the same kind of indeterminacy we encounter when interpreting $\alpha$ against an unfixed utility scale.

Theorem 4 (Scale Invariance)

Let $\boldsymbol{\upsilon}$ be a utility vector and define a rescaled utility: \[ \tilde{\upsilon}_k = a \cdot \upsilon_k + b \quad \text{where } a > 0 \]

Then:

$\tilde{\eta}_r = a \cdot \eta_r + b$ for all alternatives $r$
$P(\text{choose } r \mid \alpha, \tilde{\boldsymbol{\upsilon}}) = P(\text{choose } r \mid \alpha \cdot a, \boldsymbol{\upsilon})$

The pair $(\alpha, \boldsymbol{\upsilon})$ and $(\alpha \cdot a, \tilde{\boldsymbol{\upsilon}})$ generate identical choice probabilities.

Proof of Theorem 4

Part 1: \[ \tilde{\eta}_r = \sum_k \psi_{r,k} \cdot [a \cdot \upsilon_k + b] = a \cdot \sum_k \psi_{r,k} \cdot \upsilon_k + b \cdot \sum_k \psi_{r,k} = a \cdot \eta_r + b \]

since $\sum_k \psi_{r,k} = 1$.

Part 2: \[ P(\text{choose } r \mid \alpha, \tilde{\boldsymbol{\upsilon}}) = \frac{\exp(\alpha \cdot [a \cdot \eta_r + b])}{\sum_j \exp(\alpha \cdot [a \cdot \eta_j + b])} \]

\[ = \frac{\exp(\alpha a \cdot \eta_r) \cdot \exp(\alpha b)}{\sum_j \exp(\alpha a \cdot \eta_j) \cdot \exp(\alpha b)} = \frac{\exp(\alpha a \cdot \eta_r)}{\sum_j \exp(\alpha a \cdot \eta_j)} = P(\text{choose } r \mid \alpha a, \boldsymbol{\upsilon}) \quad \blacksquare \]

Key Implication: Without fixing the utility scale, $\alpha$ and the scale of utility are confounded—they cannot be separately interpreted from choice behavior alone. Scaling utilities by a factor $a$ is equivalent to scaling sensitivity by $1/a$.

0.5.2 Resolution: Utility Standardization

To make $\alpha$ interpretable as “sensitivity to expected utility differences,” we adopt a standard standardization convention.

Standardization Convention

We constrain utilities to lie in $[0,1]$ by assigning:

$\upsilon_{\text{worst}} = 0$ (utility of the worst consequence)
$\upsilon_{\text{best}} = 1$ (utility of the best consequence)

For $K$ ordered consequences, this means: \[ 0 = \upsilon_1 \leq \upsilon_2 \leq \cdots \leq \upsilon_K = 1 \]

This is the standard standardization in decision theory, where the utility function is anchored at the endpoints of the consequence space.

This standardization is without loss of generality—it simply fixes a representation from the equivalence class of utility functions related by positive affine transformations. Any utility function can be rescaled to satisfy this convention.

Given this choice of representative from the affine equivalence class, the residual indeterminacy in $(\alpha, \boldsymbol{\upsilon})$ identified by Theorem 4 is resolved: $\alpha$ becomes uniquely identified up to the data-generating process.

Interpretive Result: Given this standardization, $\alpha$ measures sensitivity to expected utility differences on a standardized scale where the full range of possible utilities spans exactly one unit.

0.5.3 Interpretation of α Under Standardization

With utilities standardized to $[0,1]$, expected utilities satisfy $\eta_r \in [0,1]$ for all alternatives $r$ (since $\eta_r$ is a convex combination of utilities). The maximum possible difference in expected utility is therefore 1.

The sensitivity parameter $\alpha$ has a precise interpretation via the log-odds ratio:

\[ \log\left[\frac{\chi_{r}}{\chi_{s}}\right] = \alpha \cdot [\eta_r - \eta_s] \]

Show code

import pandas as pd
from scipy import stats as _stats

_ln = _stats.lognorm(s=1.0, scale=1.0)  # lognormal(0, 1) on alpha
_q05, _q95 = _ln.ppf(0.05), _ln.ppf(0.95)

def _prior_region(a):
    if a < _q05:
        return 'lower tail (<5%)'
    if a > _q95:
        return 'upper tail (>95%)'
    if abs(a - 1.0) < 1e-9:
        return 'prior median'
    return 'central 90%'

alpha_vals = [0.5, 1, 2, 3, 5, 10]
data = []
for a in alpha_vals:
    odds_ratio = np.exp(a)
    prob_better = odds_ratio / (1 + odds_ratio)
    data.append({
        'α': a,
        'Log-odds': f'{a:.1f}',
        'Odds ratio': f'{odds_ratio:.2f}',
        'P(higher η)': f'{prob_better:.1%}',
        'Prior region': _prior_region(a),
    })

df = pd.DataFrame(data)
df

Table 1: Interpretation of α for a one-unit expected utility difference (maximum possible difference with standardized utilities). The ‘Prior region’ column locates each α relative to the lognormal(0, 1) prior used in subsequent reports: this prior has median 1, and approximately 90% of its mass lies in (0.19, 5.18).

	α	Log-odds	Odds ratio	P(higher η)	Prior region
0	0.5	0.5	1.65	62.2%	central 90%
1	1.0	1.0	2.72	73.1%	prior median
2	2.0	2.0	7.39	88.1%	central 90%
3	3.0	3.0	20.09	95.3%	central 90%
4	5.0	5.0	148.41	99.3%	central 90%
5	10.0	10.0	22026.47	100.0%	upper tail (>95%)

General interpretation: $\alpha$ measures the log-odds change per unit of expected utility difference. Higher $\alpha$ means choices become more deterministically aligned with $\eta$ rankings.

0.6 Rates of Convergence

The limiting behavior established in Properties 2 and 3 occurs at different rates, which we now characterize precisely.

0.6.1 Convergence Rate for Property 2 ($\alpha \to \infty$)

Theorem 5 (Exponential Convergence to Optimality)

Let $\Delta = \min\{V^* - V(r) : r \notin \mathcal{R}^*\}$ be the minimum gap between optimal and suboptimal values. For any suboptimal alternative $r \notin \mathcal{R}^*$:

\[ P(\text{choose } r \mid \alpha, V) = O\left(e^{-\alpha \Delta}\right) \quad \text{as } \alpha \to \infty \]

Convergence to the optimality limit is exponential with rate $\Delta$.

Remark: When $\mathcal{R}^* = \mathcal{R}$ (all alternatives have equal value), the set $\mathcal{R}^- = \mathcal{R} \setminus \mathcal{R}^*$ is empty, so $\Delta$ is undefined. In this trivial case, convergence is achieved instantly: $P(r) = 1/|\mathcal{R}|$ for all $\alpha$. This is consistent with both Properties 2 and 3 collapsing to the same uniform-over-$\mathcal{R}$ limit when all values coincide.

Proof of Theorem 5

For $r \notin \mathcal{R}^*$, recall from the proof of Property 2: \[ P(\text{choose } r) = \frac{\exp(\alpha \cdot [V(r) - V^*])}{|\mathcal{R}^*| + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot [V(j) - V^*])} \]

Since $V(r) - V^* \leq -\Delta < 0$: \[ P(\text{choose } r) \leq \frac{\exp(-\alpha \Delta)}{|\mathcal{R}^*|} = \frac{1}{|\mathcal{R}^*|} e^{-\alpha \Delta} \]

The denominator is bounded below by $|\mathcal{R}^*| \geq 1$, giving: \[ P(\text{choose } r) = O(e^{-\alpha \Delta}) \quad \blacksquare \]

Interpretation: Larger value gaps $\Delta$ lead to faster concentration on optimal alternatives. When the best alternative is clearly superior (large $\Delta$), even moderate $\alpha$ yields near-deterministic choice.

0.6.2 Convergence Rate for Property 3 ($\alpha \to 0$)

Theorem 6 (Linear Convergence to Uniformity)

For any alternative $r \in \mathcal{R}$, let $\bar{V} = \frac{1}{|\mathcal{R}|}\sum_j V(j)$ denote the arithmetic mean of values. Then:

\[ P(\text{choose } r \mid \alpha, V) = \frac{1}{|\mathcal{R}|} + \alpha \cdot \left[V(r) - \bar{V}\right] \cdot \frac{1}{|\mathcal{R}|} + O(\alpha^2) \]

Convergence to uniformity is first-order (linear) in $\alpha$.

Proof of Theorem 6

Expanding $\exp(\alpha V(r)) = 1 + \alpha V(r) + \frac{\alpha^2 V(r)^2}{2} + O(\alpha^3)$:

\[ P(\text{choose } r) = \frac{1 + \alpha V(r) + O(\alpha^2)}{\sum_j [1 + \alpha V(j) + O(\alpha^2)]} = \frac{1 + \alpha V(r) + O(\alpha^2)}{|\mathcal{R}| + \alpha \sum_j V(j) + O(\alpha^2)} \]

Let $S = \sum_j V(j) = |\mathcal{R}| \cdot \bar{V}$. Using the expansion $(1+x)^{-1} = 1 - x + O(x^2)$:

\[ P(\text{choose } r) = \frac{1 + \alpha V(r)}{|\mathcal{R}|} \cdot \left(1 + \frac{\alpha S}{|\mathcal{R}|}\right)^{-1} + O(\alpha^2) \]

\[ = \frac{1 + \alpha V(r)}{|\mathcal{R}|} \cdot \left(1 - \frac{\alpha S}{|\mathcal{R}|}\right) + O(\alpha^2) \]

\[ = \frac{1}{|\mathcal{R}|} + \frac{\alpha V(r)}{|\mathcal{R}|} - \frac{\alpha S}{|\mathcal{R}|^2} + O(\alpha^2) \]

\[ = \frac{1}{|\mathcal{R}|} + \frac{\alpha}{|\mathcal{R}|} \left[V(r) - \frac{S}{|\mathcal{R}|}\right] + O(\alpha^2) \]

\[ = \frac{1}{|\mathcal{R}|} + \frac{\alpha}{|\mathcal{R}|} \left[V(r) - \bar{V}\right] + O(\alpha^2) \quad \blacksquare \]

Interpretation: Near $\alpha = 0$, deviations from uniform choice are proportional to $\alpha$ and to how much an alternative’s value exceeds the mean. The coefficient $[V(r) - \bar{V}]/|\mathcal{R}|$ determines the direction and magnitude of the first-order effect.

0.7 Discussion

Having established the mathematical properties of the model, we turn to its interpretation.

0.7.1 The Intended Interpretation

The intended interpretation of our model is of a decision maker who is committed to SEU maximization but has limited sensitivity to the implications of that commitment. The term “commitment” is doing substantial work here, and we elaborate it in the Conceptual Lens subsection below. The sensitivity parameter $\alpha$ measures how reliably the agent’s choices track the SEU ranking:

High $\alpha$: Choices reliably favor higher-$\eta$ alternatives
Low $\alpha$: Choices are noisy relative to the SEU ranking
$\alpha \to \infty$: Near-deterministic choice of SEU-maximizing alternatives
$\alpha \to 0$: Choices become independent of SEU values

0.7.2 What we do not intend the model to describe

Not a cognitive process model. We make no claim about how decision makers actually deliberate. The model is silent on whether agents compute probabilities, form expectations, or engage in any particular mental procedure. It specifies a distribution over choices, not a mechanism that generates them. Our concern is to investigate the extent to which a decision maker’s behavior can be captured by viewing that behavior as-if it came from a decision maker who is committed to subjective expected utility maximization but has limited sensitivity to the implications of that commitment.

Not bounded rationality. Bounded rationality programs, in their various formulations (Simon 1955; Gigerenzer and Goldstein 1996), typically propose alternative decision procedures—heuristics, satisficing rules, or fast-and-frugal strategies—that agents use in place of optimization. Our model posits no such alternative procedures. Nor do we invoke notions of ecological rationality or fit between heuristics and environmental structure. The model simply describes a stochastic relationship between SEU values and choice probabilities.

Not a model of fully rational decision making. SEU maximization—the classical normative standard—corresponds only to the limit $\alpha \to \infty$. For any finite $\alpha$, the model permits systematic departures from SEU maximization: lower-$\eta$ alternatives are chosen with positive probability. The model is therefore not a model of rational agents in the classical sense; it is a model of agents whose choices are sensitive, to a degree parameterized by $\alpha$, to the SEU ranking.

0.7.3 A Conceptual Lens: Commitment and Performance

A useful conceptual lens comes from Isaac Levi’s distinction between commitment and performance (Levi 1980, chap. 1). We may be committed to standards we fail to perform up to: most of us are committed to the laws of arithmetic despite occasionally making calculation errors; the errors are failures of performance, not rejections of the standard. If we take SEU theory as specifying the decision maker’s normative commitments, then $\alpha$ measures their tendency to perform in accordance with those commitments. This framing preserves SEU as normatively fundamental while allowing systematic departures in observed behavior.

What does “commitment” itself mean? Levi, to our knowledge, does not provide an explicit further analysis of commitment and largely treats the commitment/performance distinction as primitive, using it to organize his account of doxastic and practical norms. The remainder of this subsection develops two glosses—both ours, reflecting more operationalist leanings than Levi’s own writings express—that we find useful for our purposes.

A first, instrumental gloss: commitment, like preference, is an internal state that is not directly observable but must be inferred from observable behavior via some theory connecting the two. One natural class of evidence is the agent’s deliberate adoption of tools, practices, or training that aid satisfaction of the standard. My commitment to having my arithmetic calculations conform to the laws of arithmetic is evidenced, in part, by the calculator on my desk: I purchased and use it because it helps me satisfy a standard I take seriously. Under this gloss, an agent’s commitment to a standard is inferred from their valuing aids to performance with respect to that standard.

The instrumental gloss has limits—particularly when applied to AI decision makers. In the application reports of this series we do not collect data on anything analogous to the calculator purchase: we have no record of the AI system valuing or selecting tools that would improve its capacity to satisfy a standard of rational choice. We therefore consider a second, capacity-influenced gloss that is better matched to the data we do have: an agent is committed to a standard if the influence of that standard on their decision making increases as the agent becomes more sensitive to the implications of the standard. Our model satisfies this notion by construction. The sensitivity parameter $\alpha$ parameterizes that influence directly: as $\alpha$ increases, the SEU ranking exerts greater influence on the agent’s choice distribution (Property 1), and in the limit $\alpha \to \infty$ choices align with SEU maximization (Property 2). The two glosses are related: deliberately acquiring tools to enhance one’s capacity (the instrumental gloss) is itself evidence that one’s decision making is influenced by one’s capacity to satisfy the standard—and, plausibly, that this influence would grow as that capacity grows (the capacity-influenced gloss). In this sense the capacity-influenced notion arguably generalizes the instrumental one, and it is the notion most directly operationalized by our model, including in applications to AI decision makers.

Our framework differs from Levi’s in important ways:

Decision maker’s vs. observer’s perspective. Levi’s framework is primarily concerned with the decision maker’s own deliberative standards—what an agent should believe and prefer from the first-person standpoint. Our framework, by contrast, adopts the observer’s perspective: we model choice behavior as it appears to an external analyst who observes decisions and infers parameters.
Probabilistic vs. algebraic theories. Following Luce’s (Luce 1959) distinction, our framework is a probabilistic theory of choice—it specifies a probability distribution over choices given the decision problem. Levi’s framework is more aligned with algebraic theories that characterize rational choice through axioms on preference orderings rather than stochastic choice rules.
SEU as the normative standard. Levi famously rejected subjective expected utility theory as the standard of rational choice, instead advocating for generalizations that accommodate indeterminate probabilities and utilities (see Levi 1986, for his critique of precise probability). Our framework, by contrast, takes SEU maximization as the normative standard and models departures from it through the sensitivity parameter $\alpha$.

0.8 Summary

We have established three fundamental properties of the softmax choice model:

Monotonicity: Higher $\alpha$ increases probability of choosing alternatives with higher value $V(r)$
Perfect optimization limit: As $\alpha \to \infty$, choices become deterministically concentrated on value-maximizing alternatives
Uniform choice limit: As $\alpha \to 0$, choices become uniformly distributed over the available alternatives

Additionally, we characterized the rates at which these limits are approached:

Convergence to optimality is exponential with rate determined by the value gap $\Delta$
Convergence to uniformity is linear (first-order) in $\alpha$

These properties hold for any value function $V$. When $V$ is taken to be subjective expected utility, the framework provides a model of decision-making that interpolates between random choice and SEU maximization, with $\alpha$ governing the degree of sensitivity to expected utility differences.

The standardization of utilities to $[0,1]$ fixes a representation from the equivalence class of utility functions, making $\alpha$ interpretable as sensitivity to standardized expected utility differences.

0.8.1 Looking Ahead

This report establishes the theoretical scaffolding for the series. Subsequent reports build on these foundations:

Report 2 (Concrete Implementation) translates this abstract framework into Stan code, specifying priors and the full generative model.
Report 3 (Prior Predictive Analysis) examines the implications of our prior choices before seeing data.
Report 4 (Parameter Recovery) tests whether the model can reliably recover known parameter values from simulated data.
Reports 5–7 extend the framework to include risky alternatives, validate via simulation-based calibration, and generalize the sensitivity structure.
Reports 8–12 develop a hierarchical extension of the framework (model h_m01)—formal specification, Stan implementation, prior analysis, parameter recovery, and SBC validation—suitable for studies that pool information across multiple decision makers.

0.9 References

Gigerenzer, Gerd, and Daniel G. Goldstein. 1996. “Reasoning the Fast and Frugal Way: Models of Bounded Rationality.” In Psychological Review, vol. 103.

Levi, Isaac. 1980. The Enterprise of Knowledge: An Essay on Knowledge, Credal Probability, and Chance. MIT Press.

Levi, Isaac. 1986. Hard Choices: Decision Making Under Unresolved Conflict. Cambridge University Press.

Luce, R. Duncan. 1959. Individual Choice Behavior: A Theoretical Analysis. (New York).

McFadden, Daniel. 1974. “Conditional Logit Analysis of Qualitative Choice Behavior.” Frontiers in Econometrics (New York), 105–42.

Neumann, John von, and Oskar Morgenstern. 1947. Theory of Games and Economic Behavior. 2nd ed. Princeton University Press.

Savage, Leonard J. 1954. The Foundations of Statistics. (New York).

Simon, Herbert A. 1955. “A Behavioral Model of Rational Choice.” The Quarterly Journal of Economics 69 (1): 99–118.

Reuse

CC BY-SA 4.0

Citation

BibTeX citation:

@online{helzner2026,
  author = {Helzner, Jeff},
  title = {Abstract {Formulation} of the {SEU} {Sensitivity} {Model}},
  date = {2026-05-12},
  url = {https://jeffhelzner.github.io/seu-sensitivity/foundations/01_abstract_formulation.html},
  langid = {en}
}

For attribution, please cite this work as:

Helzner, Jeff. 2026. “Abstract Formulation of the SEU Sensitivity Model.” SEU Sensitivity Project, May 12. https://jeffhelzner.github.io/seu-sensitivity/foundations/01_abstract_formulation.html.

--- title: "Abstract Formulation of the SEU Sensitivity Model" subtitle: "Foundational Report 1" description: | A formal specification of the softmax choice model with sensitivity parameter α, including complete proofs of the three fundamental properties that characterize sensitivity to value maximization. categories: [foundations, theory, m_0] --- ```{python} #| label: setup #| include: false import sys import os sys.path.insert(0, os.path.join(os.getcwd(), '..')) import numpy as np import matplotlib.pyplot as plt from scipy.special import softmax ``` ## Introduction This report establishes the theoretical foundations for the intended interpretation of the models that are discussed in subsequent reports. We derive three properties of the softmax choice model with respect to an arbitrary value function, then show how these properties apply specifically when values are subjective expected utilities (SEU). While mathematically straightforward, these properties are essential to our intended interpretation: when value is taken to be subjective expected utility, $\alpha$ can be interpreted as measuring a decision maker's alignment with the normative standard of SEU rationality—how consistently their choices track the SEU ranking. This separation clarifies an important conceptual point: the core choice-theoretic results are independent of how values are constructed—they follow from the structure of softmax choice alone. The SEU interpretation provides the substantive behavioral content and connects our model to classical decision theory [@savage1954; @vonneumann1947]. ## General Softmax Choice Model ### Notation and Definitions We begin with abstract notation for the general softmax choice model, then show how it specializes to SEU. The notation is chosen to align with our Stan implementations. ::: {.callout-note} ## Notation Summary (Abstract Model) | Symbol | Description | |--------|-------------| | $\mathcal{R} = \{1, 2, \ldots, R\}$ | Set of distinct alternatives | | $N_m$ | Number of alternatives available in problem $m$ | | $V: \mathcal{R} \to \mathbb{R}$ | Value function assigning utilities to alternatives | | $V(r)$ | Value of alternative $r$ | | $\alpha \in \mathbb{R}_+$ | Sensitivity parameter (non-negative) | **Notation used in proofs:** We write $\mathcal{R}^* = \{r : V(r) = V^*\}$ for the set of value-maximizing alternatives, where $V^* = \max_r V(r)$, and $\mathcal{R}^- = \mathcal{R} \setminus \mathcal{R}^*$ for suboptimal alternatives. ::: ### The Softmax Choice Rule The probability that a decision maker selects alternative $r$ from a choice set is given by the **softmax** rule: $$ P(\text{choose } r \mid \alpha, V) = \frac{\exp(\alpha \cdot V(r))}{\sum_{j \in \mathcal{R}} \exp(\alpha \cdot V(j))} $$ {#eq-softmax} This functional form has historical precedents in several research traditions. Luce [@luce1959] derived a ratio-scale choice rule from axiomatic foundations, working with abstract "scale values" rather than utilities per se. McFadden [@mcfadden1974] arrived at the same functional form from random utility theory in econometrics. The rule also appears in statistical mechanics as the Boltzmann distribution, where the sensitivity parameter is called *inverse temperature*, often written as $\beta = 1/T$. The temperature parameterization $T \to \infty$ corresponds to $\alpha \to 0$ (high temperature = random choice), while $T \to 0$ corresponds to $\alpha \to \infty$ (low temperature = deterministic optimization). We adopt $\alpha$ rather than $T$ throughout this report series because higher $\alpha$ corresponds to higher sensitivity—a more intuitive direction for interpretation. We adopt the term "softmax" from machine learning, where this transformation is ubiquitous. The sensitivity parameter $\alpha$ controls how deterministically choices track value differences. ::: {.callout-note} ## Parameter Space for α While the limiting behavior at $\alpha = 0$ yields uniform choice (Property 3), in practice we restrict attention to $\alpha \in (0, \infty)$. The Stan constraint `real<lower=0> alpha` is a hard sampler bound that technically permits $\alpha = 0$; the substantive content comes from the prior. The choice of prior is therefore consequential: a standard-normal prior combined with the lower-bound constraint—equivalent to a half-normal prior—would place positive density at $\alpha = 0$, making uniform choice an attainable parameter value. We instead use a lognormal prior in subsequent reports, which assigns probability zero to $\alpha = 0$. Under the model as specified, then, uniform choice is a limiting behavior rather than an attainable parameter value. ::: In our Stan implementations, this corresponds to `chi[m] = softmax(alpha * eta_m)` where `eta_m` contains the expected utilities of alternatives available in problem $m$. This rule has several appealing properties: 1. **Probabilistic**: Assigns positive probability to all alternatives 2. **Monotonic in value**: Higher-value alternatives are more likely to be chosen 3. **Parameterized sensitivity**: α controls how sharply choices concentrate on high-value alternatives ```{python} #| label: fig-softmax-demo #| fig-cap: "Softmax choice probabilities for three alternatives with values V = (0.2, 0.5, 0.8) as sensitivity α varies. When values represent subjective expected utilities (SEU), V = η = ψᵀυ." # Demonstrate softmax with varying alpha values = np.array([0.2, 0.5, 0.8]) alphas = np.linspace(0.01, 10, 100) probs = np.array([softmax(a * values) for a in alphas]) fig, ax = plt.subplots(figsize=(8, 5)) for i, label in enumerate(['V=0.2 (low)', 'V=0.5 (medium)', 'V=0.8 (high)']): ax.plot(alphas, probs[:, i], label=label, linewidth=2) ax.axhline(y=1/3, color='gray', linestyle='--', alpha=0.5, label='Uniform (α→0)') ax.set_xlabel('Sensitivity (α)', fontsize=12) ax.set_ylabel('Choice Probability χ', fontsize=12) ax.set_title('Softmax Choice Probabilities', fontsize=14) ax.legend() ax.set_ylim(0, 1) ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() ``` ## Fundamental Properties of Softmax Choice The following three properties hold for **any** value function $V: \mathcal{R} \to \mathbb{R}$. This generality is important—the properties follow from any choice model of this functional form, not any particular theory of value. ### Property 1: Monotonicity in Sensitivity {#sec-monotonicity} ::: {.callout-note appearance="minimal"} ## Theorem 1 (Monotonicity) For any value function $V: \mathcal{R} \to \mathbb{R}$, holding $V$ fixed: - For $r \in \mathcal{R}^*$ (value-maximizing): $P(\text{choose } r \mid \alpha, V)$ is **strictly increasing** in $\alpha$ - For $r \notin \mathcal{R}^*$ (suboptimal): $P(\text{choose } r \mid \alpha, V)$ is **strictly decreasing** in $\alpha$ ::: ::: {.callout-tip collapse="true"} ## Proof of Theorem 1 **Part A: Value-maximizing alternatives ($r \in \mathcal{R}^*$)** Let $r \in \mathcal{R}^*$ such that $V(r) = V^*$. Define the partition function: $$ Z(\alpha) = \sum_{j \in \mathcal{R}} \exp(\alpha \cdot V(j)) $$ Taking the derivative of $P(r)$ with respect to $\alpha$: $$ \frac{\partial P(r)}{\partial \alpha} = \frac{\partial}{\partial \alpha} \left[\frac{\exp(\alpha \cdot V(r))}{Z(\alpha)}\right] $$ Using the quotient rule: $$ \frac{\partial P(r)}{\partial \alpha} = \frac{Z(\alpha) \cdot V(r) \cdot \exp(\alpha \cdot V(r)) - \exp(\alpha \cdot V(r)) \cdot Z'(\alpha)}{Z(\alpha)^2} $$ Simplifying: $$ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot \left[V(r) - \frac{Z'(\alpha)}{Z(\alpha)}\right] $$ Computing $Z'(\alpha)$: $$ Z'(\alpha) = \sum_{j \in \mathcal{R}} V(j) \cdot \exp(\alpha \cdot V(j)) $$ Therefore: $$ \frac{Z'(\alpha)}{Z(\alpha)} = \sum_{j \in \mathcal{R}} V(j) \cdot P(j) = \mathbb{E}[V] $$ where $\mathbb{E}[V]$ is the expected value under the current choice distribution. Thus: $$ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot [V^* - \mathbb{E}[V]] $$ Since $V^* = \max_j V(j)$ and $\mathbb{E}[V]$ is a weighted average: $$ \mathbb{E}[V] = \sum_{j \in \mathcal{R}} P(j) \cdot V(j) \leq V^* $$ with equality only when $P(r) = 1$ for some $r \in \mathcal{R}^*$ (which occurs only as $\alpha \to \infty$). For any finite $\alpha$, all alternatives receive positive probability under the softmax rule (since $\exp(x) > 0$ for all real $x$), ensuring that both optimal and suboptimal alternatives contribute to the expectation. Hence $\mathbb{E}[V] < V^*$ strictly, so: $$ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot [V^* - \mathbb{E}[V]] > 0 \quad \blacksquare $$ **Part B: Suboptimal alternatives ($r \notin \mathcal{R}^*$)** For $r \notin \mathcal{R}^*$, we have $V(r) < V^*$. Following the same derivation: $$ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot [V(r) - \mathbb{E}[V]] $$ Since $r$ is suboptimal and $\mathcal{R}^*$ is non-empty (so $P(\mathcal{R}^*) > 0$ for all finite $\alpha$): $$ \mathbb{E}[V] \geq P(\mathcal{R}^*) \cdot V^* + P(r) \cdot V(r) > P(\mathcal{R}^*) \cdot V(r) + P(r) \cdot V(r) $$ The strict inequality follows because $V^* > V(r)$. Therefore, $V(r) - \mathbb{E}[V] < 0$, and: $$ \frac{\partial P(r)}{\partial \alpha} = P(r) \cdot [V(r) - \mathbb{E}[V]] < 0 \quad \blacksquare $$ ::: ### Property 2: Perfect Optimization Limit {#sec-rationality} ::: {.callout-note appearance="minimal"} ## Theorem 2 (Convergence to Value Maximization) For any value function $V: \mathcal{R} \to \mathbb{R}$, as $\alpha \to \infty$: $$ \lim_{\alpha \to \infty} P(\text{choose } r \mid \alpha, V) = \begin{cases} 1/|\mathcal{R}^*| & \text{if } r \in \mathcal{R}^* \\ 0 & \text{if } r \notin \mathcal{R}^* \end{cases} $$ ::: *Remark.* The theorem statement covers ties (multiple value-maximizing alternatives), but ties have measure zero under the continuous prior distributions used in subsequent reports. The single-maximizer case—where one alternative uniquely attains $V^*$—is therefore generic, and we work in this case throughout the rest of the report series. ::: {.callout-tip collapse="true"} ## Proof of Theorem 2 **Case 1: $r \in \mathcal{R}^*$ (value-maximizing)** $$ P(r) = \frac{\exp(\alpha \cdot V^*)}{|\mathcal{R}^*| \cdot \exp(\alpha \cdot V^*) + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot V(j))} $$ Dividing numerator and denominator by $\exp(\alpha \cdot V^*)$: $$ P(r) = \frac{1}{|\mathcal{R}^*| + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot [V(j) - V^*])} $$ For $j \in \mathcal{R}^-$, we have $V(j) < V^*$, so $V(j) - V^* < 0$. As $\alpha \to \infty$: $$ \exp(\alpha \cdot [V(j) - V^*]) \to 0 \quad \text{for all } j \in \mathcal{R}^- $$ Thus: $$ \lim_{\alpha \to \infty} P(r) = \frac{1}{|\mathcal{R}^*|} \quad \blacksquare $$ **Case 2: $r \notin \mathcal{R}^*$ (suboptimal)** $$ P(r) = \frac{\exp(\alpha \cdot V(r))}{\sum_{s \in \mathcal{R}^*} \exp(\alpha \cdot V^*) + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot V(j))} $$ Dividing by $\exp(\alpha \cdot V^*)$: $$ P(r) = \frac{\exp(\alpha \cdot [V(r) - V^*])}{|\mathcal{R}^*| + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot [V(j) - V^*])} $$ Since $V(r) - V^* < 0$: - Numerator $\to 0$ - Denominator $\geq |\mathcal{R}^*| > 0$ Therefore: $$ \lim_{\alpha \to \infty} P(r) = 0 \quad \blacksquare $$ ::: ### Property 3: Uniform Choice Limit {#sec-random} Note that the model is *stochastic* at every value of $\alpha$ (every alternative receives positive probability under softmax for any finite $\alpha$). The limit at $\alpha \to 0$ is more specific: choices become *uniformly* distributed over the available alternatives, independent of $V$. ::: {.callout-note appearance="minimal"} ## Theorem 3 (Convergence to Uniform Choice) For any value function $V: \mathcal{R} \to \mathbb{R}$, as $\alpha \to 0$: $$ \lim_{\alpha \to 0} P(\text{choose } r \mid \alpha, V) = \frac{1}{|\mathcal{R}|} \quad \text{for all } r \in \mathcal{R} $$ ::: ::: {.callout-tip collapse="true"} ## Proof of Theorem 3 Using Taylor expansion $\exp(x) = 1 + x + O(x^2)$: $$ P(r) = \frac{1 + \alpha \cdot V(r) + O(\alpha^2)}{\sum_{j \in \mathcal{R}} (1 + \alpha \cdot V(j) + O(\alpha^2))} $$ $$ = \frac{1 + \alpha \cdot V(r) + O(\alpha^2)}{|\mathcal{R}| + \alpha \cdot \sum_j V(j) + O(\alpha^2)} $$ As $\alpha \to 0$: $$ \lim_{\alpha \to 0} P(r) = \frac{1}{|\mathcal{R}|} \quad \blacksquare $$ **Alternative proof via logarithms:** $$ \log P(r) = \alpha \cdot V(r) - \log\left[\sum_{j \in \mathcal{R}} \exp(\alpha \cdot V(j))\right] $$ Expanding the log-sum-exp: $$ \log\left[\sum_{j \in \mathcal{R}} \exp(\alpha \cdot V(j))\right] = \log|\mathcal{R}| + \frac{\alpha \cdot \sum_j V(j)}{|\mathcal{R}|} + O(\alpha^2) $$ Therefore: $$ \log P(r) = -\log|\mathcal{R}| + \alpha \cdot \left[V(r) - \frac{\sum_j V(j)}{|\mathcal{R}|}\right] + O(\alpha^2) $$ As $\alpha \to 0$: $\log P(r) \to -\log|\mathcal{R}|$, hence $P(r) \to 1/|\mathcal{R}|$. $\blacksquare$ *Remark: The alternative derivation via logarithms uses the fact that $\log(\sum_j \exp(x_j))$ — the log-sum-exp function — is the cumulant generating function for a discrete uniform distribution, whose expansion near zero is well-characterized.* ::: ### Summary: The Three Properties ```{python} #| label: fig-three-properties #| fig-cap: "Visual summary of the three fundamental properties. Left: Monotonicity—optimal alternative probability increases with α. Middle: Limiting behavior at α→∞ (deterministic optimal choice). Right: Limiting behavior at α→0 (uniform random choice)." fig, axes = plt.subplots(1, 3, figsize=(12, 4)) # Values for a 3-alternative problem values = np.array([0.3, 0.5, 0.9]) # Third is optimal # Property 1: Monotonicity alphas = np.linspace(0.01, 8, 100) probs = np.array([softmax(a * values) for a in alphas]) axes[0].plot(alphas, probs[:, 2], 'b-', linewidth=2.5, label='Optimal (η=0.9)') axes[0].plot(alphas, probs[:, 1], 'orange', linewidth=2, label='Middle (η=0.5)') axes[0].plot(alphas, probs[:, 0], 'r-', linewidth=2, label='Low (η=0.3)') axes[0].axhline(y=1/3, color='gray', linestyle='--', alpha=0.5) axes[0].set_xlabel('α', fontsize=12) axes[0].set_ylabel('χ (choice probability)', fontsize=12) axes[0].set_title('Property 1: Monotonicity', fontsize=12, fontweight='bold') axes[0].legend(fontsize=9) axes[0].set_ylim(0, 1) axes[0].grid(True, alpha=0.3) # Property 2: α → ∞ alpha_large = 20 probs_large = softmax(alpha_large * values) axes[1].bar(['η=0.3', 'η=0.5', 'η=0.9'], probs_large, color=['red', 'orange', 'blue']) axes[1].axhline(y=1, color='blue', linestyle='--', alpha=0.5, label='Limit') axes[1].set_ylabel('χ (choice probability)', fontsize=12) axes[1].set_title('Property 2: α → ∞\n(Deterministic Optimal)', fontsize=12, fontweight='bold') axes[1].set_ylim(0, 1.1) axes[1].grid(True, alpha=0.3, axis='y') # Property 3: α → 0 alpha_small = 0.01 probs_small = softmax(alpha_small * values) axes[2].bar(['η=0.3', 'η=0.5', 'η=0.9'], probs_small, color=['red', 'orange', 'blue']) axes[2].axhline(y=1/3, color='gray', linestyle='--', alpha=0.5, label='Uniform') axes[2].set_ylabel('χ (choice probability)', fontsize=12) axes[2].set_title('Property 3: α → 0\n(Uniform Random)', fontsize=12, fontweight='bold') axes[2].set_ylim(0, 0.5) axes[2].grid(True, alpha=0.3, axis='y') plt.tight_layout() plt.show() ``` ## Application to Subjective Expected Utility We now specialize the general softmax framework to the case where values are subjective expected utilities (SEU). ### SEU as a Value Function ::: {.callout-note} ## Notation Summary (SEU Specialization) | Symbol | Description | |--------|-------------| | $K$ | Number of possible consequences (outcomes) | | $\boldsymbol{\upsilon} \in \mathbb{R}^K$ | Utility vector over consequences | | $\boldsymbol{\psi}_r \in \Delta^{K-1}$ | Subjective probability distribution over consequences for alternative $r$ | | $\eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon}$ | Expected utility of alternative $r$ | | $\boldsymbol{\chi}_m$ | Choice probability vector for problem $m$ | ::: ::: {.callout-note} ## Definition: Subjective Expected Utility Each alternative $r$ is associated with: 1. **Subjective probabilities** $\boldsymbol{\psi}_r \in \Delta^{K-1}$ over $K$ consequences 2. **An expected utility** computed as: $$ \eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon} = \sum_{k=1}^{K} \psi_{r,k} \cdot \upsilon_k $$ The **choice probability** for alternative $r$ in problem $m$ is then: $$ \chi_{m,r} = \frac{\exp(\alpha \cdot \eta_r)}{\sum_{j: I_{m,j}=1} \exp(\alpha \cdot \eta_j)} $$ where $I_{m,r} \in \{0, 1\}$ is an indicator variable: $I_{m,r} = 1$ if alternative $r$ is available in problem $m$, and $I_{m,r} = 0$ otherwise. This notation allows different choice problems to present different subsets of alternatives to the decision maker. ::: **Key observation:** The expected utility $\eta_r$ serves as our value function $V(r) = \eta_r$. Therefore, all three properties proved above apply immediately. ### Corollaries for SEU By substituting $V(r) = \eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon}$ into Properties 1-3: ::: {.callout-note appearance="minimal"} ## Corollary 1 (Monotonicity for SEU) Holding utilities $\boldsymbol{\upsilon}$ and beliefs $\boldsymbol{\psi}$ fixed, higher sensitivity $\alpha$ increases the probability of choosing alternatives that maximize expected utility $\eta$. ::: ::: {.callout-note appearance="minimal"} ## Corollary 2 (Perfect Rationality) As $\alpha \to \infty$, the decision maker chooses uniformly among SEU-maximizing alternatives (those with highest $\eta$) with probability 1. When there is a unique maximizer, choice becomes deterministic. ::: ::: {.callout-note} ## Remark on Ties The corollaries above handle the case of ties (multiple alternatives with equal maximal expected utility $\eta$). However, ties have measure zero under the continuous prior distributions used in subsequent reports. When utilities and beliefs are drawn from continuous distributions, the probability that two distinct alternatives yield exactly equal expected utility is zero. Thus, the single-maximizer case—where one alternative uniquely maximizes $\eta$—is generic. ::: ::: {.callout-note appearance="minimal"} ## Corollary 3 (Random Choice) As $\alpha \to 0$, the decision maker chooses uniformly at random over available alternatives, independent of $\eta$ values. ::: ### What SEU Adds to the Framework While the mathematical properties of softmax choice hold for any value function, the SEU construction provides: 1. **Decomposition**: Expected utilities $\eta$ decompose into beliefs ($\boldsymbol{\psi}$) and utilities ($\boldsymbol{\upsilon}$), allowing separate analysis of epistemic and preference components 2. **Normative content**: SEU maximization is a rationality criterion—Properties 1-3 characterize adherence to this normative standard [@savage1954] 3. **Empirical predictions**: The model predicts that choices will track $\eta$, providing testable restrictions The decomposition in (1) is conceptually attractive but raises a substantive identification question: beliefs $\boldsymbol{\psi}$ and utilities $\boldsymbol{\upsilon}$ enter the expected utility $\eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon}$ multiplicatively, so choices over uncertain alternatives alone may not separately identify them. [Report 4](04_parameter_recovery.qmd) documents this empirically through parameter recovery, [Report 5](05_adding_risky_choices.qmd) analyzes its structure and shows how adding risky choices (with objectively given probabilities) breaks the confound via an Anscombe–Aumann–style design, and [Report 6](06_sbc_validation.qmd) confirms the resulting calibration improvements via simulation-based calibration. The corollaries below should therefore be read with the caveat that *empirical* identification of the components of $\eta$ requires more than the abstract framework guarantees. ## Scale Invariance and Representation {#sec-scale-invariance} ### The Representation Problem A fundamental property of utility functions is that they are unique only up to positive affine transformations. This raises a critical question: how can we meaningfully interpret $\alpha$ when the scale of utility is arbitrary? This non-uniqueness has historical resonance with Luce's [@luce1959] derivation of the softmax functional form: Luce worked with abstract "scale values" determined only up to a common multiplicative constant, foreshadowing the same kind of indeterminacy we encounter when interpreting $\alpha$ against an unfixed utility scale. ::: {.callout-note appearance="minimal"} ## Theorem 4 (Scale Invariance) Let $\boldsymbol{\upsilon}$ be a utility vector and define a rescaled utility: $$ \tilde{\upsilon}_k = a \cdot \upsilon_k + b \quad \text{where } a > 0 $$ Then: 1. $\tilde{\eta}_r = a \cdot \eta_r + b$ for all alternatives $r$ 2. $P(\text{choose } r \mid \alpha, \tilde{\boldsymbol{\upsilon}}) = P(\text{choose } r \mid \alpha \cdot a, \boldsymbol{\upsilon})$ The pair $(\alpha, \boldsymbol{\upsilon})$ and $(\alpha \cdot a, \tilde{\boldsymbol{\upsilon}})$ generate **identical** choice probabilities. ::: ::: {.callout-tip collapse="true"} ## Proof of Theorem 4 **Part 1:** $$ \tilde{\eta}_r = \sum_k \psi_{r,k} \cdot [a \cdot \upsilon_k + b] = a \cdot \sum_k \psi_{r,k} \cdot \upsilon_k + b \cdot \sum_k \psi_{r,k} = a \cdot \eta_r + b $$ since $\sum_k \psi_{r,k} = 1$. **Part 2:** $$ P(\text{choose } r \mid \alpha, \tilde{\boldsymbol{\upsilon}}) = \frac{\exp(\alpha \cdot [a \cdot \eta_r + b])}{\sum_j \exp(\alpha \cdot [a \cdot \eta_j + b])} $$ $$ = \frac{\exp(\alpha a \cdot \eta_r) \cdot \exp(\alpha b)}{\sum_j \exp(\alpha a \cdot \eta_j) \cdot \exp(\alpha b)} = \frac{\exp(\alpha a \cdot \eta_r)}{\sum_j \exp(\alpha a \cdot \eta_j)} = P(\text{choose } r \mid \alpha a, \boldsymbol{\upsilon}) \quad \blacksquare $$ ::: **Key Implication:** Without fixing the utility scale, $\alpha$ and the scale of utility are confounded—they cannot be separately interpreted from choice behavior alone. Scaling utilities by a factor $a$ is equivalent to scaling sensitivity by $1/a$. ### Resolution: Utility Standardization To make $\alpha$ interpretable as "sensitivity to expected utility differences," we adopt a standard standardization convention. ::: {.callout-important} ## Standardization Convention We constrain utilities to lie in $[0,1]$ by assigning: - $\upsilon_{\text{worst}} = 0$ (utility of the worst consequence) - $\upsilon_{\text{best}} = 1$ (utility of the best consequence) For $K$ ordered consequences, this means: $$ 0 = \upsilon_1 \leq \upsilon_2 \leq \cdots \leq \upsilon_K = 1 $$ This is the standard standardization in decision theory, where the utility function is anchored at the endpoints of the consequence space. ::: This standardization is without loss of generality—it simply fixes a representation from the equivalence class of utility functions related by positive affine transformations. Any utility function can be rescaled to satisfy this convention. Given this choice of representative from the affine equivalence class, the residual indeterminacy in $(\alpha, \boldsymbol{\upsilon})$ identified by Theorem 4 is resolved: $\alpha$ becomes uniquely identified up to the data-generating process. **Interpretive Result:** Given this standardization, $\alpha$ measures sensitivity to expected utility differences on a standardized scale where the full range of possible utilities spans exactly one unit. ### Interpretation of α Under Standardization With utilities standardized to $[0,1]$, expected utilities satisfy $\eta_r \in [0,1]$ for all alternatives $r$ (since $\eta_r$ is a convex combination of utilities). The maximum possible difference in expected utility is therefore 1. The sensitivity parameter $\alpha$ has a precise interpretation via the log-odds ratio: $$ \log\left[\frac{\chi_{r}}{\chi_{s}}\right] = \alpha \cdot [\eta_r - \eta_s] $$ ```{python} #| label: tbl-alpha-interpretation #| tbl-cap: "Interpretation of α for a one-unit expected utility difference (maximum possible difference with standardized utilities). The 'Prior region' column locates each α relative to the lognormal(0, 1) prior used in subsequent reports: this prior has median 1, and approximately 90% of its mass lies in (0.19, 5.18)." import pandas as pd from scipy import stats as _stats _ln = _stats.lognorm(s=1.0, scale=1.0) # lognormal(0, 1) on alpha _q05, _q95 = _ln.ppf(0.05), _ln.ppf(0.95) def _prior_region(a): if a < _q05: return 'lower tail (<5%)' if a > _q95: return 'upper tail (>95%)' if abs(a - 1.0) < 1e-9: return 'prior median' return 'central 90%' alpha_vals = [0.5, 1, 2, 3, 5, 10] data = [] for a in alpha_vals: odds_ratio = np.exp(a) prob_better = odds_ratio / (1 + odds_ratio) data.append({ 'α': a, 'Log-odds': f'{a:.1f}', 'Odds ratio': f'{odds_ratio:.2f}', 'P(higher η)': f'{prob_better:.1%}', 'Prior region': _prior_region(a), }) df = pd.DataFrame(data) df ``` **General interpretation:** $\alpha$ measures the log-odds change per unit of expected utility difference. Higher $\alpha$ means choices become more deterministically aligned with $\eta$ rankings. ## Rates of Convergence The limiting behavior established in Properties 2 and 3 occurs at different rates, which we now characterize precisely. ### Convergence Rate for Property 2 ($\alpha \to \infty$) ::: {.callout-note appearance="minimal"} ## Theorem 5 (Exponential Convergence to Optimality) Let $\Delta = \min\{V^* - V(r) : r \notin \mathcal{R}^*\}$ be the minimum gap between optimal and suboptimal values. For any suboptimal alternative $r \notin \mathcal{R}^*$: $$ P(\text{choose } r \mid \alpha, V) = O\left(e^{-\alpha \Delta}\right) \quad \text{as } \alpha \to \infty $$ Convergence to the optimality limit is exponential with rate $\Delta$. *Remark: When $\mathcal{R}^* = \mathcal{R}$ (all alternatives have equal value), the set $\mathcal{R}^- = \mathcal{R} \setminus \mathcal{R}^*$ is empty, so $\Delta$ is undefined. In this trivial case, convergence is achieved instantly: $P(r) = 1/|\mathcal{R}|$ for all $\alpha$. This is consistent with both Properties 2 and 3 collapsing to the same uniform-over-$\mathcal{R}$ limit when all values coincide.* ::: ::: {.callout-tip collapse="true"} ## Proof of Theorem 5 For $r \notin \mathcal{R}^*$, recall from the proof of Property 2: $$ P(\text{choose } r) = \frac{\exp(\alpha \cdot [V(r) - V^*])}{|\mathcal{R}^*| + \sum_{j \in \mathcal{R}^-} \exp(\alpha \cdot [V(j) - V^*])} $$ Since $V(r) - V^* \leq -\Delta < 0$: $$ P(\text{choose } r) \leq \frac{\exp(-\alpha \Delta)}{|\mathcal{R}^*|} = \frac{1}{|\mathcal{R}^*|} e^{-\alpha \Delta} $$ The denominator is bounded below by $|\mathcal{R}^*| \geq 1$, giving: $$ P(\text{choose } r) = O(e^{-\alpha \Delta}) \quad \blacksquare $$ ::: **Interpretation:** Larger value gaps $\Delta$ lead to faster concentration on optimal alternatives. When the best alternative is clearly superior (large $\Delta$), even moderate $\alpha$ yields near-deterministic choice. ### Convergence Rate for Property 3 ($\alpha \to 0$) ::: {.callout-note appearance="minimal"} ## Theorem 6 (Linear Convergence to Uniformity) For any alternative $r \in \mathcal{R}$, let $\bar{V} = \frac{1}{|\mathcal{R}|}\sum_j V(j)$ denote the arithmetic mean of values. Then: $$ P(\text{choose } r \mid \alpha, V) = \frac{1}{|\mathcal{R}|} + \alpha \cdot \left[V(r) - \bar{V}\right] \cdot \frac{1}{|\mathcal{R}|} + O(\alpha^2) $$ Convergence to uniformity is first-order (linear) in $\alpha$. ::: ::: {.callout-tip collapse="true"} ## Proof of Theorem 6 Expanding $\exp(\alpha V(r)) = 1 + \alpha V(r) + \frac{\alpha^2 V(r)^2}{2} + O(\alpha^3)$: $$ P(\text{choose } r) = \frac{1 + \alpha V(r) + O(\alpha^2)}{\sum_j [1 + \alpha V(j) + O(\alpha^2)]} = \frac{1 + \alpha V(r) + O(\alpha^2)}{|\mathcal{R}| + \alpha \sum_j V(j) + O(\alpha^2)} $$ Let $S = \sum_j V(j) = |\mathcal{R}| \cdot \bar{V}$. Using the expansion $(1+x)^{-1} = 1 - x + O(x^2)$: $$ P(\text{choose } r) = \frac{1 + \alpha V(r)}{|\mathcal{R}|} \cdot \left(1 + \frac{\alpha S}{|\mathcal{R}|}\right)^{-1} + O(\alpha^2) $$ $$ = \frac{1 + \alpha V(r)}{|\mathcal{R}|} \cdot \left(1 - \frac{\alpha S}{|\mathcal{R}|}\right) + O(\alpha^2) $$ $$ = \frac{1}{|\mathcal{R}|} + \frac{\alpha V(r)}{|\mathcal{R}|} - \frac{\alpha S}{|\mathcal{R}|^2} + O(\alpha^2) $$ $$ = \frac{1}{|\mathcal{R}|} + \frac{\alpha}{|\mathcal{R}|} \left[V(r) - \frac{S}{|\mathcal{R}|}\right] + O(\alpha^2) $$ $$ = \frac{1}{|\mathcal{R}|} + \frac{\alpha}{|\mathcal{R}|} \left[V(r) - \bar{V}\right] + O(\alpha^2) \quad \blacksquare $$ ::: **Interpretation:** Near $\alpha = 0$, deviations from uniform choice are proportional to $\alpha$ and to how much an alternative's value exceeds the mean. The coefficient $[V(r) - \bar{V}]/|\mathcal{R}|$ determines the direction and magnitude of the first-order effect. ## Discussion Having established the mathematical properties of the model, we turn to its interpretation. ### The Intended Interpretation The intended interpretation of our model is of a decision maker who is committed to SEU maximization but has limited sensitivity to the implications of that commitment. The term "commitment" is doing substantial work here, and we elaborate it in the [Conceptual Lens](#a-conceptual-lens-commitment-and-performance) subsection below. The sensitivity parameter $\alpha$ measures how reliably the agent's choices track the SEU ranking: - High $\alpha$: Choices reliably favor higher-$\eta$ alternatives - Low $\alpha$: Choices are noisy relative to the SEU ranking - $\alpha \to \infty$: Near-deterministic choice of SEU-maximizing alternatives - $\alpha \to 0$: Choices become independent of SEU values ### What we do not intend the model to describe **Not a cognitive process model.** We make no claim about how decision makers actually deliberate. The model is silent on whether agents compute probabilities, form expectations, or engage in any particular mental procedure. It specifies a distribution over choices, not a mechanism that generates them. Our concern is to investigate the extent to which a decision maker's behavior can be captured by viewing that behavior *as-if* it came from a decision maker who is committed to subjective expected utility maximization but has limited sensitivity to the implications of that commitment. **Not bounded rationality.** Bounded rationality programs, in their various formulations [@simon1955; @gigerenzer1996], typically propose alternative decision procedures—heuristics, satisficing rules, or fast-and-frugal strategies—that agents use in place of optimization. Our model posits no such alternative procedures. Nor do we invoke notions of ecological rationality or fit between heuristics and environmental structure. The model simply describes a stochastic relationship between SEU values and choice probabilities. **Not a model of fully rational decision making.** SEU maximization—the classical normative standard—corresponds only to the limit $\alpha \to \infty$. For any finite $\alpha$, the model permits systematic departures from SEU maximization: lower-$\eta$ alternatives are chosen with positive probability. The model is therefore not a model of rational agents in the classical sense; it is a model of agents whose choices are *sensitive*, to a degree parameterized by $\alpha$, to the SEU ranking. ### A Conceptual Lens: Commitment and Performance A useful conceptual lens comes from Isaac Levi's distinction between **commitment** and **performance** [@levi1980, Chapter 1]. We may be committed to standards we fail to perform up to: most of us are committed to the laws of arithmetic despite occasionally making calculation errors; the errors are failures of performance, not rejections of the standard. If we take SEU theory as specifying the decision maker's normative commitments, then $\alpha$ measures their **tendency to perform in accordance with those commitments**. This framing preserves SEU as normatively fundamental while allowing systematic departures in observed behavior. **What does "commitment" itself mean?** Levi, to our knowledge, does not provide an explicit further analysis of *commitment* and largely treats the commitment/performance distinction as primitive, using it to organize his account of doxastic and practical norms. The remainder of this subsection develops two glosses—both ours, reflecting more operationalist leanings than Levi's own writings express—that we find useful for our purposes. A first, *instrumental* gloss: commitment, like preference, is an internal state that is not directly observable but must be inferred from observable behavior via some theory connecting the two. One natural class of evidence is the agent's deliberate adoption of tools, practices, or training that aid satisfaction of the standard. My commitment to having my arithmetic calculations conform to the laws of arithmetic is evidenced, in part, by the calculator on my desk: I purchased and use it because it helps me satisfy a standard I take seriously. Under this gloss, an agent's commitment to a standard is inferred from their valuing aids to performance with respect to that standard. The instrumental gloss has limits—particularly when applied to AI decision makers. In the application reports of this series we do not collect data on anything analogous to the calculator purchase: we have no record of the AI system valuing or selecting tools that would improve its capacity to satisfy a standard of rational choice. We therefore consider a second, *capacity-influenced* gloss that is better matched to the data we do have: an agent is committed to a standard if the influence of that standard on their decision making increases as the agent becomes more sensitive to the implications of the standard. Our model satisfies this notion by construction. The sensitivity parameter $\alpha$ parameterizes that influence directly: as $\alpha$ increases, the SEU ranking exerts greater influence on the agent's choice distribution (Property 1), and in the limit $\alpha \to \infty$ choices align with SEU maximization (Property 2). The two glosses are related: deliberately acquiring tools to enhance one's capacity (the instrumental gloss) is itself evidence that one's decision making is influenced by one's capacity to satisfy the standard—and, plausibly, that this influence would grow as that capacity grows (the capacity-influenced gloss). In this sense the capacity-influenced notion arguably generalizes the instrumental one, and it is the notion most directly operationalized by our model, including in applications to AI decision makers. Our framework differs from Levi's in important ways: - **Decision maker's vs. observer's perspective.** Levi's framework is primarily concerned with the decision maker's own deliberative standards—what an agent should believe and prefer from the first-person standpoint. Our framework, by contrast, adopts the observer's perspective: we model choice behavior as it appears to an external analyst who observes decisions and infers parameters. - **Probabilistic vs. algebraic theories.** Following Luce's [@luce1959] distinction, our framework is a *probabilistic* theory of choice—it specifies a probability distribution over choices given the decision problem. Levi's framework is more aligned with *algebraic* theories that characterize rational choice through axioms on preference orderings rather than stochastic choice rules. - **SEU as the normative standard.** Levi famously rejected subjective expected utility theory as the standard of rational choice, instead advocating for generalizations that accommodate indeterminate probabilities and utilities [see @levi1986, for his critique of precise probability]. Our framework, by contrast, takes SEU maximization as the normative standard and models departures from it through the sensitivity parameter $\alpha$. ## Summary We have established three fundamental properties of the softmax choice model: 1. **Monotonicity**: Higher $\alpha$ increases probability of choosing alternatives with higher value $V(r)$ 2. **Perfect optimization limit**: As $\alpha \to \infty$, choices become deterministically concentrated on value-maximizing alternatives 3. **Uniform choice limit**: As $\alpha \to 0$, choices become uniformly distributed over the available alternatives Additionally, we characterized the rates at which these limits are approached: - Convergence to optimality is **exponential** with rate determined by the value gap $\Delta$ - Convergence to uniformity is **linear** (first-order) in $\alpha$ These properties hold for any value function $V$. When $V$ is taken to be subjective expected utility, the framework provides a model of decision-making that interpolates between random choice and SEU maximization, with $\alpha$ governing the degree of sensitivity to expected utility differences. The standardization of utilities to $[0,1]$ fixes a representation from the equivalence class of utility functions, making $\alpha$ interpretable as sensitivity to standardized expected utility differences. ### Looking Ahead This report establishes the theoretical scaffolding for the series. Subsequent reports build on these foundations: - **Report 2 (Concrete Implementation)** translates this abstract framework into Stan code, specifying priors and the full generative model. - **Report 3 (Prior Predictive Analysis)** examines the implications of our prior choices before seeing data. - **Report 4 (Parameter Recovery)** tests whether the model can reliably recover known parameter values from simulated data. - **Reports 5–7** extend the framework to include risky alternatives, validate via simulation-based calibration, and generalize the sensitivity structure. - **Reports 8–12** develop a hierarchical extension of the framework (model `h_m01`)—formal specification, Stan implementation, prior analysis, parameter recovery, and SBC validation—suitable for studies that pool information across multiple decision makers. ## References ::: {#refs} :::

Symbol	Description
\(\mathcal{R} = \{1, 2, \ldots, R\}\)	Set of distinct alternatives
\(N_m\)	Number of alternatives available in problem \(m\)
\(V: \mathcal{R} \to \mathbb{R}\)	Value function assigning utilities to alternatives
\(V(r)\)	Value of alternative \(r\)
\(\alpha \in \mathbb{R}_+\)	Sensitivity parameter (non-negative)

Symbol	Description
\(K\)	Number of possible consequences (outcomes)
\(\boldsymbol{\upsilon} \in \mathbb{R}^K\)	Utility vector over consequences
\(\boldsymbol{\psi}_r \in \Delta^{K-1}\)	Subjective probability distribution over consequences for alternative \(r\)
\(\eta_r = \boldsymbol{\psi}_r^\top \boldsymbol{\upsilon}\)	Expected utility of alternative \(r\)
\(\boldsymbol{\chi}_m\)	Choice probability vector for problem \(m\)

Abstract Formulation of the SEU Sensitivity Model

0.1 Introduction

0.2 General Softmax Choice Model

0.2.1 Notation and Definitions

0.2.2 The Softmax Choice Rule

0.3 Fundamental Properties of Softmax Choice

0.3.1 Property 1: Monotonicity in Sensitivity

0.3.2 Property 2: Perfect Optimization Limit

0.3.3 Property 3: Uniform Choice Limit

0.3.4 Summary: The Three Properties

0.4 Application to Subjective Expected Utility

0.4.1 SEU as a Value Function

0.4.2 Corollaries for SEU

0.4.3 What SEU Adds to the Framework

0.5 Scale Invariance and Representation

0.5.1 The Representation Problem

0.5.2 Resolution: Utility Standardization

0.5.3 Interpretation of α Under Standardization

0.6 Rates of Convergence

0.6.1 Convergence Rate for Property 2 (\(\alpha \to \infty\))

0.6.2 Convergence Rate for Property 3 (\(\alpha \to 0\))

0.7 Discussion

0.7.1 The Intended Interpretation

0.7.2 What we do not intend the model to describe

0.7.3 A Conceptual Lens: Commitment and Performance

0.8 Summary

0.8.1 Looking Ahead

0.9 References

Reuse

Citation