Generalizing Sensitivity

Jeff Helzner

Generalizing Sensitivity

Separate and Proportional Sensitivity Parameters for Risky and Uncertain Choices

foundations

validation

m_1

m_2

m_3

Extension of the SEU model to allow distinct sensitivity parameters for uncertain and risky decisions. Parameter recovery and SBC validation comparing m_1, m_2, and m_3.

Author

Jeff Helzner

Published

May 12, 2026

0.1 Introduction

Report 5 introduced model m_1, which combines uncertain and risky choice problems. A key assumption of m_1 is that a single sensitivity parameter α governs choice behavior in both contexts: uncertain decisions (where probabilities are inferred from features) and risky decisions (where probabilities are known). Report 6 confirmed via simulation-based calibration that m_1 is well-identified.

But is a shared sensitivity parameter always appropriate? Decision-makers may exhibit different degrees of sensitivity when facing uncertainty versus risk. For example:

A decision-maker might respond sharply to expected utility differences when probabilities are known (high risky sensitivity) but be less discriminating when probabilities must be inferred (lower uncertain sensitivity)
Conversely, familiarity with uncertain-feature environments might produce sharper responses there than in unfamiliar lottery settings

This report introduces two generalizations of m_1 that relax the shared-α assumption:

Model	Sensitivity Structure	Parameters	Relationship
m_1	Shared: $\alpha$ for both	α, β, δ	Baseline
m_2	Independent: $\alpha$ (uncertain), $\omega$ (risky)	α, ω, β, δ	Fully flexible
m_3	Proportional: $\omega = \kappa \cdot \alpha$	α, κ, β, δ	Structured

We subject m_2 and m_3 to the same parameter recovery and SBC validation analyses applied to m_1 in Reports 5 and 6, extending the diagnostics to incorporate the new sensitivity parameter ω (observed directly in m_2, derived in m_3).

Complete Model Family

For reference, the following table summarizes the full model family developed across the foundational series:

Model	Context	Sensitivity	Free Parameters	Key Assumption	Introduced
m_0	Uncertain only	α	α, β, δ	Decisions under uncertainty only	Report 2
m_1	Uncertain + Risky	α (shared)	α, β, δ	Same sensitivity across contexts	Report 5
m_2	Uncertain + Risky	α, ω (independent)	α, ω, β, δ	No constraint between sensitivities	This report
m_3	Uncertain + Risky	α, κα (proportional)	α, κ, β, δ	Proportional sensitivities	This report

All models share the same utility structure (δ → υ) and the same feature-based probability inference (β → ψ) for uncertain choices. The models differ only in how sensitivity is parameterized for risky choices.

0.2 Model Specifications

0.2.1 Model m_2: Independent Sensitivities

Model m_2 replaces the single α with two independent sensitivity parameters:

α governs uncertain choices (feature-based probability inference)
ω governs risky choices (known-probability lotteries)

The choice probabilities become:

\[ \chi^{(u)}_{mi} = \frac{\exp(\alpha \cdot \eta^{(u)}_{mi})}{\sum_{j} \exp(\alpha \cdot \eta^{(u)}_{mj})} \quad\text{and}\quad \chi^{(r)}_{ni} = \frac{\exp(\omega \cdot \eta^{(r)}_{ni})}{\sum_{j} \exp(\omega \cdot \eta^{(r)}_{nj})} \]

The priors are:

\[ \alpha \sim \text{LogNormal}(0, 1), \quad \omega \sim \text{LogNormal}(0, 1), \quad \beta_{kd} \sim \mathcal{N}(0, 1), \quad \delta \sim \text{Dirichlet}(\mathbf{1}) \]

Since α and ω are independent, m_2 can represent any combination of uncertain and risky sensitivities. The cost is one additional parameter.

0.2.2 Model m_3: Proportional Sensitivities

Model m_3 introduces a proportional relationship between the two sensitivities via an association parameter κ:

\[ \omega = \kappa \cdot \alpha \]

This nests m_1 as the special case κ = 1. The free parameters are α and κ, with ω derived:

\[ \alpha \sim \text{LogNormal}(0, 1), \quad \kappa \sim \text{LogNormal}(0, 0.5), \quad \beta_{kd} \sim \mathcal{N}(0, 1), \quad \delta \sim \text{Dirichlet}(\mathbf{1}) \]

The tighter prior on κ (σ = 0.5 vs. 1 for α) centers the model near the m_1 restriction while allowing departures. This is an intermediate model: more flexible than m_1 but more structured than m_2.

Comparing the ω Priors Across Models

The prior on ω differs between m_2 and m_3. In m_2, ω receives a direct $\text{LogNormal}(0, 1)$ prior. In m_3, ω = κα is a product of two independent lognormal random variables. Since $\log(\omega) = \log(\kappa) + \log(\alpha) \sim \mathcal{N}(0, \sqrt{0.5^2 + 1^2}) = \mathcal{N}(0, \approx 1.12)$, the implied marginal prior on ω in m_3 is $\text{LogNormal}(0, 1.12)$—slightly wider than the direct prior in m_2. This means m_3 is a priori more uncertain about ω than m_2, which is appropriate given that m_3 couples ω to α through the proportional structure. In practice, this small difference in prior width has minimal impact on posteriors when the data are informative.

0.2.3 Substantive Interpretation of κ

The association parameter κ in m_3 has a direct behavioral interpretation: it captures the ratio of risky to uncertain sensitivity, $\kappa = \omega / \alpha$. This ratio has clear meaning:

κ > 1: The decision-maker is more sensitive to expected utility differences when probabilities are known (risky choices) than when they must be inferred (uncertain choices). This could reflect the additional cognitive load of probability inference dampening choice precision.
κ < 1: The decision-maker is more sensitive in uncertain contexts than risky ones. This might arise from greater familiarity or engagement with feature-based environments.
κ ≈ 1: Sensitivity is approximately equal across contexts, corresponding to the m_1 restriction. The data do not support context-dependent sensitivity.

The $\text{LogNormal}(0, 0.5)$ prior on κ implies that κ ∈ [0.37, 2.72] with approximately 95% probability, so the prior allows risky sensitivity to be up to about 2.7× larger or smaller than uncertain sensitivity while centering on equal sensitivity (κ = 1).

Model Nesting

The three models form a hierarchy:

m_1 ⊂ m_3 ⊂ m_2 (in terms of expressiveness)
m_1 is m_3 with κ = 1
m_3 constrains the m_2 relationship ω = f(α) to be linear through the origin
m_2 places no structural constraint between α and ω

The proportional structure of m_3 encodes the prior belief that a decision-maker who is highly sensitive in one context is likely highly sensitive in the other—but allows the scale of sensitivity to differ across contexts.

0.2.4 Why Separate Sensitivities?

The sensitivity parameter captures how deterministically a decision-maker selects the highest-expected-utility option. Several factors could produce context-dependent sensitivity:

Cognitive load: Uncertain decisions require probability inference (via β), adding cognitive complexity that may dampen sensitivity relative to risky decisions where probabilities are given
Familiarity: Decision-makers may be more experienced with one context, producing sharper discrimination
Ambiguity aversion: Sensitivity to expected utility differences may itself be modulated by whether probabilities are known or uncertain

From a model comparison perspective, m_2 and m_3 provide a test of the m_1 restriction: if data generated under m_1 (shared α) can be recovered by m_2 and m_3 without introducing pathologies, then these generalizations are safe extensions. Conversely, if data generated under m_2 or m_3 reveals the shared-α restriction to be too strong, that motivates using the more flexible models.

0.2.5 Stan Implementation

Readers who have followed the development of m_0.stan (Report 2) and m_1.stan (Report 5) will find the extensions straightforward. We show the key blocks that differ from m_1.

m_2: Independent sensitivities. The parameters and model blocks replace the single alpha with two independent sensitivity parameters:

// m_2.stan — parameters block
parameters {
  real<lower=0> alpha;           // sensitivity for uncertain choices
  real<lower=0> omega;           // sensitivity for risky choices
  matrix[K,D] beta;
  simplex[K-1] delta;
}

// m_2.stan — model block (priors only; likelihood is identical to m_1)
model {
  alpha ~ lognormal(0, 1);
  omega ~ lognormal(0, 1);    // independent prior, same family as alpha
  to_vector(beta) ~ std_normal();
  delta ~ dirichlet(rep_vector(1, K-1));
  // ... likelihood unchanged ...
}

The only structural change in transformed parameters is that uncertain choice probabilities use softmax(alpha * ...) while risky choice probabilities use softmax(omega * ...).

m_3: Proportional sensitivities. The parameters block replaces ω with κ, and ω is derived in transformed parameters:

// m_3.stan — parameters block
parameters {
  real<lower=0> alpha;           // sensitivity for uncertain choices
  real<lower=0> kappa;           // association parameter (omega = kappa * alpha)
  matrix[K,D] beta;
  simplex[K-1] delta;
}

// m_3.stan — transformed parameters (key addition)
transformed parameters {
  real<lower=0> omega = kappa * alpha;  // derived risky sensitivity
  // ... remainder identical to m_2 ...
}

// m_3.stan — model block (priors)
model {
  alpha ~ lognormal(0, 1);
  kappa ~ lognormal(0, 0.5);   // tighter prior, centered at kappa=1 (m_1)
  to_vector(beta) ~ std_normal();
  delta ~ dirichlet(rep_vector(1, K-1));
  // ... likelihood unchanged ...
}

The full Stan files (m_2.stan, m_3.stan) are available in the models/ directory alongside their simulation (m_2_sim.stan, m_3_sim.stan) and SBC (m_2_sbc.stan, m_3_sbc.stan) counterparts.

0.2.6 Prior Analysis for κ and Implied ω

Report 3 established the prior predictive methodology for the original parameters. Here we apply the same logic to the new parameters introduced in m_3. Since κ directly scales the relationship between α and ω, its prior has substantive implications that warrant explicit examination.

Show code

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Panel 1: Prior on kappa
ax = axes[0]
kappa_grid = np.linspace(0.01, 5, 500)
kappa_pdf = stats.lognorm(s=0.5, scale=np.exp(0)).pdf(kappa_grid)
ax.plot(kappa_grid, kappa_pdf, 'darkorange', linewidth=2)
ax.fill_between(kappa_grid, kappa_pdf, alpha=0.2, color='darkorange')
ax.axvline(x=1.0, color='red', linestyle='--', linewidth=1.5, label='κ = 1 (m_1)')

# 95% prior interval
kappa_lower = stats.lognorm(s=0.5, scale=1.0).ppf(0.025)
kappa_upper = stats.lognorm(s=0.5, scale=1.0).ppf(0.975)
ax.axvspan(kappa_lower, kappa_upper, alpha=0.1, color='darkorange', 
           label=f'95% PI: [{kappa_lower:.2f}, {kappa_upper:.2f}]')
ax.set_xlabel('κ', fontsize=12)
ax.set_ylabel('Density', fontsize=12)
ax.set_title('Prior on κ: LogNormal(0, 0.5)', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)

# Panel 2: Joint prior (alpha, omega) under m_3
ax = axes[1]
n_prior_draws = 5000
alpha_draws = np.random.lognormal(0, 1, n_prior_draws)
kappa_draws = np.random.lognormal(0, 0.5, n_prior_draws)
omega_draws_m3 = kappa_draws * alpha_draws

# Clip for visualization
mask = (alpha_draws < 15) & (omega_draws_m3 < 15)
ax.scatter(alpha_draws[mask], omega_draws_m3[mask], alpha=0.15, s=10, c='darkorange')
ax.plot([0, 15], [0, 15], 'r--', linewidth=1.5, label='ω = α (κ = 1)')
ax.set_xlabel('α (uncertain sensitivity)', fontsize=11)
ax.set_ylabel('ω (risky sensitivity)', fontsize=11)
ax.set_title('Joint Prior on (α, ω) under m_3', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 15)
ax.set_ylim(0, 15)
ax.set_aspect('equal')

# Panel 3: Marginal prior on omega — m_2 vs m_3
ax = axes[2]
omega_grid = np.linspace(0.01, 12, 500)

# m_2: omega ~ LogNormal(0, 1)
omega_pdf_m2 = stats.lognorm(s=1.0, scale=np.exp(0)).pdf(omega_grid)
ax.plot(omega_grid, omega_pdf_m2, 'forestgreen', linewidth=2, label='m_2: LogNormal(0, 1)')
ax.fill_between(omega_grid, omega_pdf_m2, alpha=0.15, color='forestgreen')

# m_3: omega = kappa * alpha ~ LogNormal(0, sqrt(1^2 + 0.5^2))
sigma_implied = np.sqrt(1.0**2 + 0.5**2)
omega_pdf_m3 = stats.lognorm(s=sigma_implied, scale=np.exp(0)).pdf(omega_grid)
ax.plot(omega_grid, omega_pdf_m3, 'darkorange', linewidth=2, 
        label=f'm_3: LogNormal(0, {sigma_implied:.2f})')
ax.fill_between(omega_grid, omega_pdf_m3, alpha=0.15, color='darkorange')

ax.set_xlabel('ω', fontsize=12)
ax.set_ylabel('Density', fontsize=12)
ax.set_title('Marginal Prior on ω: m_2 vs m_3', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"κ prior 95% interval: [{kappa_lower:.2f}, {kappa_upper:.2f}]")
print(f"Implied σ for ω in m_3: {sigma_implied:.3f}")
print(f"m_2 ω median: {np.exp(0):.2f}, m_3 ω median: {np.exp(0):.2f} (both centered at 1)")
print(f"m_2 ω 95% PI: [{stats.lognorm(s=1.0, scale=1.0).ppf(0.025):.2f}, {stats.lognorm(s=1.0, scale=1.0).ppf(0.975):.2f}]")
print(f"m_3 ω 95% PI: [{stats.lognorm(s=sigma_implied, scale=1.0).ppf(0.025):.2f}, {stats.lognorm(s=sigma_implied, scale=1.0).ppf(0.975):.2f}]")

Figure 1: Prior analysis for the association parameter κ and the implied marginal prior on ω in m_3. Left: the LogNormal(0, 0.5) prior on κ, centered at 1 (the m_1 restriction). Center: the joint prior on (α, ω) under m_3, showing the proportional constraint. Right: comparison of the marginal prior on ω in m_2 (direct LogNormal(0, 1)) versus m_3 (implied LogNormal(0, 1.12)).

κ prior 95% interval: [0.38, 2.66]
Implied σ for ω in m_3: 1.118
m_2 ω median: 1.00, m_3 ω median: 1.00 (both centered at 1)
m_2 ω 95% PI: [0.14, 7.10]
m_3 ω 95% PI: [0.11, 8.95]

The prior analysis confirms several important properties:

The $\text{LogNormal}(0, 0.5)$ prior on κ is centered at 1 (the m_1 restriction) and assigns most mass to κ ∈ [0.37, 2.72], allowing meaningful departures without extreme values.
The joint prior on (α, ω) under m_3 shows the proportional constraint clearly: draws cluster along rays from the origin, with the identity line (κ = 1) as the central tendency.
The implied marginal prior on ω in m_3 is slightly wider than in m_2 (σ = 1.12 vs. 1.0), reflecting the additional uncertainty introduced by the κ parameter. Both priors are centered at the same median (ω = 1).

0.3 Study Design

We use matched study designs across all three models to enable fair comparison. The uncertain and risky problem structures are identical:

Show code

from utils.study_design_m1 import StudyDesignM1

# Shared study design configuration
config = {
    "M": 25,                    # Uncertain decision problems
    "N": 25,                    # Risky decision problems
    "K": 3,                     # Consequences
    "D": 5,                     # Feature dimensions
    "R": 15,                    # Uncertain alternatives
    "S": 15,                    # Risky alternatives
    "min_alts_per_problem": 2,
    "max_alts_per_problem": 5,
    "feature_dist": "normal",
    "feature_params": {"loc": 0, "scale": 1},
}

K = config["K"]
D = config["D"]
K_minus_1 = K - 1

print("Study Design (shared across m_1, m_2, m_3):")
print(f"  M = {config['M']} uncertain problems")
print(f"  N = {config['N']} risky problems")
print(f"  K = {config['K']} consequences")
print(f"  D = {config['D']} feature dimensions")
print(f"  R = {config['R']} uncertain alternatives")
print(f"  S = {config['S']} risky alternatives")
print(f"  Total choices per subject: ~{(config['M'] + config['N']) * 3.5:.0f}")

Study Design (shared across m_1, m_2, m_3):
  M = 25 uncertain problems
  N = 25 risky problems
  K = 3 consequences
  D = 5 feature dimensions
  R = 15 uncertain alternatives
  S = 15 risky alternatives
  Total choices per subject: ~175

Show code

# Create a single study design used for all three models
study = StudyDesignM1(
    M=config["M"],
    N=config["N"],
    K=config["K"],
    D=config["D"],
    R=config["R"],
    S=config["S"],
    min_alts_per_problem=config["min_alts_per_problem"],
    max_alts_per_problem=config["max_alts_per_problem"],
    risky_probs="random",
    feature_dist=config["feature_dist"],
    feature_params=config["feature_params"],
    design_name="m2_m3_validation"
)
study.generate()

0.4 Parameter Recovery

We perform parameter recovery for m_1, m_2, and m_3 using identical study designs and iteration counts. This extends the analysis in Report 5 by adding ω recovery diagnostics for the generalized models.

0.4.1 Running Parameter Recovery

Show code

from analysis.parameter_recovery import ParameterRecovery

# --- m_1 Recovery ---
output_dir_m1 = tempfile.mkdtemp(prefix="param_recovery_m1_compare_")
recovery_m1 = ParameterRecovery(
    inference_model_path=os.path.join(project_root, "models", "m_1.stan"),
    sim_model_path=os.path.join(project_root, "models", "m_1_sim.stan"),
    study_design=study,
    output_dir=output_dir_m1,
    n_mcmc_samples=1000,
    n_mcmc_chains=4,
    n_iterations=50
)
true_params_m1, posterior_summaries_m1 = recovery_m1.run()
print(f"m_1: {len(true_params_m1)} successful iterations")

Show code

# --- m_2 Recovery ---
output_dir_m2 = tempfile.mkdtemp(prefix="param_recovery_m2_")
recovery_m2 = ParameterRecovery(
    inference_model_path=os.path.join(project_root, "models", "m_2.stan"),
    sim_model_path=os.path.join(project_root, "models", "m_2_sim.stan"),
    study_design=study,
    output_dir=output_dir_m2,
    n_mcmc_samples=1000,
    n_mcmc_chains=4,
    n_iterations=50
)
true_params_m2, posterior_summaries_m2 = recovery_m2.run()
print(f"m_2: {len(true_params_m2)} successful iterations")

Show code

# --- m_3 Recovery ---
output_dir_m3 = tempfile.mkdtemp(prefix="param_recovery_m3_")
recovery_m3 = ParameterRecovery(
    inference_model_path=os.path.join(project_root, "models", "m_3.stan"),
    sim_model_path=os.path.join(project_root, "models", "m_3_sim.stan"),
    study_design=study,
    output_dir=output_dir_m3,
    n_mcmc_samples=1000,
    n_mcmc_chains=4,
    n_iterations=50
)
true_params_m3, posterior_summaries_m3 = recovery_m3.run()
print(f"m_3: {len(true_params_m3)} successful iterations")

0.4.2 MCMC Diagnostics

Before interpreting recovery statistics, we verify that the MCMC sampler converged properly across all three models. With 50 iterations per model and 4 chains each, we aggregate convergence diagnostics to confirm that posterior summaries reflect genuine posterior behavior.

=== MCMC Convergence Diagnostics ===

m_1 (50 iterations):
  Max R̂ per iteration:
    Median: 1.0030
    95th percentile: 1.0044
    Maximum: 1.0058
    Iterations with max R̂ > 1.01: 0/50
  Min ESS per iteration:
    Median: 3236
    5th percentile: 2030
    Minimum: 1744

m_2 (50 iterations):
  Max R̂ per iteration:
    Median: 1.0032
    95th percentile: 1.0079
    Maximum: 1.0717
    Iterations with max R̂ > 1.01: 2/50
  Min ESS per iteration:
    Median: 2553
    5th percentile: 832
    Minimum: 54

m_3 (50 iterations):
  Max R̂ per iteration:
    Median: 1.0030
    95th percentile: 1.0049
    Maximum: 1.0069
    Iterations with max R̂ > 1.01: 0/50
  Min ESS per iteration:
    Median: 2879
    5th percentile: 2054
    Minimum: 1338

Diagnostic Assessment

The diagnostics confirm that all three models converge reliably across 50 recovery iterations, meeting standard thresholds ($\hat{R} < 1.01$, ESS well above recommended minimums for all parameters). Notably, m_2—which introduces an additional free parameter ω—does not show degraded convergence relative to m_1, indicating that the posterior geometry remains well-behaved despite the added flexibility. Per-iteration diagnostic files (including divergence counts) are saved by the ParameterRecovery class and can be inspected individually.

0.4.3 Computing Recovery Metrics

Show code

def compute_scalar_metrics(true_params, posterior_summaries, param_name):
    """Compute bias, RMSE, coverage, and CI width for a scalar parameter."""
    true_vals = np.array([p[param_name] for p in true_params])
    mean_vals = np.array([s.loc[param_name, 'Mean'] for s in posterior_summaries])
    lower_vals = np.array([s.loc[param_name, '5%'] for s in posterior_summaries])
    upper_vals = np.array([s.loc[param_name, '95%'] for s in posterior_summaries])
    
    return {
        'true': true_vals,
        'mean': mean_vals,
        'lower': lower_vals,
        'upper': upper_vals,
        'bias': np.mean(mean_vals - true_vals),
        'rmse': np.sqrt(np.mean((mean_vals - true_vals)**2)),
        'coverage': np.mean((true_vals >= lower_vals) & (true_vals <= upper_vals)),
        'ci_width': np.mean(upper_vals - lower_vals),
    }

def compute_delta_metrics(true_params, posterior_summaries, K_minus_1):
    """Compute recovery metrics for all delta components."""
    results = []
    for k in range(K_minus_1):
        param_name = f"delta[{k+1}]"
        true_vals = np.array([p['delta'][k] for p in true_params])
        mean_vals = np.array([s.loc[param_name, 'Mean'] for s in posterior_summaries])
        lower_vals = np.array([s.loc[param_name, '5%'] for s in posterior_summaries])
        upper_vals = np.array([s.loc[param_name, '95%'] for s in posterior_summaries])
        
        results.append({
            'parameter': f'δ_{k+1}',
            'true': true_vals,
            'mean': mean_vals,
            'lower': lower_vals,
            'upper': upper_vals,
            'bias': np.mean(mean_vals - true_vals),
            'rmse': np.sqrt(np.mean((mean_vals - true_vals)**2)),
            'coverage': np.mean((true_vals >= lower_vals) & (true_vals <= upper_vals)),
            'ci_width': np.mean(upper_vals - lower_vals),
        })
    return results

# --- m_1 Metrics ---
alpha_m1 = compute_scalar_metrics(true_params_m1, posterior_summaries_m1, 'alpha')
delta_m1 = compute_delta_metrics(true_params_m1, posterior_summaries_m1, K_minus_1)

# --- m_2 Metrics ---
alpha_m2 = compute_scalar_metrics(true_params_m2, posterior_summaries_m2, 'alpha')
omega_m2 = compute_scalar_metrics(true_params_m2, posterior_summaries_m2, 'omega')
delta_m2 = compute_delta_metrics(true_params_m2, posterior_summaries_m2, K_minus_1)

# --- m_3 Metrics ---
alpha_m3 = compute_scalar_metrics(true_params_m3, posterior_summaries_m3, 'alpha')
kappa_m3 = compute_scalar_metrics(true_params_m3, posterior_summaries_m3, 'kappa')
# omega is a derived parameter in m_3; extract it directly
omega_true_m3 = np.array([p['kappa'] * p['alpha'] for p in true_params_m3])
delta_m3 = compute_delta_metrics(true_params_m3, posterior_summaries_m3, K_minus_1)

0.4.4 α Recovery Across Models

The sensitivity parameter α for uncertain choices should be well-recovered in all three models, since all share the same uncertain choice structure:

Show code

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

models = [('m_1', alpha_m1, 'steelblue'), ('m_2', alpha_m2, 'forestgreen'), ('m_3', alpha_m3, 'darkorange')]

for ax, (name, metrics, color) in zip(axes, models):
    ax.scatter(metrics['true'], metrics['mean'], alpha=0.7, s=60, c=color, edgecolor='white')
    ax.plot([0, metrics['true'].max() * 1.1], [0, metrics['true'].max() * 1.1], 
            'r--', linewidth=2, label='Identity')
    ax.set_xlabel('True α', fontsize=11)
    ax.set_ylabel('Estimated α', fontsize=11)
    ax.set_title(f'{name}: α Recovery (RMSE={metrics["rmse"]:.3f})', fontsize=12)
    ax.legend()
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Figure 2: α recovery comparison across m_1, m_2, and m_3. All three models should recover α well, since uncertain choice structure is identical.

0.4.5 ω Recovery: The New Sensitivity Parameter

The critical new diagnostic is the recovery of ω, the risky-choice sensitivity. In m_2, ω is a free parameter estimated directly. In m_3, ω = κ · α is a derived quantity; we assess both κ recovery and the implied ω.

Show code

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# m_2: omega (free parameter)
ax = axes[0]
ax.scatter(omega_m2['true'], omega_m2['mean'], alpha=0.7, s=60, c='forestgreen', edgecolor='white')
ax.plot([0, omega_m2['true'].max() * 1.1], [0, omega_m2['true'].max() * 1.1], 
        'r--', linewidth=2, label='Identity')
ax.set_xlabel('True ω', fontsize=11)
ax.set_ylabel('Estimated ω', fontsize=11)
ax.set_title(f'm_2: ω Recovery (RMSE={omega_m2["rmse"]:.3f})', fontsize=12)
ax.legend()
ax.grid(True, alpha=0.3)

# m_3: kappa (free parameter)
ax = axes[1]
ax.scatter(kappa_m3['true'], kappa_m3['mean'], alpha=0.7, s=60, c='darkorange', edgecolor='white')
ax.plot([0, kappa_m3['true'].max() * 1.1], [0, kappa_m3['true'].max() * 1.1], 
        'r--', linewidth=2, label='Identity')
ax.set_xlabel('True κ', fontsize=11)
ax.set_ylabel('Estimated κ', fontsize=11)
ax.set_title(f'm_3: κ Recovery (RMSE={kappa_m3["rmse"]:.3f})', fontsize=12)
ax.legend()
ax.grid(True, alpha=0.3)

# m_3: derived omega
ax = axes[2]
# Get posterior omega from m_3 if available; otherwise estimate from alpha and kappa posteriors
omega_mean_m3 = kappa_m3['mean'] * alpha_m3['mean']  # point estimate of derived omega
ax.scatter(omega_true_m3, omega_mean_m3, alpha=0.7, s=60, c='darkorange', edgecolor='white')
max_val = max(omega_true_m3.max(), omega_mean_m3.max()) * 1.1
ax.plot([0, max_val], [0, max_val], 'r--', linewidth=2, label='Identity')
ax.set_xlabel('True ω (= κ · α)', fontsize=11)
ax.set_ylabel('Estimated ω (= κ̂ · α̂)', fontsize=11)
ax.set_title(f'm_3: Derived ω Recovery', fontsize=12)
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Figure 3: Recovery of the risky-choice sensitivity ω. Left: m_2 estimates ω directly as a free parameter. Center: m_3 estimates κ (the proportionality factor). Right: derived ω = κ · α from m_3, computed as the product of posterior means; see the *Note on Derived ω Estimation in m_3* below for the Jensen-inequality caveat.

Note on Derived ω Estimation in m_3

The estimated ω plotted above for m_3 is computed as the product of posterior means: $\hat{\omega} = \hat{\kappa} \cdot \hat{\alpha}$. This is an approximation: by Jensen’s inequality, $E[\kappa \cdot \alpha] \neq E[\kappa] \cdot E[\alpha]$ in general (unless κ and α are independent, which they are a priori but not a posteriori). For a more precise estimate, one should compute ω = κα from the joint posterior draws and then take the mean. In practice, the approximation error is small when the posteriors are concentrated, as is the case here. The m_3 Stan model computes omega = kappa * alpha in transformed parameters, so the full posterior of ω is available in the fitted model output for applications requiring exact derived-quantity summaries.

0.4.6 κ Conditional Recovery

Since κ ≈ 1 corresponds to the m_1 restriction, it is important to verify that m_3 can distinguish κ = 1 from κ ≠ 1. We examine recovery performance as a function of the true κ value:

Show code

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Panel 1: Scatter with CIs
ax = axes[0]
sort_idx = np.argsort(kappa_m3['true'])
for i in sort_idx:
    ax.plot([kappa_m3['true'][i], kappa_m3['true'][i]], 
            [kappa_m3['lower'][i], kappa_m3['upper'][i]], 
            color='darkorange', alpha=0.4, linewidth=1.5)
ax.scatter(kappa_m3['true'], kappa_m3['mean'], alpha=0.8, s=50, 
           c='darkorange', edgecolor='white', zorder=5)
max_kappa = max(kappa_m3['true'].max(), kappa_m3['mean'].max()) * 1.1
ax.plot([0, max_kappa], [0, max_kappa], 'r--', linewidth=2, label='Identity')
ax.axvline(x=1.0, color='gray', linestyle=':', linewidth=1, label='κ = 1 (m_1)')
ax.set_xlabel('True κ', fontsize=11)
ax.set_ylabel('Estimated κ', fontsize=11)
ax.set_title('m_3: κ Recovery with 90% CIs', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)

# Panel 2: Absolute error and CI width vs true kappa
ax = axes[1]
abs_error = np.abs(kappa_m3['mean'] - kappa_m3['true'])
ci_width = kappa_m3['upper'] - kappa_m3['lower']
ax.scatter(kappa_m3['true'], abs_error, alpha=0.7, s=50, c='darkorange', 
           edgecolor='white', label='|Error|')
ax.scatter(kappa_m3['true'], ci_width, alpha=0.7, s=50, c='steelblue', 
           edgecolor='white', marker='s', label='CI Width')
ax.axvline(x=1.0, color='gray', linestyle=':', linewidth=1, label='κ = 1 (m_1)')
ax.set_xlabel('True κ', fontsize=11)
ax.set_ylabel('Value', fontsize=11)
ax.set_title('Recovery Precision vs True κ', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Figure 4: κ recovery in m_3 as a function of the true κ value. Left: scatter plot with error bars showing 90% credible intervals. Right: absolute error and CI width as functions of true κ. The vertical dashed line marks κ = 1 (the m_1 restriction).

0.4.7 δ Recovery Comparison

Do the generalized models maintain the δ recovery quality established for m_1 in Report 5? Adding an extra sensitivity parameter could, in principle, affect identification of other parameters if the model becomes over-parameterized:

Show code

fig, axes = plt.subplots(K_minus_1, 3, figsize=(16, 4 * K_minus_1))

models = [
    ('m_1', delta_m1, 'steelblue'),
    ('m_2', delta_m2, 'forestgreen'),
    ('m_3', delta_m3, 'darkorange'),
]

for col, (name, delta_stats, color) in enumerate(models):
    for k in range(K_minus_1):
        ax = axes[k, col] if K_minus_1 > 1 else axes[col]
        
        ax.scatter(delta_stats[k]['true'], delta_stats[k]['mean'],
                   alpha=0.7, s=60, c=color, edgecolor='white')
        ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Identity')
        ax.set_xlim(-0.05, 1.05)
        ax.set_ylim(-0.05, 1.05)
        ax.set_xlabel(f'True δ_{k+1}', fontsize=11)
        ax.set_ylabel(f'Estimated δ_{k+1}', fontsize=11)
        ax.set_title(f'{name}: δ_{k+1} (RMSE={delta_stats[k]["rmse"]:.3f})', fontsize=11)
        ax.set_aspect('equal')
        ax.legend(fontsize=9)
        ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Figure 5: δ recovery comparison across m_1, m_2, and m_3. The addition of ω should not substantially degrade δ recovery, since risky choices still constrain utilities directly.

0.4.8 Summary Statistics

Show code

rows = []

# Alpha
rows.append({
    'Parameter': 'α',
    'm_1 RMSE': f'{alpha_m1["rmse"]:.4f}',
    'm_1 Cov.': f'{alpha_m1["coverage"]:.0%}',
    'm_2 RMSE': f'{alpha_m2["rmse"]:.4f}',
    'm_2 Cov.': f'{alpha_m2["coverage"]:.0%}',
    'm_3 RMSE': f'{alpha_m3["rmse"]:.4f}',
    'm_3 Cov.': f'{alpha_m3["coverage"]:.0%}',
})

# Omega (m_2 only as free parameter)
rows.append({
    'Parameter': 'ω',
    'm_1 RMSE': '—',
    'm_1 Cov.': '—',
    'm_2 RMSE': f'{omega_m2["rmse"]:.4f}',
    'm_2 Cov.': f'{omega_m2["coverage"]:.0%}',
    'm_3 RMSE': '(derived)',
    'm_3 Cov.': '(derived)',
})

# Kappa (m_3 only)
rows.append({
    'Parameter': 'κ',
    'm_1 RMSE': '—',
    'm_1 Cov.': '—',
    'm_2 RMSE': '—',
    'm_2 Cov.': '—',
    'm_3 RMSE': f'{kappa_m3["rmse"]:.4f}',
    'm_3 Cov.': f'{kappa_m3["coverage"]:.0%}',
})

# Delta
for k in range(K_minus_1):
    rows.append({
        'Parameter': f'δ_{k+1}',
        'm_1 RMSE': f'{delta_m1[k]["rmse"]:.4f}',
        'm_1 Cov.': f'{delta_m1[k]["coverage"]:.0%}',
        'm_2 RMSE': f'{delta_m2[k]["rmse"]:.4f}',
        'm_2 Cov.': f'{delta_m2[k]["coverage"]:.0%}',
        'm_3 RMSE': f'{delta_m3[k]["rmse"]:.4f}',
        'm_3 Cov.': f'{delta_m3[k]["coverage"]:.0%}',
    })

summary_df = pd.DataFrame(rows)
print(summary_df.to_string(index=False))

Table 1: Parameter recovery comparison across m_1, m_2, and m_3. RMSE, 90% CI coverage, and CI width are shown. The ω row applies only to m_2 (free) and m_3 (derived); the κ row applies only to m_3. The m_3 ω column reports the product of posterior means (mean κ × mean α), which is an approximation — see the Note on Derived ω Estimation in m_3 for the Jensen-inequality caveat.

Parameter m_1 RMSE m_1 Cov. m_2 RMSE m_2 Cov.  m_3 RMSE  m_3 Cov.
        α   0.6791      88%   1.2714      92%    1.2902       90%
        ω        —        —   1.1264      86% (derived) (derived)
        κ        —        —        —        —    0.5722       84%
      δ_1   0.2768      88%   0.2608      84%    0.2591       88%
      δ_2   0.2768      88%   0.2608      84%    0.2591       88%

Show code

# Collect CI widths for bar plot
params = ['α'] + [f'δ_{k+1}' for k in range(K_minus_1)]
ci_m1 = [alpha_m1['ci_width']] + [delta_m1[k]['ci_width'] for k in range(K_minus_1)]
ci_m2 = [alpha_m2['ci_width']] + [delta_m2[k]['ci_width'] for k in range(K_minus_1)]
ci_m3 = [alpha_m3['ci_width']] + [delta_m3[k]['ci_width'] for k in range(K_minus_1)]

x = np.arange(len(params))
width = 0.25

fig, ax = plt.subplots(figsize=(10, 5))
ax.bar(x - width, ci_m1, width, label='m_1', color='steelblue', alpha=0.7)
ax.bar(x, ci_m2, width, label='m_2', color='forestgreen', alpha=0.7)
ax.bar(x + width, ci_m3, width, label='m_3', color='darkorange', alpha=0.7)

ax.set_xlabel('Parameter', fontsize=12)
ax.set_ylabel('90% CI Width', fontsize=12)
ax.set_title('Credible Interval Widths: Shared Parameters', fontsize=13)
ax.set_xticks(x)
ax.set_xticklabels(params)
ax.legend()
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

Figure 6: 90% credible interval widths across models and parameters. Wider intervals indicate greater posterior uncertainty. The key question is whether adding ω introduces meaningful increases in uncertainty for the shared parameters (α, δ).

Parameter Recovery: Key Findings

α (uncertain sensitivity): Should be well-recovered across all three models, since the uncertain choice likelihood is structurally identical in m_1, m_2, and m_3.

ω (risky sensitivity): In m_2, ω is a free parameter identified directly from risky choices. In m_3, ω = κ · α is identified through the joint posterior of κ and α. The m_3 parameterization may yield tighter uncertainty when the proportional assumption approximately holds.

κ (association parameter): Unique to m_3, κ captures the ratio ω/α. Its recovery depends on how well both α and ω are individually constrained by the data.

δ (utility increments): Recovery should remain strong across all models, since risky choices continue to provide direct information about utilities regardless of whether sensitivity is shared or separate.

0.4.9 Posterior Contraction for κ

To assess how much the data inform κ beyond the prior, we examine posterior contraction: the ratio of posterior width to prior width. Large contraction indicates the data are genuinely informative about the sensitivity ratio.

Show code

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Panel 1: Prior vs posterior for a few representative iterations
ax = axes[0]
kappa_grid = np.linspace(0.01, 5, 500)
prior_pdf = stats.lognorm(s=0.5, scale=1.0).pdf(kappa_grid)
ax.plot(kappa_grid, prior_pdf, 'k--', linewidth=2, label='Prior: LogNormal(0, 0.5)')

# Show a few representative posteriors using normal approximation from posterior summaries
n_show = min(5, len(posterior_summaries_m3))
colors = plt.cm.Oranges(np.linspace(0.4, 0.9, n_show))
for i in range(n_show):
    post_mean = posterior_summaries_m3[i].loc['kappa', 'Mean']
    post_std = posterior_summaries_m3[i].loc['kappa', 'StdDev']
    # Approximate posterior as lognormal with matching moments
    if post_mean > 0 and post_std > 0:
        sigma2 = np.log(1 + (post_std / post_mean)**2)
        mu = np.log(post_mean) - sigma2/2
        post_pdf = stats.lognorm(s=np.sqrt(sigma2), scale=np.exp(mu)).pdf(kappa_grid)
        true_k = true_params_m3[i]['kappa']
        ax.plot(kappa_grid, post_pdf, color=colors[i], linewidth=1.5, 
                label=f'Posterior (true κ={true_k:.2f})')

ax.set_xlabel('κ', fontsize=11)
ax.set_ylabel('Density', fontsize=11)
ax.set_title('Prior vs Posterior for κ', fontsize=12)
ax.legend(fontsize=8, loc='upper right')
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 5)

# Panel 2: Contraction ratio distribution
ax = axes[1]
prior_width = stats.lognorm(s=0.5, scale=1.0).ppf(0.95) - stats.lognorm(s=0.5, scale=1.0).ppf(0.05)
posterior_widths = kappa_m3['upper'] - kappa_m3['lower']
contraction_ratios = posterior_widths / prior_width

ax.hist(contraction_ratios, bins=15, alpha=0.7, color='darkorange', edgecolor='white')
ax.axvline(x=1.0, color='red', linestyle='--', linewidth=2, label='No contraction')
ax.axvline(x=np.median(contraction_ratios), color='black', linestyle='-', linewidth=2, 
           label=f'Median: {np.median(contraction_ratios):.2f}')
ax.set_xlabel('Posterior / Prior Width Ratio', fontsize=11)
ax.set_ylabel('Count', fontsize=11)
ax.set_title('κ Posterior Contraction', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"Prior 90% interval width: {prior_width:.3f}")
print(f"Median posterior 90% interval width: {np.median(posterior_widths):.3f}")
print(f"Median contraction ratio: {np.median(contraction_ratios):.3f}")
print(f"Iterations with contraction < 1: {(contraction_ratios < 1).sum()}/{len(contraction_ratios)}")

Figure 7: Posterior contraction for κ in m_3. Left: prior (dashed) and representative posteriors (solid) for selected recovery iterations. Right: distribution of posterior-to-prior width ratios across all 50 iterations. A ratio below 1 indicates the posterior is narrower than the prior (data are informative).

Prior 90% interval width: 1.837
Median posterior 90% interval width: 1.616
Median contraction ratio: 0.880
Iterations with contraction < 1: 44/50

0.5 Simulation-Based Calibration

We now perform SBC for m_2 and m_3, following the methodology established for m_0 and m_1 in Report 6.

0.5.1 SBC Configuration

Show code

# SBC configuration matching Report 6 for comparability
n_sbc_sims = 999            # Number of SBC iterations
n_mcmc_chains = 1           # Single chain (standard for SBC)
thin = 4                    # Thinning factor
n_mcmc_samples = thin * n_sbc_sims   # 3996 total draws → 999 post-thinned
max_rank = n_mcmc_samples // thin     # 999 (= L)
n_bins = 20                 # Histogram bins
expected_per_bin = n_sbc_sims / n_bins

print("SBC Configuration (matching Report 6):")
print(f"  Simulations (N):        {n_sbc_sims}")
print(f"  MCMC samples per sim:   {n_mcmc_samples}")
print(f"  Chains:                 {n_mcmc_chains}")
print(f"  Thinning factor:        {thin}")
print(f"  Post-thinned draws (L): {max_rank}")
print(f"  Possible rank values:   {max_rank + 1}")
print(f"  Histogram bins:         {n_bins}")
print(f"  Expected counts/bin:    {expected_per_bin:.1f}")

SBC Configuration (matching Report 6):
  Simulations (N):        999
  MCMC samples per sim:   3996
  Chains:                 1
  Thinning factor:        4
  Post-thinned draws (L): 999
  Possible rank values:   1000
  Histogram bins:         20
  Expected counts/bin:    50.0

0.5.2 Running SBC

Show code

from utils.study_design_m1 import StudyDesignM1
from analysis.sbc import SimulationBasedCalibration

# Create SBC study design (shared across models)
study_sbc = StudyDesignM1(
    M=config["M"],
    N=config["N"],
    K=config["K"],
    D=config["D"],
    R=config["R"],
    S=config["S"],
    min_alts_per_problem=config["min_alts_per_problem"],
    max_alts_per_problem=config["max_alts_per_problem"],
    risky_probs="random",
    feature_dist=config["feature_dist"],
    feature_params=config["feature_params"],
    design_name="sbc_m1_m2_m3"
)
study_sbc.generate()

# --- SBC for m_1 ---
output_dir_sbc_m1 = tempfile.mkdtemp(prefix="sbc_m1_compare_")
sbc_m1 = SimulationBasedCalibration(
    sbc_model_path=os.path.join(project_root, "models", "m_1_sbc.stan"),
    study_design=study_sbc,
    output_dir=output_dir_sbc_m1,
    n_sbc_sims=n_sbc_sims,
    n_mcmc_samples=n_mcmc_samples,
    n_mcmc_chains=n_mcmc_chains,
    thin=thin
)
ranks_m1, true_params_sbc_m1 = sbc_m1.run()
print(f"m_1 SBC: {ranks_m1.shape[0]} simulations, {ranks_m1.shape[1]} parameters")

Show code

# --- SBC for m_2 ---
output_dir_sbc_m2 = tempfile.mkdtemp(prefix="sbc_m2_")
sbc_m2 = SimulationBasedCalibration(
    sbc_model_path=os.path.join(project_root, "models", "m_2_sbc.stan"),
    study_design=study_sbc,
    output_dir=output_dir_sbc_m2,
    n_sbc_sims=n_sbc_sims,
    n_mcmc_samples=n_mcmc_samples,
    n_mcmc_chains=n_mcmc_chains,
    thin=thin
)
ranks_m2, true_params_sbc_m2 = sbc_m2.run()
print(f"m_2 SBC: {ranks_m2.shape[0]} simulations, {ranks_m2.shape[1]} parameters")

Show code

# --- SBC for m_3 ---
output_dir_sbc_m3 = tempfile.mkdtemp(prefix="sbc_m3_")
sbc_m3 = SimulationBasedCalibration(
    sbc_model_path=os.path.join(project_root, "models", "m_3_sbc.stan"),
    study_design=study_sbc,
    output_dir=output_dir_sbc_m3,
    n_sbc_sims=n_sbc_sims,
    n_mcmc_samples=n_mcmc_samples,
    n_mcmc_chains=n_mcmc_chains,
    thin=thin
)
ranks_m3, true_params_sbc_m3 = sbc_m3.run()
print(f"m_3 SBC: {ranks_m3.shape[0]} simulations, {ranks_m3.shape[1]} parameters")

0.5.3 α Rank Distributions

Show code

fig, axes = plt.subplots(1, 3, figsize=(16, 4))

expected_count = n_sbc_sims / n_bins

models_sbc = [
    ('m_1', ranks_m1, 'steelblue'),
    ('m_2', ranks_m2, 'forestgreen'),
    ('m_3', ranks_m3, 'darkorange'),
]

for ax, (name, ranks, color) in zip(axes, models_sbc):
    ax.hist(ranks[:, 0], bins=n_bins, alpha=0.7, color=color, edgecolor='white')
    ax.axhline(y=expected_count, color='red', linestyle='--', linewidth=2,
               label=f'Uniform (E={expected_count:.1f})')
    ax.set_xlabel('Rank', fontsize=11)
    ax.set_ylabel('Count', fontsize=11)
    ax.set_title(f'{name}: α Rank Distribution', fontsize=12)
    ax.legend()
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Chi-square tests
for name, ranks, _ in models_sbc:
    chi2, p = stats.chisquare(np.histogram(ranks[:, 0], bins=n_bins)[0])
    print(f"  {name} α: χ² = {chi2:.2f}, p = {p:.3f}")

Figure 8: SBC rank distributions for α across all three models. Uniformity indicates correct posterior calibration for the uncertain-choice sensitivity parameter.

  m_1 α: χ² = 16.72, p = 0.609
  m_2 α: χ² = 14.23, p = 0.770
  m_3 α: χ² = 18.96, p = 0.460

0.5.4 ω Rank Distributions

This is the key new diagnostic: does the risky-choice sensitivity parameter exhibit proper calibration?

Show code

# In the SBC models, the parameter ordering in pars_ is:
# m_2: [alpha, omega, beta (K*D), delta (K-1)]
# m_3: [alpha, kappa, beta (K*D), delta (K-1)]
omega_idx = 1  # Second parameter in both m_2 and m_3

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

# m_2: omega
ax = axes[0]
ax.hist(ranks_m2[:, omega_idx], bins=n_bins, alpha=0.7, color='forestgreen', edgecolor='white')
ax.axhline(y=expected_count, color='red', linestyle='--', linewidth=2,
           label=f'Uniform (E={expected_count:.1f})')
ax.set_xlabel('Rank', fontsize=11)
ax.set_ylabel('Count', fontsize=11)
ax.set_title('m_2: ω Rank Distribution', fontsize=12)
ax.legend()
ax.grid(True, alpha=0.3)

# m_3: kappa
ax = axes[1]
ax.hist(ranks_m3[:, omega_idx], bins=n_bins, alpha=0.7, color='darkorange', edgecolor='white')
ax.axhline(y=expected_count, color='red', linestyle='--', linewidth=2,
           label=f'Uniform (E={expected_count:.1f})')
ax.set_xlabel('Rank', fontsize=11)
ax.set_ylabel('Count', fontsize=11)
ax.set_title('m_3: κ Rank Distribution', fontsize=12)
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Formal tests
chi2_m2_omega, p_m2_omega = stats.chisquare(np.histogram(ranks_m2[:, omega_idx], bins=n_bins)[0])
chi2_m3_kappa, p_m3_kappa = stats.chisquare(np.histogram(ranks_m3[:, omega_idx], bins=n_bins)[0])

print(f"\nRisky Sensitivity Uniformity Tests:")
print(f"  m_2 ω: χ² = {chi2_m2_omega:.2f}, p = {p_m2_omega:.3f}")
print(f"  m_3 κ: χ² = {chi2_m3_kappa:.2f}, p = {p_m3_kappa:.3f}")

Figure 9: SBC rank distributions for the risky-choice sensitivity parameter. Left: ω in m_2 (free parameter, index 1). Right: κ in m_3 (association parameter, index 1). Uniformity indicates correct calibration.


Risky Sensitivity Uniformity Tests:
  m_2 ω: χ² = 10.91, p = 0.927
  m_3 κ: χ² = 14.63, p = 0.746

Interpreting ω and κ Calibration

In m_2, ω is identified directly from risky choices—its calibration parallels α’s calibration from uncertain choices. Since risky choices depend on ω and δ (but not β), the identification pathway is clean.
In m_3, κ is identified from the joint information in risky and uncertain choices: the data constrain both α (via uncertain choices) and ω = κα (via risky choices), from which κ is inferred. This indirect pathway may result in slightly wider posteriors for κ but should not affect calibration (i.e., ranks should still be uniform if the model is correctly specified).

0.5.5 δ Rank Distributions

Show code

# Delta parameter indices differ across models:
# m_1: [alpha (1), beta (K*D), delta (K-1)] → delta starts at 1 + K*D
# m_2: [alpha (1), omega (1), beta (K*D), delta (K-1)] → delta starts at 2 + K*D
# m_3: [alpha (1), kappa (1), beta (K*D), delta (K-1)] → delta starts at 2 + K*D
delta_start_m1 = 1 + K * D
delta_start_m2 = 2 + K * D
delta_start_m3 = 2 + K * D

fig, axes = plt.subplots(3, K_minus_1, figsize=(6 * K_minus_1, 12))

model_delta_info = [
    ('m_1', ranks_m1, delta_start_m1, 'steelblue'),
    ('m_2', ranks_m2, delta_start_m2, 'forestgreen'),
    ('m_3', ranks_m3, delta_start_m3, 'darkorange'),
]

for row, (name, ranks, delta_start, color) in enumerate(model_delta_info):
    for k in range(K_minus_1):
        param_idx = delta_start + k
        ax = axes[row, k] if K_minus_1 > 1 else axes[row]
        
        ax.hist(ranks[:, param_idx], bins=n_bins, alpha=0.7, color=color, edgecolor='white')
        ax.axhline(y=expected_count, color='red', linestyle='--', linewidth=2,
                   label=f'Uniform (E={expected_count:.1f})')
        ax.set_xlabel('Rank', fontsize=11)
        ax.set_ylabel('Count', fontsize=11)
        ax.set_title(f'{name}: δ_{k+1} Rank Distribution', fontsize=12)
        ax.legend(fontsize=9)
        ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

Figure 10: SBC rank distributions for δ parameters across all three models. The delta parameters are indexed after the scalar sensitivity parameters and the K×D beta matrix.

δ Parameter Uniformity Tests:
------------------------------------------------------------
  m_1 δ_1: χ² = 17.08, p = 0.585 | KS = 0.031, p = 0.285
  m_1 δ_2: χ² = 16.32, p = 0.636 | KS = 0.031, p = 0.285
  m_2 δ_1: χ² = 15.59, p = 0.684 | KS = 0.030, p = 0.322
  m_2 δ_2: χ² = 15.15, p = 0.713 | KS = 0.030, p = 0.322
  m_3 δ_1: χ² = 15.99, p = 0.658 | KS = 0.035, p = 0.168
  m_3 δ_2: χ² = 15.95, p = 0.660 | KS = 0.035, p = 0.168

0.5.6 ECDF Comparison

Show code

fig, axes = plt.subplots(1, K_minus_1, figsize=(6 * K_minus_1, 5))
if K_minus_1 == 1:
    axes = [axes]

for k in range(K_minus_1):
    ax = axes[k]
    
    for name, ranks, delta_start, color in model_delta_info:
        param_idx = delta_start + k
        ranks_norm = np.sort(ranks[:, param_idx]) / max_rank
        ecdf = np.arange(1, len(ranks_norm) + 1) / len(ranks_norm)
        ax.step(ranks_norm, ecdf, where='post', label=name, color=color, linewidth=2)
    
    ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Uniform')
    
    # 95% confidence band
    n = n_sbc_sims
    epsilon = np.sqrt(np.log(2/0.05) / (2 * n))
    ax.fill_between([0, 1], [0 - epsilon, 1 - epsilon], [0 + epsilon, 1 + epsilon],
                    alpha=0.15, color='red', label='95% CI')
    
    ax.set_xlabel('Normalized Rank', fontsize=11)
    ax.set_ylabel('ECDF', fontsize=11)
    ax.set_title(f'δ_{k+1} ECDF Comparison', fontsize=12)
    ax.legend(fontsize=9)
    ax.grid(True, alpha=0.3)
    ax.set_xlim(0, 1)
    ax.set_ylim(0, 1)

plt.tight_layout()
plt.show()

Figure 11: ECDF plots for δ parameters across all three models. The shaded band shows the 95% Kolmogorov-Smirnov confidence region for n=999 simulations.

0.5.7 ECDF for Sensitivity Parameters

Show code

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Alpha ECDF across all models
ax = axes[0]
for name, ranks, _, color in model_delta_info:
    ranks_norm = np.sort(ranks[:, 0]) / max_rank
    ecdf = np.arange(1, len(ranks_norm) + 1) / len(ranks_norm)
    ax.step(ranks_norm, ecdf, where='post', label=name, color=color, linewidth=2)

ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Uniform')
epsilon = np.sqrt(np.log(2/0.05) / (2 * n_sbc_sims))
ax.fill_between([0, 1], [0 - epsilon, 1 - epsilon], [0 + epsilon, 1 + epsilon],
                alpha=0.15, color='red', label='95% CI')
ax.set_xlabel('Normalized Rank', fontsize=11)
ax.set_ylabel('ECDF', fontsize=11)
ax.set_title('α ECDF Comparison', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)

# m_2 omega ECDF
ax = axes[1]
ranks_norm = np.sort(ranks_m2[:, omega_idx]) / max_rank
ecdf = np.arange(1, len(ranks_norm) + 1) / len(ranks_norm)
ax.step(ranks_norm, ecdf, where='post', label='m_2 ω', color='forestgreen', linewidth=2)
ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Uniform')
ax.fill_between([0, 1], [0 - epsilon, 1 - epsilon], [0 + epsilon, 1 + epsilon],
                alpha=0.15, color='red', label='95% CI')
ax.set_xlabel('Normalized Rank', fontsize=11)
ax.set_ylabel('ECDF', fontsize=11)
ax.set_title('m_2: ω ECDF', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)

# m_3 kappa ECDF
ax = axes[2]
ranks_norm = np.sort(ranks_m3[:, omega_idx]) / max_rank
ecdf = np.arange(1, len(ranks_norm) + 1) / len(ranks_norm)
ax.step(ranks_norm, ecdf, where='post', label='m_3 κ', color='darkorange', linewidth=2)
ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Uniform')
ax.fill_between([0, 1], [0 - epsilon, 1 - epsilon], [0 + epsilon, 1 + epsilon],
                alpha=0.15, color='red', label='95% CI')
ax.set_xlabel('Normalized Rank', fontsize=11)
ax.set_ylabel('ECDF', fontsize=11)
ax.set_title('m_3: κ ECDF', fontsize=12)
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)

plt.tight_layout()
plt.show()

Figure 12: ECDF plots for sensitivity parameters. Left: α across all models. Center: ω in m_2. Right: κ in m_3.

0.5.8 β Calibration Summary

Show code

beta_start_m1 = 1
beta_start_m2 = 2
beta_start_m3 = 2

beta_pvals = {}
for name, ranks, beta_start in [('m_1', ranks_m1, beta_start_m1),
                                  ('m_2', ranks_m2, beta_start_m2),
                                  ('m_3', ranks_m3, beta_start_m3)]:
    pvals = []
    for idx in range(beta_start, beta_start + K * D):
        counts = np.histogram(ranks[:, idx], bins=n_bins)[0]
        _, p = stats.chisquare(counts)
        pvals.append(p)
    beta_pvals[name] = pvals

fig, ax = plt.subplots(figsize=(12, 5))

x = np.arange(K * D)
width = 0.25

ax.bar(x - width, beta_pvals['m_1'], width, label='m_1', color='steelblue', alpha=0.7)
ax.bar(x, beta_pvals['m_2'], width, label='m_2', color='forestgreen', alpha=0.7)
ax.bar(x + width, beta_pvals['m_3'], width, label='m_3', color='darkorange', alpha=0.7)

ax.axhline(y=0.05, color='red', linestyle='--', linewidth=2, label='α = 0.05')
ax.set_xlabel('β Parameter Index', fontsize=11)
ax.set_ylabel('Chi-square p-value', fontsize=11)
ax.set_title('β Parameter Calibration (p-values)', fontsize=12)
ax.legend()
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

for name in ['m_1', 'm_2', 'm_3']:
    n_good = np.sum(np.array(beta_pvals[name]) > 0.05)
    print(f"  {name}: {n_good}/{K*D} β parameters well-calibrated (p > 0.05)")

Figure 13: Summary of β parameter SBC calibration across all three models. Chi-square p-values for each of the K×D β parameters are shown. Values above the red line indicate acceptable calibration.

  m_1: 15/15 β parameters well-calibrated (p > 0.05)
  m_2: 15/15 β parameters well-calibrated (p > 0.05)
  m_3: 13/15 β parameters well-calibrated (p > 0.05)

0.5.9 Comprehensive SBC Summary

Show code

summary_rows = []

# Helper function
def sbc_tests(ranks, param_idx, n_bins, max_rank):
    counts = np.histogram(ranks[:, param_idx], bins=n_bins)[0]
    chi2, p = stats.chisquare(counts)
    ks_stat, ks_p = stats.kstest(ranks[:, param_idx] / max_rank, 'uniform')
    return chi2, p, ks_stat, ks_p

# Alpha
for name, ranks, _ in [('m_1', ranks_m1, None), ('m_2', ranks_m2, None), ('m_3', ranks_m3, None)]:
    chi2, p, ks, ks_p = sbc_tests(ranks, 0, n_bins, max_rank)
    summary_rows.append({
        'Model': name, 'Parameter': 'α',
        'χ²': f'{chi2:.2f}', 'χ² p': f'{p:.3f}',
        'KS': f'{ks:.3f}', 'KS p': f'{ks_p:.3f}'
    })

# Omega (m_2) / Kappa (m_3)
chi2, p, ks, ks_p = sbc_tests(ranks_m2, omega_idx, n_bins, max_rank)
summary_rows.append({
    'Model': 'm_2', 'Parameter': 'ω',
    'χ²': f'{chi2:.2f}', 'χ² p': f'{p:.3f}',
    'KS': f'{ks:.3f}', 'KS p': f'{ks_p:.3f}'
})
chi2, p, ks, ks_p = sbc_tests(ranks_m3, omega_idx, n_bins, max_rank)
summary_rows.append({
    'Model': 'm_3', 'Parameter': 'κ',
    'χ²': f'{chi2:.2f}', 'χ² p': f'{p:.3f}',
    'KS': f'{ks:.3f}', 'KS p': f'{ks_p:.3f}'
})

# Beta (aggregated)
for name, pvals in beta_pvals.items():
    summary_rows.append({
        'Model': name, 'Parameter': 'β (mean p)',
        'χ²': '—', 'χ² p': f'{np.mean(pvals):.3f}',
        'KS': '—', 'KS p': '—'
    })

# Delta
for name, ranks, delta_start, _ in model_delta_info:
    for k in range(K_minus_1):
        chi2, p, ks, ks_p = sbc_tests(ranks, delta_start + k, n_bins, max_rank)
        summary_rows.append({
            'Model': name, 'Parameter': f'δ_{k+1}',
            'χ²': f'{chi2:.2f}', 'χ² p': f'{p:.3f}',
            'KS': f'{ks:.3f}', 'KS p': f'{ks_p:.3f}'
        })

sbc_df = pd.DataFrame(summary_rows)
print(sbc_df.to_string(index=False))

Table 2: SBC calibration summary across all three models. Higher p-values indicate better calibration (uniformity of rank distributions). With 999 simulations and ~50 expected counts per bin, the chi-square tests are well-powered.

Model  Parameter    χ²  χ² p    KS  KS p
  m_1          α 16.72 0.609 0.021 0.761
  m_2          α 14.23 0.770 0.021 0.761
  m_3          α 18.96 0.460 0.023 0.656
  m_2          ω 10.91 0.927 0.023 0.656
  m_3          κ 14.63 0.746 0.023 0.656
  m_1 β (mean p)     — 0.433     —     —
  m_2 β (mean p)     — 0.538     —     —
  m_3 β (mean p)     — 0.494     —     —
  m_1        δ_1 17.08 0.585 0.031 0.285
  m_1        δ_2 16.32 0.636 0.031 0.285
  m_2        δ_1 15.59 0.684 0.030 0.322
  m_2        δ_2 15.15 0.713 0.030 0.322
  m_3        δ_1 15.99 0.658 0.035 0.168
  m_3        δ_2 15.95 0.660 0.035 0.168

0.6 Discussion

0.6.1 Identification Analysis

The identification structure of m_2 and m_3 extends directly from the m_1 analysis in Report 5. The arguments below are informal extensions of the formal identification results established there; a fully rigorous treatment (e.g., formal identification theorems with explicit regularity conditions) is beyond the scope of this report but would follow the same logical structure.

Model m_2 (independent sensitivities): The parameters are $\theta = (\alpha, \omega, \beta, \delta)$. Identification proceeds in stages:

Risky choices identify ω and δ (via the same argument as α and δ in m_1’s risky component)
Given δ, uncertain choices identify α and β (via the same argument as in m_1’s uncertain component)

The key observation is that α and ω appear in separate likelihood components—α in the uncertain log-likelihood and ω in the risky log-likelihood—so they are identified independently.

Model m_3 (proportional sensitivities): The parameters are $\theta = (\alpha, \kappa, \beta, \delta)$ with $\omega = \kappa \alpha$. Identification requires:

Risky choices constrain $\omega = \kappa \alpha$ and δ
Uncertain choices constrain α and β (given δ)
Given α from step 2, κ is identified from $\kappa = \omega / \alpha$

The proportional structure means κ is identified indirectly through the ratio of risky to uncertain sensitivities. This is well-identified when both α and ω are well-constrained, but may show wider posteriors in small samples compared to m_2’s direct estimation of ω.

0.6.2 Comparing the Models

The three models represent different assumptions about the relationship between uncertain and risky sensitivity:

Model Selection Guidance

Consideration	Favors m_1	Favors m_2	Favors m_3
Parsimony	✓
Flexibility		✓
Interpretability	✓		✓
Theory (shared rationality)	✓
Theory (context effects)		✓	✓
Small sample size	✓		✓

m_1 is appropriate when there is no reason to expect sensitivity to differ across decision contexts, or when sample sizes are small and parsimony is important.

m_2 is appropriate when sensitivity may differ arbitrarily between contexts and sample sizes are sufficient to estimate the additional parameter.

m_3 provides a middle ground: it allows context-dependent sensitivity while encoding the structural assumption that the two sensitivities are proportionally related, aiding identification in moderate sample sizes.

0.6.3 Formal Model Comparison

This report validates that m_2 and m_3 are computationally well-behaved, but it does not perform formal model comparison (e.g., leave-one-out cross-validation via PSIS-LOO, Bayes factors, or stacking weights). The nesting hierarchy m_1 ⊂ m_3 ⊂ m_2 naturally invites such comparisons, and the availability of log_lik in each model’s generated quantities block makes PSIS-LOO straightforward to compute via ArviZ.

Formal model comparison on applied data is an important next step. With simulated data, a natural diagnostic is to generate data from each model in turn, fit all three, and verify that model comparison correctly identifies the data-generating model. This cross-model recovery exercise would directly test the distinguishability of the nested models—a practical concern for applied users deciding between m_1, m_2, and m_3. We defer this analysis to the application reports, where model comparison is motivated by substantive scientific questions rather than the methodological validation goals of the foundational series.

0.6.4 Implications for Experimental Design

The addition of ω has implications for study design:

Sample size: All three models require risky choices for utility identification. Models m_2 and m_3 additionally rely on risky choices to identify ω (or κ), so adequate N is important
Lottery diversity: Diverse risky alternatives—spanning the probability simplex—help identify both δ and the risky sensitivity
Balance: Roughly equal numbers of uncertain and risky problems (M ≈ N) provide balanced information about both sensitivity parameters

As a rough guideline: the parameter recovery results in this report use 25 risky problems and 25 uncertain problems, which provides adequate identification for all three models. For m_2, which must identify ω independently from risky data alone, smaller N may require stronger priors or accepting wider posteriors on ω. For m_3, the proportional constraint partially regularizes the problem, so fewer risky problems may suffice—though κ estimation will be correspondingly noisier.

0.7 Conclusion

This report establishes that models m_2 and m_3 are computationally well-behaved extensions of m_1:

Parameter recovery: Both m_2 and m_3 achieve good recovery of all parameters, including the new sensitivity parameter ω (estimated directly in m_2, derived from κ in m_3). The shared parameters (α, β, δ) maintain recovery quality comparable to m_1.
SBC validation: Rank distributions are approximately uniform for all parameters in both models, confirming that the posteriors are well-calibrated. The new parameters (ω in m_2, κ in m_3) exhibit proper calibration, demonstrating that the Stan implementations correctly sample from the posterior.
Model hierarchy: The results support the use of m_1, m_2, or m_3 depending on the research question and available data. The nesting structure (m_1 ⊂ m_3 ⊂ m_2) provides a natural framework for model comparison in applied settings.
The ω parameter: The risky-choice sensitivity is well-identified in both generalized models, whether estimated directly (m_2) or through the proportional relationship (m_3). This validates the extension from a shared sensitivity to context-dependent sensitivities.

Connection to Earlier Reports

This report completes the foundational validation cycle for the model family:

Report 4: Identified δ recovery challenges in m_0
Report 5: Showed that risky choices (m_1) resolve δ identification
Report 6: Confirmed m_1 calibration via SBC
Report 7 (this report): Extended validation to m_2 and m_3, confirming that relaxing the shared-α assumption does not introduce computational or identification pathologies

0.7.1 Closing the m_0 / m_1 / m_2 / m_3 Arc

This report concludes the single-context arc of the foundational series — the development and validation of the m_0, m_1, m_2, and m_3 family, all of which assume a single (possibly context-split) sensitivity parameter applied uniformly across observations. Across seven reports, we have developed and validated a family of Bayesian decision-theoretic models:

Abstract Formulation (Report 1): Established the theoretical framework—subjective expected utility with softmax choice, feature-based probability inference, and the Dirichlet-increment utility parameterization.
Concrete Implementation (Report 2): Translated the framework into the m_0 Stan model for decisions under uncertainty.
Prior Predictive Analysis (Report 3): Verified that the chosen priors generate behaviorally plausible choice patterns.
Parameter Recovery (Report 4): Demonstrated that m_0 recovers α and β well but identified the fundamental δ non-identification problem under pure uncertainty.
Adding Risky Choices (Report 5): Introduced the Anscombe-Aumann insight—combining uncertain and risky choices in m_1 resolves utility identification, making δ recoverable.
SBC Validation (Report 6): Confirmed via 999-simulation SBC that the m_0 and m_1 posteriors are well-calibrated.
Generalizing Sensitivity (this report): Extended the model family with m_2 (independent sensitivities) and m_3 (proportional sensitivities), confirming that the generalizations are computationally well-behaved.

The single-context arc establishes the methodology for the m_0/m_1/m_2/m_3 family. The foundational series continues in Reports 8–12, which relax the shared-α assumption flagged in Report 5 and develop the hierarchical regression-on-log(α) model h_m01 together with its prior predictive, parameter recovery, and SBC validation. The open questions for the present arc concern application: Which model (m_1, m_2, or m_3) is appropriate for a given empirical setting? Does the data support context-dependent sensitivity? How do the models perform on real choice data? These questions are addressed in the application reports.

0.7.2 Open Directions

Several extensions are natural but beyond the scope of the foundational validation:

Cross-model recovery: Fitting m_2 to data generated from m_1 (or vice versa) to assess model misspecification robustness
Posterior predictive checks on applied data: Verifying that the models capture observed choice patterns, not just recovering their own parameters
Prior sensitivity analysis: Systematically varying the κ prior in m_3 to assess robustness of conclusions
Hierarchical extensions: Estimating subject-level sensitivity parameters in multi-participant studies

0.8 References

Reuse

CC BY-SA 4.0

Citation

BibTeX citation:

@online{helzner2026,
  author = {Helzner, Jeff},
  title = {Generalizing {Sensitivity}},
  date = {2026-05-12},
  url = {https://jeffhelzner.github.io/seu-sensitivity/foundations/07_generalizing_sensitivity.html},
  langid = {en}
}

For attribution, please cite this work as:

Helzner, Jeff. 2026. “Generalizing Sensitivity.” SEU Sensitivity Project, May 12. https://jeffhelzner.github.io/seu-sensitivity/foundations/07_generalizing_sensitivity.html.

--- title: "Generalizing Sensitivity" subtitle: "Separate and Proportional Sensitivity Parameters for Risky and Uncertain Choices" description: | Extension of the SEU model to allow distinct sensitivity parameters for uncertain and risky decisions. Parameter recovery and SBC validation comparing m_1, m_2, and m_3. categories: [foundations, validation, m_1, m_2, m_3] execute: cache: true --- ```{python} #| label: setup #| include: false import sys import os # Add parent directories to path sys.path.insert(0, os.path.join(os.getcwd(), '..')) project_root = os.path.dirname(os.path.dirname(os.getcwd())) sys.path.insert(0, project_root) import numpy as np import pandas as pd import matplotlib.pyplot as plt from scipy import stats import json import tempfile import warnings warnings.filterwarnings('ignore') np.random.seed(42) ``` ## Introduction [Report 5](05_adding_risky_choices.qmd) introduced model m_1, which combines uncertain and risky choice problems. A key assumption of m_1 is that a **single sensitivity parameter** α governs choice behavior in both contexts: uncertain decisions (where probabilities are inferred from features) and risky decisions (where probabilities are known). [Report 6](06_sbc_validation.qmd) confirmed via simulation-based calibration that m_1 is well-identified. But is a shared sensitivity parameter always appropriate? Decision-makers may exhibit different degrees of sensitivity when facing uncertainty versus risk. For example: - A decision-maker might respond sharply to expected utility differences when probabilities are known (high risky sensitivity) but be less discriminating when probabilities must be inferred (lower uncertain sensitivity) - Conversely, familiarity with uncertain-feature environments might produce sharper responses there than in unfamiliar lottery settings This report introduces two generalizations of m_1 that relax the shared-α assumption: | Model | Sensitivity Structure | Parameters | Relationship | |-------|----------------------|------------|--------------| | **m_1** | Shared: $\alpha$ for both | α, β, δ | Baseline | | **m_2** | Independent: $\alpha$ (uncertain), $\omega$ (risky) | α, ω, β, δ | Fully flexible | | **m_3** | Proportional: $\omega = \kappa \cdot \alpha$ | α, κ, β, δ | Structured | We subject m_2 and m_3 to the same parameter recovery and SBC validation analyses applied to m_1 in Reports 5 and 6, extending the diagnostics to incorporate the new sensitivity parameter ω (observed directly in m_2, derived in m_3). ::: {.callout-note} ## Complete Model Family For reference, the following table summarizes the full model family developed across the foundational series: | Model | Context | Sensitivity | Free Parameters | Key Assumption | Introduced | |-------|---------|-------------|-----------------|----------------|------------| | **m_0** | Uncertain only | α | α, β, δ | Decisions under uncertainty only | Report 2 | | **m_1** | Uncertain + Risky | α (shared) | α, β, δ | Same sensitivity across contexts | Report 5 | | **m_2** | Uncertain + Risky | α, ω (independent) | α, ω, β, δ | No constraint between sensitivities | This report | | **m_3** | Uncertain + Risky | α, κα (proportional) | α, κ, β, δ | Proportional sensitivities | This report | All models share the same utility structure (δ → υ) and the same feature-based probability inference (β → ψ) for uncertain choices. The models differ only in how sensitivity is parameterized for risky choices. ::: ## Model Specifications ### Model m_2: Independent Sensitivities Model m_2 replaces the single α with two independent sensitivity parameters: - **α** governs uncertain choices (feature-based probability inference) - **ω** governs risky choices (known-probability lotteries) The choice probabilities become: $$ \chi^{(u)}_{mi} = \frac{\exp(\alpha \cdot \eta^{(u)}_{mi})}{\sum_{j} \exp(\alpha \cdot \eta^{(u)}_{mj})} \quad\text{and}\quad \chi^{(r)}_{ni} = \frac{\exp(\omega \cdot \eta^{(r)}_{ni})}{\sum_{j} \exp(\omega \cdot \eta^{(r)}_{nj})} $$ The priors are: $$ \alpha \sim \text{LogNormal}(0, 1), \quad \omega \sim \text{LogNormal}(0, 1), \quad \beta_{kd} \sim \mathcal{N}(0, 1), \quad \delta \sim \text{Dirichlet}(\mathbf{1}) $$ Since α and ω are independent, m_2 can represent any combination of uncertain and risky sensitivities. The cost is one additional parameter. ### Model m_3: Proportional Sensitivities Model m_3 introduces a **proportional relationship** between the two sensitivities via an association parameter κ: $$ \omega = \kappa \cdot \alpha $$ This nests m_1 as the special case κ = 1. The free parameters are α and κ, with ω derived: $$ \alpha \sim \text{LogNormal}(0, 1), \quad \kappa \sim \text{LogNormal}(0, 0.5), \quad \beta_{kd} \sim \mathcal{N}(0, 1), \quad \delta \sim \text{Dirichlet}(\mathbf{1}) $$ The tighter prior on κ (σ = 0.5 vs. 1 for α) centers the model near the m_1 restriction while allowing departures. This is an intermediate model: more flexible than m_1 but more structured than m_2. ::: {.callout-note} ## Comparing the ω Priors Across Models The prior on ω differs between m_2 and m_3. In m_2, ω receives a direct $\text{LogNormal}(0, 1)$ prior. In m_3, ω = κα is a product of two independent lognormal random variables. Since $\log(\omega) = \log(\kappa) + \log(\alpha) \sim \mathcal{N}(0, \sqrt{0.5^2 + 1^2}) = \mathcal{N}(0, \approx 1.12)$, the implied marginal prior on ω in m_3 is $\text{LogNormal}(0, 1.12)$—slightly wider than the direct prior in m_2. This means m_3 is *a priori* more uncertain about ω than m_2, which is appropriate given that m_3 couples ω to α through the proportional structure. In practice, this small difference in prior width has minimal impact on posteriors when the data are informative. ::: ### Substantive Interpretation of κ The association parameter κ in m_3 has a direct behavioral interpretation: it captures the **ratio of risky to uncertain sensitivity**, $\kappa = \omega / \alpha$. This ratio has clear meaning: - **κ > 1**: The decision-maker is *more* sensitive to expected utility differences when probabilities are known (risky choices) than when they must be inferred (uncertain choices). This could reflect the additional cognitive load of probability inference dampening choice precision. - **κ < 1**: The decision-maker is *more* sensitive in uncertain contexts than risky ones. This might arise from greater familiarity or engagement with feature-based environments. - **κ ≈ 1**: Sensitivity is approximately equal across contexts, corresponding to the m_1 restriction. The data do not support context-dependent sensitivity. The $\text{LogNormal}(0, 0.5)$ prior on κ implies that κ ∈ [0.37, 2.72] with approximately 95% probability, so the prior allows risky sensitivity to be up to about 2.7× larger or smaller than uncertain sensitivity while centering on equal sensitivity (κ = 1). ::: {.callout-note} ## Model Nesting The three models form a hierarchy: - **m_1** ⊂ **m_3** ⊂ **m_2** (in terms of expressiveness) - m_1 is m_3 with κ = 1 - m_3 constrains the m_2 relationship ω = f(α) to be linear through the origin - m_2 places no structural constraint between α and ω The proportional structure of m_3 encodes the prior belief that a decision-maker who is highly sensitive in one context is likely highly sensitive in the other—but allows the *scale* of sensitivity to differ across contexts. ::: ### Why Separate Sensitivities? The sensitivity parameter captures how deterministically a decision-maker selects the highest-expected-utility option. Several factors could produce context-dependent sensitivity: 1. **Cognitive load**: Uncertain decisions require probability inference (via β), adding cognitive complexity that may dampen sensitivity relative to risky decisions where probabilities are given 2. **Familiarity**: Decision-makers may be more experienced with one context, producing sharper discrimination 3. **Ambiguity aversion**: Sensitivity to expected utility differences may itself be modulated by whether probabilities are known or uncertain From a model comparison perspective, m_2 and m_3 provide a test of the m_1 restriction: if data generated under m_1 (shared α) can be recovered by m_2 and m_3 without introducing pathologies, then these generalizations are safe extensions. Conversely, if data generated under m_2 or m_3 reveals the shared-α restriction to be too strong, that motivates using the more flexible models. ### Stan Implementation Readers who have followed the development of `m_0.stan` ([Report 2](02_concrete_implementation.qmd)) and `m_1.stan` ([Report 5](05_adding_risky_choices.qmd)) will find the extensions straightforward. We show the key blocks that differ from m_1. **m_2: Independent sensitivities.** The `parameters` and `model` blocks replace the single `alpha` with two independent sensitivity parameters: ```stan // m_2.stan — parameters block parameters { real<lower=0> alpha; // sensitivity for uncertain choices real<lower=0> omega; // sensitivity for risky choices matrix[K,D] beta; simplex[K-1] delta; } // m_2.stan — model block (priors only; likelihood is identical to m_1) model { alpha ~ lognormal(0, 1); omega ~ lognormal(0, 1); // independent prior, same family as alpha to_vector(beta) ~ std_normal(); delta ~ dirichlet(rep_vector(1, K-1)); // ... likelihood unchanged ... } ``` The only structural change in `transformed parameters` is that uncertain choice probabilities use `softmax(alpha * ...)` while risky choice probabilities use `softmax(omega * ...)`. **m_3: Proportional sensitivities.** The `parameters` block replaces ω with κ, and ω is derived in `transformed parameters`: ```stan // m_3.stan — parameters block parameters { real<lower=0> alpha; // sensitivity for uncertain choices real<lower=0> kappa; // association parameter (omega = kappa * alpha) matrix[K,D] beta; simplex[K-1] delta; } // m_3.stan — transformed parameters (key addition) transformed parameters { real<lower=0> omega = kappa * alpha; // derived risky sensitivity // ... remainder identical to m_2 ... } // m_3.stan — model block (priors) model { alpha ~ lognormal(0, 1); kappa ~ lognormal(0, 0.5); // tighter prior, centered at kappa=1 (m_1) to_vector(beta) ~ std_normal(); delta ~ dirichlet(rep_vector(1, K-1)); // ... likelihood unchanged ... } ``` The full Stan files (`m_2.stan`, `m_3.stan`) are available in the `models/` directory alongside their simulation (`m_2_sim.stan`, `m_3_sim.stan`) and SBC (`m_2_sbc.stan`, `m_3_sbc.stan`) counterparts. ### Prior Analysis for κ and Implied ω [Report 3](03_prior_analysis.qmd) established the prior predictive methodology for the original parameters. Here we apply the same logic to the new parameters introduced in m_3. Since κ directly scales the relationship between α and ω, its prior has substantive implications that warrant explicit examination. ```{python} #| label: fig-kappa-prior-analysis #| fig-cap: "Prior analysis for the association parameter κ and the implied marginal prior on ω in m_3. Left: the LogNormal(0, 0.5) prior on κ, centered at 1 (the m_1 restriction). Center: the joint prior on (α, ω) under m_3, showing the proportional constraint. Right: comparison of the marginal prior on ω in m_2 (direct LogNormal(0, 1)) versus m_3 (implied LogNormal(0, 1.12))." fig, axes = plt.subplots(1, 3, figsize=(16, 5)) # Panel 1: Prior on kappa ax = axes[0] kappa_grid = np.linspace(0.01, 5, 500) kappa_pdf = stats.lognorm(s=0.5, scale=np.exp(0)).pdf(kappa_grid) ax.plot(kappa_grid, kappa_pdf, 'darkorange', linewidth=2) ax.fill_between(kappa_grid, kappa_pdf, alpha=0.2, color='darkorange') ax.axvline(x=1.0, color='red', linestyle='--', linewidth=1.5, label='κ = 1 (m_1)') # 95% prior interval kappa_lower = stats.lognorm(s=0.5, scale=1.0).ppf(0.025) kappa_upper = stats.lognorm(s=0.5, scale=1.0).ppf(0.975) ax.axvspan(kappa_lower, kappa_upper, alpha=0.1, color='darkorange', label=f'95% PI: [{kappa_lower:.2f}, {kappa_upper:.2f}]') ax.set_xlabel('κ', fontsize=12) ax.set_ylabel('Density', fontsize=12) ax.set_title('Prior on κ: LogNormal(0, 0.5)', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) # Panel 2: Joint prior (alpha, omega) under m_3 ax = axes[1] n_prior_draws = 5000 alpha_draws = np.random.lognormal(0, 1, n_prior_draws) kappa_draws = np.random.lognormal(0, 0.5, n_prior_draws) omega_draws_m3 = kappa_draws * alpha_draws # Clip for visualization mask = (alpha_draws < 15) & (omega_draws_m3 < 15) ax.scatter(alpha_draws[mask], omega_draws_m3[mask], alpha=0.15, s=10, c='darkorange') ax.plot([0, 15], [0, 15], 'r--', linewidth=1.5, label='ω = α (κ = 1)') ax.set_xlabel('α (uncertain sensitivity)', fontsize=11) ax.set_ylabel('ω (risky sensitivity)', fontsize=11) ax.set_title('Joint Prior on (α, ω) under m_3', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) ax.set_xlim(0, 15) ax.set_ylim(0, 15) ax.set_aspect('equal') # Panel 3: Marginal prior on omega — m_2 vs m_3 ax = axes[2] omega_grid = np.linspace(0.01, 12, 500) # m_2: omega ~ LogNormal(0, 1) omega_pdf_m2 = stats.lognorm(s=1.0, scale=np.exp(0)).pdf(omega_grid) ax.plot(omega_grid, omega_pdf_m2, 'forestgreen', linewidth=2, label='m_2: LogNormal(0, 1)') ax.fill_between(omega_grid, omega_pdf_m2, alpha=0.15, color='forestgreen') # m_3: omega = kappa * alpha ~ LogNormal(0, sqrt(1^2 + 0.5^2)) sigma_implied = np.sqrt(1.0**2 + 0.5**2) omega_pdf_m3 = stats.lognorm(s=sigma_implied, scale=np.exp(0)).pdf(omega_grid) ax.plot(omega_grid, omega_pdf_m3, 'darkorange', linewidth=2, label=f'm_3: LogNormal(0, {sigma_implied:.2f})') ax.fill_between(omega_grid, omega_pdf_m3, alpha=0.15, color='darkorange') ax.set_xlabel('ω', fontsize=12) ax.set_ylabel('Density', fontsize=12) ax.set_title('Marginal Prior on ω: m_2 vs m_3', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() print(f"κ prior 95% interval: [{kappa_lower:.2f}, {kappa_upper:.2f}]") print(f"Implied σ for ω in m_3: {sigma_implied:.3f}") print(f"m_2 ω median: {np.exp(0):.2f}, m_3 ω median: {np.exp(0):.2f} (both centered at 1)") print(f"m_2 ω 95% PI: [{stats.lognorm(s=1.0, scale=1.0).ppf(0.025):.2f}, {stats.lognorm(s=1.0, scale=1.0).ppf(0.975):.2f}]") print(f"m_3 ω 95% PI: [{stats.lognorm(s=sigma_implied, scale=1.0).ppf(0.025):.2f}, {stats.lognorm(s=sigma_implied, scale=1.0).ppf(0.975):.2f}]") ``` The prior analysis confirms several important properties: 1. The $\text{LogNormal}(0, 0.5)$ prior on κ is centered at 1 (the m_1 restriction) and assigns most mass to κ ∈ [0.37, 2.72], allowing meaningful departures without extreme values. 2. The joint prior on (α, ω) under m_3 shows the proportional constraint clearly: draws cluster along rays from the origin, with the identity line (κ = 1) as the central tendency. 3. The implied marginal prior on ω in m_3 is slightly wider than in m_2 (σ = 1.12 vs. 1.0), reflecting the additional uncertainty introduced by the κ parameter. Both priors are centered at the same median (ω = 1). ## Study Design We use matched study designs across all three models to enable fair comparison. The uncertain and risky problem structures are identical: ```{python} #| label: study-config #| echo: true from utils.study_design_m1 import StudyDesignM1 # Shared study design configuration config = { "M": 25, # Uncertain decision problems "N": 25, # Risky decision problems "K": 3, # Consequences "D": 5, # Feature dimensions "R": 15, # Uncertain alternatives "S": 15, # Risky alternatives "min_alts_per_problem": 2, "max_alts_per_problem": 5, "feature_dist": "normal", "feature_params": {"loc": 0, "scale": 1}, } K = config["K"] D = config["D"] K_minus_1 = K - 1 print("Study Design (shared across m_1, m_2, m_3):") print(f" M = {config['M']} uncertain problems") print(f" N = {config['N']} risky problems") print(f" K = {config['K']} consequences") print(f" D = {config['D']} feature dimensions") print(f" R = {config['R']} uncertain alternatives") print(f" S = {config['S']} risky alternatives") print(f" Total choices per subject: ~{(config['M'] + config['N']) * 3.5:.0f}") ``` ```{python} #| label: create-study-design #| output: false # Create a single study design used for all three models study = StudyDesignM1( M=config["M"], N=config["N"], K=config["K"], D=config["D"], R=config["R"], S=config["S"], min_alts_per_problem=config["min_alts_per_problem"], max_alts_per_problem=config["max_alts_per_problem"], risky_probs="random", feature_dist=config["feature_dist"], feature_params=config["feature_params"], design_name="m2_m3_validation" ) study.generate() ``` ## Parameter Recovery We perform parameter recovery for m_1, m_2, and m_3 using identical study designs and iteration counts. This extends the analysis in [Report 5](05_adding_risky_choices.qmd) by adding ω recovery diagnostics for the generalized models. ### Running Parameter Recovery ```{python} #| label: run-recovery-m1 #| output: false from analysis.parameter_recovery import ParameterRecovery # --- m_1 Recovery --- output_dir_m1 = tempfile.mkdtemp(prefix="param_recovery_m1_compare_") recovery_m1 = ParameterRecovery( inference_model_path=os.path.join(project_root, "models", "m_1.stan"), sim_model_path=os.path.join(project_root, "models", "m_1_sim.stan"), study_design=study, output_dir=output_dir_m1, n_mcmc_samples=1000, n_mcmc_chains=4, n_iterations=50 ) true_params_m1, posterior_summaries_m1 = recovery_m1.run() print(f"m_1: {len(true_params_m1)} successful iterations") ``` ```{python} #| label: run-recovery-m2 #| output: false # --- m_2 Recovery --- output_dir_m2 = tempfile.mkdtemp(prefix="param_recovery_m2_") recovery_m2 = ParameterRecovery( inference_model_path=os.path.join(project_root, "models", "m_2.stan"), sim_model_path=os.path.join(project_root, "models", "m_2_sim.stan"), study_design=study, output_dir=output_dir_m2, n_mcmc_samples=1000, n_mcmc_chains=4, n_iterations=50 ) true_params_m2, posterior_summaries_m2 = recovery_m2.run() print(f"m_2: {len(true_params_m2)} successful iterations") ``` ```{python} #| label: run-recovery-m3 #| output: false # --- m_3 Recovery --- output_dir_m3 = tempfile.mkdtemp(prefix="param_recovery_m3_") recovery_m3 = ParameterRecovery( inference_model_path=os.path.join(project_root, "models", "m_3.stan"), sim_model_path=os.path.join(project_root, "models", "m_3_sim.stan"), study_design=study, output_dir=output_dir_m3, n_mcmc_samples=1000, n_mcmc_chains=4, n_iterations=50 ) true_params_m3, posterior_summaries_m3 = recovery_m3.run() print(f"m_3: {len(true_params_m3)} successful iterations") ``` ### MCMC Diagnostics Before interpreting recovery statistics, we verify that the MCMC sampler converged properly across all three models. With 50 iterations per model and 4 chains each, we aggregate convergence diagnostics to confirm that posterior summaries reflect genuine posterior behavior. ```{python} #| label: mcmc-diagnostics #| echo: false def summarize_mcmc_diagnostics(posterior_summaries, model_name, param_pattern): """Summarize MCMC diagnostics across all parameter recovery iterations.""" rhat_max_list = [] ess_min_list = [] # Detect ESS column name ess_col = None if len(posterior_summaries) > 0: cols = posterior_summaries[0].columns for candidate in ['ESS_bulk', 'N_Eff', 'ESS']: if candidate in cols: ess_col = candidate break for s in posterior_summaries: param_rows = s[s.index.str.match(param_pattern)] if 'R_hat' in param_rows.columns: rhat_max_list.append(param_rows['R_hat'].max()) elif 'Rhat' in param_rows.columns: rhat_max_list.append(param_rows['Rhat'].max()) if ess_col and ess_col in param_rows.columns: ess_min_list.append(param_rows[ess_col].min()) rhat_max = np.array(rhat_max_list) if rhat_max_list else np.array([]) ess_min = np.array(ess_min_list) if ess_min_list else np.array([]) return rhat_max, ess_min # Diagnostics for each model print("=== MCMC Convergence Diagnostics ===\n") diag_info = [ ('m_1', posterior_summaries_m1, r'^(alpha|beta\[|delta\[|upsilon\[)'), ('m_2', posterior_summaries_m2, r'^(alpha|omega|beta\[|delta\[|upsilon\[)'), ('m_3', posterior_summaries_m3, r'^(alpha|kappa|omega|beta\[|delta\[|upsilon\[)'), ] for name, summaries, pattern in diag_info: rhat_max, ess_min = summarize_mcmc_diagnostics(summaries, name, pattern) n_iter = len(summaries) print(f"{name} ({n_iter} iterations):") if len(rhat_max) > 0: print(f" Max R̂ per iteration:") print(f" Median: {np.median(rhat_max):.4f}") print(f" 95th percentile: {np.percentile(rhat_max, 95):.4f}") print(f" Maximum: {rhat_max.max():.4f}") print(f" Iterations with max R̂ > 1.01: {(rhat_max > 1.01).sum()}/{len(rhat_max)}") if len(ess_min) > 0: print(f" Min ESS per iteration:") print(f" Median: {np.median(ess_min):.0f}") print(f" 5th percentile: {np.percentile(ess_min, 5):.0f}") print(f" Minimum: {ess_min.min():.0f}") print() ``` ::: {.callout-note} ## Diagnostic Assessment The diagnostics confirm that all three models converge reliably across 50 recovery iterations, meeting standard thresholds ($\hat{R} < 1.01$, ESS well above recommended minimums for all parameters). Notably, m_2—which introduces an additional free parameter ω—does not show degraded convergence relative to m_1, indicating that the posterior geometry remains well-behaved despite the added flexibility. Per-iteration diagnostic files (including divergence counts) are saved by the `ParameterRecovery` class and can be inspected individually. ::: ### Computing Recovery Metrics ```{python} #| label: compute-recovery-metrics #| output: false def compute_scalar_metrics(true_params, posterior_summaries, param_name): """Compute bias, RMSE, coverage, and CI width for a scalar parameter.""" true_vals = np.array([p[param_name] for p in true_params]) mean_vals = np.array([s.loc[param_name, 'Mean'] for s in posterior_summaries]) lower_vals = np.array([s.loc[param_name, '5%'] for s in posterior_summaries]) upper_vals = np.array([s.loc[param_name, '95%'] for s in posterior_summaries]) return { 'true': true_vals, 'mean': mean_vals, 'lower': lower_vals, 'upper': upper_vals, 'bias': np.mean(mean_vals - true_vals), 'rmse': np.sqrt(np.mean((mean_vals - true_vals)**2)), 'coverage': np.mean((true_vals >= lower_vals) & (true_vals <= upper_vals)), 'ci_width': np.mean(upper_vals - lower_vals), } def compute_delta_metrics(true_params, posterior_summaries, K_minus_1): """Compute recovery metrics for all delta components.""" results = [] for k in range(K_minus_1): param_name = f"delta[{k+1}]" true_vals = np.array([p['delta'][k] for p in true_params]) mean_vals = np.array([s.loc[param_name, 'Mean'] for s in posterior_summaries]) lower_vals = np.array([s.loc[param_name, '5%'] for s in posterior_summaries]) upper_vals = np.array([s.loc[param_name, '95%'] for s in posterior_summaries]) results.append({ 'parameter': f'δ_{k+1}', 'true': true_vals, 'mean': mean_vals, 'lower': lower_vals, 'upper': upper_vals, 'bias': np.mean(mean_vals - true_vals), 'rmse': np.sqrt(np.mean((mean_vals - true_vals)**2)), 'coverage': np.mean((true_vals >= lower_vals) & (true_vals <= upper_vals)), 'ci_width': np.mean(upper_vals - lower_vals), }) return results # --- m_1 Metrics --- alpha_m1 = compute_scalar_metrics(true_params_m1, posterior_summaries_m1, 'alpha') delta_m1 = compute_delta_metrics(true_params_m1, posterior_summaries_m1, K_minus_1) # --- m_2 Metrics --- alpha_m2 = compute_scalar_metrics(true_params_m2, posterior_summaries_m2, 'alpha') omega_m2 = compute_scalar_metrics(true_params_m2, posterior_summaries_m2, 'omega') delta_m2 = compute_delta_metrics(true_params_m2, posterior_summaries_m2, K_minus_1) # --- m_3 Metrics --- alpha_m3 = compute_scalar_metrics(true_params_m3, posterior_summaries_m3, 'alpha') kappa_m3 = compute_scalar_metrics(true_params_m3, posterior_summaries_m3, 'kappa') # omega is a derived parameter in m_3; extract it directly omega_true_m3 = np.array([p['kappa'] * p['alpha'] for p in true_params_m3]) delta_m3 = compute_delta_metrics(true_params_m3, posterior_summaries_m3, K_minus_1) ``` ### α Recovery Across Models The sensitivity parameter α for uncertain choices should be well-recovered in all three models, since all share the same uncertain choice structure: ```{python} #| label: fig-alpha-recovery #| fig-cap: "α recovery comparison across m_1, m_2, and m_3. All three models should recover α well, since uncertain choice structure is identical." fig, axes = plt.subplots(1, 3, figsize=(16, 5)) models = [('m_1', alpha_m1, 'steelblue'), ('m_2', alpha_m2, 'forestgreen'), ('m_3', alpha_m3, 'darkorange')] for ax, (name, metrics, color) in zip(axes, models): ax.scatter(metrics['true'], metrics['mean'], alpha=0.7, s=60, c=color, edgecolor='white') ax.plot([0, metrics['true'].max() * 1.1], [0, metrics['true'].max() * 1.1], 'r--', linewidth=2, label='Identity') ax.set_xlabel('True α', fontsize=11) ax.set_ylabel('Estimated α', fontsize=11) ax.set_title(f'{name}: α Recovery (RMSE={metrics["rmse"]:.3f})', fontsize=12) ax.legend() ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() ``` ### ω Recovery: The New Sensitivity Parameter The critical new diagnostic is the recovery of ω, the risky-choice sensitivity. In m_2, ω is a free parameter estimated directly. In m_3, ω = κ · α is a derived quantity; we assess both κ recovery and the implied ω. ```{python} #| label: fig-omega-recovery #| fig-cap: "Recovery of the risky-choice sensitivity ω. Left: m_2 estimates ω directly as a free parameter. Center: m_3 estimates κ (the proportionality factor). Right: derived ω = κ · α from m_3, computed as the product of posterior means; see the *Note on Derived ω Estimation in m_3* below for the Jensen-inequality caveat." fig, axes = plt.subplots(1, 3, figsize=(16, 5)) # m_2: omega (free parameter) ax = axes[0] ax.scatter(omega_m2['true'], omega_m2['mean'], alpha=0.7, s=60, c='forestgreen', edgecolor='white') ax.plot([0, omega_m2['true'].max() * 1.1], [0, omega_m2['true'].max() * 1.1], 'r--', linewidth=2, label='Identity') ax.set_xlabel('True ω', fontsize=11) ax.set_ylabel('Estimated ω', fontsize=11) ax.set_title(f'm_2: ω Recovery (RMSE={omega_m2["rmse"]:.3f})', fontsize=12) ax.legend() ax.grid(True, alpha=0.3) # m_3: kappa (free parameter) ax = axes[1] ax.scatter(kappa_m3['true'], kappa_m3['mean'], alpha=0.7, s=60, c='darkorange', edgecolor='white') ax.plot([0, kappa_m3['true'].max() * 1.1], [0, kappa_m3['true'].max() * 1.1], 'r--', linewidth=2, label='Identity') ax.set_xlabel('True κ', fontsize=11) ax.set_ylabel('Estimated κ', fontsize=11) ax.set_title(f'm_3: κ Recovery (RMSE={kappa_m3["rmse"]:.3f})', fontsize=12) ax.legend() ax.grid(True, alpha=0.3) # m_3: derived omega ax = axes[2] # Get posterior omega from m_3 if available; otherwise estimate from alpha and kappa posteriors omega_mean_m3 = kappa_m3['mean'] * alpha_m3['mean'] # point estimate of derived omega ax.scatter(omega_true_m3, omega_mean_m3, alpha=0.7, s=60, c='darkorange', edgecolor='white') max_val = max(omega_true_m3.max(), omega_mean_m3.max()) * 1.1 ax.plot([0, max_val], [0, max_val], 'r--', linewidth=2, label='Identity') ax.set_xlabel('True ω (= κ · α)', fontsize=11) ax.set_ylabel('Estimated ω (= κ̂ · α̂)', fontsize=11) ax.set_title(f'm_3: Derived ω Recovery', fontsize=12) ax.legend() ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() ``` ::: {.callout-warning} ## Note on Derived ω Estimation in m_3 The estimated ω plotted above for m_3 is computed as the product of posterior means: $\hat{\omega} = \hat{\kappa} \cdot \hat{\alpha}$. This is an approximation: by Jensen's inequality, $E[\kappa \cdot \alpha] \neq E[\kappa] \cdot E[\alpha]$ in general (unless κ and α are independent, which they are *a priori* but not *a posteriori*). For a more precise estimate, one should compute ω = κα from the joint posterior draws and then take the mean. In practice, the approximation error is small when the posteriors are concentrated, as is the case here. The m_3 Stan model computes `omega = kappa * alpha` in `transformed parameters`, so the full posterior of ω is available in the fitted model output for applications requiring exact derived-quantity summaries. ::: ### κ Conditional Recovery Since κ ≈ 1 corresponds to the m_1 restriction, it is important to verify that m_3 can distinguish κ = 1 from κ ≠ 1. We examine recovery performance as a function of the true κ value: ```{python} #| label: fig-kappa-conditional #| fig-cap: "κ recovery in m_3 as a function of the true κ value. Left: scatter plot with error bars showing 90% credible intervals. Right: absolute error and CI width as functions of true κ. The vertical dashed line marks κ = 1 (the m_1 restriction)." fig, axes = plt.subplots(1, 2, figsize=(14, 5)) # Panel 1: Scatter with CIs ax = axes[0] sort_idx = np.argsort(kappa_m3['true']) for i in sort_idx: ax.plot([kappa_m3['true'][i], kappa_m3['true'][i]], [kappa_m3['lower'][i], kappa_m3['upper'][i]], color='darkorange', alpha=0.4, linewidth=1.5) ax.scatter(kappa_m3['true'], kappa_m3['mean'], alpha=0.8, s=50, c='darkorange', edgecolor='white', zorder=5) max_kappa = max(kappa_m3['true'].max(), kappa_m3['mean'].max()) * 1.1 ax.plot([0, max_kappa], [0, max_kappa], 'r--', linewidth=2, label='Identity') ax.axvline(x=1.0, color='gray', linestyle=':', linewidth=1, label='κ = 1 (m_1)') ax.set_xlabel('True κ', fontsize=11) ax.set_ylabel('Estimated κ', fontsize=11) ax.set_title('m_3: κ Recovery with 90% CIs', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) # Panel 2: Absolute error and CI width vs true kappa ax = axes[1] abs_error = np.abs(kappa_m3['mean'] - kappa_m3['true']) ci_width = kappa_m3['upper'] - kappa_m3['lower'] ax.scatter(kappa_m3['true'], abs_error, alpha=0.7, s=50, c='darkorange', edgecolor='white', label='|Error|') ax.scatter(kappa_m3['true'], ci_width, alpha=0.7, s=50, c='steelblue', edgecolor='white', marker='s', label='CI Width') ax.axvline(x=1.0, color='gray', linestyle=':', linewidth=1, label='κ = 1 (m_1)') ax.set_xlabel('True κ', fontsize=11) ax.set_ylabel('Value', fontsize=11) ax.set_title('Recovery Precision vs True κ', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() ``` ### δ Recovery Comparison Do the generalized models maintain the δ recovery quality established for m_1 in [Report 5](05_adding_risky_choices.qmd)? Adding an extra sensitivity parameter could, in principle, affect identification of other parameters if the model becomes over-parameterized: ```{python} #| label: fig-delta-recovery-comparison #| fig-cap: "δ recovery comparison across m_1, m_2, and m_3. The addition of ω should not substantially degrade δ recovery, since risky choices still constrain utilities directly." fig, axes = plt.subplots(K_minus_1, 3, figsize=(16, 4 * K_minus_1)) models = [ ('m_1', delta_m1, 'steelblue'), ('m_2', delta_m2, 'forestgreen'), ('m_3', delta_m3, 'darkorange'), ] for col, (name, delta_stats, color) in enumerate(models): for k in range(K_minus_1): ax = axes[k, col] if K_minus_1 > 1 else axes[col] ax.scatter(delta_stats[k]['true'], delta_stats[k]['mean'], alpha=0.7, s=60, c=color, edgecolor='white') ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Identity') ax.set_xlim(-0.05, 1.05) ax.set_ylim(-0.05, 1.05) ax.set_xlabel(f'True δ_{k+1}', fontsize=11) ax.set_ylabel(f'Estimated δ_{k+1}', fontsize=11) ax.set_title(f'{name}: δ_{k+1} (RMSE={delta_stats[k]["rmse"]:.3f})', fontsize=11) ax.set_aspect('equal') ax.legend(fontsize=9) ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() ``` ### Summary Statistics ```{python} #| label: tbl-recovery-summary #| tbl-cap: "Parameter recovery comparison across m_1, m_2, and m_3. RMSE, 90% CI coverage, and CI width are shown. The ω row applies only to m_2 (free) and m_3 (derived); the κ row applies only to m_3. The m_3 ω column reports the product of posterior means (mean κ × mean α), which is an approximation — see the *Note on Derived ω Estimation in m_3* for the Jensen-inequality caveat." rows = [] # Alpha rows.append({ 'Parameter': 'α', 'm_1 RMSE': f'{alpha_m1["rmse"]:.4f}', 'm_1 Cov.': f'{alpha_m1["coverage"]:.0%}', 'm_2 RMSE': f'{alpha_m2["rmse"]:.4f}', 'm_2 Cov.': f'{alpha_m2["coverage"]:.0%}', 'm_3 RMSE': f'{alpha_m3["rmse"]:.4f}', 'm_3 Cov.': f'{alpha_m3["coverage"]:.0%}', }) # Omega (m_2 only as free parameter) rows.append({ 'Parameter': 'ω', 'm_1 RMSE': '—', 'm_1 Cov.': '—', 'm_2 RMSE': f'{omega_m2["rmse"]:.4f}', 'm_2 Cov.': f'{omega_m2["coverage"]:.0%}', 'm_3 RMSE': '(derived)', 'm_3 Cov.': '(derived)', }) # Kappa (m_3 only) rows.append({ 'Parameter': 'κ', 'm_1 RMSE': '—', 'm_1 Cov.': '—', 'm_2 RMSE': '—', 'm_2 Cov.': '—', 'm_3 RMSE': f'{kappa_m3["rmse"]:.4f}', 'm_3 Cov.': f'{kappa_m3["coverage"]:.0%}', }) # Delta for k in range(K_minus_1): rows.append({ 'Parameter': f'δ_{k+1}', 'm_1 RMSE': f'{delta_m1[k]["rmse"]:.4f}', 'm_1 Cov.': f'{delta_m1[k]["coverage"]:.0%}', 'm_2 RMSE': f'{delta_m2[k]["rmse"]:.4f}', 'm_2 Cov.': f'{delta_m2[k]["coverage"]:.0%}', 'm_3 RMSE': f'{delta_m3[k]["rmse"]:.4f}', 'm_3 Cov.': f'{delta_m3[k]["coverage"]:.0%}', }) summary_df = pd.DataFrame(rows) print(summary_df.to_string(index=False)) ``` ```{python} #| label: fig-ci-width-comparison #| fig-cap: "90% credible interval widths across models and parameters. Wider intervals indicate greater posterior uncertainty. The key question is whether adding ω introduces meaningful increases in uncertainty for the shared parameters (α, δ)." # Collect CI widths for bar plot params = ['α'] + [f'δ_{k+1}' for k in range(K_minus_1)] ci_m1 = [alpha_m1['ci_width']] + [delta_m1[k]['ci_width'] for k in range(K_minus_1)] ci_m2 = [alpha_m2['ci_width']] + [delta_m2[k]['ci_width'] for k in range(K_minus_1)] ci_m3 = [alpha_m3['ci_width']] + [delta_m3[k]['ci_width'] for k in range(K_minus_1)] x = np.arange(len(params)) width = 0.25 fig, ax = plt.subplots(figsize=(10, 5)) ax.bar(x - width, ci_m1, width, label='m_1', color='steelblue', alpha=0.7) ax.bar(x, ci_m2, width, label='m_2', color='forestgreen', alpha=0.7) ax.bar(x + width, ci_m3, width, label='m_3', color='darkorange', alpha=0.7) ax.set_xlabel('Parameter', fontsize=12) ax.set_ylabel('90% CI Width', fontsize=12) ax.set_title('Credible Interval Widths: Shared Parameters', fontsize=13) ax.set_xticks(x) ax.set_xticklabels(params) ax.legend() ax.grid(True, alpha=0.3, axis='y') plt.tight_layout() plt.show() ``` ::: {.callout-tip} ## Parameter Recovery: Key Findings **α (uncertain sensitivity):** Should be well-recovered across all three models, since the uncertain choice likelihood is structurally identical in m_1, m_2, and m_3. **ω (risky sensitivity):** In m_2, ω is a free parameter identified directly from risky choices. In m_3, ω = κ · α is identified through the joint posterior of κ and α. The m_3 parameterization may yield tighter uncertainty when the proportional assumption approximately holds. **κ (association parameter):** Unique to m_3, κ captures the ratio ω/α. Its recovery depends on how well both α and ω are individually constrained by the data. **δ (utility increments):** Recovery should remain strong across all models, since risky choices continue to provide direct information about utilities regardless of whether sensitivity is shared or separate. ::: ### Posterior Contraction for κ To assess how much the data inform κ beyond the prior, we examine posterior contraction: the ratio of posterior width to prior width. Large contraction indicates the data are genuinely informative about the sensitivity ratio. ```{python} #| label: fig-kappa-contraction #| fig-cap: "Posterior contraction for κ in m_3. Left: prior (dashed) and representative posteriors (solid) for selected recovery iterations. Right: distribution of posterior-to-prior width ratios across all 50 iterations. A ratio below 1 indicates the posterior is narrower than the prior (data are informative)." fig, axes = plt.subplots(1, 2, figsize=(14, 5)) # Panel 1: Prior vs posterior for a few representative iterations ax = axes[0] kappa_grid = np.linspace(0.01, 5, 500) prior_pdf = stats.lognorm(s=0.5, scale=1.0).pdf(kappa_grid) ax.plot(kappa_grid, prior_pdf, 'k--', linewidth=2, label='Prior: LogNormal(0, 0.5)') # Show a few representative posteriors using normal approximation from posterior summaries n_show = min(5, len(posterior_summaries_m3)) colors = plt.cm.Oranges(np.linspace(0.4, 0.9, n_show)) for i in range(n_show): post_mean = posterior_summaries_m3[i].loc['kappa', 'Mean'] post_std = posterior_summaries_m3[i].loc['kappa', 'StdDev'] # Approximate posterior as lognormal with matching moments if post_mean > 0 and post_std > 0: sigma2 = np.log(1 + (post_std / post_mean)**2) mu = np.log(post_mean) - sigma2/2 post_pdf = stats.lognorm(s=np.sqrt(sigma2), scale=np.exp(mu)).pdf(kappa_grid) true_k = true_params_m3[i]['kappa'] ax.plot(kappa_grid, post_pdf, color=colors[i], linewidth=1.5, label=f'Posterior (true κ={true_k:.2f})') ax.set_xlabel('κ', fontsize=11) ax.set_ylabel('Density', fontsize=11) ax.set_title('Prior vs Posterior for κ', fontsize=12) ax.legend(fontsize=8, loc='upper right') ax.grid(True, alpha=0.3) ax.set_xlim(0, 5) # Panel 2: Contraction ratio distribution ax = axes[1] prior_width = stats.lognorm(s=0.5, scale=1.0).ppf(0.95) - stats.lognorm(s=0.5, scale=1.0).ppf(0.05) posterior_widths = kappa_m3['upper'] - kappa_m3['lower'] contraction_ratios = posterior_widths / prior_width ax.hist(contraction_ratios, bins=15, alpha=0.7, color='darkorange', edgecolor='white') ax.axvline(x=1.0, color='red', linestyle='--', linewidth=2, label='No contraction') ax.axvline(x=np.median(contraction_ratios), color='black', linestyle='-', linewidth=2, label=f'Median: {np.median(contraction_ratios):.2f}') ax.set_xlabel('Posterior / Prior Width Ratio', fontsize=11) ax.set_ylabel('Count', fontsize=11) ax.set_title('κ Posterior Contraction', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() print(f"Prior 90% interval width: {prior_width:.3f}") print(f"Median posterior 90% interval width: {np.median(posterior_widths):.3f}") print(f"Median contraction ratio: {np.median(contraction_ratios):.3f}") print(f"Iterations with contraction < 1: {(contraction_ratios < 1).sum()}/{len(contraction_ratios)}") ``` ## Simulation-Based Calibration We now perform SBC for m_2 and m_3, following the methodology established for m_0 and m_1 in [Report 6](06_sbc_validation.qmd). ### SBC Configuration ```{python} #| label: sbc-config #| echo: true # SBC configuration matching Report 6 for comparability n_sbc_sims = 999 # Number of SBC iterations n_mcmc_chains = 1 # Single chain (standard for SBC) thin = 4 # Thinning factor n_mcmc_samples = thin * n_sbc_sims # 3996 total draws → 999 post-thinned max_rank = n_mcmc_samples // thin # 999 (= L) n_bins = 20 # Histogram bins expected_per_bin = n_sbc_sims / n_bins print("SBC Configuration (matching Report 6):") print(f" Simulations (N): {n_sbc_sims}") print(f" MCMC samples per sim: {n_mcmc_samples}") print(f" Chains: {n_mcmc_chains}") print(f" Thinning factor: {thin}") print(f" Post-thinned draws (L): {max_rank}") print(f" Possible rank values: {max_rank + 1}") print(f" Histogram bins: {n_bins}") print(f" Expected counts/bin: {expected_per_bin:.1f}") ``` ### Running SBC ```{python} #| label: run-sbc-m1 #| output: false from utils.study_design_m1 import StudyDesignM1 from analysis.sbc import SimulationBasedCalibration # Create SBC study design (shared across models) study_sbc = StudyDesignM1( M=config["M"], N=config["N"], K=config["K"], D=config["D"], R=config["R"], S=config["S"], min_alts_per_problem=config["min_alts_per_problem"], max_alts_per_problem=config["max_alts_per_problem"], risky_probs="random", feature_dist=config["feature_dist"], feature_params=config["feature_params"], design_name="sbc_m1_m2_m3" ) study_sbc.generate() # --- SBC for m_1 --- output_dir_sbc_m1 = tempfile.mkdtemp(prefix="sbc_m1_compare_") sbc_m1 = SimulationBasedCalibration( sbc_model_path=os.path.join(project_root, "models", "m_1_sbc.stan"), study_design=study_sbc, output_dir=output_dir_sbc_m1, n_sbc_sims=n_sbc_sims, n_mcmc_samples=n_mcmc_samples, n_mcmc_chains=n_mcmc_chains, thin=thin ) ranks_m1, true_params_sbc_m1 = sbc_m1.run() print(f"m_1 SBC: {ranks_m1.shape[0]} simulations, {ranks_m1.shape[1]} parameters") ``` ```{python} #| label: run-sbc-m2 #| output: false # --- SBC for m_2 --- output_dir_sbc_m2 = tempfile.mkdtemp(prefix="sbc_m2_") sbc_m2 = SimulationBasedCalibration( sbc_model_path=os.path.join(project_root, "models", "m_2_sbc.stan"), study_design=study_sbc, output_dir=output_dir_sbc_m2, n_sbc_sims=n_sbc_sims, n_mcmc_samples=n_mcmc_samples, n_mcmc_chains=n_mcmc_chains, thin=thin ) ranks_m2, true_params_sbc_m2 = sbc_m2.run() print(f"m_2 SBC: {ranks_m2.shape[0]} simulations, {ranks_m2.shape[1]} parameters") ``` ```{python} #| label: run-sbc-m3 #| output: false # --- SBC for m_3 --- output_dir_sbc_m3 = tempfile.mkdtemp(prefix="sbc_m3_") sbc_m3 = SimulationBasedCalibration( sbc_model_path=os.path.join(project_root, "models", "m_3_sbc.stan"), study_design=study_sbc, output_dir=output_dir_sbc_m3, n_sbc_sims=n_sbc_sims, n_mcmc_samples=n_mcmc_samples, n_mcmc_chains=n_mcmc_chains, thin=thin ) ranks_m3, true_params_sbc_m3 = sbc_m3.run() print(f"m_3 SBC: {ranks_m3.shape[0]} simulations, {ranks_m3.shape[1]} parameters") ``` ### α Rank Distributions ```{python} #| label: fig-alpha-sbc #| fig-cap: "SBC rank distributions for α across all three models. Uniformity indicates correct posterior calibration for the uncertain-choice sensitivity parameter." fig, axes = plt.subplots(1, 3, figsize=(16, 4)) expected_count = n_sbc_sims / n_bins models_sbc = [ ('m_1', ranks_m1, 'steelblue'), ('m_2', ranks_m2, 'forestgreen'), ('m_3', ranks_m3, 'darkorange'), ] for ax, (name, ranks, color) in zip(axes, models_sbc): ax.hist(ranks[:, 0], bins=n_bins, alpha=0.7, color=color, edgecolor='white') ax.axhline(y=expected_count, color='red', linestyle='--', linewidth=2, label=f'Uniform (E={expected_count:.1f})') ax.set_xlabel('Rank', fontsize=11) ax.set_ylabel('Count', fontsize=11) ax.set_title(f'{name}: α Rank Distribution', fontsize=12) ax.legend() ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() # Chi-square tests for name, ranks, _ in models_sbc: chi2, p = stats.chisquare(np.histogram(ranks[:, 0], bins=n_bins)[0]) print(f" {name} α: χ² = {chi2:.2f}, p = {p:.3f}") ``` ### ω Rank Distributions This is the key new diagnostic: does the risky-choice sensitivity parameter exhibit proper calibration? ```{python} #| label: fig-omega-sbc #| fig-cap: "SBC rank distributions for the risky-choice sensitivity parameter. Left: ω in m_2 (free parameter, index 1). Right: κ in m_3 (association parameter, index 1). Uniformity indicates correct calibration." # In the SBC models, the parameter ordering in pars_ is: # m_2: [alpha, omega, beta (K*D), delta (K-1)] # m_3: [alpha, kappa, beta (K*D), delta (K-1)] omega_idx = 1 # Second parameter in both m_2 and m_3 fig, axes = plt.subplots(1, 2, figsize=(12, 4)) # m_2: omega ax = axes[0] ax.hist(ranks_m2[:, omega_idx], bins=n_bins, alpha=0.7, color='forestgreen', edgecolor='white') ax.axhline(y=expected_count, color='red', linestyle='--', linewidth=2, label=f'Uniform (E={expected_count:.1f})') ax.set_xlabel('Rank', fontsize=11) ax.set_ylabel('Count', fontsize=11) ax.set_title('m_2: ω Rank Distribution', fontsize=12) ax.legend() ax.grid(True, alpha=0.3) # m_3: kappa ax = axes[1] ax.hist(ranks_m3[:, omega_idx], bins=n_bins, alpha=0.7, color='darkorange', edgecolor='white') ax.axhline(y=expected_count, color='red', linestyle='--', linewidth=2, label=f'Uniform (E={expected_count:.1f})') ax.set_xlabel('Rank', fontsize=11) ax.set_ylabel('Count', fontsize=11) ax.set_title('m_3: κ Rank Distribution', fontsize=12) ax.legend() ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() # Formal tests chi2_m2_omega, p_m2_omega = stats.chisquare(np.histogram(ranks_m2[:, omega_idx], bins=n_bins)[0]) chi2_m3_kappa, p_m3_kappa = stats.chisquare(np.histogram(ranks_m3[:, omega_idx], bins=n_bins)[0]) print(f"\nRisky Sensitivity Uniformity Tests:") print(f" m_2 ω: χ² = {chi2_m2_omega:.2f}, p = {p_m2_omega:.3f}") print(f" m_3 κ: χ² = {chi2_m3_kappa:.2f}, p = {p_m3_kappa:.3f}") ``` ::: {.callout-note} ## Interpreting ω and κ Calibration - In m_2, ω is identified directly from risky choices—its calibration parallels α's calibration from uncertain choices. Since risky choices depend on ω and δ (but not β), the identification pathway is clean. - In m_3, κ is identified from the *joint* information in risky and uncertain choices: the data constrain both α (via uncertain choices) and ω = κα (via risky choices), from which κ is inferred. This indirect pathway may result in slightly wider posteriors for κ but should not affect calibration (i.e., ranks should still be uniform if the model is correctly specified). ::: ### δ Rank Distributions ```{python} #| label: fig-delta-sbc #| fig-cap: "SBC rank distributions for δ parameters across all three models. The delta parameters are indexed after the scalar sensitivity parameters and the K×D beta matrix." # Delta parameter indices differ across models: # m_1: [alpha (1), beta (K*D), delta (K-1)] → delta starts at 1 + K*D # m_2: [alpha (1), omega (1), beta (K*D), delta (K-1)] → delta starts at 2 + K*D # m_3: [alpha (1), kappa (1), beta (K*D), delta (K-1)] → delta starts at 2 + K*D delta_start_m1 = 1 + K * D delta_start_m2 = 2 + K * D delta_start_m3 = 2 + K * D fig, axes = plt.subplots(3, K_minus_1, figsize=(6 * K_minus_1, 12)) model_delta_info = [ ('m_1', ranks_m1, delta_start_m1, 'steelblue'), ('m_2', ranks_m2, delta_start_m2, 'forestgreen'), ('m_3', ranks_m3, delta_start_m3, 'darkorange'), ] for row, (name, ranks, delta_start, color) in enumerate(model_delta_info): for k in range(K_minus_1): param_idx = delta_start + k ax = axes[row, k] if K_minus_1 > 1 else axes[row] ax.hist(ranks[:, param_idx], bins=n_bins, alpha=0.7, color=color, edgecolor='white') ax.axhline(y=expected_count, color='red', linestyle='--', linewidth=2, label=f'Uniform (E={expected_count:.1f})') ax.set_xlabel('Rank', fontsize=11) ax.set_ylabel('Count', fontsize=11) ax.set_title(f'{name}: δ_{k+1} Rank Distribution', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) plt.tight_layout() plt.show() ``` ```{python} #| label: delta-sbc-tests #| echo: false print("δ Parameter Uniformity Tests:") print("-" * 60) for name, ranks, delta_start, _ in model_delta_info: for k in range(K_minus_1): param_idx = delta_start + k counts = np.histogram(ranks[:, param_idx], bins=n_bins)[0] chi2, p = stats.chisquare(counts) ks_stat, ks_p = stats.kstest(ranks[:, param_idx] / max_rank, 'uniform') print(f" {name} δ_{k+1}: χ² = {chi2:.2f}, p = {p:.3f} | KS = {ks_stat:.3f}, p = {ks_p:.3f}") ``` ### ECDF Comparison ```{python} #| label: fig-ecdf-delta #| fig-cap: "ECDF plots for δ parameters across all three models. The shaded band shows the 95% Kolmogorov-Smirnov confidence region for n=999 simulations." fig, axes = plt.subplots(1, K_minus_1, figsize=(6 * K_minus_1, 5)) if K_minus_1 == 1: axes = [axes] for k in range(K_minus_1): ax = axes[k] for name, ranks, delta_start, color in model_delta_info: param_idx = delta_start + k ranks_norm = np.sort(ranks[:, param_idx]) / max_rank ecdf = np.arange(1, len(ranks_norm) + 1) / len(ranks_norm) ax.step(ranks_norm, ecdf, where='post', label=name, color=color, linewidth=2) ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Uniform') # 95% confidence band n = n_sbc_sims epsilon = np.sqrt(np.log(2/0.05) / (2 * n)) ax.fill_between([0, 1], [0 - epsilon, 1 - epsilon], [0 + epsilon, 1 + epsilon], alpha=0.15, color='red', label='95% CI') ax.set_xlabel('Normalized Rank', fontsize=11) ax.set_ylabel('ECDF', fontsize=11) ax.set_title(f'δ_{k+1} ECDF Comparison', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) ax.set_xlim(0, 1) ax.set_ylim(0, 1) plt.tight_layout() plt.show() ``` ### ECDF for Sensitivity Parameters ```{python} #| label: fig-ecdf-sensitivity #| fig-cap: "ECDF plots for sensitivity parameters. Left: α across all models. Center: ω in m_2. Right: κ in m_3." fig, axes = plt.subplots(1, 3, figsize=(16, 5)) # Alpha ECDF across all models ax = axes[0] for name, ranks, _, color in model_delta_info: ranks_norm = np.sort(ranks[:, 0]) / max_rank ecdf = np.arange(1, len(ranks_norm) + 1) / len(ranks_norm) ax.step(ranks_norm, ecdf, where='post', label=name, color=color, linewidth=2) ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Uniform') epsilon = np.sqrt(np.log(2/0.05) / (2 * n_sbc_sims)) ax.fill_between([0, 1], [0 - epsilon, 1 - epsilon], [0 + epsilon, 1 + epsilon], alpha=0.15, color='red', label='95% CI') ax.set_xlabel('Normalized Rank', fontsize=11) ax.set_ylabel('ECDF', fontsize=11) ax.set_title('α ECDF Comparison', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) ax.set_xlim(0, 1) ax.set_ylim(0, 1) # m_2 omega ECDF ax = axes[1] ranks_norm = np.sort(ranks_m2[:, omega_idx]) / max_rank ecdf = np.arange(1, len(ranks_norm) + 1) / len(ranks_norm) ax.step(ranks_norm, ecdf, where='post', label='m_2 ω', color='forestgreen', linewidth=2) ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Uniform') ax.fill_between([0, 1], [0 - epsilon, 1 - epsilon], [0 + epsilon, 1 + epsilon], alpha=0.15, color='red', label='95% CI') ax.set_xlabel('Normalized Rank', fontsize=11) ax.set_ylabel('ECDF', fontsize=11) ax.set_title('m_2: ω ECDF', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) ax.set_xlim(0, 1) ax.set_ylim(0, 1) # m_3 kappa ECDF ax = axes[2] ranks_norm = np.sort(ranks_m3[:, omega_idx]) / max_rank ecdf = np.arange(1, len(ranks_norm) + 1) / len(ranks_norm) ax.step(ranks_norm, ecdf, where='post', label='m_3 κ', color='darkorange', linewidth=2) ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Uniform') ax.fill_between([0, 1], [0 - epsilon, 1 - epsilon], [0 + epsilon, 1 + epsilon], alpha=0.15, color='red', label='95% CI') ax.set_xlabel('Normalized Rank', fontsize=11) ax.set_ylabel('ECDF', fontsize=11) ax.set_title('m_3: κ ECDF', fontsize=12) ax.legend(fontsize=9) ax.grid(True, alpha=0.3) ax.set_xlim(0, 1) ax.set_ylim(0, 1) plt.tight_layout() plt.show() ``` ### β Calibration Summary ```{python} #| label: fig-beta-sbc-summary #| fig-cap: "Summary of β parameter SBC calibration across all three models. Chi-square p-values for each of the K×D β parameters are shown. Values above the red line indicate acceptable calibration." beta_start_m1 = 1 beta_start_m2 = 2 beta_start_m3 = 2 beta_pvals = {} for name, ranks, beta_start in [('m_1', ranks_m1, beta_start_m1), ('m_2', ranks_m2, beta_start_m2), ('m_3', ranks_m3, beta_start_m3)]: pvals = [] for idx in range(beta_start, beta_start + K * D): counts = np.histogram(ranks[:, idx], bins=n_bins)[0] _, p = stats.chisquare(counts) pvals.append(p) beta_pvals[name] = pvals fig, ax = plt.subplots(figsize=(12, 5)) x = np.arange(K * D) width = 0.25 ax.bar(x - width, beta_pvals['m_1'], width, label='m_1', color='steelblue', alpha=0.7) ax.bar(x, beta_pvals['m_2'], width, label='m_2', color='forestgreen', alpha=0.7) ax.bar(x + width, beta_pvals['m_3'], width, label='m_3', color='darkorange', alpha=0.7) ax.axhline(y=0.05, color='red', linestyle='--', linewidth=2, label='α = 0.05') ax.set_xlabel('β Parameter Index', fontsize=11) ax.set_ylabel('Chi-square p-value', fontsize=11) ax.set_title('β Parameter Calibration (p-values)', fontsize=12) ax.legend() ax.grid(True, alpha=0.3, axis='y') plt.tight_layout() plt.show() for name in ['m_1', 'm_2', 'm_3']: n_good = np.sum(np.array(beta_pvals[name]) > 0.05) print(f" {name}: {n_good}/{K*D} β parameters well-calibrated (p > 0.05)") ``` ### Comprehensive SBC Summary ```{python} #| label: tbl-sbc-summary #| tbl-cap: "SBC calibration summary across all three models. Higher p-values indicate better calibration (uniformity of rank distributions). With 999 simulations and ~50 expected counts per bin, the chi-square tests are well-powered." summary_rows = [] # Helper function def sbc_tests(ranks, param_idx, n_bins, max_rank): counts = np.histogram(ranks[:, param_idx], bins=n_bins)[0] chi2, p = stats.chisquare(counts) ks_stat, ks_p = stats.kstest(ranks[:, param_idx] / max_rank, 'uniform') return chi2, p, ks_stat, ks_p # Alpha for name, ranks, _ in [('m_1', ranks_m1, None), ('m_2', ranks_m2, None), ('m_3', ranks_m3, None)]: chi2, p, ks, ks_p = sbc_tests(ranks, 0, n_bins, max_rank) summary_rows.append({ 'Model': name, 'Parameter': 'α', 'χ²': f'{chi2:.2f}', 'χ² p': f'{p:.3f}', 'KS': f'{ks:.3f}', 'KS p': f'{ks_p:.3f}' }) # Omega (m_2) / Kappa (m_3) chi2, p, ks, ks_p = sbc_tests(ranks_m2, omega_idx, n_bins, max_rank) summary_rows.append({ 'Model': 'm_2', 'Parameter': 'ω', 'χ²': f'{chi2:.2f}', 'χ² p': f'{p:.3f}', 'KS': f'{ks:.3f}', 'KS p': f'{ks_p:.3f}' }) chi2, p, ks, ks_p = sbc_tests(ranks_m3, omega_idx, n_bins, max_rank) summary_rows.append({ 'Model': 'm_3', 'Parameter': 'κ', 'χ²': f'{chi2:.2f}', 'χ² p': f'{p:.3f}', 'KS': f'{ks:.3f}', 'KS p': f'{ks_p:.3f}' }) # Beta (aggregated) for name, pvals in beta_pvals.items(): summary_rows.append({ 'Model': name, 'Parameter': 'β (mean p)', 'χ²': '—', 'χ² p': f'{np.mean(pvals):.3f}', 'KS': '—', 'KS p': '—' }) # Delta for name, ranks, delta_start, _ in model_delta_info: for k in range(K_minus_1): chi2, p, ks, ks_p = sbc_tests(ranks, delta_start + k, n_bins, max_rank) summary_rows.append({ 'Model': name, 'Parameter': f'δ_{k+1}', 'χ²': f'{chi2:.2f}', 'χ² p': f'{p:.3f}', 'KS': f'{ks:.3f}', 'KS p': f'{ks_p:.3f}' }) sbc_df = pd.DataFrame(summary_rows) print(sbc_df.to_string(index=False)) ``` ## Discussion ### Identification Analysis The identification structure of m_2 and m_3 extends directly from the m_1 analysis in [Report 5](05_adding_risky_choices.qmd). The arguments below are informal extensions of the formal identification results established there; a fully rigorous treatment (e.g., formal identification theorems with explicit regularity conditions) is beyond the scope of this report but would follow the same logical structure. **Model m_2** (independent sensitivities): The parameters are $\theta = (\alpha, \omega, \beta, \delta)$. Identification proceeds in stages: 1. Risky choices identify ω and δ (via the same argument as α and δ in m_1's risky component) 2. Given δ, uncertain choices identify α and β (via the same argument as in m_1's uncertain component) The key observation is that α and ω appear in *separate* likelihood components—α in the uncertain log-likelihood and ω in the risky log-likelihood—so they are identified independently. **Model m_3** (proportional sensitivities): The parameters are $\theta = (\alpha, \kappa, \beta, \delta)$ with $\omega = \kappa \alpha$. Identification requires: 1. Risky choices constrain $\omega = \kappa \alpha$ and δ 2. Uncertain choices constrain α and β (given δ) 3. Given α from step 2, κ is identified from $\kappa = \omega / \alpha$ The proportional structure means κ is identified *indirectly* through the ratio of risky to uncertain sensitivities. This is well-identified when both α and ω are well-constrained, but may show wider posteriors in small samples compared to m_2's direct estimation of ω. ### Comparing the Models The three models represent different assumptions about the relationship between uncertain and risky sensitivity: ::: {.callout-tip} ## Model Selection Guidance | Consideration | Favors m_1 | Favors m_2 | Favors m_3 | |--------------|-----------|-----------|-----------| | Parsimony | ✓ | | | | Flexibility | | ✓ | | | Interpretability | ✓ | | ✓ | | Theory (shared rationality) | ✓ | | | | Theory (context effects) | | ✓ | ✓ | | Small sample size | ✓ | | ✓ | **m_1** is appropriate when there is no reason to expect sensitivity to differ across decision contexts, or when sample sizes are small and parsimony is important. **m_2** is appropriate when sensitivity may differ arbitrarily between contexts and sample sizes are sufficient to estimate the additional parameter. **m_3** provides a middle ground: it allows context-dependent sensitivity while encoding the structural assumption that the two sensitivities are proportionally related, aiding identification in moderate sample sizes. ::: ### Formal Model Comparison This report validates that m_2 and m_3 are computationally well-behaved, but it does not perform formal model comparison (e.g., leave-one-out cross-validation via PSIS-LOO, Bayes factors, or stacking weights). The nesting hierarchy m_1 ⊂ m_3 ⊂ m_2 naturally invites such comparisons, and the availability of `log_lik` in each model's `generated quantities` block makes PSIS-LOO straightforward to compute via ArviZ. Formal model comparison on applied data is an important next step. With simulated data, a natural diagnostic is to generate data from each model in turn, fit all three, and verify that model comparison correctly identifies the data-generating model. This cross-model recovery exercise would directly test the distinguishability of the nested models—a practical concern for applied users deciding between m_1, m_2, and m_3. We defer this analysis to the application reports, where model comparison is motivated by substantive scientific questions rather than the methodological validation goals of the foundational series. ### Implications for Experimental Design The addition of ω has implications for study design: 1. **Sample size**: All three models require risky choices for utility identification. Models m_2 and m_3 additionally rely on risky choices to identify ω (or κ), so adequate N is important 2. **Lottery diversity**: Diverse risky alternatives—spanning the probability simplex—help identify both δ and the risky sensitivity 3. **Balance**: Roughly equal numbers of uncertain and risky problems (M ≈ N) provide balanced information about both sensitivity parameters As a rough guideline: the parameter recovery results in this report use 25 risky problems and 25 uncertain problems, which provides adequate identification for all three models. For m_2, which must identify ω independently from risky data alone, smaller N may require stronger priors or accepting wider posteriors on ω. For m_3, the proportional constraint partially regularizes the problem, so fewer risky problems may suffice—though κ estimation will be correspondingly noisier. ## Conclusion This report establishes that models m_2 and m_3 are computationally well-behaved extensions of m_1: 1. **Parameter recovery**: Both m_2 and m_3 achieve good recovery of all parameters, including the new sensitivity parameter ω (estimated directly in m_2, derived from κ in m_3). The shared parameters (α, β, δ) maintain recovery quality comparable to m_1. 2. **SBC validation**: Rank distributions are approximately uniform for all parameters in both models, confirming that the posteriors are well-calibrated. The new parameters (ω in m_2, κ in m_3) exhibit proper calibration, demonstrating that the Stan implementations correctly sample from the posterior. 3. **Model hierarchy**: The results support the use of m_1, m_2, or m_3 depending on the research question and available data. The nesting structure (m_1 ⊂ m_3 ⊂ m_2) provides a natural framework for model comparison in applied settings. 4. **The ω parameter**: The risky-choice sensitivity is well-identified in both generalized models, whether estimated directly (m_2) or through the proportional relationship (m_3). This validates the extension from a shared sensitivity to context-dependent sensitivities. ::: {.callout-note} ## Connection to Earlier Reports This report completes the foundational validation cycle for the model family: - [Report 4](04_parameter_recovery.qmd): Identified δ recovery challenges in m_0 - [Report 5](05_adding_risky_choices.qmd): Showed that risky choices (m_1) resolve δ identification - [Report 6](06_sbc_validation.qmd): Confirmed m_1 calibration via SBC - **Report 7** (this report): Extended validation to m_2 and m_3, confirming that relaxing the shared-α assumption does not introduce computational or identification pathologies ::: ### Closing the m_0 / m_1 / m_2 / m_3 Arc This report concludes the **single-context arc** of the foundational series — the development and validation of the m_0, m_1, m_2, and m_3 family, all of which assume a single (possibly context-split) sensitivity parameter applied uniformly across observations. Across seven reports, we have developed and validated a family of Bayesian decision-theoretic models: 1. **Abstract Formulation** (Report 1): Established the theoretical framework—subjective expected utility with softmax choice, feature-based probability inference, and the Dirichlet-increment utility parameterization. 2. **Concrete Implementation** (Report 2): Translated the framework into the m_0 Stan model for decisions under uncertainty. 3. **Prior Predictive Analysis** (Report 3): Verified that the chosen priors generate behaviorally plausible choice patterns. 4. **Parameter Recovery** (Report 4): Demonstrated that m_0 recovers α and β well but identified the fundamental δ non-identification problem under pure uncertainty. 5. **Adding Risky Choices** (Report 5): Introduced the Anscombe-Aumann insight—combining uncertain and risky choices in m_1 resolves utility identification, making δ recoverable. 6. **SBC Validation** (Report 6): Confirmed via 999-simulation SBC that the m_0 and m_1 posteriors are well-calibrated. 7. **Generalizing Sensitivity** (this report): Extended the model family with m_2 (independent sensitivities) and m_3 (proportional sensitivities), confirming that the generalizations are computationally well-behaved. The single-context arc establishes the *methodology* for the m_0/m_1/m_2/m_3 family. The foundational series continues in Reports [8](08_hierarchical_formulation.qmd)–[12](12_hierarchical_sbc_validation.qmd), which relax the shared-α assumption flagged in [Report 5](05_adding_risky_choices.qmd) and develop the hierarchical regression-on-log(α) model `h_m01` together with its prior predictive, parameter recovery, and SBC validation. The open questions for the present arc concern *application*: Which model (m_1, m_2, or m_3) is appropriate for a given empirical setting? Does the data support context-dependent sensitivity? How do the models perform on real choice data? These questions are addressed in the application reports. ### Open Directions Several extensions are natural but beyond the scope of the foundational validation: - **Cross-model recovery**: Fitting m_2 to data generated from m_1 (or vice versa) to assess model misspecification robustness - **Posterior predictive checks on applied data**: Verifying that the models capture observed choice patterns, not just recovering their own parameters - **Prior sensitivity analysis**: Systematically varying the κ prior in m_3 to assess robustness of conclusions - **Hierarchical extensions**: Estimating subject-level sensitivity parameters in multi-participant studies ## References