Angle model yields only correct responses in posterior predictive check #666

ding9889 · 2025-02-25T10:24:37Z

ding9889
Feb 25, 2025

Greetings the HSSM community,

Thank you for this amazing library!

I am recently estimating a hierarchical DDM with some real data collected from an online experiment (120 participants * 160 trials). Since there is a maximum duration in my experiment, I believe a DDM model with collapsing bounds might make more sense.

I made the following hierarchical model:

Hierarchical Sequential Sampling Model
Model: angle

Response variable: rt,response
Likelihood: approx_differentiable
Observations: 19019

Parameters:

v:
Formula: v ~ 1 + C(condition) + (1|sub_no)
Priors:
v_Intercept ~ Uniform(lower: 0.5, upper: 1.5, initval: 1.0)
v_C(condition) ~ Normal(mu: 0.0, sigma: 0.25)
v_1|sub_no ~ Normal(mu: 0.0, sigma: HalfNormal(sigma: 1.0), initval: 0.0)
Link: identity
Explicit bounds: (-5.0, 5.0)

a:
Formula: a ~ 1 + C(condition) + (1|sub_no)
Priors:
a_Intercept ~ Gamma(alpha: 1.5, beta: 0.75, initval: 2.5)
a_C(condition) ~ Normal(mu: 0.0, sigma: 0.25)
a_1|sub_no ~ Normal(mu: 0.0, sigma: HalfNormal(sigma: 0.0010000000474974513), initval: 0.0)
Link: identity
Explicit bounds: (0.5, 10.0)

z:
Prior: 0.5
Explicit bounds: (0.1, 0.9)

t:
Prior: 0.1
Explicit bounds: (0.001, 5)

theta:
Prior: Uniform(lower: -0.10000000149011612, upper: 1.5)
Explicit bounds: (-0.1, 1.5)

Lapse probability: 0.05
Lapse distribution: Uniform(lower: 0.0, upper: 20.0)

where initial bias was fixed to the mid-point of the boundary since the responses were coded as correct vs. incorrect; and the non-decision time was fixed to 100ms since I kept getting convergence issues similar to the one in the discussion #570 even after tunning the priors, and fixing the non-decision time seems to be the only way that allows the model to converge for now.

condition is a between-subject factor and direction is within, there were 60 participants in each condition and each participant performed 80 trials of each direction, in other words, the sample was balanced.

After running 4 chains (1000 burn-ins + 1000 draws), the model seems to converge okay, with all R-hats range from 1.00-1.01 except for a few participants showed a slightly larger R-hats (1.02-1.04, so I guess not a very big issue) on their random intercepts in drift rates.

However, when I tried to do the posterior predictive check using the plot_posterior_predictive() method, the parameters sampled from the posteriors only yield correct responses, which is definitely not the case in the real data (the mean accuracy rate of the real data is approx. 85%)

I also tried to estimate a baseline model where the effects of condition and direction were not included, and that model gave decent posterior predictive check.

I also tried to estimate vanilla DDM models with the same dataset and got similar behaviours (i.e., the baseline model recovered the real data, while the model including the treatment effects consistently yielded correct responses).

Does anyone have any idea of the possible reasons for this?

Many thanks!
Yuyang

AlexanderFengler · 2025-02-25T17:33:45Z

AlexanderFengler
Feb 25, 2025
Maintainer

Hi @ding9889,

this is an interesting case, and I don't have an immediate resolution to offer. Some comments:

First, there is a possibility we have a bug somewhere, and we will need to investigate that.
If you don't mind, it will help a lot of you can share more of the exact code you used to set up the model so we can find a way to reproduce the problem.

Second, using the collapsing bound to account for a task deadline can be tricky. I recommend you to look at this paeper to get an idea of the kinds of pathologies that can arise.

Third, following from the second point, I am seeing a few weird things happening in parameter space. The a_Intercept parameter is actually too high, and de facto out of the legal bounds of the angle model. The a|sub_no_offset parameter seems to be sampling from the prior while a|sub_no also seems to have no impact. I would consider this highly unusual to be honest, quite sure something weird is going on on that front.

Overall my first suspicions here is that the sampler has trouble escaping a local maximum, which puts too much mass on one of the responses. One thing to test is if you simply simulate some data corresponding to the group_mean parameters you set.

Best,
Alex

1 reply

ding9889 Feb 26, 2025
Author

Hi @AlexanderFengler,

Thanks for the reply.

Here're the codes I used to estimate the model (not exactly the same since I do not have access to my desktop at the moment, but these codes also threw out 100% positive responses):

import warnings
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from google.colab import files, drive
import io

random_seed_sim = 7
np.random.seed(random_seed_sim)

import arviz as az  # Visualization
import bambi as bmb  # Model construction
import hddm_wfpt
import jax
import pytensor  # Graph-based tensor library

import hssm

pytensor.config.floatX = "float32"
jax.config.update("jax_enable_x64", False)

from pickle import TRUE
df = df[["sub_no", "rt", "cor", "condition", "trend"]] # select the necessary columns
df['cor'] = df['cor'].fillna(-1) # treat non-response as incorrect answers 
df_cleaned = df
df_cleaned.rename(columns = {"cor":"response"}, inplace = TRUE) # rename the column to fit the HSSM pacakge
df_cleaned.loc[df_cleaned['response'] == 0, 'response'] = -1 # HSSM requires responses to be coded as 1 vs -1
df_cleaned.loc[df_cleaned['condition'] == "trading", 'condition'] = 1
df_cleaned.loc[df_cleaned['condition'] == "perceptual", 'condition'] = -1
df_cleaned.loc[df_cleaned['trend'] == 0.6, 'direction'] = 1
df_cleaned.loc[df_cleaned['trend'] == 0.4, 'direction'] = -1
df_cleaned = df_cleaned[["sub_no", "rt", "response", "condition", "direction"]]
df_cleaned['sub_no'] = df_cleaned['sub_no'].astype('int64')
df_cleaned['condition'] = df_cleaned['condition'].astype('int64')

model_config = hssm.ModelConfig(
    response=["rt", "response"],
    list_params=["v", "a", "z", "t", "theta"],
    bounds={
        "v": (-5.0, 5.0),
        "a": (1.0, 10),
        "t": (0.001, 5),
        "theta": (-0.1, 1.3)
    },
)

data = df_cleaned

mod1 = hssm.HSSM(model = "angle", z = 0.5, t = 0.1, data = data,
                 include=[
                   {
                    "name": "v",
                    "prior": {
                      "Intercept": {"name": "Uniform", "lower": -3.0, "upper": 3.0, "initval": 0.5},
                      "condition": {"name": "Uniform", "lower": -1.0, "upper": 1.0},
                      "direction": {"name": "Uniform", "lower": -1.0, "upper": 1.0},
                      "1|sub_no": {"name": "Normal", "mu": 0, "sigma":{"name": "HalfNormal", "sigma": 0.001, "initval": 0.1}}
                    },
                    "formula": "v ~ 1 + C(condition) * C(direction) + (1|sub_no)",
                    "link": "identity",
                   },
                   {
                    "name": "a",
                    "prior": {
                      "Intercept": {"name": "Uniform", "lower": 1.0, "upper": 5.0, "initval": 2.5},
                      "condition": {"name": "Uniform", "lower": -1.0, "upper": 1.0, "initval": 0.5},
                      "1|sub_no": {"name": "Normal", "mu": 0, "sigma":{"name": "HalfNormal", "sigma": 0.001, "initval": 0.1}}
                    },
                    "formula": "a ~ 1 + C(condition) + (1|sub_no)",
                    "link": "identity",
                   },
                  #  {
                  #   "name": "t",
                  #   "prior": {
                  #     "Intercept": {"name": "HalfNormal", "sigma": 1.0},
                  #     # "condition": {"name": "Uniform", "lower": -1.0, "upper": 1.0},
                  #     # "1|sub_no": {"name": "Normal", "mu": 0, "sigma":{"name": "HalfNormal", "sigma": 0.5, "initval": 0.1}}
                  #   },
                  #   "formula": "t ~ 1 + (1|sub_no)",
                  #   "link": "identity",
                  #  },
                 ],
                 loglik_kind = "approx_differentiable",
                 model_config=model_config)

infer_data_mod1 = mod1.sample(
    sampler="nuts_numpyro",  # type of sampler to choose, 'nuts_numpyro',
    # 'nuts_blackjax' of default pymc nuts sampler
    cores=2,  # how many cores to use
    chains=2,  # how many chains to run
    draws=2000,  # number of draws from the markov chain
    tune=2000,  # number of burn-in samples
    target_accept=0.95,  # increasing target acceptance rate for smaller step sizes
    idata_kwargs=dict(log_likelihood=True),  # return log likelihood
    mp_ctx="spawn",
)

az.plot_trace(
    infer_data_mod1,  # we exclude the log_likelihood traces here
)
plt.tight_layout()

hssm.HSSM.plot_posterior_predictive(mod1)

Then, the model again gives me the same posterior prediction as shown above (i.e., 100% positive responses).

As per your suggestion, I tried to do some data simulation using the group means (roughly, v = 1.1, a = 4.5, z = 0.5, t = 0.1, theta = 0.35) I got from the model, and you're right that this set of parameters only yields positive responses.

And yes, it seems like a way too high a is absolutely problematic - I tried to play around the parameters a bit, and the model will be able to yield a few (approx. 3%-5%, with everything the same but lower a to 2) negative responses. While I notice that the distribution of responses seems to be also quite sensitive to the drift rate. By lowering v to 0.5 and keeping a = 4.5, z = 0.5, t = 0.1, theta = 0.35, there are about 83% positive responses in the simulated data, which is quite close to the real data.

I am guessing maybe it's because there are quite a few slow responses in my dataset, the model thereby has to increase both drift rates and boundary separation to fit the RT distribution, which in turn leads to an unreasonable distribution of responses...

Maybe I shall try to set more appropriate bounds for the parameters and I'll let you know if I figure out how to fix this.

Anyways, thanks a lot for the message and for recommending this very interesting paper. That really helps!

Cheers,
Yuyang

frankmj · 2025-02-26T14:21:16Z

frankmj
Feb 26, 2025
Maintainer

I suggest you try using more informative priors especially on a (also are you using prior_settings = "safe"?). This shouldn't be a uniform but e.g. a_Intercept ~ Gamma(mu: 0.5, sigma: 1.75, initval: 1.0) If the value of a is higher than that allowable by the LAN then the likelihood will be nonsensical -- and usually you would not need an *a* anywhere near that high, so it is likely the sampler is just getting stuck in a region that is returning a high likelihood that isn't accurate.

…

On Wed, Feb 26, 2025 at 8:57 AM Yuyang Ding ***@***.***> wrote: Hi @AlexanderFengler <https://github.com/AlexanderFengler>, Thanks for the reply. Here're the codes I used to estimate the model (not exactly the same since I do not have access to my desktop at the moment, but these codes also threw out 100% positive responses): import warnings warnings.filterwarnings("ignore") import numpy as np import pandas as pd from matplotlib import pyplot as plt from google.colab import files, drive import io random_seed_sim = 7 np.random.seed(random_seed_sim) import arviz as az # Visualization import bambi as bmb # Model construction import hddm_wfpt import jax import pytensor # Graph-based tensor library import hssm pytensor.config.floatX = "float32" jax.config.update("jax_enable_x64", False) from pickle import TRUE df = df[["sub_no", "rt", "cor", "condition", "trend"]] # select the necessary columns df['cor'] = df['cor'].fillna(-1) # treat non-response as incorrect answers df_cleaned = df df_cleaned.rename(columns = {"cor":"response"}, inplace = TRUE) # rename the column to fit the HSSM pacakge df_cleaned.loc[df_cleaned['response'] == 0, 'response'] = -1 # HSSM requires responses to be coded as 1 vs -1 df_cleaned.loc[df_cleaned['condition'] == "trading", 'condition'] = 1 df_cleaned.loc[df_cleaned['condition'] == "perceptual", 'condition'] = -1 df_cleaned.loc[df_cleaned['trend'] == 0.6, 'direction'] = 1 df_cleaned.loc[df_cleaned['trend'] == 0.4, 'direction'] = -1 df_cleaned = df_cleaned[["sub_no", "rt", "response", "condition", "direction"]] df_cleaned['sub_no'] = df_cleaned['sub_no'].astype('int64') df_cleaned['condition'] = df_cleaned['condition'].astype('int64') model_config = hssm.ModelConfig( response=["rt", "response"], list_params=["v", "a", "z", "t", "theta"], bounds={ "v": (-5.0, 5.0), "a": (1.0, 10), "t": (0.001, 5), "theta": (-0.1, 1.3) }, ) data = df_cleaned mod1 = hssm.HSSM(model = "angle", z = 0.5, t = 0.1, data = data, include=[ { "name": "v", "prior": { "Intercept": {"name": "Uniform", "lower": -3.0, "upper": 3.0, "initval": 0.5}, "condition": {"name": "Uniform", "lower": -1.0, "upper": 1.0}, "direction": {"name": "Uniform", "lower": -1.0, "upper": 1.0}, "1|sub_no": {"name": "Normal", "mu": 0, "sigma":{"name": "HalfNormal", "sigma": 0.001, "initval": 0.1}} }, "formula": "v ~ 1 + C(condition) * C(direction) + (1|sub_no)", "link": "identity", }, { "name": "a", "prior": { "Intercept": {"name": "Uniform", "lower": 1.0, "upper": 5.0, "initval": 2.5}, "condition": {"name": "Uniform", "lower": -1.0, "upper": 1.0, "initval": 0.5}, "1|sub_no": {"name": "Normal", "mu": 0, "sigma":{"name": "HalfNormal", "sigma": 0.001, "initval": 0.1}} }, "formula": "a ~ 1 + C(condition) + (1|sub_no)", "link": "identity", }, # { # "name": "t", # "prior": { # "Intercept": {"name": "HalfNormal", "sigma": 1.0}, # # "condition": {"name": "Uniform", "lower": -1.0, "upper": 1.0}, # # "1|sub_no": {"name": "Normal", "mu": 0, "sigma":{"name": "HalfNormal", "sigma": 0.5, "initval": 0.1}} # }, # "formula": "t ~ 1 + (1|sub_no)", # "link": "identity", # }, ], loglik_kind = "approx_differentiable", model_config=model_config) infer_data_mod1 = mod1.sample( sampler="nuts_numpyro", # type of sampler to choose, 'nuts_numpyro', # 'nuts_blackjax' of default pymc nuts sampler cores=2, # how many cores to use chains=2, # how many chains to run draws=2000, # number of draws from the markov chain tune=2000, # number of burn-in samples target_accept=0.95, # increasing target acceptance rate for smaller step sizes idata_kwargs=dict(log_likelihood=True), # return log likelihood mp_ctx="spawn", ) az.plot_trace( infer_data_mod1, # we exclude the log_likelihood traces here ) plt.tight_layout() hssm.HSSM.plot_posterior_predictive(mod1) Then, the model again gives me the same posterior prediction as shown above (i.e., 100% positive responses). As per your suggestion, I tried to do some data simulation using the group means (roughly, v = 1.1, a = 4.5, z = 0.5, t = 0.1, theta = 0.35) I got from the model, and you're right that this set of parameters only yields positive responses. And yes, it seems like a way too high *a* is absolutely problematic - I tried to play around the parameters a bit, and the model will be able to yield a few (approx. 3%-5%, with everything the same but lower a to 2) negative responses. While I notice that the distribution of responses seems to be also quite sensitive to the drift rate. By lowering v to 0.5 and keeping a = 4.5, z = 0.5, t = 0.1, theta = 0.35, there are about 83% positive responses in the simulated data, which is quite close to the real data. I am guessing maybe it's because there are quite a few slow responses in my dataset, the model thereby has to increase both drift rates and boundary separation to fit the RT distribution, which in turn leads to an unreasonable distribution of responses... Maybe I shall try to set more appropriate bounds for the parameters and I'll let you know if I figure out how to fix this. Anyways, thanks a lot for the message and for recommending this very interesting paper. That really helps! Cheers, Yuyang — Reply to this email directly, view it on GitHub <#666 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAG7TFGC7XS6BAXUYIRTXIT2RXB23AVCNFSM6AAAAABX2IAKR6VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEMZSGYZDIMI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Angle model yields only correct responses in posterior predictive check #666

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Angle model yields only correct responses in posterior predictive check #666

Uh oh!

ding9889 Feb 25, 2025

Replies: 2 comments · 1 reply

Uh oh!

AlexanderFengler Feb 25, 2025 Maintainer

Uh oh!

ding9889 Feb 26, 2025 Author

Uh oh!

frankmj Feb 26, 2025 Maintainer

ding9889
Feb 25, 2025

Replies: 2 comments 1 reply

AlexanderFengler
Feb 25, 2025
Maintainer

ding9889 Feb 26, 2025
Author

frankmj
Feb 26, 2025
Maintainer