Basic event model


Definition of EventModel class.

The EventModel class is the base model for estimating hidden multivariate pattern models.

class hmp.models.event.EventModel(*args, n_events, fixed_time_pars=None, fixed_channel_pars=None, tolerance=0.0001, max_iteration=1000.0, min_iteration=1, starting_points=1, max_scale=None, **kwargs)#

Bases: BaseModel

A model for estimating HMP events.

Parameters:
  • n_events (int) – The number of HMP events to estimate.

  • fixed_time_pars (list, optional) – List of time parameters to fix during estimation. If None, all time parameters are estimated.

  • fixed_channel_pars (list, optional) – List of channel parameters to fix during estimation. If None, all channel parameters are estimated.

  • tolerance (float, optional) – Convergence tolerance for the expectation maximization algorithm. Default is 1e-4.

  • max_iteration (int, optional) – Maximum number of iterations for the expectation maximization algorithm. Default is 1e3.

  • min_iteration (int, optional) – Minimum number of iterations for the expectation maximization algorithm. Default is 1.

  • starting_points (int, optional) – Number of random starting points to use for initialization. Default is 1.

  • max_scale (float, optional) – Maximum mean distance between events, used when generating random starting points. Default is None.

EM(trial_data, initial_channel_pars, initial_time_pars, fixed_channel_pars=None, fixed_time_pars=None, max_iteration=1000, tolerance=0.0001, min_iteration=1, channel_map=None, time_map=None, groups=None, cpus=1, n_cor=30)#

Fit using expectation maximization.

Parameters:
  • trial_data (TrialData) – The trial data to fit the model to.

  • initial_channel_pars (np.ndarray) – 2D ndarray (n_events * n_channels) or 3D (iteration * n_events * n_channels), initial conditions for event channel contributions.

  • initial_time_pars (np.ndarray) – 2D ndarray (n_stages * n_parameters) or 3D (iteration * n_stages * n_parameters), initial conditions for time distribution parameters.

  • fixed_channel_pars (list[int], optional) – Indices of channel parameters to fix during estimation.

  • fixed_time_pars (list[int], optional) – Indices of time parameters to fix during estimation.

  • max_iteration (int, optional) – Maximum number of iterations for the expectation maximization algorithm. Default is 1000.

  • tolerance (float, optional) – Convergence tolerance for the expectation maximization algorithm. Default is 1e-4.

  • min_iteration (int, optional) – Minimum number of iterations for the expectation maximization algorithm. Default is 1.

  • channel_map (np.ndarray, optional) – 2D array mapping channel parameters to groups. Default is None.

  • time_map (np.ndarray, optional) – 2D array mapping time parameters to groups. Default is None.

  • groups (np.ndarray, optional) – Array indicating the groups for grouping modeling. Default is None.

  • cpus (int, optional) – Number of cores to use in multiprocessing functions. Default is 1.

  • n_cor (int, optional) – In case the log-likelihood becomes invalid after completing the M step, the parameter update vector will be halved for a maximum of n_cor times to try and recover a valid update before giving up and falling back to the parameter estimates from the previous iteration.

Return type:

tuple[float, ndarray, ndarray, ndarray, ndarray]

Returns:

  • lkh (float) – Summed log probabilities.

  • channel_pars (np.ndarray) – Estimated channel contributions for each event.

  • time_pars (np.ndarray) – Estimated time distribution parameters for each stage.

  • traces (np.ndarray) – Log-likelihood values for each EM iteration.

  • time_pars_dev (np.ndarray) – Time parameters for each iteration of the EM algorithm.

distribution_pdf(shape, scale, max_duration)#

Return a discretized probability density function (PDF) for a provided scipy distribution.

This method computes the PDF using the given shape and scale parameters over a range from 0 to max_duration, and normalizes it to ensure the probabilities sum to 1.

Parameters:
  • shape (float) – The shape parameter of the distribution.

  • scale (float) – The scale parameter of the distribution.

  • max_duration (int) – The maximum duration (range) for which the PDF is computed.

Returns:

A 1D array representing the probability mass function for the distribution with the given shape and scale parameters, normalized to sum to 1.

Return type:

np.ndarray

estim_probs(trial_data, channel_pars, time_pars, location=True, subset_epochs=None)#

Estimate probabilities for events and compute the log-likelihood.

Parameters:
  • trial_data (TrialData) – The trial data containing cross-correlation and event information.

  • channel_pars (np.ndarray) – A 2D array of shape (n_events, n_channels) or a 3D array of shape (iteration, n_events, n_channels) containing initial conditions for channel contributions to events.

  • time_pars (np.ndarray) – A 2D array of shape (n_stages, n_parameters) or a 3D array of shape (iteration, n_stages, n_parameters) containing initial conditions for the distribution parameters.

  • location (bool, optional) – Whether to add a minimum distance between events to avoid event collapse during the expectation-maximization algorithm. Default is True.

  • subset_epochs (list[int] or None, optional) – A list of trial indices to consider for the computation. If None, all trials are used. Default is None.

Return type:

tuple[float, ndarray]

Returns:

  • loglikelihood (float) – The summed log probabilities.

  • eventprobs (np.ndarray) – A 3D array of shape (n_trials, max_samples, n_events) containing the probabilities for each event.

fit(trial_data, channel_pars=None, time_pars=None, fixed_time_pars=None, fixed_channel_pars=None, verbose=True, cpus=1, channel_map=None, time_map=None, grouping_dict=None)#

Fit HMP for a single n_events model.

Parameters:
  • trial_data (TrialData) – The trial data to fit the model to.

  • channel_pars (ndarray, optional) – 2D ndarray (n_groups * n_events * n_channels) or 4D (starting_points * n_groups * n_groups * n_events * n_channels), initial conditions for event channel contributions. Default is None.

  • time_pars (ndarray, optional) – 3D ndarray (n_groups * n_stages * 2) or 4D (starting_points * n_groups * n_stages * 2), initial conditions for time distribution parameters. Default is None.

  • fixed_time_pars (list, optional) – Indices of time parameters to fix during estimation. Default is None.

  • fixed_channel_pars (list, optional) – Indices of channel parameters to fix during estimation. Default is None.

  • tolerance (float, optional) – Convergence tolerance for the expectation maximization algorithm. Default is 1e-4.

  • max_iteration (int, optional) – Maximum number of iterations for the expectation maximization algorithm. Default is 1e3.

  • min_iteration (int, optional) – Minimum number of iterations for the expectation maximization algorithm. Default is 1.

  • verbose (bool, optional) – If True, displays output useful for debugging. Default is True.

  • cpus (int, optional) – Number of cores to use in multiprocessing functions. Default is 1.

  • channel_map (ndarray, optional) – 2D ndarray (n_groups * n_events) indicating which channel contributions are shared between groups. Default is None.

  • time_map (ndarray, optional) – 2D ndarray (n_groups * n_stages) indicating which time parameters are shared between groups. Default is None.

  • grouping_dict (dict, optional) – Dictionary defining groups for grouping modeling. Keys are group names, and values are lists of groups. Default is None.

Return type:

None

gen_random_stages(n_events)#

Compute random stage durations.

Generates random stage durations between 0 and the mean reaction time (RT) by iteratively drawing samples from a uniform distribution. The last stage duration is computed as 1 minus the cumulative duration of previous stages. The stages are then scaled to the mean RT.

Parameters:

n_events (int) – The number of events to generate random durations for.

Returns:

A 2D array where each row contains the shape and scale parameters for a stage.

Return type:

np.ndarray

get_channel_time_parameters_expectation(trial_data, eventprobs, subset_epochs=None)#

Compute the channel and time parameters using the expectation step.

Parameters:
  • trial_data (TrialData) – The trial data containing cross-correlation and event information.

  • eventprobs (np.ndarray) – A 3D array of shape (n_trials, max_duration, n_events) containing the event probabilities.

  • subset_epochs (list[int], optional) – A list of trial indices to consider for the computation. If None, all trials are used.

Return type:

tuple[ndarray, ndarray]

Returns:

  • channel_pars (np.ndarray) – A 2D array of shape (n_events, n_dims) with the estimated channel parameters.

  • time_pars (np.ndarray) – A 2D array of shape (n_stages, 2) with the estimated time parameters (shape and scale).

group_constructor(trial_data, grouping_dict, channel_map=None, time_map=None, verbose=False)#

Adapt the model to groups by constructing group mappings and validating provided maps.

Parameters:
  • trial_data (TrialData) – The trial data containing trial-group information.

  • grouping_dict (dict) – A dictionary defining groups for grouping modeling. Keys are group names, and values are lists of groups.

  • channel_map (np.ndarray, optional) – A 2D array mapping channel parameters to groups. Default is None.

  • time_map (np.ndarray, optional) – A 2D array mapping time parameters to groups. Default is None.

  • verbose (bool, optional) – If True, prints detailed information about the group construction process. Default is False.

Return type:

tuple[int, ndarray, dict]

Returns:

  • n_groups (int) – The number of unique groups.

  • groups (np.ndarray) – An array indicating the group assignment for each trial.

  • glabels (dict) – A dictionary containing group names and their corresponding modalities.

scale_parameters(averagepos)#

Scale parameters from the average position of events.

This method is used during the re-estimation step in the EM procedure. It computes the likeliest location of events from eventprobs and calculates the scale parameters as the average distance between consecutive events.

Parameters:

averagepos (np.ndarray) – A 1D array containing the average positions of events.

Returns:

A 2D array where each row contains the shape and scale parameters for the corresponding event distribution.

Return type:

np.ndarray

transform(trial_data)#

Transform the trial data using the fitted model.

Parameters:

trial_data (TrialData) – The trial data to transform.

Return type:

tuple[ndarray, DataArray]

Returns:

  • likelihoods (list) – List of log-likelihoods for each submodel (number of events).

  • xr_eventprobs (xr.DataArray) – Concatenated event probability arrays for all submodels, indexed by number of events.

property xrchannel_pars#

Returns the channel parameters as an xarray DataArray.

Returns:

An xarray DataArray with dimensions (“group”, “event”, “channel”) containing the channel parameters.

Return type:

xr.DataArray

property xrlikelihoods#

Returns the log-likelihoods as an xarray DataArray.

Returns:

An xarray DataArray containing the log-likelihood values.

Return type:

xr.DataArray

property xrtime_pars#

Returns the time parameters as an xarray DataArray.

Returns:

An xarray DataArray with dimensions (“group”, “stage”, “parameter”) containing the time parameters.

Return type:

xr.DataArray

property xrtime_pars_dev#

Returns the time parameter for each EM iteration as an xarray DataArray.

Returns:

An xarray DataArray with dimensions (“em_iteration”, “group”, “stage”, “time_pars”) containing the time parameter deviations.

Return type:

xr.DataArray

property xrtraces#

Returns the traces of the log-likelihood for each EM iteration as an xarray DataArray.

Returns:

An xarray DataArray with dimensions (“em_iteration”, “group”) containing the log-likelihood traces.

Return type:

xr.DataArray

Eliminative estimation method#

Estimate all possible number events starting from a base model or the maximum possible.

class hmp.models.eliminative.EliminativeMethod(*args, max_events=None, min_events=0, base_fit=None, tolerance=0.0001, max_iteration=1000, **kwargs)#

Bases: BaseModel

Initialize the EliminativeMethod.

Parameters:
  • max_events (int, optional) – Maximum number of events to be estimated. By default, it is inferred using compute_max_events() if not provided.

  • min_events (int, optional) – The minimum number of events to be estimated. Defaults to 1.

  • base_fit (EventModel, optional) – To start the elimination from a specfic model this argument can be provided with a fitted EventModel. Defaults to None.

  • tolerance (float, optional) – Tolerance for the expectation maximization algorithm. Defaults to 1e-4.

  • max_iteration (int, optional) – Maximum number of iterations for the expectation maximization algorithm. Defaults to 1000.

fit(trial_data, cpus=1)#

Perform the eliminative estimation.

First, read or estimate the max_event solution, then estimate the max_event - 1 solution by iteratively removing one of the events and picking the one with the highest log-likelihood.

Parameters:
  • trial_data (TrialData) – The dataset containing the crosscorrelated data and infos on durations and trials.

  • cpus (int, optional) – Number of CPUs to use for parallel processing. Defaults to 1.

Return type:

None

get_event_model(n_events, starting_points)#
transform(trial_data)#

Apply all fitted submodels to the provided trial data.

Parameters:

trial_data (TrialData) – The dataset containing the crosscorrelated data and information on durations and trials.

Returns:

  • likelihoods (list) – List of log-likelihoods for each submodel (number of events).

  • xr_eventprobs (xarray.DataArray) – Concatenated event probability arrays for all submodels, indexed by number of events.

Cumulative estimation method#

Models to estimate cumulative event models.

class hmp.models.cumulative.CumulativeMethod(*args, step=None, end=None, sequential=True, fastforward=True, tolerance=0.0001, base_fit=None, max_n_events=None, **kwargs)#

Bases: BaseModel

Initialize the CumulativeMethod.

This method initializes the model and sets up parameters for fitting a cumulative event model. The fitting process starts with a 1-event model and iteratively adds events based on the convergence of the expectation maximization algorithm.

Parameters:
  • args (tuple) – Extra arguments to be passed to the BaseModel, including at least events and distribution objects.

  • step (float, optional) – The size of the step from 0 to the mean RT. Defaults to the location defined in the pattern. Small values ensure a complete exploration of the parameter space but can be slow. Higher values fasten the estimation but risk missing event due to unexplored spaces.

  • end (int, optional) – The maximum number of samples to explore within each trial. Defaults to None.

  • sequential (bool) – If True (Default), iteratively test all samples from 0 to end, retain times at which likelihood increased regardless of whether a subsequent event was found. If False, Testing new starting point solely starting from the last detected event.

  • fastforward (bool, optional) – If True when proposal got rejected, start again with the furthest time point explored with previous proposition.

  • tolerance (float, optional) – The tolerance used for convergence in the EM() function for the cumulative step. Defaults to 1e-4.

  • base_fit (EventModel) – To start adding events from a specfic model, this argument can be provided with a fitted EventModel. Defaults to None.

  • max_n_events (int) – Maximum number of events to be estimated. If None (default) uses the minim RT to estimated the maximim possible number of events.

  • kwargs (dict) – Additional keyword arguments to be passed to the BaseModel.

fit(trial_data, verbose=True, cpus=1)#

Fit the model starting with a 1-event model and iteratively add events.

This method fits the cumulative event model to the provided trial data. It begins with a single-event model and incrementally adds events based on the convergence of the expectation maximization algorithm. The process continues until the maximum number of events (given the minimum duration) is reached or the likelihood no longer improves.

Parameters:
  • trial_data (TrialData) – The trial data to fit the model on.

  • verbose (bool, optional) – If True, provides detailed output about the fitting process. Defaults to True.

  • cpus (int, optional) – The number of CPU cores to use for computation. Defaults to 1.

Return type:

None

transform(*args, **kwargs)#

Transform the input data using the last model fitted in the cumulative method.

This method applies the transformation defined by the final model to the provided data.

Return type:

Transformed data as returned by the final model’s transform method.