Utilities

Utility modules providing various helper functions and classes.

Results Handling

class BOBE.utils.results.BOBEResults(param_names, param_labels, param_bounds, output_file='results', save_dir='./', settings=None, likelihood_name='unknown', resume_from_existing=False)[source]

Bases: object

Comprehensive results management for BOBE runs.

This class handles storing, organizing, and outputting results in formats compatible with standard nested sampling analysis tools.

__init__(param_names, param_labels, param_bounds, output_file='results', save_dir='./', settings=None, likelihood_name='unknown', resume_from_existing=False)[source]

Initialize the results manager.

Parameters:
  • output_file (str) – Base name for output files

  • param_names (List[str]) – List of parameter names

  • param_labels (List[str]) – List of parameter LaTeX labels

  • param_bounds (ndarray) – Parameter bounds array [n_params, 2]

  • settings (Optional[Dict[str, Any]]) – Dictionary of BOBE settings

  • likelihood_name (str) – Name of the likelihood function

  • resume_from_existing (bool) – If True, try to load existing results and continue from there

update_acquisition(iteration, acquisition_value, acquisition_function)[source]

Track acquisition function values throughout iterations.

Parameters:
  • iteration (int) – Current iteration number

  • acquisition_value (float) – Value of the acquisition function at the selected point

  • acquisition_function (str) – String name of the acquisition function used

update_gp_hyperparams(iteration, lengthscales, kernel_variance)[source]

Track GP hyperparameters evolution.

Parameters:
  • iteration (int) – Current iteration number

  • lengthscales (list) – List of lengthscale values (can be JAX arrays)

  • kernel_variance (float) – Kernel variance value

update_best_loglike(iteration, best_loglike)[source]

Track best loglikelihood evolution.

Parameters:
  • iteration (int) – Current iteration number

  • best_loglike (float) – Current best loglikelihood value

update_convergence(iteration, logz_dict, converged, threshold)[source]

Update convergence information from a nested sampling check.

Parameters:
  • iteration (int) – Current iteration number

  • logz_dict (Dict[str, float]) – Dictionary with logz information

  • converged (bool) – Whether convergence was achieved

  • threshold (float) – Convergence threshold used

update_kl_divergences(iteration, successive_kl=None)[source]

Update KL divergence tracking for convergence analysis.

Parameters:
  • iteration (int) – Current iteration number

  • successive_kl (Optional[Dict[str, float]]) – Optional KL divergence between successive iterations

get_last_iteration()[source]

Get the last iteration number from the results history.

Return type:

int

Returns:

Last iteration number, or 0 if no iterations have been recorded

is_resuming()[source]

Check if this is a resumed run (has existing data).

Return type:

bool

Returns:

True if this appears to be a resumed run

start_timing(phase_name)[source]

Start timing a specific phase.

end_timing(phase_name)[source]

End timing a specific phase and accumulate the time.

get_timing_summary()[source]

Get a summary of timing information.

Return type:

Dict[str, Any]

save_timing_data()[source]

Save timing data to JSON file.

get_gp_data()[source]

Get GP hyperparameter evolution data for plotting.

Return type:

Dict[str, list]

Returns:

Dictionary with ‘iterations’, ‘lengthscales’, and ‘kernel_variances’ keys

get_acquisition_data()[source]

Get acquisition function evolution data for plotting.

Return type:

Dict[str, list]

Returns:

Dictionary with ‘iterations’, ‘values’, and ‘functions’ keys

get_best_loglike_data()[source]

Get best loglikelihood evolution data for plotting.

Return type:

Dict[str, list]

Returns:

Dictionary with ‘iterations’ and ‘best_loglike’ keys

finalize(samples_dict={}, logz_dict=None, converged=False, termination_reason='Max iterations reached', gp_info=None, best_point=None, best_loglike=None, best_iteration=None)[source]

Finalize the results with final samples and metadata.

Parameters:
  • samples_dict (Dict[str, ndarray]) – Dictionary with ‘x’, ‘weights’, ‘logl’ keys for final samples

  • logz_dict (Optional[Dict[str, float]]) – Final evidence information

  • converged (bool) – Whether the run converged

  • termination_reason (str) – Reason for termination

  • gp_info (Optional[Dict[str, Any]]) – Dictionary containing GP and classifier information

  • best_point (Optional[ndarray]) – Best point found (physical parameter space)

  • best_loglike (Optional[float]) – Best log-likelihood value

  • best_iteration (Optional[int]) – Iteration where best point was found

get_results_dict()[source]

Get simplified results dictionary with only essential data.

Return type:

Dict[str, Any]

Returns:

Dictionary containing samples, weights, evidence evolution, and convergence info

save_all_formats()[source]

Save results in multiple formats for compatibility.

save_main_results()[source]

Save main comprehensive results file.

save_chain_files(samples_dict=None, filename=None)[source]

Save chain files in GetDist format using MCSamples.saveAsText method.

save_minimum_files()[source]

Save best point in GetDist minimum format.

Creates two files: - .minimum.txt: Simple table with best point - .minimum: Formatted text with parameter details

save_summary_stats()[source]

Save summary statistics in JSON format.

save_intermediate(gp, filename=None)[source]

Save intermediate results for crash recovery and resuming.

get_getdist_samples(samples_dict=None)[source]

Convert results to GetDist MCSamples object.

Return type:

Optional[MCSamples]

Returns:

GetDist MCSamples object if GetDist is available, None otherwise

classmethod load_results(output_file)[source]

Load results from saved files.

Parameters:

output_file (str) – Base name of the output files

Return type:

BOBEResults

Returns:

BOBEResults object with loaded data

Core Utilities

BOBE.utils.core.is_cluster_environment()[source]

Detect if running in a cluster environment by checking common environment variables. Returns True if cluster environment is detected, False otherwise.

BOBE.utils.core.renormalise_log_weights(log_weights)[source]
BOBE.utils.core.resample_equal(samples, aux, weights=None, logwts=None)[source]
BOBE.utils.core.kl_divergence_samples(prev_loglike, curr_loglike)[source]

Compute KL divergence between successive iterations.

BOBE.utils.core.kl_divergence_gaussian(mu1, Cov1, mu2, Cov2)[source]

Computes the forward, reverse, and symmetric KL divergence between two multivariate Gaussian distributions N1=N(mu1, Cov1) and N2=N(mu2, Cov2).

BOBE.utils.core.get_threshold_for_nsigma(nsigma, d)[source]

Difference between peak of Gaussian and logprob level for nsigma (taken from GPry).

Parameters:
  • nsigma (float) – The number of standard deviations to consider.

  • d (int) – The dimensionality of the space.

Returns:

The threshold value.

Return type:

float

BOBE.utils.core.split_vmap(func, input_arrays, batch_size=10)[source]
BOBE.utils.core.scale_to_unit(x, param_bounds)[source]

Project from original domain to unit hypercube, X is N x d shaped, param_bounds are 2 x d

BOBE.utils.core.scale_from_unit(x, param_bounds)[source]

Project from unit hypercube to original domain, X is N x d shaped, param_bounds are 2 x d

BOBE.utils.core.suppress_stdout_stderr()[source]

A context manager that redirects stdout and stderr to devnull

Seed Management

Utility functions for managing global random seeds across the BOBE package.

BOBE.utils.seed.set_global_seed(seed=None)[source]

Set global random seed for reproducible results.

Parameters:

seed (int | None) – The random seed to use. If None, a random seed is generated.

Return type:

int

Returns:

The seed that was used.

BOBE.utils.seed.get_global_seed()[source]

Get the current global seed value.

If the seed has not been set, it will be initialized automatically.

Return type:

int

Returns:

The current global seed.

BOBE.utils.seed.get_jax_key()[source]

Get the current JAX random key.

If the seed has not been set, it will be initialized automatically.

Return type:

Array

Returns:

The current JAX PRNGKey.

BOBE.utils.seed.split_jax_key()[source]

Split the current JAX random key and update the global key.

If the seed has not been set, it will be initialized automatically.

Return type:

tuple[Array, Array]

Returns:

A tuple containing the new global key and the key for use.

BOBE.utils.seed.get_new_jax_key()[source]

Get a new JAX random key by splitting the current global key.

If the seed has not been set, it will be initialized automatically.

Return type:

Array

Returns:

A new JAX PRNGKey for immediate use.

BOBE.utils.seed.get_numpy_rng()[source]

Get the global NumPy random number generator.

If the seed has not been set, it will be initialized automatically.

Return type:

Generator

Returns:

The global instance of numpy.random.Generator.

BOBE.utils.seed.ensure_reproducibility(seed=None)[source]

Ensure reproducibility by setting seeds and JAX configurations.

Parameters:

seed (int | None) – The seed to use. If None, a random seed will be generated.

Return type:

int

Returns:

The seed that was used.

Logging Utilities

class BOBE.utils.log.LevelFilter(levels)[source]

Bases: Filter

Filter to allow specific log levels

__init__(levels)[source]

Initialize a filter.

Initialize with the name of the logger which, together with its children, will have its events allowed through the filter. If no name is specified, allow every event.

filter(record)[source]

Determine if the specified record is to be logged.

Returns True if the record should be logged, or False otherwise. If deemed appropriate, the record may be modified in-place.

BOBE.utils.log.setup_logging(verbosity='INFO', log_file=None)[source]

Configure logging for serial or MPI runs.

Parameters:
  • verbosity – String level - ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’, ‘QUIET’

  • log_file – Optional file to log to. In MPI runs, will be post-fixed with rank.

BOBE.utils.log.get_logger(name)[source]

Gets a logger. The root logger should be configured first via setup_logging.

BOBE.utils.log.update_verbosity(verbosity)[source]

Update the logging verbosity at runtime

Plotting and Visualization

Summary plotting module for BOBE runtime visualization.

This module provides comprehensive plotting capabilities for analyzing BOBE runs, including evidence evolution, GP hyperparameters, timing information, and convergence diagnostics.

BOBE.utils.plot.plot_final_samples(gp, samples_dict, param_list, param_labels, plot_params=None, param_bounds=None, reference_samples=None, reference_file=None, reference_ignore_rows=0.0, reference_label='MCMC', scatter_points=False, markers=None, output_file='output', output_dir='./', **kwargs)[source]

Plot the final samples from the Bayesian optimization process.

Parameters:
  • gp (GP object) – The Gaussian process object used for the optimization.

  • samples_dict (dict) – The samples from the nested sampling or MCMC process.

  • param_list (list) – The list of parameter names.

  • param_labels (list) – The list of parameter labels for plotting.

  • plot_params (list, optional) – The list of parameters to plot. If None, all parameters will be plotted.

  • param_bounds (np.ndarray, optional) – The bounds of the parameters. If None, assumed to be [0,1] for all parameters.

  • reference_samples (MCSamples, optional) – The reference getdist MCsamples from the MCMC/Nested Sampling to compare against. If None, will be loaded from the reference_file.

  • reference_file (str, optional) – The getdist file root containing the reference samples. If None, will be loaded from the reference_samples. If both are None, no reference samples will be plotted.

  • reference_ignore_rows (float, optional) – The fraction of rows to ignore in the reference file. Default is 0.0.

  • reference_label (str, optional) – The label for the reference samples. Default is ‘MCMC’.

  • scatter_points (bool, optional) – If True, scatter the training points on the plot. Default is False.

  • output_file (str, optional) – The output file name for the plot. Default is ‘output’.

class BOBE.utils.plot.BOBESummaryPlotter(results, figsize_scale=1.0)[source]

Bases: object

Comprehensive plotting class for BOBE run analysis and diagnostics.

__init__(results, figsize_scale=1.0)[source]

Initialize the plotter with BOBE results.

Parameters:
  • results (Union[BOBEResults, str]) – BOBEResults object or path to results file

  • figsize_scale (float) – Scale factor for figure sizes (default: 1.0)

plot_evidence_evolution(logz_data=None, ax=None, show_convergence=True)[source]

Plot the evolution of log evidence (logZ) with error bounds.

Parameters:
  • logz_data (Optional[Dict]) – Dictionary containing logZ evolution data (uses results if None)

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

  • show_convergence (bool) – Whether to mark convergence points

Return type:

Axes

Returns:

The matplotlib axes object

plot_gp_lengthscales(gp_data=None, ax=None)[source]

Plot evolution of GP lengthscales only.

Parameters:
  • gp_data (Optional[Dict]) – Dictionary containing GP hyperparameter evolution data

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

plot_gp_kernel_variance(gp_data=None, ax=None)[source]

Plot evolution of GP kernel variance only.

Parameters:
  • gp_data (Optional[Dict]) – Dictionary containing GP hyperparameter evolution data

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

plot_gp_hyperparameters(gp_data=None, ax=None)[source]

Plot evolution of GP hyperparameters (backward compatibility - now plots lengthscales only).

Parameters:
  • gp_data (Optional[Dict]) – Dictionary containing GP hyperparameter evolution data

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

plot_best_loglike_evolution(best_loglike_data=None, ax=None, scatter_improvements=False)[source]

Plot evolution of the best log-likelihood found so far.

Parameters:
  • best_loglike_data (Optional[Dict]) – Dictionary with ‘iterations’ and ‘best_loglike’ keys

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

plot_acquisition_evolution(acquisition_data=None, ax=None)[source]

Plot the evolution of acquisition function values throughout iterations.

Parameters:
  • acquisition_data (Optional[Dict]) – Dictionary with acquisition data (gets from results if None)

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

plot_timing_breakdown(timing_data=None, ax=None)[source]

Plot timing breakdown of different phases of the algorithm.

Parameters:
  • timing_data (Optional[Dict]) – Dictionary with timing information for different phases

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

plot_convergence_diagnostics(convergence_data=None, ax=None)[source]

Plot convergence diagnostics including thresholds and delta evolution.

Parameters:
  • convergence_data (Optional[Dict]) – Dictionary containing convergence history data (uses results if None)

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

plot_kl_divergences(kl_data=None, ax=None, annotate=False)[source]

Plot successive KL divergences between NS iterations (reverse, forward, symmetric).

Parameters:
  • kl_data (Optional[Dict]) – Dictionary containing KL divergence data (uses results if None)

  • ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

plot_parameter_evolution(param_evolution_data=None, max_params=4)[source]

Plot evolution of parameter values during optimization.

Parameters:
  • param_evolution_data (Optional[Dict]) – Dictionary with parameter evolution data

  • max_params (int) – Maximum number of parameters to plot

Return type:

Figure

Returns:

The matplotlib figure object

create_summary_dashboard(logz_data=None, convergence_data=None, kl_data=None, gp_data=None, best_loglike_data=None, acquisition_data=None, timing_data=None, save_path=None, title=None)[source]

Create a comprehensive summary dashboard with all diagnostic plots.

Parameters:
  • logz_data (Optional[Dict]) – Log evidence evolution data

  • convergence_data (Optional[Dict]) – Convergence diagnostics data

  • kl_data (Optional[Dict]) – KL divergence data

  • gp_data (Optional[Dict]) – GP hyperparameter evolution data

  • best_loglike_data (Optional[Dict]) – Best log-likelihood evolution data

  • acquisition_data (Optional[Dict]) – Acquisition function evolution data

  • timing_data (Optional[Dict]) – Timing breakdown data

  • save_path (Optional[str]) – Path to save the figure (optional)

Return type:

Figure

Returns:

The matplotlib figure object

plot_summary_stats(ax=None)[source]

Plot key summary statistics as text.

Parameters:

ax (Optional[Axes]) – Matplotlib axes to plot on (creates new if None)

Return type:

Axes

Returns:

The matplotlib axes object

save_all_plots(output_dir=None, **data_kwargs)[source]

Save all individual plots and the summary dashboard.

Parameters:
  • output_dir (Optional[str]) – Directory to save plots (uses output_file base if None)

  • **data_kwargs – Data dictionaries for different plot types

BOBE.utils.plot.create_summary_plots(results_file, gp_data=None, best_loglike_data=None, timing_data=None, param_evolution_data=None, output_dir=None, figsize_scale=1.0)[source]

Convenience function to create all summary plots for a BOBE run.

Parameters:
  • results_file (str) – Path to BOBE results file (without extension)

  • gp_data (Optional[Dict]) – GP hyperparameter evolution data

  • best_loglike_data (Optional[Dict]) – Best log-likelihood evolution data

  • timing_data (Optional[Dict]) – Timing breakdown data

  • param_evolution_data (Optional[Dict]) – Parameter evolution data

  • output_dir (Optional[str]) – Directory to save plots

  • figsize_scale (float) – Scale factor for figure sizes

Return type:

BOBESummaryPlotter

Returns:

BOBESummaryPlotter object

BOBE.utils.plot.get_data_format_examples()[source]

Return example data formats for the plotting functions.

Return type:

Dict[str, Dict]

Returns:

Dictionary with example data structures

Parallel Computing

class BOBE.pool.MPI_Pool(dynamic_dispatch=False)[source]

Bases: object

Enhanced MPI Pool with support for managing worker state and multiple task types.

This pool implements a master-worker pattern where workers enter a waiting loop and the master dispatches tasks dynamically. Workers automatically participate after initialization and don’t need explicit management in user code.

TASK_OBJECTIVE_EVAL = 0
TASK_GP_FIT = 1
TASK_ACQUISITION_OPT = 3
TASK_COBAYA_INIT = 4
TASK_CLEAR_JAX_CACHES = 5
TASK_INIT = 99
TASK_EXIT = 100
__init__(dynamic_dispatch=False)[source]

Initializes the pool based on whether MPI is available and active.

worker_wait(likelihood, gp=None, seed=None)[source]

Main loop for worker processes. Workers wait for tasks from master and execute them.

This method should be called by worker processes after initialization is complete. It enters an infinite loop waiting for tasks until TASK_EXIT is received.

Parameters:
  • likelihood (Likelihood) – The likelihood object for evaluating objective function.

  • gp (Union[GP, GPwithClassifier]) – The GP object, can be updated via state_dict broadcasts.

  • seed (Optional[int]) – Random seed for the worker. If provided, will be offset by rank.

Notes

This method only executes for worker processes (rank > 0). The master process (rank 0) immediately returns.

run_map_objective(function, tasks)[source]

Maps a function over a list of tasks in parallel.

In MPI mode, distributes tasks to workers dynamically. Workers must be in worker_wait() loop. In serial mode, evaluates locally.

Parameters:
  • function (Callable) – The objective/likelihood function to evaluate. Only used in serial mode.

  • tasks (List[Any]) – List of input points to evaluate, shape (n_tasks, ndim).

Returns:

Array of results, shape (n_tasks,) or (n_tasks, 1).

Return type:

ndarray

gp_fit(gp, maxiters=1000, n_restarts=8, rng=None, use_pool=True)[source]

Orchestrates a parallel GP hyperparameter fit across MPI processes.

Distributes multiple random restarts across workers for hyperparameter optimization and selects the best result.

Parameters:
  • gp (GP) – Gaussian Process model to fit.

  • maxiters (int, optional) – Maximum iterations for each optimization. Default is 1000.

  • n_restarts (int, optional) – Number of random restarts for optimization. Default is 8. In MPI mode, adjusted to at least one restart per process.

  • rng (np.random.Generator, optional) – Random number generator for initial points. If None, creates new one.

  • use_pool (bool, optional) – Whether to use MPI pool for parallelization. Default is True.

Returns:

Best fit result for master process, None for workers.

Return type:

dict or None

get_cobaya_initial_points(likelihood, n_points, rng=None)[source]

Gets initial points from the Cobaya reference prior in parallel.

Distributes the generation of Cobaya initial points across workers. Workers must be in worker_wait() loop.

Parameters:
  • likelihood (CobayaLikelihood) – Cobaya likelihood object with _get_single_valid_point method.

  • n_points (int) – Number of initial points to generate.

  • rng (np.random.Generator, optional) – Random number generator. Only used in serial mode.

Returns:

List of (point, logpost) tuples for master process, None for workers.

Return type:

List[Tuple]

clear_jax_caches()[source]

Clear JAX caches on all processes.

close()[source]

Shut down the pool by telling all workers to exit.

Sends TASK_EXIT signal to all worker processes, allowing them to exit from the worker_wait() loop gracefully.