Utilities
Utility modules providing various helper functions and classes.
Results Handling
- class BOBE.utils.results.BOBEResults(param_names, param_labels, param_bounds, output_file='results', save_dir='./', settings=None, likelihood_name='unknown', resume_from_existing=False)[source]
Bases:
objectComprehensive results management for BOBE runs.
This class handles storing, organizing, and outputting results in formats compatible with standard nested sampling analysis tools.
- __init__(param_names, param_labels, param_bounds, output_file='results', save_dir='./', settings=None, likelihood_name='unknown', resume_from_existing=False)[source]
Initialize the results manager.
- Parameters:
output_file (
str) – Base name for output filesparam_bounds (
ndarray) – Parameter bounds array [n_params, 2]settings (
Optional[Dict[str,Any]]) – Dictionary of BOBE settingslikelihood_name (
str) – Name of the likelihood functionresume_from_existing (
bool) – If True, try to load existing results and continue from there
- update_acquisition(iteration, acquisition_value, acquisition_function)[source]
Track acquisition function values throughout iterations.
- update_gp_hyperparams(iteration, lengthscales, kernel_variance)[source]
Track GP hyperparameters evolution.
- update_convergence(iteration, logz_dict, converged, threshold)[source]
Update convergence information from a nested sampling check.
- update_kl_divergences(iteration, successive_kl=None)[source]
Update KL divergence tracking for convergence analysis.
- get_last_iteration()[source]
Get the last iteration number from the results history.
- Return type:
- Returns:
Last iteration number, or 0 if no iterations have been recorded
- is_resuming()[source]
Check if this is a resumed run (has existing data).
- Return type:
- Returns:
True if this appears to be a resumed run
- finalize(samples_dict={}, logz_dict=None, converged=False, termination_reason='Max iterations reached', gp_info=None, best_point=None, best_loglike=None, best_iteration=None)[source]
Finalize the results with final samples and metadata.
- Parameters:
samples_dict (
Dict[str,ndarray]) – Dictionary with ‘x’, ‘weights’, ‘logl’ keys for final sampleslogz_dict (
Optional[Dict[str,float]]) – Final evidence informationconverged (
bool) – Whether the run convergedtermination_reason (
str) – Reason for terminationgp_info (
Optional[Dict[str,Any]]) – Dictionary containing GP and classifier informationbest_point (
Optional[ndarray]) – Best point found (physical parameter space)best_iteration (
Optional[int]) – Iteration where best point was found
- save_chain_files(samples_dict=None, filename=None)[source]
Save chain files in GetDist format using MCSamples.saveAsText method.
- save_minimum_files()[source]
Save best point in GetDist minimum format.
Creates two files: - .minimum.txt: Simple table with best point - .minimum: Formatted text with parameter details
- save_intermediate(gp, filename=None)[source]
Save intermediate results for crash recovery and resuming.
- get_getdist_samples(samples_dict=None)[source]
Convert results to GetDist MCSamples object.
- Return type:
Optional[MCSamples]- Returns:
GetDist MCSamples object if GetDist is available, None otherwise
Core Utilities
- BOBE.utils.core.is_cluster_environment()[source]
Detect if running in a cluster environment by checking common environment variables. Returns True if cluster environment is detected, False otherwise.
- BOBE.utils.core.kl_divergence_samples(prev_loglike, curr_loglike)[source]
Compute KL divergence between successive iterations.
- BOBE.utils.core.kl_divergence_gaussian(mu1, Cov1, mu2, Cov2)[source]
Computes the forward, reverse, and symmetric KL divergence between two multivariate Gaussian distributions N1=N(mu1, Cov1) and N2=N(mu2, Cov2).
- BOBE.utils.core.get_threshold_for_nsigma(nsigma, d)[source]
Difference between peak of Gaussian and logprob level for nsigma (taken from GPry).
- BOBE.utils.core.scale_to_unit(x, param_bounds)[source]
Project from original domain to unit hypercube, X is N x d shaped, param_bounds are 2 x d
Seed Management
Utility functions for managing global random seeds across the BOBE package.
- BOBE.utils.seed.set_global_seed(seed=None)[source]
Set global random seed for reproducible results.
- BOBE.utils.seed.get_global_seed()[source]
Get the current global seed value.
If the seed has not been set, it will be initialized automatically.
- Return type:
- Returns:
The current global seed.
- BOBE.utils.seed.get_jax_key()[source]
Get the current JAX random key.
If the seed has not been set, it will be initialized automatically.
- Return type:
- Returns:
The current JAX PRNGKey.
- BOBE.utils.seed.split_jax_key()[source]
Split the current JAX random key and update the global key.
If the seed has not been set, it will be initialized automatically.
- BOBE.utils.seed.get_new_jax_key()[source]
Get a new JAX random key by splitting the current global key.
If the seed has not been set, it will be initialized automatically.
- Return type:
- Returns:
A new JAX PRNGKey for immediate use.
Logging Utilities
- class BOBE.utils.log.LevelFilter(levels)[source]
Bases:
FilterFilter to allow specific log levels
- BOBE.utils.log.setup_logging(verbosity='INFO', log_file=None)[source]
Configure logging for serial or MPI runs.
- Parameters:
verbosity – String level - ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’, ‘QUIET’
log_file – Optional file to log to. In MPI runs, will be post-fixed with rank.
Plotting and Visualization
Summary plotting module for BOBE runtime visualization.
This module provides comprehensive plotting capabilities for analyzing BOBE runs, including evidence evolution, GP hyperparameters, timing information, and convergence diagnostics.
- BOBE.utils.plot.plot_final_samples(gp, samples_dict, param_list, param_labels, plot_params=None, param_bounds=None, reference_samples=None, reference_file=None, reference_ignore_rows=0.0, reference_label='MCMC', scatter_points=False, markers=None, output_file='output', output_dir='./', **kwargs)[source]
Plot the final samples from the Bayesian optimization process.
- Parameters:
gp (GP object) – The Gaussian process object used for the optimization.
samples_dict (dict) – The samples from the nested sampling or MCMC process.
param_list (list) – The list of parameter names.
param_labels (list) – The list of parameter labels for plotting.
plot_params (list, optional) – The list of parameters to plot. If None, all parameters will be plotted.
param_bounds (np.ndarray, optional) – The bounds of the parameters. If None, assumed to be [0,1] for all parameters.
reference_samples (MCSamples, optional) – The reference getdist MCsamples from the MCMC/Nested Sampling to compare against. If None, will be loaded from the reference_file.
reference_file (str, optional) – The getdist file root containing the reference samples. If None, will be loaded from the reference_samples. If both are None, no reference samples will be plotted.
reference_ignore_rows (float, optional) – The fraction of rows to ignore in the reference file. Default is 0.0.
reference_label (str, optional) – The label for the reference samples. Default is ‘MCMC’.
scatter_points (bool, optional) – If True, scatter the training points on the plot. Default is False.
output_file (str, optional) – The output file name for the plot. Default is ‘output’.
- class BOBE.utils.plot.BOBESummaryPlotter(results, figsize_scale=1.0)[source]
Bases:
objectComprehensive plotting class for BOBE run analysis and diagnostics.
- __init__(results, figsize_scale=1.0)[source]
Initialize the plotter with BOBE results.
- Parameters:
results (
Union[BOBEResults,str]) – BOBEResults object or path to results filefigsize_scale (
float) – Scale factor for figure sizes (default: 1.0)
- plot_evidence_evolution(logz_data=None, ax=None, show_convergence=True)[source]
Plot the evolution of log evidence (logZ) with error bounds.
- plot_gp_hyperparameters(gp_data=None, ax=None)[source]
Plot evolution of GP hyperparameters (backward compatibility - now plots lengthscales only).
- plot_best_loglike_evolution(best_loglike_data=None, ax=None, scatter_improvements=False)[source]
Plot evolution of the best log-likelihood found so far.
- plot_acquisition_evolution(acquisition_data=None, ax=None)[source]
Plot the evolution of acquisition function values throughout iterations.
- plot_timing_breakdown(timing_data=None, ax=None)[source]
Plot timing breakdown of different phases of the algorithm.
- plot_convergence_diagnostics(convergence_data=None, ax=None)[source]
Plot convergence diagnostics including thresholds and delta evolution.
- plot_kl_divergences(kl_data=None, ax=None, annotate=False)[source]
Plot successive KL divergences between NS iterations (reverse, forward, symmetric).
- plot_parameter_evolution(param_evolution_data=None, max_params=4)[source]
Plot evolution of parameter values during optimization.
- create_summary_dashboard(logz_data=None, convergence_data=None, kl_data=None, gp_data=None, best_loglike_data=None, acquisition_data=None, timing_data=None, save_path=None, title=None)[source]
Create a comprehensive summary dashboard with all diagnostic plots.
- Parameters:
- Return type:
Figure- Returns:
The matplotlib figure object
- BOBE.utils.plot.create_summary_plots(results_file, gp_data=None, best_loglike_data=None, timing_data=None, param_evolution_data=None, output_dir=None, figsize_scale=1.0)[source]
Convenience function to create all summary plots for a BOBE run.
- Parameters:
- Return type:
- Returns:
BOBESummaryPlotter object
Parallel Computing
- class BOBE.pool.MPI_Pool(dynamic_dispatch=False)[source]
Bases:
objectEnhanced MPI Pool with support for managing worker state and multiple task types.
This pool implements a master-worker pattern where workers enter a waiting loop and the master dispatches tasks dynamically. Workers automatically participate after initialization and don’t need explicit management in user code.
- TASK_OBJECTIVE_EVAL = 0
- TASK_GP_FIT = 1
- TASK_ACQUISITION_OPT = 3
- TASK_COBAYA_INIT = 4
- TASK_CLEAR_JAX_CACHES = 5
- TASK_INIT = 99
- TASK_EXIT = 100
- __init__(dynamic_dispatch=False)[source]
Initializes the pool based on whether MPI is available and active.
- worker_wait(likelihood, gp=None, seed=None)[source]
Main loop for worker processes. Workers wait for tasks from master and execute them.
This method should be called by worker processes after initialization is complete. It enters an infinite loop waiting for tasks until TASK_EXIT is received.
- Parameters:
likelihood (
Likelihood) – The likelihood object for evaluating objective function.gp (
Union[GP,GPwithClassifier]) – The GP object, can be updated via state_dict broadcasts.seed (
Optional[int]) – Random seed for the worker. If provided, will be offset by rank.
Notes
This method only executes for worker processes (rank > 0). The master process (rank 0) immediately returns.
- run_map_objective(function, tasks)[source]
Maps a function over a list of tasks in parallel.
In MPI mode, distributes tasks to workers dynamically. Workers must be in worker_wait() loop. In serial mode, evaluates locally.
- gp_fit(gp, maxiters=1000, n_restarts=8, rng=None, use_pool=True)[source]
Orchestrates a parallel GP hyperparameter fit across MPI processes.
Distributes multiple random restarts across workers for hyperparameter optimization and selects the best result.
- Parameters:
gp (
GP) – Gaussian Process model to fit.maxiters (int, optional) – Maximum iterations for each optimization. Default is 1000.
n_restarts (int, optional) – Number of random restarts for optimization. Default is 8. In MPI mode, adjusted to at least one restart per process.
rng (np.random.Generator, optional) – Random number generator for initial points. If None, creates new one.
use_pool (bool, optional) – Whether to use MPI pool for parallelization. Default is True.
- Returns:
Best fit result for master process, None for workers.
- Return type:
dict or None
- get_cobaya_initial_points(likelihood, n_points, rng=None)[source]
Gets initial points from the Cobaya reference prior in parallel.
Distributes the generation of Cobaya initial points across workers. Workers must be in worker_wait() loop.
- Parameters:
likelihood (
CobayaLikelihood) – Cobaya likelihood object with _get_single_valid_point method.n_points (
int) – Number of initial points to generate.rng (np.random.Generator, optional) – Random number generator. Only used in serial mode.
- Returns:
List of (point, logpost) tuples for master process, None for workers.
- Return type: