Gaussian Process Models
JaxBo provides several Gaussian Process implementations optimized for different scenarios.
Base GP Class
- class BOBE.gp.GP(train_x, train_y, noise=1e-08, kernel='rbf', optimizer='scipy', optimizer_options={}, kernel_variance_bounds=[0.0001, 100000000.0], lengthscale_bounds=[0.01, 5], lengthscales=None, kernel_variance=None, kernel_variance_prior=None, lengthscale_prior=None, tausq=None, tausq_bounds=[0.0001, 10000.0], param_names=None)[source]
Bases:
object- __init__(train_x, train_y, noise=1e-08, kernel='rbf', optimizer='scipy', optimizer_options={}, kernel_variance_bounds=[0.0001, 100000000.0], lengthscale_bounds=[0.01, 5], lengthscales=None, kernel_variance=None, kernel_variance_prior=None, lengthscale_prior=None, tausq=None, tausq_bounds=[0.0001, 10000.0], param_names=None)[source]
Initialize the Gaussian Process model.
- Parameters:
train_x (jnp.ndarray) – Training inputs, shape (N, D).
train_y (jnp.ndarray) – Objective function values at training points, shape (N, 1).
noise (float, optional) – Noise parameter added to the diagonal of the kernel. Default is 1e-8.
kernel (str, optional) – Kernel to use, either “rbf” or “matern”. Default is “rbf”.
optimizer (str, optional) – Optimizer to use for hyperparameter tuning. Default is “scipy”.
optimizer_options (dict, optional) – Keyword arguments for the optimizer. Default is {}.
kernel_variance_bounds (list, optional) – Bounds for the kernel variance. Default is [1e-4, 1e8].
lengthscale_bounds (list, optional) – Bounds for the lengthscales. Default is [0.01, 10].
lengthscales (jnp.ndarray, optional) – Initial lengthscale values. If None, defaults to ones. Default is None.
kernel_variance (float, optional) – Initial kernel variance. If None, defaults to 1.0. Default is None.
kernel_variance_prior (dict or str, optional) – Specification for the kernel variance prior. If None, defaults to {‘name’: ‘LogNormal’, ‘loc’: 0.0, ‘scale’: 1.0}. If ‘fixed’, the kernel variance will be fixed to the initial value and not optimized. Defaults to None.
lengthscale_prior (str or dict, optional) – Specification for the lengthscale prior. If ‘DSLP’ or None, uses the DSLP prior. If ‘SAAS’, uses the SAAS prior with tausq parameter. Otherwise, uses the provided distribution spec. Defaults to None.
tausq (float, optional) – Initial tausq parameter for SAAS prior. Only used when lengthscale_prior=’SAAS’. If None, defaults to 1.0. Defaults to None.
tausq_bounds (list, optional) – Bounds for the tausq parameter (in log space). Only used when lengthscale_prior=’SAAS’. Defaults to [-4, 4].
- neg_mll(log_params)[source]
Computes the negative log marginal likelihood for the GP with given hyperparameters.
- fit(x0=None, maxiter=500)[source]
Performs a serial fit for a given batch of starting points (x0). This method is called by each MPI process on its assigned chunk.
- Parameters:
- Returns:
result – Dictionary containing the best ‘mll’ and corresponding ‘params’ (log space) found.
- Return type:
- update_hyperparams(hyperparams)[source]
Update the GP hyperparameters and recompute the Cholesky and alphas.
- predict_single(x)[source]
Predicts the mean and variance of the GP at x but does not unstandardize it. To use with EI and the like.
- update(new_x, new_y)[source]
Updates the GP with new training points and refits the GP if refit is True.
- recompute_cholesky()[source]
Recomputes the Cholesky decomposition and alphas. Useful if hyperparameters are changed manually.
- fantasy_var(new_x, mc_points, k_train_mc)[source]
Computes the variance of the GP at the mc_points assuming a single point new_x is added to the training set
- state_dict()[source]
Returns a dictionary containing the complete state of the GP. This can be used for saving, loading, or copying the GP.
- Returns:
state – Dictionary containing all necessary information to reconstruct the GP
- Return type:
- save(filename='gp')[source]
Save the GP state to a file using state_dict.
- Parameters:
filename (str) – The filename to save to (with or without .npz extension). Default is ‘gp’.
- copy()[source]
Creates a deep copy of the GP using state_dict.
- Returns:
gp_copy – A deep copy of the current GP
- Return type:
- property npoints
Gaussian Process with Classifier
For handling constraints and invalid regions.
- class BOBE.clf_gp.GPwithClassifier(train_x=None, train_y=None, clf_type='svm', clf_settings={}, clf_use_size=10, clf_update_step=1, probability_threshold=0.5, minus_inf=-100000.0, clf_threshold=250.0, gp_threshold=500.0, noise=1e-08, kernel='rbf', optimizer='scipy', optimizer_options={}, kernel_variance_bounds=[0.0001, 100000000.0], lengthscale_bounds=[0.01, 5.0], tausq=None, tausq_bounds=[0.0001, 10000.0], kernel_variance_prior=None, lengthscale_prior=None, lengthscales=None, kernel_variance=1.0, param_names=None, train_clf_on_init=True)[source]
Bases:
GP- __init__(train_x=None, train_y=None, clf_type='svm', clf_settings={}, clf_use_size=10, clf_update_step=1, probability_threshold=0.5, minus_inf=-100000.0, clf_threshold=250.0, gp_threshold=500.0, noise=1e-08, kernel='rbf', optimizer='scipy', optimizer_options={}, kernel_variance_bounds=[0.0001, 100000000.0], lengthscale_bounds=[0.01, 5.0], tausq=None, tausq_bounds=[0.0001, 10000.0], kernel_variance_prior=None, lengthscale_prior=None, lengthscales=None, kernel_variance=1.0, param_names=None, train_clf_on_init=True)[source]
Generic Classifier-GP class combining a GP with a classifier. The GP is trained on the data points that are within the GP threshold of the maximum value of the GP.
- Parameters:
train_x (array-like, shape (n_samples, n_dim)) – Initial training points.
train_y (array-like, shape (n_samples,)) – Initial training values.
clf_type (str, optional) – Type of classifier (‘svm’, ‘nn’, ‘ellipsoid’, etc.). Default is ‘svm’.
clf_params (dict, optional) – Parameters specific to the chosen classifier. Default is None.
clf_use_size (int, optional) – Minimum number of points to start using the classifier. Default is 300.
clf_update_step (int, optional) – Update classifier every clf_update_step points after clf_use_size is reached. Default is 5.
probability_threshold (float, optional) – Threshold for classifier probability/score to consider a point feasible (important for nn, ellipsoid). Default is 0.5.
minus_inf (float, optional) – Value used for infeasible predictions. Default is -1e5.
clf_threshold (float, optional) – Threshold for initial classifier training labels (if used). If None, gp_threshold might be used or a default calculated.
gp_threshold (float, optional) – Threshold for adding points to the GP training set. Default is 5000.
noise – GP parameters (see DSLP_GP/SAAS_GP). Note: bounds are now in actual space, not log10.
kernel – GP parameters (see DSLP_GP/SAAS_GP). Note: bounds are now in actual space, not log10.
optimizer – GP parameters (see DSLP_GP/SAAS_GP). Note: bounds are now in actual space, not log10.
kernel_variance_bounds – GP parameters (see DSLP_GP/SAAS_GP). Note: bounds are now in actual space, not log10.
lengthscale_bounds – GP parameters (see DSLP_GP/SAAS_GP). Note: bounds are now in actual space, not log10.
lengthscale_priors – GP parameters (see DSLP_GP/SAAS_GP). Note: bounds are now in actual space, not log10.
lengthscales – GP parameters (see DSLP_GP/SAAS_GP). Note: bounds are now in actual space, not log10.
kernel_variance – GP parameters (see DSLP_GP/SAAS_GP). Note: bounds are now in actual space, not log10.
- predict_single(x)[source]
Predicts the mean and variance of the GP at x but does not unstandardize it. To use with EI and the like.
- fantasy_var(new_x, mc_points, k_train_mc)[source]
Computes the fantasy variance, see gp.py for more details. Classifier logic could potentially be added here if needed.
- update(new_x, new_y)[source]
Updates the classifier and GP training sets. Retrains classifier/GP based on thresholds and steps.
- kernel(x1, x2, lengthscales, kernel_variance, noise, include_noise=True)[source]
Returns the kernel function used by the GP.
- state_dict()[source]
Returns a dictionary containing the complete state of the GPwithClassifier. This can be used for saving, loading, or copying the GPwithClassifier.
- Returns:
state – Dictionary containing all necessary information to reconstruct the GPwithClassifier
- Return type:
- classmethod from_state_dict(state)[source]
Creates a GPwithClassifier instance from a state dictionary.
- Parameters:
state (dict) – State dictionary returned by state_dict()
- Returns:
gp_clf – The reconstructed GPwithClassifier object
- Return type:
- save(filename='gp')[source]
Save the GPwithClassifier state to a file using state_dict.
- Parameters:
filename (str) – The filename to save to (with or without .npz extension). Default is ‘gp’.
- classmethod load(filename, **kwargs)[source]
Loads a GPwithClassifier from a file
- Parameters:
filename (str) – The name of the file to load the GPwithClassifier from (with or without .npz extension)
**kwargs – Additional keyword arguments to pass to the GPwithClassifier constructor
- Returns:
gp_clf – The loaded GPwithClassifier object
- Return type:
- copy()[source]
Creates a deep copy of the GPwithClassifier using state_dict.
- Returns:
gp_clf_copy – A deep copy of the current GPwithClassifier
- Return type:
- property clf_data_size
Size of the classifier’s training inputs.
- property npoints
Kernel Functions
JaxBo uses object-oriented kernel implementations for GP covariance computation.
- class BOBE.kernels.Kernel(lengthscales, kernel_variance, noise=1e-08)[source]
Bases:
ABCAbstract base class for all kernels in BOBE.
- lengthscales
Lengthscale parameters for each dimension, shape (D,)
- Type:
jnp.ndarray
- __init__(lengthscales, kernel_variance, noise=1e-08)[source]
Initialize kernel with hyperparameters.
- sq_dist(xa, xb)[source]
Compute squared Euclidean distance between two sets of points.
This utility method is used by many kernel implementations.
- Parameters:
xa (jnp.ndarray) – First set of points, shape (n1, D)
xb (jnp.ndarray) – Second set of points, shape (n2, D)
- Returns:
sq_dist – Squared distances, shape (n1, n2)
- Return type:
jnp.ndarray
- abstractmethod covariance(xa, xb, include_noise=True)[source]
Compute covariance matrix between two sets of points.
- Parameters:
xa (jnp.ndarray) – First set of points, shape (n1, D)
xb (jnp.ndarray) – Second set of points, shape (n2, D)
include_noise (bool, optional) – Whether to add noise to diagonal (only when xa is xb). Default is True.
- Returns:
K – Covariance matrix of shape (n1, n2)
- Return type:
jnp.ndarray
- diagonal(x, include_noise=True)[source]
Compute only the diagonal of the kernel matrix K(x,x).
For stationary kernels, the diagonal is constant: kernel_variance (+ noise). Override this method if your kernel has a non-constant diagonal.
- Parameters:
x (jnp.ndarray) – Points at which to compute diagonal, shape (n, D)
include_noise (bool, optional) – Whether to include noise in diagonal. Default is True.
- Returns:
diag – Diagonal values, shape (n,)
- Return type:
jnp.ndarray
- class BOBE.kernels.RBFKernel(lengthscales, kernel_variance, noise=1e-08)[source]
Bases:
KernelRadial Basis Function (RBF) / Squared Exponential kernel.
k(x, x’) = σ² * exp(-0.5 * ||x - x’||²/ℓ²)
where σ² is kernel_variance and ℓ is lengthscale.
- covariance(xa, xb, include_noise=True)[source]
Compute RBF covariance matrix.
- Parameters:
xa (jnp.ndarray) – First set of input points, shape (n1, d).
xb (jnp.ndarray) – Second set of input points, shape (n2, d).
include_noise (bool, optional) – Whether to include noise on diagonal. Default is True.
- Returns:
Kernel matrix of shape (n1, n2).
- Return type:
jnp.ndarray
- class BOBE.kernels.MaternKernel(lengthscales, kernel_variance, noise=1e-08)[source]
Bases:
KernelMatérn-5/2 kernel.
k(x, x’) = σ² * (1 + √5*d + 5*d²/3) * exp(-√5*d)
where d = ||x - x’||/ℓ, σ² is kernel_variance, and ℓ is lengthscale.
- covariance(xa, xb, include_noise=True)[source]
Compute Matérn-5/2 covariance matrix.
- Parameters:
xa (jnp.ndarray) – First set of input points, shape (n1, d).
xb (jnp.ndarray) – Second set of input points, shape (n2, d).
include_noise (bool, optional) – Whether to include noise on diagonal. Default is True.
- Returns:
Kernel matrix of shape (n1, n2).
- Return type:
jnp.ndarray
Classifier Module
- BOBE.clf.train_svm_classifier(X, Y, settings={}, init_params=None, **kwargs)[source]
Train SVM classifier and return parameters, metrics, and predict function.
- BOBE.clf.get_svm_predict_proba_fn(params)[source]
Get prediction function for SVM classifier from parameters (for loading from file).
- BOBE.clf.train_nn_classifier(X, Y, settings={}, init_params=None, **kwargs)[source]
Train neural network classifier and return parameters, metrics, and predict function.
- BOBE.clf.get_nn_predict_proba_fn(params, settings={}, **kwargs)[source]
Get prediction function for NN classifier from parameters (for loading from file).
- BOBE.clf.train_ellipsoid_classifier(X, Y, settings={}, init_params=None, **kwargs)[source]
Train ellipsoid classifier and return parameters, metrics, and predict function.
- BOBE.clf.get_ellipsoid_predict_proba_fn(params, settings, d, **kwargs)[source]
Get prediction function for ellipsoid classifier from parameters (for loading from file).
- BOBE.clf.svm_predict(x, support_vectors, dual_coef, intercept, gamma)[source]
Compute the decision function for SVM with RBF kernel.
- Parameters:
- Returns:
Decision function value (scalar). Sign of this value gives the predicted class.
- BOBE.clf.train_with_restarts(train_fn, x, y, n_restarts=2, init_params=None, **train_kwargs)[source]
Train model with multiple restarts using the entire dataset.
- BOBE.clf.train_nn(model, x_train, y_train, init_params=None, **kwargs)[source]
Simplified NN training using entire dataset
- BOBE.clf.train_nn_multiple_restarts(model, x, y, **kwargs)[source]
Wrapper for NN training with restarts