Rosenbrock function with the Neighbourhood Algorithm#

Open In Colab

If you are running this notebook locally, make sure you’ve followed steps here to set up the environment. (This environment.yml file specifies a list of packages required to run the notebooks)


1. Introduction#

1.1 Rosenbrock function#

The Rosenbrock function is a classic optimisation test function given by

\[f(x,y) = (a - x)^2 + b(y - x^2)^2\]

with \(a = 1\) and \(b = 100\). The function has a global minimum at \((a, a^2) = (1, 1)\) where \(f(1,1) = 0\).

The minimum lies inside a narrow, curved valley. Finding the valley is easy, but converging to the minimum within it is challenging for many optimisers.

We work with the \(\log_{10}\)-scaled version to avoid very large/small values:

\[F(x,y) = \log_{10}\left[(a - x)^2 + b(y - x^2)^2\right]\]

1.2 Neighbourhood Algorithm#

# display theory on the Neighbourhood Algorithm
from IPython.display import display, Markdown

with open("../../theory/neighbourhood_algorithm.md", "r") as f:
    content = f.read()

display(Markdown(content))
<IPython.core.display.Markdown object>

1.3 CoFI and the NA#

The implementation of the NA that cofi wraps is called `neighpy <inlab-geo/neighpy>`__. This implementation implements both phases of the NA as described in the original papers.

Because of the multi-phase nature of the NA, cofi gives you 3 options: 1. Run both phases, using the Direct Search Phase samples as the initial ensemble for the Appraisal Phase (tool: neighpy) 2. Only run the Direct Search Phase (tool: neighpyI) 3. Only run the Appraisal Phase, using samples obtained from any method (tool: neighpyII)

We will look at all of these options in this notebook.


2. Import modules#

# -------------------------------------------------------- #
#                                                          #
#     Uncomment below to set up environment on "colab"     #
#                                                          #
# -------------------------------------------------------- #

# !pip install -U cofi
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
from scipy.spatial import Voronoi, voronoi_plot_2d

from cofi import BaseProblem, InversionOptions, Inversion

np.random.seed(42)

3. Define the problem#

def rosenbrock(params, a=1, b=100):
    """Log10-scaled Rosenbrock function."""
    return np.log10((a - params[0])**2 + b * (params[1] - params[0]**2)**2)
# Contour plot of the Rosenbrock function
X, Y = np.meshgrid(np.linspace(-2, 2, 1000), np.linspace(-1, 3, 1000))
Z = np.log10((1 - X)**2 + 100 * (Y - X**2)**2)

plt.figure(figsize=(8, 6))
plt.imshow(Z, origin="lower", extent=(-2, 2, -1, 3), aspect="auto")
plt.colorbar(label="$\\log_{10}(f)$")
plt.scatter(1, 1, c="r", marker="x", s=100, zorder=10, label="Global minimum (1, 1)")
plt.xlabel("x")
plt.ylabel("y")
plt.title("Rosenbrock function")
plt.legend(loc="upper center")
<matplotlib.legend.Legend object at 0x7efcbfc411d0>
# Define the problem in CoFI
inv_problem = BaseProblem()
inv_problem.name = "Rosenbrock Function"
inv_problem.set_objective(rosenbrock)

inv_problem.summary()
=====================================================================
Summary for inversion problem: Rosenbrock Function
=====================================================================
Model shape: Unknown
---------------------------------------------------------------------
List of functions/properties set by you:
['objective']
---------------------------------------------------------------------
List of functions/properties created based on what you have provided:
-- none --
---------------------------------------------------------------------
List of functions/properties that can be further set for the problem:
( not all of these may be relevant to your inversion workflow )
['log_posterior', 'log_posterior_with_blobs', 'log_likelihood', 'log_prior', 'gradient', 'hessian', 'hessian_times_vector', 'residual', 'jacobian', 'jacobian_times_vector', 'data_misfit', 'regularization', 'regularization_matrix', 'forward', 'data', 'data_covariance', 'data_covariance_inv', 'initial_model', 'model_shape', 'blobs_dtype', 'bounds', 'constraints']

4. Full NA (direct search + appraisal)#

Here we run both phases of the NA in one go using the neighpy tool.

bounds = [(-2, 2), (-1, 3)]
n_initial_samples = 100
n_samples_per_iteration = 70
n_cells_to_resample = 10
n_iterations = 20
n_resample = 50000
n_walkers = 10

inv_options = InversionOptions()
inv_options.set_tool("neighpy")
inv_options.set_params(
    bounds=bounds,
    n_initial_samples=n_initial_samples,
    n_samples_per_iteration=n_samples_per_iteration,
    n_cells_to_resample=n_cells_to_resample,
    n_iterations=n_iterations,
    n_resample=n_resample,
    n_walkers=n_walkers,
)
inv = Inversion(inv_problem, inv_options)
inv_result = inv.run()
inv_result.summary()
NAI - Initial Random Search

NAI - Optimisation Loop:   0%|          | 0/20 [00:00<?, ?it/s]
NAI - Optimisation Loop:   5%|▌         | 1/20 [00:00<00:04,  4.29it/s]
NAI - Optimisation Loop:  30%|███       | 6/20 [00:00<00:00, 21.00it/s]
NAI - Optimisation Loop:  55%|█████▌    | 11/20 [00:00<00:00, 30.36it/s]
NAI - Optimisation Loop:  80%|████████  | 16/20 [00:00<00:00, 36.08it/s]
NAI - Optimisation Loop: 100%|██████████| 20/20 [00:00<00:00, 31.37it/s]
============================
Summary for inversion result
============================
SUCCESS
----------------------------
model: [1.07679749 1.1598174 ]
direct_search_samples: [[1.15976313 1.2663329 ]
 [0.27603632 1.95840675]
 [1.30366831 0.88045826]
 ...
 [1.07685765 1.1598225 ]
 [1.07685438 1.15982338]
 [1.07685384 1.1598245 ]]
direct_search_objectives: [-0.19032542  2.54997852  1.82726273 ... -2.22833152 -2.22834468
 -2.22834377]
appraisal_samples: [[ 1.18791177  1.80697917]
 [ 0.18183544  0.23945759]
 [ 0.54951805  0.31364752]
 ...
 [ 0.40972822  1.48295012]
 [ 1.24207273 -0.33868541]
 [ 0.46144746 -0.07465434]]
best = inv_result.model
ds_samples = inv_result.direct_search_samples
ds_objectives = inv_result.direct_search_objectives
appraisal_samples = inv_result.appraisal_samples

print(f"Best model: x={best[0]:.4f}, y={best[1]:.4f}")
print(f"True minimum: x=1, y=1")
Best model: x=1.0768, y=1.1598
True minimum: x=1, y=1

Appraisal samples#

fig = voronoi_plot_2d(
    Voronoi(ds_samples), show_vertices=False, line_width=0.5, line_colors="w"
)
ax = fig.gca()
im = ax.imshow(Z, origin="lower", extent=(-2, 2, -1, 3), aspect="auto")
fig.colorbar(im)
_truth = ax.scatter(1, 1, c="r", marker="x", s=100, zorder=2, label="True minimum")
_best = ax.scatter(*best, c="k", marker="+", s=100, zorder=2, label="Best sample (NA-I)")
_resample = ax.scatter(
    *appraisal_samples.T, s=0.5, c="grey", zorder=0, label="Resampled points (NA-II)"
)
_voronoi = Line2D([0], [0], marker="o", label="Voronoi samples (NA-I)", markersize=5, linewidth=0)
ax.set_xlim(-2, 2)
ax.set_ylim(-1, 3)
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.legend(
    handles=[_truth, _best, _voronoi, _resample], framealpha=1, edgecolor="black"
)
ax.set_title("NA-I and NA-II on Rosenbrock function")
Text(0.5, 1.0, 'NA-I and NA-II on Rosenbrock function')

Conditional distributions#

fig, axs = plt.subplots(
    2, 2,
    gridspec_kw=dict(height_ratios=[1, 5], width_ratios=[5, 1]),
    figsize=(7, 7),
    tight_layout=True,
)
axs[0, 1].set_visible(False)

# Conditional posterior samples p(x|y=1)
y_hist_edges = np.histogram_bin_edges(appraisal_samples[:, 1], bins=50, range=(-1, 3))
best_ind_y = np.digitize(1, y_hist_edges)
x_given_y = appraisal_samples[
    (appraisal_samples[:, 1] > y_hist_edges[best_ind_y - 1])
    & (appraisal_samples[:, 1] < y_hist_edges[best_ind_y]),
    0,
]
axs[0, 0].hist(x_given_y, bins=50, orientation="vertical", color="grey")
axs[0, 0].axvline(1, c="r", ls="--", lw=1)
axs[0, 0].set_xlim(-2, 2)
axs[0, 0].set_xticks([])
axs[0, 0].text(
    0.05, 0.9, "p(x|y=1)",
    transform=axs[0, 0].transAxes, fontsize=12, verticalalignment="top",
)

# Conditional posterior samples p(y|x=1)
x_hist_edges = np.histogram_bin_edges(appraisal_samples[:, 0], bins=50, range=(-2, 2))
best_ind_x = np.digitize(1, x_hist_edges)
y_given_x = appraisal_samples[
    (appraisal_samples[:, 0] > x_hist_edges[best_ind_x - 1])
    & (appraisal_samples[:, 0] < x_hist_edges[best_ind_x]),
    1,
]
axs[1, 1].hist(y_given_x, bins=50, orientation="horizontal", color="grey")
axs[1, 1].axhline(1, c="r", ls="--", lw=1)
axs[1, 1].set_ylim(-1, 3)
axs[1, 1].set_yticks([])
axs[1, 1].text(
    0.05, 0.9, "p(y|x=1)",
    transform=axs[1, 1].transAxes, fontsize=12, verticalalignment="bottom",
)

# Full posterior samples
im = axs[1, 0].imshow(
    Z, origin="lower", extent=(-2, 2, -1, 3), aspect="auto"
)
_truth = axs[1, 0].scatter(
    1, 1, c="r", marker="x", s=50, zorder=2, label="True minimum"
)
_best = axs[1, 0].scatter(
    *best, c="k", marker="+", s=50, zorder=2, label="Best sample (NA-I)"
)
_resample = axs[1, 0].scatter(
    *appraisal_samples.T, s=0.5, c="grey", zorder=0, label="Resampled points (NA-II)"
)
axs[1, 0].set_xlim(-2, 2)
axs[1, 0].set_ylim(-1, 3)
axs[1, 0].set_xlabel("x")
axs[1, 0].set_ylabel("y")
axs[1, 0].legend(
    handles=[_truth, _best, _resample], framealpha=1, edgecolor="black",
)

fig.suptitle("Conditional distributions from NA-II on Rosenbrock function")
Text(0.5, 0.98, 'Conditional distributions from NA-II on Rosenbrock function')

5. Direct search only#

If you only want to optimise the objective function and find a point estimate, you can run just the direct search phase using the neighpyI tool.

inv_options_ds = InversionOptions()
inv_options_ds.set_tool("neighpyI")
inv_options_ds.set_params(
    bounds=bounds,
    n_initial_samples=n_initial_samples,
    n_samples_per_iteration=n_samples_per_iteration,
    n_cells_to_resample=n_cells_to_resample,
    n_iterations=n_iterations,
)

inv_ds = Inversion(inv_problem, inv_options_ds)
inv_result_ds = inv_ds.run()
inv_result_ds.summary()
NAI - Initial Random Search

NAI - Optimisation Loop:   0%|          | 0/20 [00:00<?, ?it/s]
NAI - Optimisation Loop:  25%|██▌       | 5/20 [00:00<00:00, 44.44it/s]
NAI - Optimisation Loop:  50%|█████     | 10/20 [00:00<00:00, 45.75it/s]
NAI - Optimisation Loop:  75%|███████▌  | 15/20 [00:00<00:00, 46.40it/s]
NAI - Optimisation Loop: 100%|██████████| 20/20 [00:00<00:00, 46.74it/s]
NAI - Optimisation Loop: 100%|██████████| 20/20 [00:00<00:00, 46.35it/s]
============================
Summary for inversion result
============================
SUCCESS
----------------------------
model: [1.28686446 1.6562155 ]
samples: [[-1.67484822  2.15015736]
 [-1.03846947  1.15364817]
 [-1.95464601 -0.95437107]
 ...
 [ 1.2869385   1.6572716 ]
 [ 1.28693635  1.65727174]
 [ 1.28693689  1.65727276]]
objectives: [ 1.69942117  0.67406187  3.3596086  ... -1.08382908 -1.08382922
 -1.083828  ]
best_ds = inv_result_ds.model
ds_samples_only = inv_result_ds.samples
ds_objectives_only = inv_result_ds.objectives

print(f"Best model: x={best_ds[0]:.4f}, y={best_ds[1]:.4f}")
Best model: x=1.2869, y=1.6562
fig = voronoi_plot_2d(
    Voronoi(ds_samples_only), show_vertices=False, line_width=0.5, line_colors="w"
)
ax = fig.gca()
im = ax.imshow(Z, origin="lower", extent=(-2, 2, -1, 3), aspect="auto")
fig.colorbar(im)
_truth = ax.scatter(1, 1, c="r", marker="x", s=100, zorder=10, label="True minimum")
_best = ax.scatter(*best_ds, c="k", marker="+", s=100, zorder=10, label="Best sample (NA-I)")
_voronoi = Line2D([0], [0], marker="o", label="Voronoi samples (NA-I)", markersize=5, linewidth=0)
ax.set_xlim(-2, 2)
ax.set_ylim(-1, 3)
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.legend(handles=[_truth, _best, _voronoi], framealpha=1, edgecolor="black")
ax.set_title("NA-I Direct Search on Rosenbrock function")
Text(0.5, 1.0, 'NA-I Direct Search on Rosenbrock function')

6. Appraisal only#

The appraisal phase is implemented in the neighpyII tool. It takes a set of samples and their corresponding log posterior probability density.

Note that the direct search minimises an objective, but the appraisal maximises a posterior. So we pass -objectives as log_ppd.

inv_options_app = InversionOptions()
inv_options_app.set_tool("neighpyII")
inv_options_app.set_params(
    bounds=bounds,
    initial_ensemble=ds_samples_only,
    log_ppd=-ds_objectives_only,
    n_resample=n_resample,
    n_walkers=n_walkers,
)

inv_app = Inversion(inv_problem, inv_options_app)
inv_result_app = inv_app.run()
inv_result_app.summary()
============================
Summary for inversion result
============================
SUCCESS
----------------------------
new_samples: [[ 1.37579996  1.71047836]
 [ 0.89175276  1.67088524]
 [ 1.48016363  2.03048887]
 ...
 [ 1.14469765  1.2165956 ]
 [ 1.01351183  1.36735017]
 [-0.8633595   1.50488675]]
appraisal_samples_only = inv_result_app.new_samples

Appraisal samples#

fig = voronoi_plot_2d(
    Voronoi(ds_samples_only), show_vertices=False, line_width=0.5, line_colors="w"
)
ax = fig.gca()
im = ax.imshow(Z, origin="lower", extent=(-2, 2, -1, 3), aspect="auto")
fig.colorbar(im)
_truth = ax.scatter(1, 1, c="r", marker="x", s=100, zorder=2, label="True minimum")
_best = ax.scatter(*best_ds, c="k", marker="+", s=100, zorder=2, label="Best sample (NA-I)")
_resample = ax.scatter(
    *appraisal_samples_only.T, s=0.5, c="grey", zorder=0, label="Resampled points (NA-II)"
)
_voronoi = Line2D([0], [0], marker="o", label="Voronoi samples (NA-I)", markersize=5, linewidth=0)
ax.set_xlim(-2, 2)
ax.set_ylim(-1, 3)
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.legend(
    handles=[_truth, _best, _voronoi, _resample], framealpha=1, edgecolor="black"
)
ax.set_title("NA-II Appraisal on Rosenbrock function")
Text(0.5, 1.0, 'NA-II Appraisal on Rosenbrock function')

Conditional distributions#

fig, axs = plt.subplots(
    2, 2,
    gridspec_kw=dict(height_ratios=[1, 5], width_ratios=[5, 1]),
    figsize=(7, 7),
    tight_layout=True,
)
axs[0, 1].set_visible(False)

# Conditional posterior samples p(x|y=1)
y_hist_edges = np.histogram_bin_edges(appraisal_samples_only[:, 1], bins=50, range=(-1, 3))
best_ind_y = np.digitize(1, y_hist_edges)
x_given_y = appraisal_samples_only[
    (appraisal_samples_only[:, 1] > y_hist_edges[best_ind_y - 1])
    & (appraisal_samples_only[:, 1] < y_hist_edges[best_ind_y]),
    0,
]
axs[0, 0].hist(x_given_y, bins=50, orientation="vertical", color="grey")
axs[0, 0].axvline(1, c="r", ls="--", lw=1)
axs[0, 0].set_xlim(-2, 2)
axs[0, 0].set_xticks([])
axs[0, 0].text(
    0.05, 0.9, "p(x|y=1)",
    transform=axs[0, 0].transAxes, fontsize=12, verticalalignment="top",
)

# Conditional posterior samples p(y|x=1)
x_hist_edges = np.histogram_bin_edges(appraisal_samples_only[:, 0], bins=50, range=(-2, 2))
best_ind_x = np.digitize(1, x_hist_edges)
y_given_x = appraisal_samples_only[
    (appraisal_samples_only[:, 0] > x_hist_edges[best_ind_x - 1])
    & (appraisal_samples_only[:, 0] < x_hist_edges[best_ind_x]),
    1,
]
axs[1, 1].hist(y_given_x, bins=50, orientation="horizontal", color="grey")
axs[1, 1].axhline(1, c="r", ls="--", lw=1)
axs[1, 1].set_ylim(-1, 3)
axs[1, 1].set_yticks([])
axs[1, 1].text(
    0.05, 0.9, "p(y|x=1)",
    transform=axs[1, 1].transAxes, fontsize=12, verticalalignment="bottom",
)

# Full posterior samples
im = axs[1, 0].imshow(
    Z, origin="lower", extent=(-2, 2, -1, 3), aspect="auto"
)
_truth = axs[1, 0].scatter(
    1, 1, c="r", marker="x", s=50, zorder=2, label="True minimum"
)
_best = axs[1, 0].scatter(
    *best_ds, c="k", marker="+", s=50, zorder=2, label="Best sample (NA-I)"
)
_resample = axs[1, 0].scatter(
    *appraisal_samples_only.T, s=0.5, c="grey", zorder=0, label="Resampled points (NA-II)"
)
axs[1, 0].set_xlim(-2, 2)
axs[1, 0].set_ylim(-1, 3)
axs[1, 0].set_xlabel("x")
axs[1, 0].set_ylabel("y")
axs[1, 0].legend(
    handles=[_truth, _best, _resample], framealpha=1, edgecolor="black",
)

fig.suptitle("Conditional distributions from NA-II on Rosenbrock function")
Text(0.5, 0.98, 'Conditional distributions from NA-II on Rosenbrock function')

Watermark#

watermark_list = ["cofi", "numpy", "scipy", "matplotlib"]
for pkg in watermark_list:
    pkg_var = __import__(pkg)
    print(pkg, getattr(pkg_var, "__version__"))
cofi 0.2.11+68.gc319bc9
numpy 2.2.6
scipy 1.17.1
matplotlib 3.10.9

sphinx_gallery_thumbnail_number = -1

Total running time of the script: (0 minutes 34.566 seconds)

Gallery generated by Sphinx-Gallery