Real-time monitoring dashboard
The Hill Climber package includes a real-time monitoring dashboard built with Streamlit and SQLite for visualizing optimization progress as it runs. The dashboard uses a modular architecture separating data loading, UI components, and plot generation.
Features
The dashboard provides:
Replica leaderboard: Top-performing replicas with current objectives, steps, and temperatures
Progress statistics: - Exploration rate (total perturbations per second across all replicas) - Progress rate (accepted steps per second) - Acceptance rate percentage
Interactive time series plots: Plotly charts for metrics over time with zoom and pan
Temperature exchange markers: Optional visualization of replica exchange events
Configurable refresh rate: Adjust polling frequency (0.5-5 minutes)
Plot options: - Metric selection (Best vs Current history) - Normalization toggle - Layout control (1 or 2 columns) - Downsampling for performance
Run information: Objective function name, dataset size, hyperparameters, and initial temperatures
Installation
The dashboard requires additional dependencies. Install with dashboard extras:
pip install parallel-hill-climber[dashboard]
This will install:
streamlit: Web dashboard frameworkplotly: Interactive plotting library
Usage
Enabling database logging
To use the dashboard, enable database logging in your HillClimber instance:
from hill_climber import HillClimber
climber = HillClimber(
data=data,
objective_func=my_objective,
db_enabled=True, # Enable database logging
db_path='my_optimization.db', # Optional: custom path
db_step_interval=100, # Optional: collect every 100th step
checkpoint_interval=10, # Optional: checkpoint every 10 batches
# ... other parameters
)
best_data = climber.climb()
Launching the dashboard
While your optimization is running (or after it completes), launch the dashboard:
hill-climber-dashboard
Then navigate to http://localhost:8501 in your browser. The dashboard will automatically discover databases in common locations, or you can select a specific database file using the sidebar controls.
Dashboard configuration
Configure the dashboard using the sidebar:
Database selection: Choose directory and database file from dropdowns
Auto-refresh: Enable/disable automatic updates
Refresh interval: Set polling frequency (0.5-5 minutes)
History type: Select Best (monotonic improvement) or Current (includes exploration)
Additional metrics: Select extra metrics beyond the objective to plot
Normalize: Toggle metric normalization to [0, 1]
Exchange markers: Show vertical lines at temperature exchange events
Plot layout: Choose 1 or 2 column display
Database configuration parameters
The database logging system uses an efficient collection and write strategy:
Worker processes collect metrics at regular intervals during optimization
Main process performs all database writes after each batch, avoiding lock contention
No buffering needed - workers return collected metrics to main process
- db_enabledbool, default=True
Enable database logging for dashboard monitoring
- db_pathstr, optional
Path to SQLite database file. Defaults to
'../data/hill_climb.db'- db_step_intervalint, optional
Collect metrics every Nth step. Uses tiered sampling based on exchange_interval:
exchange_interval < 10: sample every step (1)
exchange_interval 10-99: sample every 10 steps
exchange_interval 100-999: sample every 100 steps
exchange_interval >= 1000: sample every 1000 steps
Must be less than or equal to exchange_interval to ensure at least one collection per batch.
- checkpoint_intervalint, default=1
Number of batches between checkpoint saves. Default is 1 (checkpoint every batch). Set to higher values (e.g., 10) to reduce checkpoint I/O while database provides real-time monitoring
Performance tuning
Default settings (recommended)
climber = HillClimber(
data=data,
objective_func=objective,
exchange_interval=100,
db_enabled=True,
# db_step_interval defaults to 100 (one sample per batch)
)
This provides good balance between resolution and performance:
Collects 100 steps per replica per batch (100 / 1)
Main process writes all collected metrics once per batch
No worker I/O contention
Higher resolution (more database load)
climber = HillClimber(
data=data,
objective_func=objective,
exchange_interval=1000,
db_enabled=True,
db_step_interval=100 # Collect every 100th step instead of default 1000
)
Collects 10 samples per replica per batch instead of 1 (10x higher resolution).
Database schema
The database contains four tables:
run_metadata
Stores run configuration and hyperparameters:
run_id: Always 1 (single run per database)start_time: Unix timestamp when optimization startedn_replicas: Number of replicasexchange_interval: Steps between exchange attemptsdb_step_interval: Step collection frequencyhyperparameters: JSON-encoded hyperparameters
replica_status
Current state of each replica (updated after each batch):
replica_id: Replica identifier (0 to n_replicas-1)step: Current step numbertemperature: Current temperaturebest_objective: Best objective value foundcurrent_objective: Current objective valuetimestamp: Unix timestamp of last update
metrics_history
Time series of metrics (sampled according to db_step_interval):
replica_id: Replica identifierstep: Step number when metric was recordedmetric_name: Name of the metricvalue: Metric value
Indexed on (replica_id, step) for fast queries.
temperature_exchanges
Record of temperature swaps between replicas:
step: Step number when exchange occurredreplica_id: Replica that received new temperaturenew_temperature: New temperature after exchangetimestamp: Unix timestamp
Checkpoint independence
Database logging and checkpointing are decoupled for flexibility:
Database: Provides real-time progress monitoring with configurable granularity
Checkpoints: Provide full state recovery with configurable frequency
This allows you to:
Monitor progress every batch while checkpointing every 10 batches
Reduce checkpoint file I/O overhead
Accept partial batch loss on crashes (database provides progress visibility, checkpoints provide recovery)
Example:
climber = HillClimber(
data=data,
objective_func=objective,
checkpoint_file='optimization.pkl',
checkpoint_interval=10, # Checkpoint every 10 batches
db_enabled=True,
db_path='optimization.db' # Monitor every batch
)
Complete example
import numpy as np
import pandas as pd
from hill_climber import HillClimber
# Generate data
np.random.seed(42)
data = pd.DataFrame({
'x': np.random.randn(1000),
'y': np.random.randn(1000)
})
# Define objective
def objective(x, y):
corr = np.corrcoef(x, y)[0, 1]
return {'Correlation': corr}, corr
# Create optimizer with database enabled
climber = HillClimber(
data=data,
objective_func=objective,
max_time=30,
n_replicas=4,
exchange_interval=100,
db_enabled=True,
db_path='correlation_opt.db',
checkpoint_file='correlation_opt.pkl',
checkpoint_interval=5 # Checkpoint every 5 batches
)
# Run optimization
best_data = climber.climb()
Then in a separate terminal:
hill-climber-dashboard
The dashboard will automatically discover the correlation_opt.db database file, or you can select it using the sidebar controls.
Troubleshooting
Database file not found
Ensure your HillClimber instance has
db_enabled=TrueCheck that the database path in the dashboard matches your configuration
Verify the optimization has started and completed at least one batch
No data appearing
Wait for the first batch to complete (
exchange_intervalsteps)Check the “Run Information” in the sidebar to verify database configuration
Ensure auto-refresh is enabled or click “Refresh Now”
Slow dashboard updates
Reduce the number of metrics displayed
Increase the refresh interval
Increase
db_step_intervalto reduce database size
Slow optimization performance
Increase
db_step_intervalto reduce collection overheadConsider disabling database logging (
db_enabled=False) for production runsUse checkpoints for state recovery instead of database monitoring
Database size estimation
With default settings:
exchange_interval=100n_replicas=820 metrics
db_step_interval=100(default: one sample per batch)
Results in:
100 steps collected per replica per batch
16,000 metric rows per batch
For 30-minute run (~1000 batches): ~16M rows → 1-2GB database
To reduce size, increase db_step_interval:
# For exchange_interval >= 1000, only 1 sample per batch by default
exchange_interval = 5000
# db_step_interval defaults to 1000 (one sample per 5 batches)
See also
User guide: Core optimization concepts
API reference: Complete API reference
Advanced topics: Advanced features and customization