2.6.2. Data Assimilation

This page gives a summary of the functions available in the data assimilation library. Differentiation of persistence diagrams is exploited to optimize data driven model coefficients by minimizing topological differences between model the model forecast and measurements. More information on the details of the TADA algorithm can be found in, “Topological Approach for Data Assimilation.” We plan to implement more data assimilation tools here in the future.

Warning

TADA requires tensorflow for optimization features. Please install teaspoon using the command: pip install “teaspoon[full]” to install the necessary packages.

teaspoon.DAF.data_assimilation.TADA(u_obs, window_size, model_parameters, n_epochs=1, train_len=4000, opt_params=[1e-05, 0.99], window_number=1)[source]

Compute optimal model weights using data assimilation and persistent homology. Heavily modified code from https://github.com/GUDHI/TDA-tutorial.git

Parameters:
  • u_obs (array) – Array of observations (D x N) D is the dimension, N is the number of time points (INCLUDING TRAINING DATA)

  • window_size (int) – Number of points included in the sliding window

  • model_parameters (list) – List of parameters used to generate a forecast. Must contain current model weights (W_A), original model weights (W_LR), random feature map weight matrix (W_in) and random feature map bias vector (b_in).

  • n_epochs (int) – Number of optimization epochs for each assimilation window

  • train_len (int) – Number of points used for training the original model

  • opt_params (list) – List of parameters used for gradient descent optimization. Must contain a learning rate and decay rate. Decay only occurs between assimilation windows.

  • window_number (int) – Current window number. Used to determine how many points to forecast into the future for the current window.

Returns:

Optimal model weights using TADA algorithm.

Return type:

(array)

teaspoon.DAF.data_assimilation.get_initial_pds(X)[source]

Function to compute the initial persistence diagrams of the measurements and model forecast using the Vietoris Rips complex.

Parameters:

X (array) – Point cloud array for computing persistence.

Returns:

0D and 1D persistence pairs.

Return type:

(2 arrays)

Example:

import numpy as np
from teaspoon.MakeData.DynSysLib.autonomous_dissipative_flows import lorenz
from teaspoon.DAF.data_assimilation import TADA
from teaspoon.DAF.forecasting import forecast_time
from teaspoon.DAF.forecasting import random_feature_map_model
from teaspoon.DAF.forecasting import get_forecast

# Set random seed
r_seed = 48824
np.random.seed(r_seed)

# Set TADA parameters
snr = 60.0
lr = 1e-5
train_len = 4000
forecast_len = 2000
window_size = 50
max_window_number = 10

# Get training and validation data at random initial condition
ICs = list(np.random.normal(size=(3,1)).reshape(-1,))
t, ts = lorenz(L=500, fs=50, SampleSize=6001, parameters=[28,10.0,8.0/3.0],InitialConditions=ICs)
ts = np.array(ts)

# Get signal and noise amplitudes using signal-to-noise ratio
a_sig = np.sqrt(np.mean(np.square(ts),axis=1))
a_noise = a_sig * 10**(-snr/20)

# Add noise to the signal
Gamma = np.diag(a_noise)
noise = np.random.normal(size=np.shape(ts[:,0:train_len+forecast_len]))
u_obs = ts[:,0:train_len+forecast_len] + Gamma@noise

# Train model
W_LR, W_in, b_in = random_feature_map_model(u_obs[:,0:train_len],Dr=300, seed=r_seed)

# Set optimization parameters
d_rate = 0.99
opt_params = [lr, d_rate]
W_opt = W_LR

# TADA optimization loop
for window_number in range(1,max_window_number):
    model_parameters = [W_opt, W_LR, W_in, b_in]
    W_opt = TADA(u_obs, window_size, model_parameters, train_len=train_len, n_epochs=1, opt_params=opt_params, window_number=window_number)

# Set forecast parameters
window_number = 200
start = train_len
end = train_len + window_number + 1

# Forecast TADA and LR models and get measurements
X_model_tada = get_forecast(u_obs[:,train_len], W_opt, W_in, b_in,forecast_len=end-train_len)
X_model_lr =get_forecast(u_obs[:,train_len], W_LR, W_in, b_in,forecast_len=end-train_len)
X_meas = u_obs[:,start:end]

# Compute and print forecast times
tada_time = forecast_time(X_model_tada, X_meas, dt=0.02, lambda_max=0.91, threshold=0.05)
lr_time = forecast_time(X_model_lr, X_meas, dt=0.02, lambda_max=0.91, threshold=0.05)

print(f"TADA Forecast Time: {tada_time}")
print(f"LR Forecast Time: {lr_time}")

Note

Resulting forecast times may vary depending on the operating system.