Example 07: Augmentations, Auxiliary Losses, and Training Options¶
TSFast provides composable building blocks that customize training without modifying model code:
- Augmentations modify training data on-the-fly (noise, bias, sequence length variation)
- Auxiliary losses add regularization terms to the main loss (activation smoothing, gradient penalties)
- Training options control optimizer behavior (gradient clipping)
This example demonstrates each category and shows how to combine them.
Setup¶
from tsfast.tsdata.benchmark import create_dls_silverbox
from tsfast.models.rnn import RNNLearner
from tsfast.models.scaling import unwrap_model
from tsfast.training import (
fun_rmse,
ActivationRegularizer,
TemporalActivationRegularizer,
noise, vary_seq_len, truncate_sequence,
)
/home/pheenix/Development/tsfast/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
Load the Dataset¶
dls = create_dls_silverbox(bs=16, win_sz=500, stp_sz=10)
Data Augmentations¶
Augmentations modify training data on-the-fly. They only apply during
training (not validation or test), so your evaluation metrics stay
comparable. Pass them as augmentations=[...] when creating the Learner.
noise¶
Adds Gaussian noise to input signals. std controls the noise magnitude.
lrn_noisy = RNNLearner(
dls, rnn_type='lstm', metrics=[fun_rmse],
augmentations=[noise(std=0.05)],
)
bias¶
Adds a constant offset per signal per sample. This simulates sensor drift or calibration errors, making the model more robust to such shifts.
Training with Augmentation¶
Train two models -- one with augmentation, one without -- to see the effect on validation performance.
lrn_base = RNNLearner(dls, rnn_type='lstm', metrics=[fun_rmse])
lrn_base.fit_flat_cos(n_epoch=5, lr=3e-3)
print(f"Without augmentation: {lrn_base.validate()}")
Training: 0%| | 0/1500 [00:00<?, ?it/s]
Training: 0%| | 1/1500 [00:00<12:55, 1.93it/s]
Training: 3%|▎ | 39/1500 [00:01<00:32, 44.60it/s]
Training: 5%|▍ | 71/1500 [00:01<00:26, 52.99it/s]
Training: 7%|▋ | 107/1500 [00:02<00:23, 60.22it/s]
Training: 9%|▉ | 138/1500 [00:02<00:22, 60.43it/s]
Training: 12%|█▏ | 173/1500 [00:03<00:21, 63.13it/s]
Training: 14%|█▍ | 207/1500 [00:03<00:20, 64.47it/s]
Training: 16%|█▌ | 241/1500 [00:04<00:19, 65.49it/s]
Training: 18%|█▊ | 274/1500 [00:04<00:18, 64.92it/s]
Training: 20%|██ | 300/1500 [00:04<00:18, 64.92it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 20%|██ | 307/1500 [00:05<00:18, 64.50it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 23%|██▎ | 340/1500 [00:05<00:17, 64.49it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 25%|██▍ | 374/1500 [00:06<00:17, 65.02it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 27%|██▋ | 409/1500 [00:06<00:16, 66.22it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 30%|██▉ | 443/1500 [00:07<00:15, 66.11it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 32%|███▏ | 477/1500 [00:07<00:15, 66.27it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 34%|███▍ | 511/1500 [00:08<00:15, 65.58it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 36%|███▋ | 544/1500 [00:08<00:14, 64.31it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 38%|███▊ | 577/1500 [00:09<00:14, 62.02it/s, epoch 1 | train=0.0112 | valid=0.0075 | fun_rmse=0.0118]
Training: 40%|████ | 600/1500 [00:09<00:14, 62.02it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 41%|████ | 609/1500 [00:10<00:16, 54.87it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 43%|████▎ | 641/1500 [00:10<00:15, 56.95it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 45%|████▍ | 673/1500 [00:11<00:14, 58.55it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 47%|████▋ | 703/1500 [00:11<00:14, 56.83it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 49%|████▉ | 736/1500 [00:12<00:12, 58.94it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 51%|█████ | 767/1500 [00:12<00:12, 59.28it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 53%|█████▎ | 798/1500 [00:13<00:11, 59.60it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 55%|█████▌ | 829/1500 [00:13<00:11, 60.10it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 57%|█████▋ | 860/1500 [00:14<00:10, 60.14it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 59%|█████▉ | 891/1500 [00:14<00:10, 60.16it/s, epoch 2 | train=0.0062 | valid=0.0053 | fun_rmse=0.0103]
Training: 60%|██████ | 900/1500 [00:14<00:09, 60.16it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 61%|██████▏ | 922/1500 [00:15<00:09, 58.62it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 63%|██████▎ | 952/1500 [00:15<00:09, 58.86it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 65%|██████▌ | 982/1500 [00:16<00:08, 59.01it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 67%|██████▋ | 1012/1500 [00:16<00:08, 59.10it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 70%|██████▉ | 1044/1500 [00:17<00:07, 60.16it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 72%|███████▏ | 1076/1500 [00:17<00:06, 60.74it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 74%|███████▍ | 1107/1500 [00:18<00:06, 60.64it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 76%|███████▌ | 1138/1500 [00:18<00:05, 60.58it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 78%|███████▊ | 1171/1500 [00:19<00:05, 61.96it/s, epoch 3 | train=0.0052 | valid=0.0063 | fun_rmse=0.0108]
Training: 80%|████████ | 1200/1500 [00:19<00:04, 61.96it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 80%|████████ | 1202/1500 [00:19<00:04, 61.35it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 82%|████████▏ | 1233/1500 [00:20<00:04, 60.72it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 84%|████████▍ | 1264/1500 [00:20<00:03, 60.87it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 86%|████████▋ | 1296/1500 [00:21<00:03, 61.48it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 88%|████████▊ | 1327/1500 [00:21<00:02, 61.47it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 91%|█████████ | 1358/1500 [00:22<00:02, 60.44it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 93%|█████████▎| 1389/1500 [00:22<00:01, 60.65it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 95%|█████████▍| 1420/1500 [00:23<00:01, 59.74it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 97%|█████████▋| 1450/1500 [00:23<00:00, 59.62it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 99%|█████████▊| 1481/1500 [00:24<00:00, 59.84it/s, epoch 4 | train=0.0049 | valid=0.0044 | fun_rmse=0.0098]
Training: 100%|██████████| 1500/1500 [00:24<00:00, 59.84it/s, epoch 5 | train=0.0035 | valid=0.0029 | fun_rmse=0.0095]
Training: 100%|██████████| 1500/1500 [00:24<00:00, 60.38it/s, epoch 5 | train=0.0035 | valid=0.0029 | fun_rmse=0.0095]
Without augmentation: (0.0029070964083075523, {'fun_rmse': 0.009518533013761044})
lrn_noisy.fit_flat_cos(n_epoch=5, lr=3e-3)
print(f"With noise augmentation: {lrn_noisy.validate()}")
Training: 0%| | 0/1500 [00:00<?, ?it/s]
Training: 2%|▏ | 28/1500 [00:00<00:26, 55.31it/s]
Training: 4%|▍ | 59/1500 [00:01<00:24, 58.63it/s]
Training: 6%|▌ | 90/1500 [00:01<00:23, 59.54it/s]
Training: 8%|▊ | 120/1500 [00:02<00:23, 57.74it/s]
Training: 10%|█ | 150/1500 [00:02<00:23, 58.02it/s]
Training: 12%|█▏ | 180/1500 [00:03<00:23, 56.13it/s]
Training: 14%|█▍ | 209/1500 [00:03<00:22, 56.56it/s]
Training: 16%|█▌ | 238/1500 [00:04<00:22, 56.56it/s]
Training: 18%|█▊ | 267/1500 [00:04<00:21, 56.61it/s]
Training: 20%|█▉ | 296/1500 [00:05<00:21, 55.95it/s]
Training: 20%|██ | 300/1500 [00:05<00:21, 55.95it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 22%|██▏ | 324/1500 [00:05<00:21, 55.95it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 24%|██▎ | 354/1500 [00:06<00:20, 56.66it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 26%|██▌ | 384/1500 [00:06<00:19, 57.56it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 28%|██▊ | 414/1500 [00:07<00:18, 58.10it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 30%|██▉ | 444/1500 [00:07<00:18, 58.56it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 32%|███▏ | 474/1500 [00:08<00:17, 58.81it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 34%|███▎ | 506/1500 [00:08<00:16, 59.98it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 36%|███▌ | 537/1500 [00:09<00:15, 60.45it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 38%|███▊ | 568/1500 [00:09<00:15, 60.86it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 40%|███▉ | 599/1500 [00:10<00:14, 60.65it/s, epoch 1 | train=0.0404 | valid=0.0316 | fun_rmse=0.0396]
Training: 40%|████ | 600/1500 [00:10<00:14, 60.65it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 42%|████▏ | 630/1500 [00:10<00:14, 58.31it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 44%|████▍ | 660/1500 [00:11<00:14, 58.44it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 46%|████▌ | 690/1500 [00:11<00:13, 58.57it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 48%|████▊ | 720/1500 [00:12<00:13, 58.12it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 50%|█████ | 750/1500 [00:12<00:13, 57.68it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 52%|█████▏ | 779/1500 [00:13<00:12, 56.86it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 54%|█████▍ | 810/1500 [00:13<00:11, 58.16it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 56%|█████▌ | 840/1500 [00:14<00:11, 57.81it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 58%|█████▊ | 869/1500 [00:14<00:10, 57.59it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 60%|██████ | 900/1500 [00:15<00:10, 58.83it/s, epoch 2 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 60%|██████ | 900/1500 [00:15<00:10, 58.83it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 62%|██████▏ | 930/1500 [00:16<00:09, 57.56it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 64%|██████▍ | 961/1500 [00:16<00:09, 58.31it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 66%|██████▌ | 991/1500 [00:17<00:08, 57.79it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 68%|██████▊ | 1020/1500 [00:17<00:08, 57.32it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 70%|██████▉ | 1049/1500 [00:18<00:07, 57.03it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 72%|███████▏ | 1079/1500 [00:18<00:07, 57.49it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 74%|███████▍ | 1109/1500 [00:19<00:06, 57.67it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 76%|███████▌ | 1139/1500 [00:19<00:06, 58.29it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 78%|███████▊ | 1170/1500 [00:20<00:05, 59.33it/s, epoch 3 | train=0.0383 | valid=0.0346 | fun_rmse=0.0431]
Training: 80%|████████ | 1200/1500 [00:20<00:05, 59.33it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 80%|████████ | 1201/1500 [00:20<00:05, 58.96it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 82%|████████▏ | 1231/1500 [00:21<00:04, 59.00it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 84%|████████▍ | 1262/1500 [00:21<00:03, 59.58it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 86%|████████▌ | 1293/1500 [00:22<00:03, 60.16it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 88%|████████▊ | 1325/1500 [00:22<00:02, 61.07it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 90%|█████████ | 1356/1500 [00:23<00:02, 60.71it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 92%|█████████▏| 1387/1500 [00:23<00:01, 60.21it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 95%|█████████▍| 1419/1500 [00:24<00:01, 61.30it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 97%|█████████▋| 1451/1500 [00:24<00:00, 61.55it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 99%|█████████▉| 1483/1500 [00:25<00:00, 62.22it/s, epoch 4 | train=0.0382 | valid=0.0336 | fun_rmse=0.0419]
Training: 100%|██████████| 1500/1500 [00:25<00:00, 62.22it/s, epoch 5 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
Training: 100%|██████████| 1500/1500 [00:25<00:00, 58.66it/s, epoch 5 | train=0.0382 | valid=0.0335 | fun_rmse=0.0419]
With noise augmentation: (0.03354886919260025, {'fun_rmse': 0.041886795312166214})
Activation Regularization¶
Two regularizers can be added to the loss:
ActivationRegularizer: L2 penalty on RNN activations -- prevents activations from growing too large.TemporalActivationRegularizer: L2 penalty on temporal differences of activations -- encourages smooth predictions over time.
modules specifies which model components to regularize (typically the RNN
layers). Pass them as aux_losses=[...] when creating the Learner.
lrn_reg = RNNLearner(dls, rnn_type='lstm', metrics=[fun_rmse])
lrn_reg.add_aux_loss(
ActivationRegularizer(modules=[unwrap_model(lrn_reg.model).rnn], alpha=2.0)
)
lrn_reg.add_aux_loss(
TemporalActivationRegularizer(modules=[unwrap_model(lrn_reg.model).rnn], beta=1.0)
)
lrn_reg.fit_flat_cos(n_epoch=5, lr=3e-3)
lrn_reg.show_results(max_n=2)
Training: 0%| | 0/1500 [00:00<?, ?it/s]
Training: 2%|▏ | 25/1500 [00:00<00:29, 49.85it/s]
Training: 4%|▎ | 56/1500 [00:01<00:25, 56.34it/s]
Training: 6%|▌ | 86/1500 [00:01<00:24, 57.21it/s]
Training: 8%|▊ | 115/1500 [00:02<00:24, 57.20it/s]
Training: 10%|▉ | 144/1500 [00:02<00:23, 57.32it/s]
Training: 12%|█▏ | 173/1500 [00:03<00:23, 57.44it/s]
Training: 13%|█▎ | 202/1500 [00:03<00:22, 57.46it/s]
Training: 15%|█▌ | 231/1500 [00:04<00:22, 57.36it/s]
Training: 17%|█▋ | 260/1500 [00:04<00:21, 57.12it/s]
Training: 19%|█▉ | 289/1500 [00:05<00:21, 55.11it/s]
Training: 20%|██ | 300/1500 [00:05<00:21, 55.11it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 21%|██ | 317/1500 [00:05<00:22, 51.83it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 23%|██▎ | 347/1500 [00:06<00:21, 53.62it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 25%|██▌ | 378/1500 [00:06<00:20, 55.57it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 27%|██▋ | 409/1500 [00:07<00:19, 57.24it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 29%|██▉ | 440/1500 [00:07<00:18, 58.51it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 31%|███▏ | 471/1500 [00:08<00:17, 59.23it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 33%|███▎ | 501/1500 [00:08<00:16, 59.36it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 35%|███▌ | 532/1500 [00:09<00:16, 60.02it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 38%|███▊ | 563/1500 [00:09<00:15, 59.98it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 40%|███▉ | 594/1500 [00:10<00:15, 60.13it/s, epoch 1 | train=0.0123 | valid=0.0042 | fun_rmse=0.0098]
Training: 40%|████ | 600/1500 [00:10<00:14, 60.13it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 42%|████▏ | 625/1500 [00:10<00:14, 59.06it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 44%|████▎ | 655/1500 [00:11<00:14, 58.80it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 46%|████▌ | 686/1500 [00:11<00:13, 59.26it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 48%|████▊ | 717/1500 [00:12<00:13, 59.43it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 50%|████▉ | 747/1500 [00:12<00:12, 58.47it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 52%|█████▏ | 777/1500 [00:13<00:12, 58.74it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 54%|█████▍ | 807/1500 [00:13<00:11, 58.89it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 56%|█████▌ | 837/1500 [00:14<00:11, 59.03it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 58%|█████▊ | 867/1500 [00:14<00:10, 59.03it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 60%|█████▉ | 897/1500 [00:15<00:10, 58.72it/s, epoch 2 | train=0.0064 | valid=0.0062 | fun_rmse=0.0108]
Training: 60%|██████ | 900/1500 [00:15<00:10, 58.72it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 62%|██████▏ | 927/1500 [00:16<00:10, 57.18it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 64%|██████▍ | 957/1500 [00:16<00:09, 57.49it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 66%|██████▌ | 987/1500 [00:17<00:08, 57.77it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 68%|██████▊ | 1017/1500 [00:17<00:08, 57.97it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 70%|██████▉ | 1047/1500 [00:18<00:07, 58.15it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 72%|███████▏ | 1077/1500 [00:18<00:07, 58.00it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 74%|███████▍ | 1107/1500 [00:19<00:06, 58.23it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 76%|███████▌ | 1137/1500 [00:19<00:06, 57.96it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 78%|███████▊ | 1167/1500 [00:20<00:05, 58.20it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 80%|███████▉ | 1197/1500 [00:20<00:05, 57.67it/s, epoch 3 | train=0.0062 | valid=0.0066 | fun_rmse=0.0112]
Training: 80%|████████ | 1200/1500 [00:20<00:05, 57.67it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 82%|████████▏ | 1226/1500 [00:21<00:04, 57.08it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 84%|████████▎ | 1255/1500 [00:21<00:04, 57.15it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 86%|████████▌ | 1284/1500 [00:22<00:03, 57.20it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 88%|████████▊ | 1314/1500 [00:22<00:03, 57.72it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 90%|████████▉ | 1344/1500 [00:23<00:02, 57.90it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 92%|█████████▏| 1374/1500 [00:23<00:02, 57.96it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 94%|█████████▎| 1405/1500 [00:24<00:01, 58.61it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 96%|█████████▌| 1436/1500 [00:24<00:01, 59.15it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 98%|█████████▊| 1466/1500 [00:25<00:00, 58.96it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 100%|█████████▉| 1496/1500 [00:25<00:00, 58.91it/s, epoch 4 | train=0.0060 | valid=0.0040 | fun_rmse=0.0097]
Training: 100%|██████████| 1500/1500 [00:25<00:00, 58.91it/s, epoch 5 | train=0.0042 | valid=0.0029 | fun_rmse=0.0095]
Training: 100%|██████████| 1500/1500 [00:25<00:00, 57.86it/s, epoch 5 | train=0.0042 | valid=0.0029 | fun_rmse=0.0095]
Gradient Clipping¶
Clips the gradient norm during backpropagation. This prevents exploding
gradients, which are common with RNNs on long sequences. Pass grad_clip=
when creating the Learner.
lrn_clip = RNNLearner(dls, rnn_type='lstm', metrics=[fun_rmse], grad_clip=10)
lrn_clip.fit_flat_cos(n_epoch=5, lr=3e-3)
Training: 0%| | 0/1500 [00:00<?, ?it/s]
Training: 2%|▏ | 24/1500 [00:00<00:31, 47.10it/s]
Training: 4%|▎ | 54/1500 [00:01<00:26, 53.71it/s]
Training: 6%|▌ | 84/1500 [00:01<00:25, 56.04it/s]
Training: 8%|▊ | 114/1500 [00:02<00:24, 56.79it/s]
Training: 10%|▉ | 145/1500 [00:02<00:23, 58.14it/s]
Training: 12%|█▏ | 175/1500 [00:03<00:22, 58.13it/s]
Training: 14%|█▎ | 206/1500 [00:03<00:21, 58.94it/s]
Training: 16%|█▌ | 236/1500 [00:04<00:21, 59.04it/s]
Training: 18%|█▊ | 266/1500 [00:04<00:20, 59.03it/s]
Training: 20%|█▉ | 297/1500 [00:05<00:20, 59.73it/s]
Training: 20%|██ | 300/1500 [00:05<00:20, 59.73it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 22%|██▏ | 327/1500 [00:05<00:19, 59.44it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 24%|██▍ | 358/1500 [00:06<00:19, 59.74it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 26%|██▌ | 389/1500 [00:06<00:18, 59.87it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 28%|██▊ | 420/1500 [00:07<00:17, 60.23it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 30%|███ | 451/1500 [00:07<00:17, 60.69it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 32%|███▏ | 482/1500 [00:08<00:17, 59.29it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 34%|███▍ | 512/1500 [00:08<00:16, 59.27it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 36%|███▌ | 542/1500 [00:09<00:16, 58.77it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 38%|███▊ | 572/1500 [00:09<00:15, 59.03it/s, epoch 1 | train=0.0126 | valid=0.0065 | fun_rmse=0.0114]
Training: 40%|████ | 600/1500 [00:10<00:15, 59.03it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 40%|████ | 602/1500 [00:10<00:15, 57.61it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 42%|████▏ | 632/1500 [00:10<00:14, 57.97it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 44%|████▍ | 663/1500 [00:11<00:14, 58.91it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 46%|████▋ | 694/1500 [00:11<00:13, 59.24it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 48%|████▊ | 724/1500 [00:12<00:13, 58.75it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 50%|█████ | 754/1500 [00:12<00:12, 58.70it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 52%|█████▏ | 784/1500 [00:13<00:12, 56.05it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 54%|█████▍ | 814/1500 [00:13<00:12, 56.99it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 56%|█████▌ | 843/1500 [00:14<00:11, 56.32it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 58%|█████▊ | 872/1500 [00:15<00:11, 55.83it/s, epoch 2 | train=0.0057 | valid=0.0060 | fun_rmse=0.0108]
Training: 60%|██████ | 900/1500 [00:15<00:10, 55.83it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 60%|██████ | 901/1500 [00:15<00:10, 55.78it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 62%|██████▏ | 931/1500 [00:16<00:10, 56.53it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 64%|██████▍ | 960/1500 [00:16<00:09, 56.27it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 66%|██████▌ | 989/1500 [00:17<00:09, 56.63it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 68%|██████▊ | 1018/1500 [00:17<00:08, 56.82it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 70%|██████▉ | 1047/1500 [00:18<00:08, 51.09it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 72%|███████▏ | 1076/1500 [00:18<00:08, 52.54it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 74%|███████▎ | 1104/1500 [00:19<00:07, 53.24it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 76%|███████▌ | 1134/1500 [00:19<00:06, 54.84it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 78%|███████▊ | 1163/1500 [00:20<00:06, 55.52it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 79%|███████▉ | 1192/1500 [00:20<00:05, 56.20it/s, epoch 3 | train=0.0051 | valid=0.0097 | fun_rmse=0.0139]
Training: 80%|████████ | 1200/1500 [00:21<00:05, 56.20it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 81%|████████▏ | 1221/1500 [00:21<00:05, 54.03it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 83%|████████▎ | 1249/1500 [00:21<00:04, 54.56it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 85%|████████▌ | 1277/1500 [00:22<00:04, 54.88it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 87%|████████▋ | 1305/1500 [00:22<00:03, 54.82it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 89%|████████▉ | 1333/1500 [00:23<00:03, 54.30it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 91%|█████████ | 1361/1500 [00:23<00:02, 54.43it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 93%|█████████▎| 1389/1500 [00:24<00:02, 54.57it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 94%|█████████▍| 1417/1500 [00:24<00:01, 54.83it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 96%|█████████▋| 1445/1500 [00:25<00:01, 54.57it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 98%|█████████▊| 1474/1500 [00:26<00:00, 55.09it/s, epoch 4 | train=0.0049 | valid=0.0043 | fun_rmse=0.0099]
Training: 100%|██████████| 1500/1500 [00:26<00:00, 55.09it/s, epoch 5 | train=0.0034 | valid=0.0029 | fun_rmse=0.0096]
Training: 100%|██████████| 1500/1500 [00:26<00:00, 56.61it/s, epoch 5 | train=0.0034 | valid=0.0029 | fun_rmse=0.0096]
vary_seq_len¶
Randomly truncates sequences to different lengths each batch. This acts as
data augmentation by preventing the model from overfitting to a fixed window
size. min_len sets the minimum allowed length.
lrn_vary = RNNLearner(
dls, rnn_type='lstm', metrics=[fun_rmse],
augmentations=[vary_seq_len(min_len=100)],
)
lrn_vary.fit_flat_cos(n_epoch=5, lr=3e-3)
Training: 0%| | 0/1500 [00:00<?, ?it/s]
Training: 2%|▏ | 26/1500 [00:00<00:29, 50.42it/s]
Training: 4%|▎ | 55/1500 [00:01<00:26, 54.04it/s]
Training: 6%|▌ | 84/1500 [00:01<00:25, 55.68it/s]
Training: 8%|▊ | 114/1500 [00:02<00:24, 57.27it/s]
Training: 10%|▉ | 144/1500 [00:02<00:23, 57.99it/s]
Training: 12%|█▏ | 174/1500 [00:03<00:23, 57.31it/s]
Training: 14%|█▎ | 204/1500 [00:03<00:22, 57.64it/s]
Training: 16%|█▌ | 234/1500 [00:04<00:21, 57.87it/s]
Training: 18%|█▊ | 264/1500 [00:04<00:21, 58.31it/s]
Training: 20%|█▉ | 294/1500 [00:05<00:20, 58.25it/s]
Training: 20%|██ | 300/1500 [00:05<00:20, 58.25it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 22%|██▏ | 324/1500 [00:05<00:20, 56.61it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 24%|██▎ | 355/1500 [00:06<00:19, 57.83it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 26%|██▌ | 386/1500 [00:06<00:19, 58.49it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 28%|██▊ | 417/1500 [00:07<00:18, 59.06it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 30%|██▉ | 447/1500 [00:07<00:17, 58.98it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 32%|███▏ | 479/1500 [00:08<00:16, 60.26it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 34%|███▍ | 511/1500 [00:08<00:16, 61.06it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 36%|███▌ | 543/1500 [00:09<00:15, 61.71it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 38%|███▊ | 574/1500 [00:09<00:15, 61.48it/s, epoch 1 | train=0.0146 | valid=0.0066 | fun_rmse=0.0113]
Training: 40%|████ | 600/1500 [00:10<00:14, 61.48it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 40%|████ | 605/1500 [00:10<00:14, 61.29it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 42%|████▏ | 636/1500 [00:10<00:14, 61.31it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 45%|████▍ | 668/1500 [00:11<00:13, 61.68it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 47%|████▋ | 700/1500 [00:11<00:12, 62.24it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 49%|████▉ | 732/1500 [00:12<00:12, 61.95it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 51%|█████ | 763/1500 [00:12<00:12, 61.32it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 53%|█████▎ | 794/1500 [00:13<00:11, 59.33it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 55%|█████▍ | 824/1500 [00:13<00:11, 57.22it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 57%|█████▋ | 853/1500 [00:14<00:11, 57.36it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 59%|█████▉ | 882/1500 [00:14<00:10, 57.41it/s, epoch 2 | train=0.0081 | valid=0.0072 | fun_rmse=0.0116]
Training: 60%|██████ | 900/1500 [00:15<00:10, 57.41it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 61%|██████ | 911/1500 [00:15<00:10, 56.13it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 63%|██████▎ | 940/1500 [00:16<00:10, 54.84it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 65%|██████▍ | 970/1500 [00:16<00:09, 56.19it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 67%|██████▋ | 1001/1500 [00:17<00:08, 57.27it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 69%|██████▊ | 1030/1500 [00:17<00:08, 56.82it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 71%|███████ | 1059/1500 [00:18<00:07, 56.26it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 73%|███████▎ | 1088/1500 [00:18<00:07, 55.82it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 74%|███████▍ | 1116/1500 [00:19<00:07, 54.78it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 76%|███████▋ | 1144/1500 [00:19<00:06, 53.34it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 78%|███████▊ | 1172/1500 [00:20<00:06, 53.63it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 80%|███████▉ | 1199/1500 [00:20<00:05, 53.42it/s, epoch 3 | train=0.0071 | valid=0.0056 | fun_rmse=0.0106]
Training: 80%|████████ | 1200/1500 [00:20<00:05, 53.42it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 82%|████████▏ | 1226/1500 [00:21<00:05, 52.76it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 84%|████████▎ | 1253/1500 [00:21<00:04, 52.62it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 85%|████████▌ | 1280/1500 [00:22<00:04, 52.33it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 87%|████████▋ | 1307/1500 [00:22<00:03, 50.39it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 89%|████████▉ | 1333/1500 [00:23<00:03, 50.27it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 91%|█████████ | 1360/1500 [00:23<00:02, 50.70it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 92%|█████████▏| 1386/1500 [00:24<00:02, 50.61it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 94%|█████████▍| 1412/1500 [00:25<00:01, 50.06it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 96%|█████████▌| 1438/1500 [00:25<00:01, 50.14it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 98%|█████████▊| 1465/1500 [00:26<00:00, 50.89it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 99%|█████████▉| 1492/1500 [00:26<00:00, 51.76it/s, epoch 4 | train=0.0071 | valid=0.0039 | fun_rmse=0.0097]
Training: 100%|██████████| 1500/1500 [00:26<00:00, 51.76it/s, epoch 5 | train=0.0058 | valid=0.0030 | fun_rmse=0.0096]
Training: 100%|██████████| 1500/1500 [00:26<00:00, 56.10it/s, epoch 5 | train=0.0058 | valid=0.0030 | fun_rmse=0.0096]
truncate_sequence¶
Progressively increases sequence length during training. Starts with short sequences (easier for the model) and gradually increases to full length. This is a form of curriculum learning that helps the model learn short-term dynamics first before tackling longer dependencies.
lrn_trunc = RNNLearner(
dls, rnn_type='lstm', metrics=[fun_rmse],
augmentations=[truncate_sequence(truncate_length=100)],
)
lrn_trunc.fit_flat_cos(n_epoch=10, lr=3e-3)
Training: 0%| | 0/3000 [00:00<?, ?it/s]
Training: 1%| | 21/3000 [00:00<01:12, 40.85it/s]
Training: 2%|▏ | 48/3000 [00:01<01:03, 46.60it/s]
Training: 2%|▏ | 74/3000 [00:01<01:00, 48.61it/s]
Training: 3%|▎ | 99/3000 [00:02<00:59, 49.00it/s]
Training: 4%|▍ | 124/3000 [00:02<00:58, 49.06it/s]
Training: 5%|▍ | 149/3000 [00:03<00:58, 48.41it/s]
Training: 6%|▌ | 176/3000 [00:03<00:56, 49.73it/s]
Training: 7%|▋ | 202/3000 [00:04<00:55, 50.24it/s]
Training: 8%|▊ | 228/3000 [00:04<00:55, 49.55it/s]
Training: 8%|▊ | 255/3000 [00:05<00:54, 50.42it/s]
Training: 9%|▉ | 281/3000 [00:05<00:53, 50.87it/s]
Training: 10%|█ | 300/3000 [00:06<00:53, 50.87it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 10%|█ | 307/3000 [00:06<00:53, 50.10it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 11%|█ | 333/3000 [00:06<00:52, 50.37it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 12%|█▏ | 359/3000 [00:07<00:54, 48.44it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 13%|█▎ | 384/3000 [00:07<00:54, 47.93it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 14%|█▎ | 409/3000 [00:08<00:54, 47.73it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 14%|█▍ | 435/3000 [00:08<00:52, 48.58it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 15%|█▌ | 460/3000 [00:09<00:52, 48.58it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 16%|█▌ | 485/3000 [00:09<00:51, 48.99it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 17%|█▋ | 510/3000 [00:10<00:51, 48.18it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 18%|█▊ | 535/3000 [00:10<00:51, 47.87it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 19%|█▊ | 559/3000 [00:11<00:51, 47.58it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 19%|█▉ | 584/3000 [00:11<00:50, 48.01it/s, epoch 1 | train=0.0190 | valid=0.0050 | fun_rmse=0.0102]
Training: 20%|██ | 600/3000 [00:12<00:49, 48.01it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 20%|██ | 609/3000 [00:12<00:51, 46.55it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 21%|██ | 634/3000 [00:13<00:49, 47.50it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 22%|██▏ | 659/3000 [00:13<00:48, 48.20it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 23%|██▎ | 686/3000 [00:14<00:46, 49.90it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 24%|██▎ | 712/3000 [00:14<00:45, 50.47it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 25%|██▍ | 739/3000 [00:15<00:43, 51.41it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 26%|██▌ | 766/3000 [00:15<00:43, 51.75it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 26%|██▋ | 795/3000 [00:16<00:41, 53.30it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 27%|██▋ | 824/3000 [00:16<00:39, 54.41it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 28%|██▊ | 853/3000 [00:17<00:38, 55.05it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 29%|██▉ | 883/3000 [00:17<00:37, 56.24it/s, epoch 2 | train=0.0138 | valid=0.0092 | fun_rmse=0.0131]
Training: 30%|███ | 900/3000 [00:17<00:37, 56.24it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 30%|███ | 912/3000 [00:18<00:37, 55.45it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 31%|███▏ | 944/3000 [00:18<00:35, 57.77it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 32%|███▎ | 975/3000 [00:19<00:34, 58.59it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 34%|███▎ | 1007/3000 [00:19<00:33, 60.05it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 35%|███▍ | 1038/3000 [00:20<00:32, 60.52it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 36%|███▌ | 1070/3000 [00:20<00:31, 61.46it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 37%|███▋ | 1102/3000 [00:21<00:30, 61.77it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 38%|███▊ | 1133/3000 [00:21<00:30, 61.84it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 39%|███▉ | 1164/3000 [00:22<00:30, 60.65it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 40%|███▉ | 1195/3000 [00:22<00:30, 60.09it/s, epoch 3 | train=0.0105 | valid=0.0067 | fun_rmse=0.0110]
Training: 40%|████ | 1200/3000 [00:22<00:29, 60.09it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 41%|████ | 1226/3000 [00:23<00:31, 56.86it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 42%|████▏ | 1255/3000 [00:23<00:30, 56.32it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 43%|████▎ | 1284/3000 [00:24<00:30, 55.72it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 44%|████▍ | 1314/3000 [00:24<00:29, 56.72it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 45%|████▍ | 1343/3000 [00:25<00:29, 56.75it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 46%|████▌ | 1373/3000 [00:25<00:28, 57.14it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 47%|████▋ | 1402/3000 [00:26<00:28, 55.96it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 48%|████▊ | 1431/3000 [00:27<00:27, 56.17it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 49%|████▊ | 1460/3000 [00:27<00:27, 56.58it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 50%|████▉ | 1489/3000 [00:28<00:26, 56.91it/s, epoch 4 | train=0.0071 | valid=0.0052 | fun_rmse=0.0102]
Training: 50%|█████ | 1500/3000 [00:28<00:26, 56.91it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 51%|█████ | 1518/3000 [00:28<00:26, 56.53it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 52%|█████▏ | 1547/3000 [00:29<00:25, 56.27it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 53%|█████▎ | 1577/3000 [00:29<00:25, 56.61it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 54%|█████▎ | 1608/3000 [00:30<00:24, 57.81it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 55%|█████▍ | 1639/3000 [00:30<00:23, 58.96it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 56%|█████▌ | 1669/3000 [00:31<00:22, 58.40it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 57%|█████▋ | 1699/3000 [00:31<00:22, 58.52it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 58%|█████▊ | 1729/3000 [00:32<00:21, 58.47it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 59%|█████▊ | 1759/3000 [00:32<00:21, 58.08it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 60%|█████▉ | 1789/3000 [00:33<00:20, 58.01it/s, epoch 5 | train=0.0062 | valid=0.0047 | fun_rmse=0.0100]
Training: 60%|██████ | 1800/3000 [00:33<00:20, 58.01it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 61%|██████ | 1819/3000 [00:33<00:20, 57.41it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 62%|██████▏ | 1849/3000 [00:34<00:19, 57.61it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 63%|██████▎ | 1880/3000 [00:34<00:19, 58.48it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 64%|██████▎ | 1910/3000 [00:35<00:18, 58.64it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 65%|██████▍ | 1941/3000 [00:35<00:17, 59.41it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 66%|██████▌ | 1973/3000 [00:36<00:16, 60.49it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 67%|██████▋ | 2004/3000 [00:36<00:16, 59.41it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 68%|██████▊ | 2034/3000 [00:37<00:16, 58.00it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 69%|██████▉ | 2065/3000 [00:37<00:15, 58.56it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 70%|██████▉ | 2095/3000 [00:38<00:15, 58.29it/s, epoch 6 | train=0.0051 | valid=0.0044 | fun_rmse=0.0098]
Training: 70%|███████ | 2100/3000 [00:38<00:15, 58.29it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 71%|███████ | 2125/3000 [00:38<00:15, 57.18it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 72%|███████▏ | 2154/3000 [00:39<00:15, 56.22it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 73%|███████▎ | 2183/3000 [00:40<00:14, 56.23it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 74%|███████▍ | 2213/3000 [00:40<00:13, 57.05it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 75%|███████▍ | 2242/3000 [00:41<00:13, 57.00it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 76%|███████▌ | 2271/3000 [00:41<00:12, 56.88it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 77%|███████▋ | 2300/3000 [00:42<00:12, 56.67it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 78%|███████▊ | 2329/3000 [00:42<00:12, 54.51it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 79%|███████▊ | 2357/3000 [00:43<00:12, 50.77it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 79%|███████▉ | 2383/3000 [00:43<00:12, 50.18it/s, epoch 7 | train=0.0051 | valid=0.0043 | fun_rmse=0.0099]
Training: 80%|████████ | 2400/3000 [00:44<00:11, 50.18it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 80%|████████ | 2412/3000 [00:44<00:11, 52.25it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 81%|████████▏ | 2441/3000 [00:44<00:10, 53.82it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 82%|████████▏ | 2471/3000 [00:45<00:09, 55.55it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 83%|████████▎ | 2500/3000 [00:45<00:08, 55.94it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 84%|████████▍ | 2530/3000 [00:46<00:08, 56.67it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 85%|████████▌ | 2561/3000 [00:46<00:07, 57.76it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 86%|████████▋ | 2592/3000 [00:47<00:06, 58.88it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 87%|████████▋ | 2622/3000 [00:47<00:06, 58.27it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 88%|████████▊ | 2652/3000 [00:48<00:05, 58.03it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 89%|████████▉ | 2682/3000 [00:48<00:05, 57.89it/s, epoch 8 | train=0.0046 | valid=0.0039 | fun_rmse=0.0097]
Training: 90%|█████████ | 2700/3000 [00:49<00:05, 57.89it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 90%|█████████ | 2712/3000 [00:49<00:04, 58.03it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 91%|█████████▏| 2744/3000 [00:49<00:04, 59.26it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 92%|█████████▏| 2774/3000 [00:50<00:03, 59.00it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 93%|█████████▎| 2804/3000 [00:51<00:03, 58.52it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 94%|█████████▍| 2835/3000 [00:51<00:02, 59.19it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 96%|█████████▌| 2865/3000 [00:52<00:02, 59.37it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 96%|█████████▋| 2895/3000 [00:52<00:01, 58.99it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 98%|█████████▊| 2926/3000 [00:53<00:01, 59.78it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 99%|█████████▊| 2956/3000 [00:53<00:00, 59.56it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 100%|█████████▉| 2987/3000 [00:54<00:00, 60.22it/s, epoch 9 | train=0.0038 | valid=0.0032 | fun_rmse=0.0096]
Training: 100%|██████████| 3000/3000 [00:54<00:00, 60.22it/s, epoch 10 | train=0.0028 | valid=0.0029 | fun_rmse=0.0095]
Training: 100%|██████████| 3000/3000 [00:54<00:00, 55.26it/s, epoch 10 | train=0.0028 | valid=0.0029 | fun_rmse=0.0095]
Combining Options¶
Augmentations, auxiliary losses, and gradient clipping can be combined freely. Pass them all at Learner creation time.
lrn_combined = RNNLearner(
dls, rnn_type='lstm', metrics=[fun_rmse],
grad_clip=10,
)
lrn_combined.add_aux_loss(
ActivationRegularizer(modules=[unwrap_model(lrn_combined.model).rnn], alpha=2.0)
)
lrn_combined.add_aux_loss(
TemporalActivationRegularizer(modules=[unwrap_model(lrn_combined.model).rnn], beta=1.0)
)
lrn_combined.fit_flat_cos(n_epoch=10, lr=3e-3)
lrn_combined.show_results(max_n=2)
Training: 0%| | 0/3000 [00:00<?, ?it/s]
Training: 1%| | 24/3000 [00:00<01:03, 46.51it/s]
Training: 2%|▏ | 52/3000 [00:01<00:56, 51.98it/s]
Training: 3%|▎ | 81/3000 [00:01<00:53, 54.58it/s]
Training: 4%|▎ | 110/3000 [00:02<00:51, 55.76it/s]
Training: 5%|▍ | 139/3000 [00:02<00:50, 56.52it/s]
Training: 6%|▌ | 168/3000 [00:03<00:49, 56.75it/s]
Training: 7%|▋ | 198/3000 [00:03<00:48, 57.76it/s]
Training: 8%|▊ | 230/3000 [00:04<00:46, 59.39it/s]
Training: 9%|▊ | 261/3000 [00:04<00:45, 59.96it/s]
Training: 10%|▉ | 293/3000 [00:05<00:44, 60.86it/s]
Training: 10%|█ | 300/3000 [00:05<00:44, 60.86it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 11%|█ | 324/3000 [00:05<00:44, 60.76it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 12%|█▏ | 356/3000 [00:06<00:43, 61.42it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 13%|█▎ | 388/3000 [00:06<00:42, 61.76it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 14%|█▍ | 419/3000 [00:07<00:41, 61.47it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 15%|█▌ | 450/3000 [00:07<00:41, 61.58it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 16%|█▌ | 481/3000 [00:08<00:41, 61.43it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 17%|█▋ | 512/3000 [00:08<00:40, 61.48it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 18%|█▊ | 544/3000 [00:09<00:39, 61.84it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 19%|█▉ | 576/3000 [00:09<00:38, 62.40it/s, epoch 1 | train=0.0129 | valid=0.0057 | fun_rmse=0.0108]
Training: 20%|██ | 600/3000 [00:10<00:38, 62.40it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 20%|██ | 608/3000 [00:10<00:38, 62.01it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 21%|██▏ | 640/3000 [00:10<00:38, 61.96it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 22%|██▏ | 671/3000 [00:11<00:37, 61.58it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 23%|██▎ | 703/3000 [00:11<00:37, 62.02it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 24%|██▍ | 735/3000 [00:12<00:36, 61.49it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 26%|██▌ | 766/3000 [00:12<00:36, 61.52it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 27%|██▋ | 797/3000 [00:13<00:36, 60.69it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 28%|██▊ | 829/3000 [00:13<00:35, 61.65it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 29%|██▊ | 860/3000 [00:14<00:34, 61.72it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 30%|██▉ | 892/3000 [00:14<00:33, 62.10it/s, epoch 2 | train=0.0070 | valid=0.0065 | fun_rmse=0.0110]
Training: 30%|███ | 900/3000 [00:14<00:33, 62.10it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 31%|███ | 924/3000 [00:15<00:33, 61.39it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 32%|███▏ | 956/3000 [00:15<00:32, 62.05it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 33%|███▎ | 988/3000 [00:16<00:32, 61.69it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 34%|███▍ | 1019/3000 [00:16<00:32, 60.81it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 35%|███▌ | 1050/3000 [00:17<00:32, 60.45it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 36%|███▌ | 1081/3000 [00:17<00:31, 59.98it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 37%|███▋ | 1112/3000 [00:18<00:31, 60.18it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 38%|███▊ | 1144/3000 [00:18<00:30, 60.77it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 39%|███▉ | 1175/3000 [00:19<00:30, 59.95it/s, epoch 3 | train=0.0058 | valid=0.0053 | fun_rmse=0.0105]
Training: 40%|████ | 1200/3000 [00:19<00:30, 59.95it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 40%|████ | 1205/3000 [00:19<00:30, 58.32it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 41%|████ | 1235/3000 [00:20<00:30, 58.44it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 42%|████▏ | 1265/3000 [00:21<00:29, 58.03it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 43%|████▎ | 1295/3000 [00:21<00:29, 57.42it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 44%|████▍ | 1327/3000 [00:22<00:28, 58.83it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 45%|████▌ | 1358/3000 [00:22<00:27, 59.37it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 46%|████▋ | 1390/3000 [00:23<00:26, 60.59it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 47%|████▋ | 1422/3000 [00:23<00:25, 61.08it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 48%|████▊ | 1453/3000 [00:24<00:25, 61.09it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 50%|████▉ | 1485/3000 [00:24<00:24, 61.42it/s, epoch 4 | train=0.0059 | valid=0.0044 | fun_rmse=0.0100]
Training: 50%|█████ | 1500/3000 [00:24<00:24, 61.42it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 51%|█████ | 1516/3000 [00:25<00:24, 59.96it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 52%|█████▏ | 1547/3000 [00:25<00:24, 59.91it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 53%|█████▎ | 1579/3000 [00:26<00:23, 60.67it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 54%|█████▎ | 1612/3000 [00:26<00:22, 61.65it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 55%|█████▍ | 1644/3000 [00:27<00:21, 61.86it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 56%|█████▌ | 1675/3000 [00:27<00:21, 61.22it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 57%|█████▋ | 1706/3000 [00:28<00:21, 60.63it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 58%|█████▊ | 1738/3000 [00:28<00:20, 61.08it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 59%|█████▉ | 1769/3000 [00:29<00:20, 60.78it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 60%|██████ | 1800/3000 [00:29<00:19, 60.48it/s, epoch 5 | train=0.0058 | valid=0.0054 | fun_rmse=0.0104]
Training: 60%|██████ | 1800/3000 [00:29<00:19, 60.48it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 61%|██████ | 1831/3000 [00:30<00:19, 59.47it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 62%|██████▏ | 1863/3000 [00:30<00:18, 60.72it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 63%|██████▎ | 1894/3000 [00:31<00:18, 60.31it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 64%|██████▍ | 1925/3000 [00:31<00:17, 60.70it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 65%|██████▌ | 1956/3000 [00:32<00:17, 60.93it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 66%|██████▌ | 1987/3000 [00:32<00:16, 60.92it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 67%|██████▋ | 2019/3000 [00:33<00:15, 61.58it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 68%|██████▊ | 2052/3000 [00:33<00:15, 62.67it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 70%|██████▉ | 2085/3000 [00:34<00:14, 63.47it/s, epoch 6 | train=0.0054 | valid=0.0047 | fun_rmse=0.0100]
Training: 70%|███████ | 2100/3000 [00:34<00:14, 63.47it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 71%|███████ | 2117/3000 [00:34<00:14, 62.25it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 72%|███████▏ | 2149/3000 [00:35<00:13, 62.57it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 73%|███████▎ | 2182/3000 [00:35<00:12, 63.06it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 74%|███████▍ | 2215/3000 [00:36<00:12, 63.84it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 75%|███████▍ | 2248/3000 [00:37<00:11, 64.00it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 76%|███████▌ | 2281/3000 [00:37<00:12, 59.17it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 77%|███████▋ | 2311/3000 [00:38<00:11, 57.46it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 78%|███████▊ | 2343/3000 [00:38<00:11, 58.86it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 79%|███████▉ | 2375/3000 [00:39<00:10, 59.84it/s, epoch 7 | train=0.0054 | valid=0.0050 | fun_rmse=0.0100]
Training: 80%|████████ | 2400/3000 [00:39<00:10, 59.84it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 80%|████████ | 2406/3000 [00:39<00:09, 59.49it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 81%|████████▏ | 2439/3000 [00:40<00:09, 61.04it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 82%|████████▏ | 2471/3000 [00:40<00:08, 61.56it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 83%|████████▎ | 2502/3000 [00:41<00:08, 61.21it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 84%|████████▍ | 2533/3000 [00:41<00:07, 60.77it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 86%|████████▌ | 2565/3000 [00:42<00:07, 61.60it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 87%|████████▋ | 2597/3000 [00:42<00:06, 62.21it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 88%|████████▊ | 2629/3000 [00:43<00:05, 62.66it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 89%|████████▊ | 2661/3000 [00:43<00:05, 62.84it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 90%|████████▉ | 2693/3000 [00:44<00:04, 62.55it/s, epoch 8 | train=0.0055 | valid=0.0037 | fun_rmse=0.0097]
Training: 90%|█████████ | 2700/3000 [00:44<00:04, 62.55it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 91%|█████████ | 2725/3000 [00:44<00:04, 61.63it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 92%|█████████▏| 2758/3000 [00:45<00:03, 62.71it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 93%|█████████▎| 2791/3000 [00:45<00:03, 63.11it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 94%|█████████▍| 2823/3000 [00:46<00:02, 63.15it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 95%|█████████▌| 2856/3000 [00:46<00:02, 63.39it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 96%|█████████▋| 2889/3000 [00:47<00:01, 63.92it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 97%|█████████▋| 2922/3000 [00:47<00:01, 64.40it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 98%|█████████▊| 2955/3000 [00:48<00:00, 63.90it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 100%|█████████▉| 2987/3000 [00:48<00:00, 63.34it/s, epoch 9 | train=0.0044 | valid=0.0031 | fun_rmse=0.0096]
Training: 100%|██████████| 3000/3000 [00:49<00:00, 63.34it/s, epoch 10 | train=0.0032 | valid=0.0029 | fun_rmse=0.0095]
Training: 100%|██████████| 3000/3000 [00:49<00:00, 60.97it/s, epoch 10 | train=0.0032 | valid=0.0029 | fun_rmse=0.0095]
Key Takeaways¶
noiseandbiasaugment training data for better generalization. Pass them asaugmentations=[...]on the Learner.ActivationRegularizerandTemporalActivationRegularizersmooth predictions with activation and temporal penalties. Pass them asaux_losses=[...]or vialrn.add_aux_loss(...).grad_clipprevents exploding gradients on long sequences.vary_seq_lenacts as augmentation by varying sequence length each batch.truncate_sequenceimplements curriculum learning with progressive sequence length.- All options compose -- combine augmentations, auxiliary losses, and gradient clipping for best results.