Time series imputation with missing or irregularly sampled data is a persistent challenge in machine learning. Most frequency-domain methods rely on the Fast Fourier Transform (FFT), which assumes uniform sampling, therefore requiring interpolation or imputation prior to frequency estimation. We propose a novel diffusion-based imputation approach (LSCD) that leverages Lomb–Scargle periodograms to robustly handle missing and irregular samples without requiring interpolation or imputation in the frequency domain. Our method trains a score-based diffusion model conditioned on the entire signal spectrum, enabling direct usage of irregularly spaced observations. Experiments on synthetic and real-world benchmarks demonstrate that our method recovers missing data more accurately than purely time-domain baselines, while simultaneously producing consistent frequency estimates. Crucially, our framework paves the way for broader adoption of Lomb–Scargle methods in machine learning tasks involving irregular data.
Miss Type | Miss rate | Metric | Method | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Mean | Lerp | BRITS | GP-VAE | US-GAN | TimesNet | CSDI | SAITS | ModernTCN | LSCD | |||
MCAR | 10 % | MAE | 1.380 | 1.305 | 0.943 | 1.399 | 0.933 | 1.220 | 1.336 | 0.885 | 0.973 | 0.765 |
RMSE | 1.947 | 1.991 | 1.657 | 1.986 | 1.636 | 1.803 | 1.889 | 1.569 | 1.727 | 1.453 | ||
S-MAE | 0.081 | 0.081 | 0.052 | 0.082 | 0.053 | 0.069 | 0.008 | 0.043 | 0.049 | 0.003 | ||
50 % | MAE | 1.373 | 1.449 | 1.095 | 1.383 | 1.152 | 1.481 | 1.359 | 1.041 | 1.129 | 0.975 | |
RMSE | 1.930 | 2.070 | 1.759 | 1.950 | 1.845 | 2.017 | 1.922 | 1.699 | 1.817 | 1.658 | ||
S-MAE | 0.264 | 0.324 | 0.170 | 0.266 | 0.191 | 0.239 | 0.027 | 0.159 | 0.173 | 0.014 | ||
90 % | MAE | 1.375 | 1.586 | 1.320 | 1.377 | 1.369 | 1.579 | 1.361 | 1.292 | 1.360 | 1.271 | |
RMSE | 1.935 | 2.142 | 1.899 | 1.938 | 1.970 | 2.143 | 1.925 | 1.878 | 1.963 | 1.870 | ||
S-MAE | 0.439 | 0.572 | 0.383 | 0.439 | 0.407 | 0.462 | 0.044 | 0.375 | 0.406 | 0.036 | ||
Sequence | 10 % | MAE | 1.353 | 1.542 | 1.330 | 1.355 | 1.384 | 1.391 | 1.413 | 1.323 | 1.329 | 1.359 |
RMSE | 1.905 | 2.092 | 1.915 | 1.908 | 1.995 | 1.959 | 1.988 | 1.890 | 1.931 | 1.962 | ||
S-MAE | 0.055 | 0.075 | 0.056 | 0.054 | 0.061 | 0.062 | 0.006 | 0.055 | 0.056 | 0.005 | ||
50 % | MAE | 1.374 | 1.564 | 1.347 | 1.376 | 1.393 | 1.467 | 1.378 | 1.342 | 1.354 | 1.316 | |
RMSE | 1.934 | 2.115 | 1.928 | 1.936 | 1.999 | 2.038 | 1.943 | 1.917 | 1.960 | 1.913 | ||
S-MAE | 0.271 | 0.369 | 0.269 | 0.271 | 0.297 | 0.321 | 0.028 | 0.268 | 0.277 | 0.026 | ||
90 % | MAE | 1.386 | 1.573 | 1.362 | 1.388 | 1.403 | 1.489 | 1.372 | 1.352 | 1.375 | 1.313 | |
RMSE | 1.946 | 2.127 | 1.941 | 1.949 | 2.007 | 2.062 | 1.943 | 1.929 | 1.982 | 1.913 | ||
S-MAE | 0.288 | 0.389 | 0.286 | 0.288 | 0.305 | 0.338 | 0.029 | 0.283 | 0.292 | 0.027 | ||
Block | 10 % | MAE | 1.306 | 1.507 | 1.255 | 1.309 | 1.334 | 1.379 | 1.304 | 1.268 | 1.275 | 1.259 |
RMSE | 1.807 | 2.014 | 1.786 | 1.811 | 1.885 | 1.898 | 1.804 | 1.785 | 1.825 | 1.774 | ||
S-MAE | 0.105 | 0.146 | 0.100 | 0.104 | 0.116 | 0.124 | 0.011 | 0.103 | 0.106 | 0.010 | ||
50 % | MAE | 1.306 | 1.505 | 1.279 | 1.308 | 1.333 | 1.451 | 1.314 | 1.285 | 1.309 | 1.269 | |
RMSE | 1.815 | 2.014 | 1.806 | 1.817 | 1.881 | 1.978 | 1.835 | 1.804 | 1.852 | 1.810 | ||
S-MAE | 0.287 | 0.383 | 0.278 | 0.286 | 0.306 | 0.344 | 0.029 | 0.285 | 0.296 | 0.027 | ||
90 % | MAE | 1.339 | 1.523 | 1.319 | 1.340 | 1.359 | 1.506 | 1.329 | 1.320 | 1.356 | 1.320 | |
RMSE | 1.868 | 2.052 | 1.862 | 1.869 | 1.927 | 2.054 | 1.874 | 1.859 | 1.916 | 1.870 | ||
S-MAE | 0.359 | 0.473 | 0.351 | 0.358 | 0.376 | 0.439 | 0.036 | 0.358 | 0.374 | 0.035 |
Table 1: Imputation performance on the Synthetic-Sines benchmark with three missing-data patterns: MCAR (pointwise random), Sequence gaps, and Block outages, each tested at 10 %, 50 % and 90 % missing rates. For every setting we report MAE↓ and RMSE↓ in the time domain, and S-MAE↓ on the spectral domain.
Dataset | Miss rate | Metric | Method | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Mean | Lerp | BRITS | GP-VAE | US-GAN | TimesNet | CSDI | SAITS | ModernTCN | LSCD | |||
PhysioNet | 10 % | MAE | 0.714 | 0.372 | 0.278 | 0.469 | 0.323 | 0.375 | 0.219 | 0.232 | 0.351 | 0.211 |
RMSE | 1.035 | 0.708 | 0.693 | 0.783 | 0.662 | 0.690 | 0.545 | 0.583 | 0.697 | 0.494 | ||
S-MAE | 0.032 | 0.020 | 0.016 | 0.026 | 0.020 | 0.022 | 0.013 | 0.014 | 0.020 | 0.012 | ||
50 % | MAE | 0.711 | 0.417 | 0.385 | 0.521 | 0.449 | 0.453 | 0.307 | 0.315 | 0.440 | 0.303 | |
RMSE | 1.091 | 0.840 | 0.833 | 0.907 | 0.852 | 0.840 | 0.672 | 0.735 | 0.803 | 0.664 | ||
S-MAE | 0.111 | 0.087 | 0.064 | 0.083 | 0.076 | 0.076 | 0.052 | 0.055 | 0.071 | 0.052 | ||
90 % | MAE | 0.710 | 0.565 | 0.560 | 0.642 | 0.670 | 0.642 | 0.481 | 0.565 | 0.647 | 0.479 | |
RMSE | 1.097 | 0.993 | 0.975 | 1.038 | 1.060 | 1.031 | 0.834 | 0.971 | 1.026 | 0.832 | ||
S-MAE | 0.148 | 0.189 | 0.104 | 0.124 | 0.125 | 0.131 | 0.093 | 0.108 | 0.137 | 0.093 | ||
PM 2.5 | 10 % | MAE | 50.685 | 15.363 | 16.519 | 23.941 | 32.999 | 22.685 | 9.670 | 15.424 | 24.089 | 9.069 |
RMSE | 66.558 | 27.658 | 26.775 | 40.586 | 48.951 | 39.336 | 19.093 | 30.558 | 40.052 | 17.914 | ||
S-MAE | 0.135 | 0.039 | 0.039 | 0.060 | 0.080 | 0.056 | 0.023 | 0.034 | 0.059 | 0.022 |
Table 2. Time- and frequency-domain imputation errors on two real-world datasets. PhysioNet is evaluated at 10%, 50% and 90% missingness rates, while PM 2.5 is evaluated at 10%. Metrics are MAE↓, RMSE↓ and S-MAE↓.
pip install git+https://github.com/asztr/LombScargle.git
import torch
import math
import LombScargle
# Define example time series with single frequency = 5
t = torch.linspace(0, 10.0, 200) #timestamps
y = torch.sin(2*math.pi*5.0*t) #values
# Select frequencies to evaluate
freqs = torch.linspace(1e-5, 10.0, 100)
# Compute the normalized spectrum
ls = LombScargle.LombScargle(freqs)
P = ls(t, y, fap=True, norm=True) # [1, 100] array of power values
@inproceedings{lscd2025,
title = {LSCD: Lomb–Scargle Conditioned Diffusion for Time-Series Imputation},
author = {Elizabeth Fons and Alejandro Sztrajman and Yousef El-Laham and Luciana Ferrer and
Svitlana Vyetrenko and Manuela Veloso},
booktitle = {Proc. 42nd International Conference on Machine Learning},
year = {2025}
}