SPLICE: Diffusion Models Get Confidence Intervals for Reliable Time Series Imputation

May 4, 2026 at 10:22 PM AINews arXiv cs.LG May 2026

Source: arXiv cs.LG Archive: May 2026

SPLICE introduces a modular framework that pairs latent diffusion generation with distribution-free conformal prediction, giving each imputed time series value a dynamically updated confidence interval. This shifts generative imputation from mere accuracy to provable reliability, a game-changer for high-stakes applications like power grid scheduling.

Time series data is the lifeblood of modern infrastructure—from electricity load forecasting to financial risk modeling—yet missing values remain a persistent and crippling problem. Traditional imputation methods, from simple interpolation to advanced generative models, produce point estimates that offer no measure of their own uncertainty. For a power grid operator deciding whether to dispatch a peaker plant based on a predicted load spike, a single imputed value without a reliability bound is a gamble. SPLICE, developed by researchers combining techniques from the Joint Embedding Predictive Architecture (JEPA) and conformal prediction, directly addresses this blind spot. The framework operates in three distinct stages: first, a JEPA-based encoder learns robust, temporally-aware latent representations of the observed time series; second, a latent diffusion model generates plausible completions for the missing segments; third, a conformal prediction module wraps each imputed value with a prediction interval that is guaranteed to cover the true value with a user-specified probability, under only the assumption of exchangeability. Crucially, this interval is not static—it updates online as new data streams in, allowing the system to tighten its bounds as confidence grows. For the power sector, this means that a load forecast with a 90% confidence interval of ±2 MW is fundamentally different from a point estimate of 100 MW. The former enables risk-aware scheduling, reserve allocation, and even automated curtailment decisions. SPLICE signals a maturation of generative AI: the industry is moving beyond the race for higher fidelity scores and toward the harder problem of calibrated trustworthiness. The framework is modular, meaning its components—JEPA, latent diffusion, conformal prediction—can be swapped or upgraded independently. This architectural clarity invites rapid iteration and domain-specific customization. While the paper focuses on power grid data, the implications span any field where missing data and decision risk intersect: algorithmic trading, patient monitoring, autonomous vehicle sensor fusion, and climate modeling. SPLICE does not just fill gaps; it quantifies the risk of filling them wrong.

Technical Deep Dive

SPLICE’s architecture is a masterclass in modular design, combining three independently powerful techniques into a coherent pipeline. The first stage is a JEPA (Joint Embedding Predictive Architecture) encoder. Unlike traditional autoencoders that reconstruct the input pixel-by-pixel, JEPA learns representations by predicting the embeddings of masked patches from the embeddings of visible patches. This is done in a latent space, forcing the model to capture high-level temporal dependencies—such as daily seasonality, trend components, and sudden regime changes—without being distracted by low-level noise. The JEPA encoder is trained on complete segments of the time series, learning a mapping from raw sequences to a compact latent vector. The key advantage here is robustness: JEPA’s predictive objective naturally handles missing data during training, and the latent space acts as a compressed, denoised representation of the underlying dynamics.

Stage two is the latent diffusion model. Instead of diffusing in the high-dimensional raw time series space (which is computationally expensive and prone to mode collapse), SPLICE performs the forward and reverse diffusion processes entirely in the latent space learned by JEPA. The forward process gradually adds Gaussian noise to the latent representation of the missing segment. The reverse process, parameterized by a U-Net or transformer-based denoiser, learns to recover the clean latent from the noisy one, conditioned on the latent representations of the observed context. This conditional generation is what produces the imputed values. The latent diffusion approach inherits the diversity and high-fidelity generation capabilities of diffusion models while keeping the computational footprint manageable. The model is trained on a large corpus of complete time series segments, learning the distribution of latent trajectories.

The third and most innovative stage is the conformal prediction (CP) wrapper. Conformal prediction is a distribution-free framework that provides finite-sample coverage guarantees. Given a trained imputation model and a new time series with missing values, SPLICE generates a set of candidate imputations by running the latent diffusion model multiple times with different noise seeds. Each candidate yields a different imputed value for a given missing point. The CP module then uses a held-out calibration set (exchangeable with the test data) to compute a nonconformity score—for instance, the absolute deviation of the imputed value from the true value. Based on the quantiles of these scores, it constructs a prediction interval for each new imputed value. The guarantee is that, with probability at least 1-α (e.g., 90%), the true value will fall within the interval. This holds for any finite sample size and any underlying data distribution, making it ideal for real-world power grid data that rarely follows neat Gaussian assumptions.

A critical feature is online adaptation. As new data points arrive (e.g., a new hour of load measurements), the calibration set can be updated via a sliding window, and the conformal intervals are recomputed. This allows the system to tighten intervals when the model is performing well and widen them when the data distribution shifts (e.g., during a heatwave). The computational overhead of the CP step is negligible compared to the diffusion sampling.

| Component | Function | Key Property | Example Implementation |
|---|---|---|---|
| JEPA Encoder | Learn robust latent representations | Predicts embeddings of masked patches | Vision Transformer (ViT) backbone adapted for 1D time series |
| Latent Diffusion | Generate plausible latent completions | Conditional denoising in latent space | DDPM with U-Net; ~100 denoising steps |
| Conformal Prediction | Wrap imputations with confidence intervals | Distribution-free, finite-sample coverage | Split conformal with absolute residual nonconformity |

Data Takeaway: The modularity means each component can be independently improved. For instance, replacing the U-Net with a diffusion transformer (DiT) could improve generation quality, while using adaptive conformal prediction (ACP) could enhance online coverage stability.

Key Players & Case Studies

SPLICE is a research contribution, but its lineage traces directly to several key players and prior work. The JEPA component is inspired by Yann LeCun’s vision for self-supervised learning, originally applied to images and video. The adaptation to time series is part of a broader trend: companies like Gretel.ai and Mostly AI have commercialized synthetic time series generation, but they lack uncertainty quantification. The latent diffusion backbone draws from the explosion of diffusion models in image generation (Stability AI, OpenAI’s DALL-E 3) and their recent application to time series by groups like Google Research (e.g., Time-Diffusion) and Amazon Web Services (GluonTS). Conformal prediction, while a decades-old statistical framework, has seen a renaissance in machine learning thanks to work by researchers like Emmanuel Candès and Ryan Tibshirani, and is now being integrated into production systems by startups like Robust Intelligence and WhyLabs for model monitoring.

A direct comparison with existing imputation methods reveals SPLICE’s unique value proposition:

| Method | Uncertainty Quantification | Distribution-Free | Online Adaptation | Typical Use Case |
|---|---|---|---|---|
| Linear Interpolation | No | Yes | Yes | Simple gap filling |
| KNN Imputation | No | Yes | No | Low-dimensional data |
| VAE-based (e.g., GP-VAE) | Yes (variance) | No (Gaussian assumption) | No | General purpose |
| GAN-based (e.g., GAIN) | No | No | No | Missing data in tabular data |
| SPLICE (proposed) | Yes (conformal intervals) | Yes | Yes | High-stakes time series |

Data Takeaway: SPLICE is the only method that simultaneously offers distribution-free uncertainty quantification and online adaptation, making it uniquely suited for production environments where data distributions shift and decisions have consequences.

A notable case study is the California Independent System Operator (CAISO), which manages the state’s power grid. CAISO uses load forecasts to schedule generation and avoid blackouts. During the 2020 heatwaves, forecast errors of just 2-3% led to rolling blackouts. A SPLICE-like system could have provided operators with confidence intervals, enabling them to pre-position reserves only when the interval width exceeded a risk threshold, rather than relying on a point forecast. Similarly, Octopus Energy, a UK-based utility, uses AI for demand-side management. Their systems could use SPLICE to impute missing smart meter data and flag intervals where uncertainty is high, triggering manual verification.

Industry Impact & Market Dynamics

The market for time series imputation and forecasting is massive and growing. The global time series analysis market was valued at approximately $1.2 billion in 2023 and is projected to exceed $2.5 billion by 2028, driven by IoT, smart grids, and fintech. However, the current tools—from statistical methods (ARIMA, Prophet) to deep learning (DeepAR, Temporal Fusion Transformer)—largely ignore uncertainty quantification. This is a critical gap. A 2022 survey by the International Energy Agency found that 70% of grid operators cite forecast uncertainty as a top barrier to integrating renewable energy. SPLICE directly addresses this.

The adoption curve will likely follow a pattern: first, research labs and academic groups will validate the framework on public datasets (e.g., UCR Time Series Archive, M4 Competition). Then, startups focused on energy analytics—like GridBeyond, Autogrid, or Enbala—will integrate SPLICE into their platforms. Finally, large utilities and system operators will adopt it for core operations. The regulatory environment is a tailwind: the European Union’s AI Act and California’s proposed AI safety regulations require explainability and uncertainty quantification for high-risk AI systems. SPLICE’s conformal prediction intervals provide a legally defensible basis for AI-assisted decisions.

| Sector | Current Imputation Practice | SPLICE Advantage | Estimated Value at Stake |
|---|---|---|---|
| Power Grid | Linear interpolation + point forecast | Risk-aware scheduling, reduced blackout risk | $10B+ in avoided outages/year (US) |
| Finance | Mean imputation + GARCH models | Portfolio risk quantification | $5B+ in improved VaR estimates |
| Healthcare | Last observation carried forward | Reliable patient monitoring | $3B+ in reduced misdiagnosis |
| Manufacturing | Simple moving average | Predictive maintenance with confidence | $2B+ in reduced downtime |

Data Takeaway: The financial incentive for adopting SPLICE is enormous, particularly in power grids where a single major blackout can cost billions. The regulatory push for AI transparency will accelerate adoption beyond early adopters.

Risks, Limitations & Open Questions

SPLICE is not a silver bullet. Its conformal prediction guarantee relies on the assumption of exchangeability between calibration and test data. In a power grid, this can be violated during rapid regime changes (e.g., a sudden plant outage, a cyberattack). While online adaptation mitigates this, it cannot eliminate the risk of coverage degradation during extreme events. The paper proposes using adaptive conformal prediction (ACP) with a learning rate, but the optimal choice of hyperparameters remains an open problem.

Another limitation is computational cost. The latent diffusion model requires multiple denoising steps (typically 50-100) to generate a single imputation. For real-time applications with sub-second latency (e.g., high-frequency trading), this may be prohibitive. Distillation techniques or consistency models could reduce this, but they are not yet explored in the SPLICE context.

Calibration set size is also a concern. Conformal prediction requires a held-out calibration set that is representative of the test distribution. In practice, obtaining such a set for rare events (e.g., a once-in-a-decade heatwave) is impossible. The intervals will be wide for such events, which is honest but may lead to overly conservative decisions.

Finally, there is the question of interpretability. While the confidence interval is a useful summary, it does not explain *why* the model is uncertain. Is it due to high noise, a novel pattern, or a lack of training data? Future work could combine SPLICE with feature attribution methods to provide this granularity.

AINews Verdict & Predictions

SPLICE is not just another incremental improvement; it is a paradigm shift in how we think about generative models for decision-making. The AI community has spent years chasing higher accuracy on benchmarks, but the real bottleneck for deployment in high-stakes domains has always been trust. SPLICE provides a principled, mathematically rigorous way to build that trust.

Prediction 1: Within 18 months, at least two major grid operators (e.g., PJM in the US, National Grid in the UK) will pilot SPLICE-based imputation for load forecasting. The regulatory and financial incentives are too strong to ignore.

Prediction 2: The modular architecture will spawn a cottage industry of “calibrated generators.” Startups will offer conformal prediction wrappers for existing diffusion models, much like how LangChain provides wrappers for LLMs. This will commoditize uncertainty quantification.

Prediction 3: The next frontier will be extending SPLICE to multivariate and spatiotemporal data. Power grids are networks; missing data at one substation affects predictions at another. A conformal prediction framework for graph-structured time series would be the natural evolution.

Prediction 4: We will see a backlash from the “point-estimate establishment.” Many practitioners are comfortable with point forecasts and will resist the added complexity of intervals. But as AI regulation tightens, the legal liability of making decisions without uncertainty quantification will become untenable.

The bottom line: SPLICE proves that generative AI can be both powerful and provably reliable. The era of blind AI is ending. The era of calibrated AI has begun.

常见问题

这次模型发布“SPLICE: Diffusion Models Get Confidence Intervals for Reliable Time Series Imputation”的核心内容是什么？

Time series data is the lifeblood of modern infrastructure—from electricity load forecasting to financial risk modeling—yet missing values remain a persistent and crippling problem…

从“SPLICE conformal prediction power grid reliability”看，这个模型发布为什么重要？

SPLICE’s architecture is a masterclass in modular design, combining three independently powerful techniques into a coherent pipeline. The first stage is a JEPA (Joint Embedding Predictive Architecture) encoder. Unlike tr…

围绕“JEPA time series imputation latent space”，这次模型更新对开发者和企业有什么影响？

开发者通常会重点关注能力提升、API 兼容性、成本变化和新场景机会，企业则会更关心可替代性、接入门槛和商业化落地空间。