OpenSynth: redefining the demand data frontier

OpenSynth: redefining the demand data frontier

Accelerating the Energy Transition with Synthetic Data

Artificially generated datasets that mimic real-world data are reshaping how the energy sector harnesses artificial intelligence. Synthetic data provides a privacy‑safe method to train machine learning models, test software, and unlock hidden opportunities within global power grids.

Why Synthetic Data Matters

  • Cost‑efficiency – Large, diverse training sets can now be created without a costly data‑collection campaign.
  • Privacy preservation – By replacing sensitive real‑world values with synthetic counterparts, data owners can share datasets without exposing personal or proprietary information.
  • Real‑world fidelity – Synthetic records are generated by modelers trained on authentic data, ensuring they reflect genuine system behaviour.

OpenSynth: Democratising AI‑Generated Energy Datasets

OpenSynth is an open‑source community that empowers utilities and researchers to generate, refine, and share synthetic demand and network data. The project originated under the Centre for Net Zero (CNZ) – a non‑profit wing of Octopus Energy – before being contributed to the Linux Foundation’s LF Energy ecosystem.

Key Features of OpenSynth

  • Raw Smart Meter Integration – Global meter owners can transform real readings into synthetic distributions.
  • Algorithm Collaboration – Community members can tweak, improve, and share AI algorithms to generate richer datasets.
  • Multi‑Market Expansion – OpenSynth is actively incorporating synthetic and real energy data from international markets to fortify its modelling capabilities.
Recent Milestone: RTE’s First Non‑Demand Contribution

France’s Transmission System Operator (RTE) has now contributed the inaugural set of non‑demand data to OpenSynth. This dataset blends:

  • Non‑synthetic grid topology – Detailed node arrangements for the full French transmission network.
  • All network components – Comprehensive lists of pumps, transformers, and other infrastructure.
  • Synthetic injection time‑series – Reconstructed load data derived from open‑source aggregated datasets.
Implications for Energy Modelling and AI Research

With full‑grid topology and time‑series load data now publicly available, researchers, software developers, and system modellers can launch large‑scale studies and train AI models without proprietary barriers. The RTE7000 contribution sets a precedent for future AI‑generated synthetic datasets, creating a robust foundation for advanced energy analytics.