In day-ahead markets, participants submit bids specifying the amounts of energy they wish to buy or sell and the price they are prepared to pay or receive. However, the dynamic for forming the Market Clearing Price (MCP) dictated by the bidding mechanism is frequently overlooked in the literature on energy market modeling. Forecasting models usually focus on predicting the MCP rather than trying to build the optimal supply and demand curves for a given price scenario. This article develops a data-driven approach for generating optimal offering curves using Deep Deterministic Policy Gradient (DDPG), a reinforcement learning algorithm capable of handling continuous action spaces. Our model processes historical Italian electricity price data to generate stepwise offering curves that maximize profit over time. Numerical experiments demonstrate the effectiveness of our approach, with the agent achieving up to 85% of the normalized reward, i.e. the ratio between actual profit and the maximum possible revenue obtainable if all production capacity were sold at the highest feasible price. These results demonstrate that reinforcement learning can effectively capture complex temporal patterns in electricity price data without requiring explicit forecast models, providing market participants with adaptive bidding strategies that improve profit margins while accounting for production constraints.

Reinforcement learning for bidding strategy optimization in day-ahead energy market

Di Persio, Luca;Garbelli, Matteo;
2025-01-01

Abstract

In day-ahead markets, participants submit bids specifying the amounts of energy they wish to buy or sell and the price they are prepared to pay or receive. However, the dynamic for forming the Market Clearing Price (MCP) dictated by the bidding mechanism is frequently overlooked in the literature on energy market modeling. Forecasting models usually focus on predicting the MCP rather than trying to build the optimal supply and demand curves for a given price scenario. This article develops a data-driven approach for generating optimal offering curves using Deep Deterministic Policy Gradient (DDPG), a reinforcement learning algorithm capable of handling continuous action spaces. Our model processes historical Italian electricity price data to generate stepwise offering curves that maximize profit over time. Numerical experiments demonstrate the effectiveness of our approach, with the agent achieving up to 85% of the normalized reward, i.e. the ratio between actual profit and the maximum possible revenue obtainable if all production capacity were sold at the highest feasible price. These results demonstrate that reinforcement learning can effectively capture complex temporal patterns in electricity price data without requiring explicit forecast models, providing market participants with adaptive bidding strategies that improve profit margins while accounting for production constraints.
2025
Bidding strategy
Electricity auction
Euphemia
Day ahead energy market
Reinforcement learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1171688
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact