2025 - Enrique Mallada

1 paper published in EJOR

Our paper on counterfactual analysis of default bid market power mitigation strategies in two-stage electricity markets [1] has been published in the European Journal of Operational Research. Congrats Rajni!

[1]

R. K. Bansal, P. You, Y. Chen, and E. Mallada, “Counterfactual analysis of default bid market power mitigation strategies in two-stage electricity markets,” European Journal of Operational Research, pp. 1-18, 2025.
[Bibtex] [Abstract] [Download PDF]

Market power remains a persistent challenge in many liberalized electricity markets worldwide, driving the adoption of ex-ante and ex-post mitigation measures. Despite locational mitigation tools (e.g., cost-based reference levels or default energy bids), evidence of price manipulation has motivated system-level market power mitigation (MPM) policies. However, the full implications of these rules are not well understood, and limited insight into participant behavior can lead to unintended consequences, including increased market power and welfare losses. We study sequentially cleared electricity markets and analyze a two-stage settlement structure commonly used by system operators (e.g., day-ahead and real-time markets in North America). Our focus is on MPM policies that replace noncompetitive generator offers with operator-estimated default bids, and we model competition between generators and loads with inelastic energy requirements who act strategically in allocating demand across stages under real-time, day-ahead, and simultaneous applications of MPM policies. Motivated by the loss of Nash equilibrium under conventional supply-function bidding, we adopt an alternative mechanism in which generators bid the intercept of an affine supply function. Under real-time MPM, strategic interaction in the day-ahead market drives all demand to real time, producing an undesirable outcome. To test robustness, we incorporate demand uncertainty using a variance-penalized expectation framework. Low risk aversion still leads to substantial real-time clearing, while imbalances in risk preferences further amplify market power. Overall, intercept-function bidding combined with day-ahead and simultaneous MPM policies mitigates generator market power more effectively than real-time substitution alone, although these policies shift some market power toward loads.

@article{bcym2025ejor,
  abstract = {Market power remains a persistent challenge in many liberalized electricity markets worldwide, driving the adoption of ex-ante and ex-post mitigation measures. Despite locational mitigation tools (e.g., cost-based reference levels or default energy bids), evidence of price manipulation has motivated system-level market power mitigation (MPM) policies. However, the full implications of these rules are not well understood, and limited insight into participant behavior can lead to unintended consequences, including increased market power and welfare losses. We study sequentially cleared electricity markets and analyze a two-stage settlement structure commonly used by system operators (e.g., day-ahead and real-time markets in North America). Our focus is on MPM policies that replace noncompetitive generator offers with operator-estimated default bids, and we model competition between generators and loads with inelastic energy requirements who act strategically in allocating demand across stages under real-time, day-ahead, and simultaneous applications of MPM policies. Motivated by the loss of Nash equilibrium under conventional supply-function bidding, we adopt an alternative mechanism in which generators bid the intercept of an affine supply function. Under real-time MPM, strategic interaction in the day-ahead market drives all demand to real time, producing an undesirable outcome. To test robustness, we incorporate demand uncertainty using a variance-penalized expectation framework. Low risk aversion still leads to substantial real-time clearing, while imbalances in risk preferences further amplify market power. Overall, intercept-function bidding combined with day-ahead and simultaneous MPM policies mitigates generator market power more effectively than real-time substitution alone, although these policies shift some market power toward loads.},
  author = {Bansal, Rajni Kant and You, Pengcheng and Chen, Yue and Mallada, Enrique},
  doi = {https://doi.org/10.1016/j.ejor.2025.12.030},
  grants = {CAREER-1752362; CPS-2136324; Global-Centers-2330450},
  issn = {0377-2217},
  journal = {European Journal of Operational Research},
  month = {12},
  pages = {1-18},
  record = {online 12 2025, accepted Dec 2025, under revision Jan 2024, submitted Aug 2023},
  title = {Counterfactual analysis of default bid market power mitigation strategies in two-stage electricity markets},
  url = {https://mallada.ece.jhu.edu/pubs/2025-EJOR-BCYM.pdf},
  year = {2025}
}

1 paper published in TAC

Oct '25 by Enrique Mallada

Our paper on the stability, economic efficiency, and incentive compatibility of electricity market dynamics [1] has been published in the IEEE Transactions on Automatic Control. Congrats Pengcheng and Yan!

[1]

P. You, Y. Jiang, E. Yeung, D. Gayme, and E. Mallada, “On the Stability, Economic Efficiency and Incentive Compatibility of Electricity Market Dynamics,” IEEE Transactions on Automatic Control, vol. 70, iss. 10, pp. 6815-6830, 2025.
[Bibtex] [Abstract] [Download PDF]

This paper focuses on the operation of an electricity market that accounts for participants that bid at a sub-minute timescale. To that end, we model the market-clearing process as a dynamical system, called market dynamics, which is temporally coupled with the grid frequency dynamics and is thus required to guarantee system-wide stability while meeting the system operational constraints. We characterize participants as price-takers who rationally update their bids to maximize their utility in response to real-time schedules of prices and dispatch. For two common bidding mechanisms, based on quantity and price, we identify a notion of alignment between participants’ behavior and planners’ goals that leads to a saddle-based design of the market that guarantees convergence to a point meeting all operational constraints. We further explore cases where this alignment property does not hold and observe that misaligned participants’ bidding can destabilize the closed-loop system. We thus design a regularized version of the market dynamics that recovers all the desirable stability and steady-state performance guarantees. Numerical tests validate our results on the IEEE 39-bus system.

@article{yjygm2025tac,
  abstract = {This paper focuses on the operation of an electricity market that accounts for participants that bid at a sub-minute timescale. To that end, we model the market-clearing process as a dynamical system, called market dynamics, which is temporally coupled with the grid frequency dynamics and is thus required to guarantee system-wide stability while meeting the system operational constraints. We characterize participants as price-takers who rationally update their bids to maximize their utility in response to real-time schedules of prices and dispatch. For two common bidding mechanisms, based on quantity and price, we identify a notion of alignment between participants' behavior and planners' goals that leads to a saddle-based design of the market that guarantees convergence to a point meeting all operational constraints. We further explore cases where this alignment property does not hold and observe that misaligned participants' bidding can destabilize the closed-loop system.  We thus design a regularized version of the market dynamics that recovers all the desirable stability and steady-state performance guarantees. Numerical tests validate our results on the IEEE 39-bus system.},
  author = {You, Pengcheng and Jiang, Yan and Yeung, Enoch and Gayme, Dennice and Mallada, Enrique},
  bdsk-url-3 = {https://mallada.ece.jhu.edu/pubs/2024-TAC-YJYGM.pdf},
  doi = {10.1109/TAC.2025.3589447},
  grants = {CPS-2136324, Global Centers-2330450},
  journal = {IEEE Transactions on Automatic Control},
  month = {10},
  number = {10},
  pages = {6815-6830},
  record = {published Oct 2025, accepted Aug 2024, revised Dec 2023, submitted Dec 2021},
  title = {On the Stability, Economic Efficiency and Incentive Compatibility of Electricity Market Dynamics},
  url = {https://mallada.ece.jhu.edu/pubs/2025-TAC-YJYGM.pdf},
  volume = {70},
  year = {2025}
}

Roy defended his dissertation

Jun '26Aug '25 by Enrique Mallada

Roy Siegelmann, an AMS Ph.D. student in our lab, defended his dissertation entitled “Data-Driven Analysis and Control of Dynamical Systems via Recurrent Lyapunov Functions” on Monday, August 18th. Congratulations Dr Siegelmann!

1 paper accepted to CDC

Jul '25 by Enrique Mallada

Our paper on recurrent control barrier functions [1] has been accepted to the 64th IEEE Conference on Decision and Control. Congrats Jixian!

[1]

J. Liu and E. Mallada, “Recurrent Control Barrier Functions: A Path Towards Nonparametric Safety Verification,” in 64th IEEE Conference on Decision and Control (CDC), 2025.
[Bibtex] [Abstract] [Download PDF]

Ensuring the safety of complex dynamical systems often relies on Hamilton-Jacobi (HJ) Reachability Analysis or Control Barrier Functions (CBFs). Both methods require computing a function that characterizes a safe set that can be made (control) invariant. However, the computational burden of solving high-dimensional partial differential equations (for HJ Reachability) or large-scale semidefinite programs (for CBFs) makes finding such functions challenging. In this paper, we introduce the notion of Recurrent Control Barrier Functions (RCBFs), a novel class of CBFs that leverages a recurrent property of the trajectories, i.e., coming back to a safe set, for safety verification. Under mild assumptions, we show that the RCBF condition holds for the signed-distance function, turning function design into set identification. Notably, the resulting set need not be invariant to certify safety. We further propose a data-driven nonparametric method to compute safe sets that is massively parallelizable and trades off conservativeness against computational cost.

@inproceedings{lm2025cdc,
  abstract = {Ensuring the safety of complex dynamical systems often relies on Hamilton-Jacobi (HJ) Reachability Analysis or Control Barrier Functions (CBFs). Both methods require computing a function that characterizes a safe set that can be made (control) invariant. However, the computational burden of solving high-dimensional partial differential equations (for HJ Reachability) or large-scale semidefinite programs (for CBFs) makes finding such functions challenging. In this paper, we introduce the notion of Recurrent Control Barrier Functions (RCBFs), a novel class of CBFs that leverages a recurrent property of the trajectories, i.e., coming back to a safe set, for safety verification. Under mild assumptions, we show that the RCBF condition holds for the signed-distance function, turning function design into set identification. Notably, the resulting set need not be invariant to certify safety. We further propose a data-driven nonparametric method to compute safe sets that is massively parallelizable and trades off conservativeness against computational cost.},
  author = {Liu, Jixian and Mallada, Enrique},
  booktitle = {64th IEEE Conference on Decision and Control (CDC)},
  doi = {10.1109/CDC57313.2025.11312572},
  grants = {CPS-2136324; Global-Centers-2330450},
  month = {12},
  record = {presented Dec. 2025, accepted Jul. 2025, submitted Mar. 2025},
  title = {Recurrent Control Barrier Functions: A Path Towards Nonparametric Safety Verification},
  url = {https://mallada.ece.jhu.edu/pubs/2025-CDC-LM.pdf},
  year = {2025}
}

1 paper published in TMLR

May '25 by Enrique Mallada

Our paper on a local Polyak-Lojasiewicz condition and descent lemma of gradient descent for overparametrized linear models [1] has been published in Transactions on Machine Learning Research. Congrats Ziqing!

[1] Z. Xu, H. Min, S. Tarmoun, E. Mallada, and R. Vidal, “A Local Polyak-Łojasiewicz and Descent Lemma of Gradient Descent For Overparametrized Linear Models,” Transaction on Machine Learning Research (TMLR), 2025.
[Bibtex] [Download PDF]

@article{xmtmv2025tmlr,
  author = {Xu, Ziqing and Min, Hancheng and Tarmoun, Salma and Mallada, Enrique and Vidal, Rene},
  grants = {Global Centers-2330450},
  issn = {2835-8856},
  journal = {Transaction on Machine Learning Research (TMLR)},
  month = {5},
  record = {accepted May 2025, submitted Feb 2025},
  title = {A Local Polyak-Łojasiewicz and Descent Lemma of Gradient Descent For Overparametrized Linear Models},
  url = {https://mallada.ece.jhu.edu/pubs/2025-TMLR-XMTMV.pdf},
  year = {2025}
}

1 paper accepted to RLC

May '25 by Enrique Mallada

Our paper on nonparametric policy improvement in continuous action spaces via expert demonstrations [1] has been accepted to the Reinforcement Learning Conference. Congrats Agustin!

[1] A. Castellano, S. Rezaei, J. Markowitz, and E. Mallada, “Nonparametric Policy Improvement in Continuous Action Spaces via Expert Demonstrations,” in Reinforcement Learning Conference, 2025, pp. 1158-1179.
[Bibtex] [Abstract] [Download PDF]

The policy improvement theorem is a fundamental building block of classical reinforcement learning for discrete action spaces. Unfortunately, the lack of an analogous result for continuous action spaces with function approximation has historically limited theoretical guarantees of policy optimization algorithms, undermining their reliability. Here, we introduce a novel nonparametric policy that relies purely on data to take actions and that admits a policy improvement theorem for deterministic Markov Decision Processes (MDPs). By imposing mild regularity assumptions on the optimal policy, we show that, when data come from expert demonstrations, one can construct a nonparametric lower bound on the value of the policy, thus enabling its robust evaluation. The constructed lower bound naturally leads to a simple improvement mechanism based on adding more demonstrations. We also provide conditions to identify regions of the state space where additional demonstrations are needed to meet specific performance goals. Finally, we propose a policy optimization algorithm that ensures a monotonic improvement of the lower bound and leads to high probability performance guarantees. These contributions provide a foundational step toward establishing a rigorous framework for policy improvement in continuous action spaces.

@inproceedings{crmm2025rlc,
  abstract = {The policy improvement theorem is a fundamental building block of classical reinforcement learning for discrete action spaces. Unfortunately, the lack of an analogous result for continuous action spaces with function approximation has historically limited theoretical guarantees of policy optimization algorithms, undermining their reliability. Here, we introduce a novel nonparametric policy that relies purely on data to take actions and that admits a 
policy improvement theorem for deterministic Markov Decision Processes (MDPs). By imposing mild regularity assumptions on the optimal policy, we show that, when data come from expert demonstrations, one can construct a nonparametric lower bound on the value of the policy, thus enabling its robust evaluation. The constructed lower bound naturally leads to a simple improvement mechanism based on adding more demonstrations. We also provide conditions to identify regions of the state space where additional demonstrations are needed to meet specific performance goals. Finally, we propose a policy optimization algorithm that ensures a monotonic improvement of the lower bound and leads to high probability performance guarantees. These contributions provide a foundational step toward establishing a rigorous framework for policy improvement in continuous action spaces.},
  author = {Castellano, Agustin and Rezaei, Sohrab and Markowitz, Jared and Mallada, Enrique},
  booktitle = {Reinforcement Learning Conference},
  month = {8},
  pages = {1158-1179},
  record = {presented Aug. 2025, accepted May 2025, submitted Feb. 2025},
  title = {Nonparametric Policy Improvement in Continuous Action Spaces via Expert Demonstrations},
  url = {https://mallada.ece.jhu.edu/pubs/2025-RLC-CRMM.pdf},
  year = {2025}
}

1 paper accepted to ACC

May '25 by Enrique Mallada

Our tutorial paper on safe physics-informed machine learning for dynamics and control [1] has been accepted to the American Control Conference.

[1]

J. Drgona, T. X. Nghiem, T. Beckers, M. Fazlyab, E. Mallada, C. Jones, D. Vrabie, S. L. Brunton, and R. Findeisen, “Safe Physics-informed Machine Learning for Dynamics and Control,” in American Control Conference (ACC), 2025, pp. 591-606.
[Bibtex] [Abstract] [Download PDF]

This tutorial paper focuses on safe physics-informed machine learning in the context of dynamics and control, providing a comprehensive overview of how to integrate physical models and safety guarantees. As machine learning techniques enhance the modeling and control of complex dynamical systems, ensuring safety and stability remains a critical challenge, especially in safety-critical applications like autonomous vehicles, robotics, medical decision-making, and energy systems. We explore various approaches for embedding and ensuring safety constraints, including structural priors, Lyapunov and Control Barrier Functions, predictive control, projections, and robust optimization techniques. Additionally, we delve into methods for uncertainty quantification and safety verification, including reachability analysis and neural network verification tools, which help validate that control policies remain within safe operating bounds even in uncertain environments. The paper includes illustrative examples demonstrating the implementation aspects of safe learning frameworks that combine the strengths of data-driven approaches with the rigor of physical principles, offering a path toward the safe control of complex dynamical systems.

@inproceedings{dnbetal2025acc,
  abstract = {This tutorial paper focuses on safe physics-informed machine learning in the context of dynamics and control, providing a comprehensive overview of how to integrate physical models and safety guarantees. As machine learning techniques enhance the modeling and control of complex dynamical systems, ensuring safety and stability remains a critical challenge, especially in safety-critical applications like autonomous vehicles, robotics, medical decision-making, and energy systems. We explore various approaches for embedding and ensuring safety constraints, including structural priors, Lyapunov and Control Barrier Functions, predictive control, projections, and robust optimization techniques. Additionally, we delve into methods for uncertainty quantification and safety verification, including reachability analysis and neural network verification tools, which help validate that control policies remain within safe operating bounds even in uncertain environments. The paper includes illustrative examples demonstrating the implementation aspects of safe learning frameworks that combine the strengths of data-driven approaches with the rigor of physical principles, offering a path toward the safe control of complex dynamical systems.},
  author = {Drgona, Jan and Nghiem, Truong X. and Beckers, Thomas and Fazlyab, Mahyar and Mallada, Enrique and Jones, Colin and Vrabie, Draguna and Brunton, Steven L. and Findeisen, Rolf},
  booktitle = {American Control Conference (ACC)},
  doi = {10.23919/ACC63710.2025.11107836},
  month = {7},
  pages = {591-606},
  record = {presented Jul. 2025, accepted May 2025, submitted Mar. 2025},
  title = {Safe Physics-informed Machine Learning for Dynamics and Control},
  url = {https://mallada.ece.jhu.edu/pubs/2025-ACC-Tutorial-DNBetal.pdf},
  year = {2025}
}

Agustin received the WSE Teaching Assistant Award

Jun '26Mar '25 by Enrique Mallada

Agustin Castellano received the Whiting School of Engineering Teaching Assistant Award, recognizing his exemplary support of students and faculty and his commitment to academic excellence as a graduate teaching assistant. Congrats Agustin!

2 papers accepted to AISTATS

Jan '25 by Enrique Mallada

Our papers on variance-aware linear UCB for neural contextual bandits [1] and on the learning dynamics of LoRA [2] have been accepted to the International Conference on Artificial Intelligence and Statistics. Congrats Ziqing!

[1] H. M. Bui, E. Mallada, and A. Liu, “Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits,” in International Conference on Artificial Intelligence and Statistics (AISTATS), 2025.
[Bibtex] [Abstract] [Download PDF]

By leveraging the representation power of deepneuralnetworks, neuralupperconfidence bound (UCB) algorithms have shown success in contextual bandits. To further balance the exploration and exploitation, we propose Neural- $σ$2-LinearUCB, a variance-aware algo- rithm that utilizes $σ$2 t, i.e., an upper bound of the reward noise variance at round t, to enhance the uncertainty quantification quality of the UCB, resulting in a regret performance improvement. We provide an oracle version for our algorithm characterized by an oracle variance upper bound $σ$2 tand a practical ver- sion with a novel estimation for this variance bound. Theoretically, we provide rigorous re- gret analysis for both versions and prove that our oracle algorithm achieves a better regret guarantee than other neural-UCB algorithms in the neural contextual bandits setting. Em- pirically, ourpracticalmethodenjoysasimilar computational efficiency, while outperforming state-of-the-art techniques by having a better calibration and lower regret across multiple standard settings, including on the synthetic, UCI, MNIST, and CIFAR-10 datasets.

@inproceedings{bml2025aistats,
  abstract = {By leveraging the representation power of deepneuralnetworks, neuralupperconfidence bound (UCB) algorithms have shown success in contextual bandits. To further balance the exploration and exploitation, we propose Neural- $σ$2-LinearUCB, a variance-aware algo- rithm that utilizes $σ$2 t, i.e., an upper bound of the reward noise variance at round t, to enhance the uncertainty quantification quality of the UCB, resulting in a regret performance improvement. We provide an oracle version for our algorithm characterized by an oracle variance upper bound $σ$2 tand a practical ver- sion with a novel estimation for this variance bound. Theoretically, we provide rigorous re- gret analysis for both versions and prove that our oracle algorithm achieves a better regret guarantee than other neural-UCB algorithms in the neural contextual bandits setting. Em- pirically, ourpracticalmethodenjoysasimilar computational efficiency, while outperforming state-of-the-art techniques by having a better calibration and lower regret across multiple standard settings, including on the synthetic, UCI, MNIST, and CIFAR-10 datasets.},
  author = {Bui, Ha Manh and Mallada, Enrique and Liu, Anqi},
  booktitle = {International Conference on Artificial Intelligence and Statistics (AISTATS)},
  grants = {No Grant},
  month = {4},
  publisher = {PMLR},
  record = {accepted Jan 2025, submitted Oct 2024},
  series = {Proceedings of Machine Learning Research},
  title = {Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits},
  url = {https://mallada.ece.jhu.edu/pubs/2025-AISTATS-BML.pdf},
  year = {2025}
}

[2] Z. Xu, H. Min, L. E. MacDonald, J. Luo, S. Tarmoun, E. Mallada, and R. Vidal, “Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization,” in International Conference on Artificial Intelligence and Statistics (AISTATS), 2025.
[Bibtex] [Abstract] [Download PDF]

Despite the empirical success of Low-Rank Adaptation (LoRA) in fine-tuning pretrained models, there is little theoretical understanding of how first-order methods with carefully crafted initialization adapt models to new tasks. In this work, we take the first step towards bridging this gap by theoretically analyzing the learning dynamics of LoRA for matrix factorization (MF) under gradient flow (GF), emphasizing the crucial role of initialization. For small initialization, we theoretically show that GF converges to a neighborhood of the optimal solution, with smaller initialization leading to lower final error. Our analysis shows that the final error is affected by the misalignment between the singular spaces of the pre-trained model and the target matrix, and reducing the initialization scale improves alignment. To address this misalignment, we propose a spectral initialization for LoRA in MF and theoretically prove that GF with small spectral initialization converges to the fine-tuning task with arbitrary precision. Numerical experiments from MF and image classification validate our findings.

@inproceedings{xmmltmv2025aistats,
  abstract = {Despite the empirical success of Low-Rank Adaptation (LoRA) in fine-tuning pretrained models, there is little theoretical understanding of how first-order methods with carefully crafted initialization adapt models to new tasks. In this work, we take the first step towards bridging this gap by theoretically analyzing the learning dynamics of LoRA for matrix factorization (MF) under gradient flow (GF), emphasizing the crucial role of initialization. For small initialization, we theoretically show that GF converges to a neighborhood of the optimal solution, with smaller initialization leading to lower final error. Our analysis shows that the final error is affected by the misalignment between the singular spaces of the pre-trained model and the target matrix, and reducing the initialization scale improves alignment. To address this misalignment, we propose a spectral initialization for LoRA in MF and theoretically prove that GF with small spectral initialization converges to the fine-tuning task with arbitrary precision. Numerical experiments from MF and image classification validate our findings.},
  author = {Xu, Ziqing and Min, Hancheng and MacDonald, Lachlan Ewen and Luo, Jinqi and Tarmoun, Salma and Mallada, Enrique and Vidal, Rene},
  booktitle = {International Conference on Artificial Intelligence and Statistics (AISTATS)},
  grants = {Global Centers},
  month = {4},
  publisher = {PMLR},
  record = {accepted Jan 2024, submitted Oct 2024},
  series = {Proceedings of Machine Learning Research},
  title = {Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization},
  url = {https://mallada.ece.jhu.edu/pubs/2025-AISTATS-XMMLTMV.pdf},
  year = {2025}
}