Enrique Mallada's Home Page

Associate Professor • Electrical & Computer Engineering • (he/him/his)

Barton Hall 312 • 3400 N Charles St • Baltimore • MD 21218
phone: 410-516-7018 • fax: 410-516-5566 • mallada [at] jhu [dot] edu

I have been an associate professor in Electrical and Computer Engineering (ECE) at Johns Hopkins University (JHU) since July 2022. I earned my Ph.D. in ECE with a minor in Applied Mathematics from Cornell University in Jan 2014 under the supervision of an awesome advisor and person, Prof. A. Kevin Tang. Before joining JHU as an assistant professor in Jan 2016, I was a postdoctoral scholar at the Center for the Mathematics of Information (CMI) in the Computational and Mathematical Sciences (CMS) department at Caltech from 2013 to 2015, where I had the pleasure to be mentored by Prof. Steven Low and Prof. Adam Wierman.

Research Interests

Networked Systems: coupled oscillators, clock synchronization, saddle-flows, network coherence, distributed coordination, consensus
Power Systems: frequency control, inverter-based control, real-time congestion management, electricity markets, reduced-order models
Optimization: time-varying optimization, primal-dual algorithms, semidefinite programming, sum-of-squares optimization
Machine Learning: reinforcement learning, sparse recovery, subspace preserving recovery, network tomography, multi-armed bandits

Recent Talks

A complete list of talks can be found here.

2025-04-03: Grid Shaping Control for High-IBR Power Systems: Stability Analysis and Control Design, Caltech.
[BibTeX] [Abstract] [Download PDF]

The transition of power systems from conventional synchronous generation toward renewable energy sources — with little, or no, inertia — is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system previously maintained. In this talk, we seek to challenge this approach. We advocate leveraging the system’s low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfy a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, this condition reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load disturbances to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.

@talk{caltech25b,
  abstract = {The transition of power systems from conventional synchronous generation toward renewable energy sources --- with little, or no, inertia --- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system previously maintained. In this talk, we seek to challenge this approach. We advocate leveraging the system's low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfy a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, this condition reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load disturbances to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.

},
  date = {04/3/2025},
  day = {03},
  event = {Caltech},
  host = {Steven Low},
  month = {04},
  role = {Speaker},
  title = {Grid Shaping Control for High-IBR Power Systems: Stability Analysis and Control Design},
  url = {https://mallada.ece.jhu.edu/talks/202504b-Caltech.pdf},
  year = {2025}
}

2025-04-02: Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement, Caltech CDS.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{caltech25a,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  date = {4/02/2025},
  day = {02},
  event = {Caltech CDS},
  host = {Steven Low},
  month = {04},
  role = {Speaker},
  title = {Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement},
  url = {https://mallada.ece.jhu.edu/talks/202504a-Caltech.pdf},
  year = {2025}
}

2024-11-19: Nonparametric Analysis of Dynamical Systems: From Recurrent Sets to Generalized Lyapunov and Barrier Conditions, ESE Fall Colloquium, UPenn.
[BibTeX] [Abstract] [Download PDF]

This talk presents novel non-parametric methods for analyzing dynamical systems using solely trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed condition of recurrence. Specifically, a set is τ-recurrent if every trajectory that starts within the set returns to it after at most τ seconds. We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions and develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov and Barrier Function Methods to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability and safety using τ-monotonic functions (functions whose value along trajectories monotonically increases or decreases after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing power to verify stability and safety using only trajectory information. We finalize by discussing future research directions and possible extensions for control.

@talk{upenn24,
  abstract = {This talk presents novel non-parametric methods for analyzing dynamical systems using solely trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed condition of recurrence. Specifically, a set is τ-recurrent if every trajectory that starts within the set returns to it after at most τ seconds. We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions and develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov and Barrier Function Methods to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability and safety using τ-monotonic functions (functions whose value along trajectories monotonically increases or decreases after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing power to verify stability and safety using only trajectory information. We finalize by discussing future research directions and possible extensions for control.},
  date = {11/20/2024},
  day = {19},
  event = {ESE Fall Colloquium, UPenn},
  host = {Rene Vidal},
  month = {11},
  role = {Speaker},
  title = {Nonparametric Analysis of Dynamical Systems: From Recurrent Sets to Generalized Lyapunov and Barrier Conditions},
  url = {https://mallada.ece.jhu.edu/talks/202411-UPenn.pdf},
  year = {2024}
}

2024-09-25: Generalized Barrier Functions: Integral Conditions and Recurrent Relaxations, 60th Allerton Conference on Communication, Control, and Computing.
[BibTeX] [Abstract] [Download PDF]

Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL’s suitability for safety-critical systems. Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy. Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing–except for a spurious solution–maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.

@talk{allerton24,
  abstract = {Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL's suitability for safety-critical systems.

Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy.

Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing--except for a spurious solution--maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.},
  date = {06/12/2024},
  day = {25},
  event = {60th Allerton Conference on Communication, Control, and Computing},
  host = {N/A},
  month = {09},
  role = {Speaker},
  title = {Generalized Barrier Functions: Integral Conditions and Recurrent Relaxations},
  url = {https://mallada.ece.jhu.edu/talks/202409-Allerton.pdf},
  year = {2024}
}

2024-06-12: Reinforcement Learning for Safety Critical Applications, Tercera Conferencia Colombiana de Matematicas Aplicadas e Industriales.
[BibTeX] [Abstract] [Download PDF]

Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL’s suitability for safety-critical systems. Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy. Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing–except for a spurious solution–maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.

@talk{mapi24,
  abstract = {Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL's suitability for safety-critical systems.

Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy.

Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing--except for a spurious solution--maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.},
  date = {06/12/2024},
  day = {12},
  event = {Tercera Conferencia Colombiana de Matematicas Aplicadas e Industriales},
  host = {Javier Peña (CMU), Mateo Diaz (JHU)},
  month = {06},
  role = {Speaker},
  title = {Reinforcement Learning for Safety Critical Applications},
  url = {https://mallada.ece.jhu.edu/talks/202406-MAPI.pdf},
  year = {2024}
}

2024-06-18: Data-driven Analysis of Dynamical Systems Using Recurrent Sets, INFORMS International Conference.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov’s Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.

@talk{informs2024,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov's Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.},
  date = {06/18/2024},
  day = {18},
  event = {INFORMS International Conference},
  host = {Luis Zuluaga (Lehigh), Mateo Diaz (JHU)},
  month = {06},
  role = {Speaker},
  title = {Data-driven Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202406-Informs.pdf},
  year = {2024}
}

2024-06-05: Data-driven Analysis of Dynamical Systems Using Recurrent Sets, Department of Automatic Control, Lund University.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov’s Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.

@talk{lund2024,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov's Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.},
  date = {06/05/2024},
  day = {05},
  event = {Department of Automatic Control, Lund University},
  host = {Richard Pates (Lund)},
  month = {06},
  role = {Lecture},
  title = {Data-driven Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202406-Lund.pdf},
  year = {2024}
}

2024-06-06: Data-driven Analysis of Dynamical Systems Using Recurrent Sets, Cyber-Physical Systems Lab, Université catholique de Louvain.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov’s Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.

@talk{ucl2024,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov's Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.},
  date = {06/06/2024},
  day = {06},
  event = {Cyber-Physical Systems Lab, Université catholique de Louvain},
  host = {Raphael Jungers (UCL)},
  month = {06},
  role = {Lecture},
  title = {Data-driven Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202406-UCL.pdf},
  year = {2024}
}

Publications Snapshot

A complete list of publications can be found here or on my Google Scholar profile.

Preprints

R. Siegelmann, Y. Shen, F. Paganini, and E. Mallada, Stability Analysis and Data-driven Verification via Recurrent Lyapunov Functions, 2025, submitted.
[BibTeX] [Abstract] [Download PDF]

Lyapunov’s direct method is an instrumental tool that provides a rigorous framework for stability analysis and control design for dynamical systems. A critical step that enables the application of the method is the availability of a Lyapunov function $V$—a function whose value monotonically decreases along the trajectories of the dynamical system. Unfortunately, finding a Lyapunov function is often tricky and requires ingenuity, domain knowledge, or significant computational power. At the core of this challenge is the fact that the method requires every sub-level set of $V$ ($V_łeq c$) to be forward invariant, thus implicitly coupling the geometry of $V_łeq c$ and the trajectories of the system. In this paper, we seek to disentangle this dependence by developing a direct method that substitutes the concept of invariance with the more flexible notion of recurrence. A set is ($τ$-)recurrent if every trajectory that starts in the set returns to it (within $τ$ seconds). We show that, under mild conditions, the recurrence of sub-level sets $V_łeq c$ is sufficient to guarantee stability and introduce the appropriate stronger notions to obtain asymptotic stability and exponential stability. Most notably, we provide norm-agnostic converse theorems showing that, under mild conditions, any norm satisfies our relaxed stability conditions, provided one is willing to certify a slightly weaker stability condition. We further develop GPU-based algorithms that can verify (practical) stability notions using purely trajectory data, and without the need of computing a Lyapunov function. Our analysis and methods further highlight an intrinsic trade-off between the sample/computational complexity and the certified performance that our algorithms navigate.

@unpublished{sspm2025a-preprint,
  abstract = {Lyapunov's direct method is an instrumental tool that provides a rigorous framework for stability analysis and control design for dynamical systems. A critical step that enables the application of the method is the availability of a Lyapunov function $V$---a function whose value monotonically decreases along the trajectories of the dynamical system. Unfortunately, finding a Lyapunov function is often tricky and requires ingenuity, domain knowledge, or significant computational power. At the core of this challenge is the fact that the method requires every sub-level set of $V$ ($V_łeq c$) to be forward invariant, thus implicitly coupling the geometry of $V_łeq c$ and the trajectories of the system. In this paper, we seek to disentangle this dependence by developing a direct method that substitutes the concept of invariance with the more flexible notion of recurrence. A set is ($τ$-)recurrent if every trajectory that starts in the set returns to it (within $τ$ seconds). We show that, under mild conditions,  the recurrence of sub-level sets $V_łeq c$ is sufficient to guarantee stability and introduce the appropriate stronger notions to obtain asymptotic stability and exponential stability. 
Most notably, we provide norm-agnostic converse theorems showing that, under mild conditions, any norm satisfies our relaxed stability conditions, provided one is willing to certify a slightly weaker stability condition.
We further develop GPU-based algorithms that can verify (practical) stability notions using purely trajectory data, and without the need of computing a Lyapunov function. 
Our analysis and methods further highlight an intrinsic trade-off between the sample/computational complexity and the certified performance that our algorithms navigate.},
  author = {Siegelmann, Roy and Shen, Yue and Paganini, Fernando and Mallada, Enrique},
  grants = {Global Centers-2330450},
  month = {07},
  title = {Stability Analysis and Data-driven Verification via Recurrent Lyapunov Functions},
  url = {https://mallada.ece.jhu.edu/pubs/2025-Preprint-SSPM.pdf},
  year = {2025, submitted}
}

H. Sibai and E. Mallada, Recurrence of Nonlinear Control Systems: Entropy, Bit Rates, and Finite Alphabets, 2025, submitted.
[BibTeX] [Abstract] [Download PDF]

In this paper, we introduce the notion of recurrence entropy in the context of nonlinear control systems. A set is said to be ($τ$-)recurrent if every trajectory that starts in the set returns to it (within at most $τ$ units of time). The recurrence entropy of a control system quantifies the complexity of making a set $τ$-recurrent measured by the average rate of growth, as time increases, of the number of control signals required to achieve this goal. Our analysis reveals that, compared to invariance, recurrence is quantitatively less complex, meaning that the recurrence entropy of a set is no larger than, and often strictly smaller than, the invariance entropy. We provide upper and lower bounds on recurrence entropy and show that they converge to the bounds on invariance entropy as $τ$ decreases to zero. Further, our results show that recurrence entropy lower bounds the minimum data rate between the sensor and controller required for achieving recurrence. We present an algorithm according to which the sensor can send state estimates to the controller over a limited-bandwidth channel to achieve recurrence asymptotically at an exponential rate. Finally, we show that, under mild stricter conditions on the set and dynamics, the control signals that enforce the $τ$-recurrence of a set can be generated by a finite alphabet of control signals of durations of at most $τ$ units of time, which allows us to store them for quick online execution.

@unpublished{sm2025a-preprint,
  abstract = {In this paper, we introduce the notion of recurrence entropy in the context of nonlinear control systems. A set is said to be ($τ$-)recurrent if every trajectory that starts in the set returns to it (within at most $τ$ units of time). The recurrence entropy of a control system quantifies the complexity of making a set $τ$-recurrent measured by the average rate of growth, as time increases, of the number of control signals required to achieve this goal. Our analysis reveals that, compared to invariance, recurrence is quantitatively less complex, meaning that the recurrence entropy of a set is no larger than, and often strictly smaller than, the invariance entropy. We provide upper and lower bounds on recurrence entropy and show that they converge to the bounds on invariance entropy as $τ$ decreases to zero. Further, our results show that recurrence entropy lower bounds the minimum data rate between the sensor and controller required for achieving recurrence. We present an algorithm according to which the sensor can send state estimates to the controller over a limited-bandwidth channel to achieve recurrence asymptotically at an exponential rate. Finally, we show that, under mild stricter conditions on the set and dynamics, the control signals that enforce the $τ$-recurrence of a set can be generated by a finite alphabet of control signals of durations of at most $τ$ units of time, which allows us to store them for quick online execution.},
  author = {Sibai, Hussein and Mallada, Enrique},
  grants = {CPS-2136324, Global-Centers-2330450},
  month = {02},
  pages = {1--16},
  record = {submitted Feb 2025},
  title = {Recurrence of Nonlinear Control Systems: Entropy, Bit Rates, and Finite Alphabets},
  url = {https://mallada.ece.jhu.edu/pubs/2025-Preprint-SM.pdf},
  year = {2025, submitted}
}

J. Drgona, T. X. Nghiem, T. Beckers, M. Fazlyab, E. Mallada, C. Jones, D. Vrabie, Brunton, S. L., and R. Findeisen, Safe Physics-informed Machine Learning for Dynamics and Control, 2025, submitted.
[BibTeX] [Download PDF]

@unpublished{dnbetal2025a-preprint,
  author = {Drgona, Jan and Nghiem, Truong X. and Beckers, Thomas and Fazlyab, Mahyar and Mallada, Enrique and Jones, Colin and Vrabie, Draguna and Brunton, and Steven L. and Findeisen, Rolf},
  month = {4},
  title = {Safe Physics-informed Machine Learning for Dynamics and Control},
  url = {https://mallada.ece.jhu.edu/pubs/2025-Preprint-DNBetal.pdf},
  year = {2025, submitted}
}

R. K. Bansal, E. Mallada, and P. Hidalgo-Gonzalez, A Market Mechanism for a Two-stage Settlement Electricity Market with Energy Storage, 2024, submitted.
[BibTeX] [Abstract] [Download PDF]

Electricity markets typically clear in two stages: a day-ahead market and a real-time market. In this paper, we propose market mechanisms for a two-stage multi-interval electricity market with energy storage, generators, and demand uncertainties. We consider two possible mixed bidding strategies: storage first bids cycle depths in the day ahead and then charge-discharge power bids in the real-time market for any last-minute adjustments. While the first strategy only considers day-ahead decisions from an individual participant’s perspective as part of their individual optimization formulation, the second strategy accounts for both the market operator’s and participants’ perspectives. We demonstrate that the competitive equilibrium exists uniquely for both mechanisms. However, accounting for the day-ahead decisions in the bidding function has several advantages. Numerical experiments using New York ISO data provide bounds on the proposed market mechanism.

@unpublished{bmh2024a-preprint,
  abstract = {Electricity markets typically clear in two stages: a day-ahead market and a real-time market. In this paper, we propose market mechanisms for a two-stage multi-interval electricity market with energy storage, generators, and demand uncertainties. We consider two possible mixed bidding strategies: storage first bids cycle depths in the day ahead and then charge-discharge power bids in the real-time market for any last-minute adjustments. While the first strategy only considers day-ahead decisions from an individual participant's perspective as part of their individual optimization formulation, the second strategy accounts for both the market operator's and participants' perspectives. We demonstrate that the competitive equilibrium exists uniquely for both mechanisms. However, accounting for the day-ahead decisions in the bidding function has several advantages. Numerical experiments using New York ISO data provide bounds on the proposed market mechanism.},
  author = {Bansal, Rajni Kant and Mallada, Enrique and Hidalgo-Gonzalez, Patricia},
  grants = {CPS-2136324, Global-Centers-2330450},
  month = {03},
  title = {A Market Mechanism for a Two-stage Settlement Electricity Market with Energy Storage},
  url = {https://mallada.ece.jhu.edu/pubs/2024-Preprint-BMH.pdf},
  year = {2024, submitted}
}

P. You, Y. Liu, and E. Mallada, A Unified Analysis of Saddle Flow Dynamics: Stability and Algorithm Design, 2024, submitted.
[BibTeX] [Abstract] [Download PDF]

This work examines the conditions for asymptotic and exponential convergence of saddle flow dynamics of convex-concave functions. First, we propose an observability-based certificate for asymptotic convergence, directly bridging the gap between the invariant set in a LaSalle argument and the equilibrium set of saddle flows. This certificate generalizes conventional conditions for convergence, e.g., strict convexity-concavity, and leads to a novel state-augmentation method that requires minimal assumptions for asymptotic convergence. We also show that global exponential stability follows from strong convexity-strong concavity, providing a lower-bound estimate of the convergence rate. This insight also explains the convergence of proximal saddle flows for strongly convex-concave objective functions. Our results generalize to dynamics with projections on the vector field and have applications in solving constrained convex optimization via primal-dual methods. Based on these insights, we study four algorithms built upon different Lagrangian function transformations. We validate our work by applying these methods to solve a network flow optimization and a Lasso regression problem.

@unpublished{ylm2024a-preprint,
  abstract = {This work examines the conditions for asymptotic and exponential convergence of saddle flow dynamics of convex-concave functions. First, we propose an observability-based certificate for asymptotic convergence, directly bridging the gap between the invariant set in a LaSalle argument and the equilibrium set of saddle flows. This certificate generalizes conventional conditions for convergence, e.g., strict convexity-concavity, and leads to a novel state-augmentation method that requires minimal assumptions for asymptotic convergence. We also show that global exponential stability follows from strong convexity-strong concavity, providing a lower-bound estimate of the convergence rate. This insight also explains the convergence of proximal saddle flows for strongly convex-concave objective functions. Our results generalize to dynamics with projections on the vector field and have applications in solving constrained convex optimization via primal-dual methods. Based on these insights, we study four algorithms built upon different Lagrangian function transformations. We validate our work by applying these methods to solve a network flow optimization and a Lasso regression problem.},
  author = {You, Pengcheng and Liu, Yingzhu and Mallada, Enrique},
  bdsk-url-3 = {https://mallada.ece.jhu.edu/pubs/2024-Preprint-YLM.pdf},
  grants = {CAREER-1752362, CPS-2136324, Global-Centers-2330450},
  month = {9},
  pages = {1-16},
  title = {A Unified Analysis of Saddle Flow Dynamics: Stability and Algorithm Design},
  url = {https://mallada.ece.jhu.edu/pubs/2024-Preprint-YLM.pdf},
  year = {2024, submitted}
}

P. You, M. Fernandez, D. F. Gayme, and E. Mallada, Mixed Supply Function and Quantity Bidding in Two-Stage Settlement Markets, 2023, under revision, submitted Mar 2023.
[BibTeX] [Abstract] [Download PDF]

Motivated by electricity markets, we study the incentives of heterogeneous participants (firms and consumers) in a two-stage settlement market with a mixed bidding mechanism, in which firms participate using supply function bids and consumers use quantity bids. We carry out an equilibrium analysis of the market outcome and obtain closed-form solutions. The characterization of the equilibria allows us to gain insights into the market-power implications of mixed bidding and uncover the importance of accounting for consumers’ strategic behavior in a two-stage market, even when their demand is completely inelastic with respect to price. We show that strategic consumers are able to exploit firms’ strategic behavior to maintain a systematic difference between the forward and spot prices, with the latter being higher. Notably, such a strategy does bring down consumer payment and undermines the supply-side market power. However, it is only effective when firms are behaving strategically. We also observe situations where firms lose profit by behaving strategically, a sign of overturn of the conventional supply-side market power. Our results further suggest that market competition has a heterogeneous impact across consumer sizes, particularly benefiting small consumers. Our analysis can accommodate other market policies, and we demonstrate this versatility by examining the impact of some example policies, including virtual bidding, on the market outcome.

@unpublished{yfgm2023a-preprint,
  abstract = {Motivated by electricity markets, we study the incentives of heterogeneous participants (firms and consumers) in a two-stage settlement market with a mixed bidding mechanism, in which firms participate using supply function bids and consumers use quantity bids. We carry out an equilibrium analysis of the market outcome and obtain closed-form solutions.  The characterization of the equilibria allows us to gain insights into the market-power implications of mixed bidding and uncover the importance of accounting for consumers' strategic behavior in a two-stage market, even when their demand is completely inelastic with respect to price. We show that strategic consumers are able to exploit firms' strategic behavior to maintain a systematic difference between the forward and spot prices, with the latter being higher. Notably, such a strategy does bring down consumer payment and undermines the supply-side market power. However, it is only effective when firms are behaving strategically. We also observe situations where firms lose profit by behaving strategically, a sign of overturn of the conventional supply-side market power. Our results further suggest that market competition has a heterogeneous impact across consumer sizes, particularly benefiting small consumers. Our analysis can accommodate other market policies, and we demonstrate this versatility by examining the impact of some example policies, including virtual bidding, on the market outcome.},
  author = {You, Pengcheng and Fernandez, Marcelo and Gayme, Dennice F. and Mallada, Enrique},
  bdsk-url-3 = {https://mallada.ece.jhu.edu/pubs/2023-Preprint-YFGM.pdf},
  grants = {CAREER-1752362;TRIPODS-1934979;CPS-2136324},
  month = {8},
  pages = {1-45},
  title = {Mixed Supply Function and Quantity Bidding in Two-Stage Settlement Markets},
  url = {https://mallada.ece.jhu.edu/pubs/2023-Preprint-YFGM.pdf},
  year = {2023, under revision, submitted Mar 2023}
}

R. K. Bansal, P. You, Y. Chen, and E. Mallada, Intercept Function and Quantity Bidding in Two-stage Electricity Market with Market Power Mitigation, 2023, under revision, submitted Aug 2023.
[BibTeX] [Abstract] [Download PDF]

Electricity markets typically operate in two stages, day-ahead and real-time. Despite best efforts striving efficiency, evidence of price manipulation has called for system-level market power mitigation (MPM) initiatives that substitute noncompetitive bids with default bids. Implementing these policies with a limited understanding of participant behavior may lead to unintended economic losses. In this paper, we model the competition between generators and inelastic loads in a two-stage market with stage-wise MPM policies. The loss of Nash equilibrium and lack of guarantee of stable market outcome in the case of conventional supply function bidding motivates the use of an alternative market mechanism where generators bid an intercept function. A Nash equilibrium analysis for a day-ahead MPM policy leads to a Stackelberg-Nash game with loads exercising market power at the expense of generators. A comparison of the resulting equilibrium with the standard market (not implementing any MPM policy) shows that a day-ahead policy completely mitigates the market power of generators. On the other hand, the real-time MPM policy increases demand allocation to real-time, contrary to current market practice with most electricity trades in the day-ahead market. Numerical studies illustrate the impact of the slope of the intercept function on the standard market.

@unpublished{bcym2023a-preprint,
  abstract = {Electricity markets typically operate in two stages, day-ahead and real-time. Despite best efforts striving efficiency, evidence of price manipulation has called for system-level market power mitigation (MPM) initiatives that substitute noncompetitive bids with default bids. Implementing these policies with a limited understanding of participant behavior may lead to unintended economic losses. In this paper, we model the competition between generators and inelastic loads in a two-stage market with stage-wise MPM policies. The loss of Nash equilibrium and lack of guarantee of stable market outcome in the case of conventional supply function bidding motivates the use of an alternative market mechanism where generators bid an intercept function. A Nash equilibrium analysis for a day-ahead MPM policy leads to a Stackelberg-Nash game with loads exercising market power at the expense of generators. A comparison of the resulting equilibrium with the standard market (not implementing any MPM policy) shows that a day-ahead policy completely mitigates the market power of generators. On the other hand, the real-time MPM policy increases demand allocation to real-time, contrary to current market practice with most electricity trades in the day-ahead market. Numerical studies illustrate the impact of the slope of the intercept function on the standard market.},
  author = {Bansal, Rajni Kant and You, Pengcheng and Chen, Yue and Mallada, Enrique},
  month = {11},
  pages = {1-14},
  title = {Intercept Function and Quantity Bidding in Two-stage Electricity Market with Market Power Mitigation},
  url = {https://mallada.ece.jhu.edu/pubs/2023-Preprint-BCYMb.pdf},
  year = {2023, under revision, submitted Aug 2023}
}

T. Zheng, J. W. Simpson-Porco, and E. Mallada, Closed-Loop Motion Planning for Differentially Flat Systems: A Time-Varying Optimization Framework, 2023, submitted.
[BibTeX] [Abstract] [Download PDF]

Motion planning and control are two core components of the robotic systems autonomy stack. The standard approach to combine these methodologies comprises an offline/open-loop stage, planning, that designs a feasible and safe trajectory to follow, and an online/closed-loop stage, tracking, that corrects for unmodeled dynamics and disturbances. Such an approach generally introduces conservativeness into the planning stage, which becomes difficult to overcome as the model complexity increases and real-time decisions need to be made in a changing environment. This work addresses these challenges for the class of differentially flat nonlinear systems by integrating planning and control into a cohesive closed-loop task. Precisely, we develop an optimization-based framework that aims to steer a differentially flat system to a trajectory implicitly defined via a constrained time-varying optimization problem. To that end, we generalize the notion of feedback linearization, which makes non-linear systems behave as linear systems, and develop controllers that effectively transform a differentially flat system into an optimization algorithm that seeks to find the optimal solution of a (possibly time-varying) optimization problem. Under sufficient regularity assumptions, we prove global asymptotic convergence for the optimization dynamics to the minimizer of the time-varying optimization problem. We illustrate the effectiveness of our method with two numerical examples: a multi-robot tracking problem and a robot obstacle avoidance problem.

@unpublished{zsm2023a-preprint,
  abstract = {Motion planning and control are two core components of the robotic systems autonomy stack. The standard approach to combine these methodologies comprises an offline/open-loop stage, planning, that designs a feasible and safe trajectory to follow, and an online/closed-loop stage, tracking, that corrects for unmodeled dynamics and disturbances. Such an approach generally introduces conservativeness into the planning stage, which becomes difficult to overcome as the model complexity increases and real-time decisions need to be made in a changing environment. This work addresses these challenges for the class of differentially flat nonlinear systems by integrating planning and control into a cohesive closed-loop task. Precisely, we develop an optimization-based framework that aims to steer a differentially flat system to a trajectory implicitly defined via a constrained time-varying optimization problem. To that end, we generalize the notion of feedback linearization, which makes non-linear systems behave as linear systems, and develop controllers that effectively transform a differentially flat system into an optimization algorithm that seeks to find the optimal solution of a (possibly time-varying) optimization problem. Under sufficient regularity assumptions, we prove global asymptotic convergence for the optimization dynamics to the minimizer of the time-varying optimization problem. We illustrate the effectiveness of our method with two numerical examples: a multi-robot tracking problem and a robot obstacle avoidance problem.},
  author = {Zheng, Tianqi and Simpson-Porco, John W. and Mallada, Enrique},
  bdsk-url-3 = {https://mallada.ece.jhu.edu/pubs/2023-Preprint-ZSM.pdf},
  grants = {CAREER-1752362,CPS-2136324,EPICS-2330450},
  month = {10},
  pages = {1-14},
  title = {Closed-Loop Motion Planning for Differentially Flat Systems: A Time-Varying Optimization Framework},
  url = {https://mallada.ece.jhu.edu/pubs/2023-Preprint-ZSM.pdf},
  year = {2023, submitted}
}

H. Min, S. Tarmoun, R. Vidal, and E. Mallada, Convergence and Implicit Bias of Gradient Flow on Overparametrized Linear Networks, 2023, submitted.
[BibTeX] [Abstract] [Download PDF]

Neural networks trained via gradient descent with random initialization and without any regularization enjoy good generalization performance in practice despite being highly overparametrized. A promising direction to explain this phenomenon is to study how initialization and overparametrization affect convergence and implicit bias of training algorithms. In this paper, we present a novel analysis of single-hidden-layer linear networks trained under gradient flow, which connects initialization, optimization, and overparametrization. Firstly, we show that the squared loss converges exponentially to its optimum at a rate that depends on the level of imbalance of the initialization. Secondly, we show that proper initialization constrains the dynamics of the network parameters to lie within an invariant set. In turn, minimizing the loss over this set leads to the min-norm solution. Finally, we show that large hidden layer width, together with (properly scaled) random initialization, ensures proximity to such an invariant set during training, allowing us to derive a novel non-asymptotic upper-bound on the distance between the trained network and the min-norm solution.

@unpublished{mtvm2023a-preprint,
  abstract = {Neural networks trained via gradient descent with random initialization and without any regularization enjoy good generalization performance in practice despite being highly overparametrized. A promising direction to explain this phenomenon is to study how initialization and overparametrization affect convergence and implicit bias of training algorithms. In this paper, we present a novel analysis of single-hidden-layer linear networks trained under gradient flow, which connects initialization, optimization, and overparametrization. Firstly, we show that the squared loss converges exponentially to its optimum at a rate that depends on the level of imbalance of the initialization. Secondly, we show that proper initialization constrains the dynamics of the network parameters to lie within an invariant set. In turn, minimizing the loss over this set leads to the min-norm solution. Finally, we show that large hidden layer width, together with (properly scaled) random initialization, ensures proximity to such an invariant set during training, allowing us to derive a novel non-asymptotic upper-bound on the distance between the trained network and the min-norm solution.        },
  author = {Min, Hancheng and Tarmoun, Salma and Vidal, Rene and Mallada, Enrique},
  grants = {CAREER-1752362, TRIPODS-1934979, CPS-2136324},
  month = {02},
  title = {Convergence and Implicit Bias of Gradient Flow on Overparametrized Linear Networks},
  url = {https://mallada.ece.jhu.edu/pubs/2022-Preprint-MTVM.pdf},
  year = {2023, submitted}
}

Recent Journals

H. Min, R. Pates, and E. Mallada, “A Frequency Domain Analysis of Slow Coherency in Networked Systems,” Automatica, vol. 74, pp. 1-13, 2025. doi:https://doi.org/10.1016/j.automatica.2025.112184
[BibTeX] [Abstract] [Download PDF]

Network coherence generally refers to the emergence of simple aggregated dynamical behaviors, despite heterogeneity in the dynamics of the network’s subsystems. In this paper, we develop a general frequency domain framework to analyze and quantify the level of network coherence that a system exhibits by relating coherence with a low-rank property of the system’s input-output response. More precisely, for a networked system with linear dynamics and coupling, we show that, as the network’s effective algebraic connectivity grows, the system transfer matrix converges to a rank-one transfer matrix representing the coherent behavior. Interestingly, the non-zero eigenvalue of such a rank-one matrix is given by the harmonic mean of individual nodal dynamics, and we refer to it as coherent dynamics. Our analysis unveils the frequency-dependent nature of coherence and a non-trivial interplay between dynamics and network topology. We further show that many networked systems can exhibit similar coherent behavior by establishing a concentration result in a setting with randomly chosen individual nodal dynamics.

@article{mpm2025automatica,
  abstract = {Network coherence generally refers to the emergence of simple aggregated dynamical behaviors, despite heterogeneity in the dynamics of the network's subsystems. In this paper, we develop a general frequency domain framework to analyze and quantify the level of network coherence that a system exhibits by relating coherence with a low-rank property of the system's input-output response. More precisely, for a networked system with linear dynamics and coupling, we show that, as the network's effective algebraic connectivity grows, the system transfer matrix converges to a rank-one transfer matrix representing the coherent behavior. Interestingly, the non-zero eigenvalue of such a rank-one matrix is given by the harmonic mean of individual nodal dynamics, and we refer to it as coherent dynamics. Our analysis unveils the frequency-dependent nature of coherence and a non-trivial interplay between dynamics and network topology. We further show that many networked systems can exhibit similar coherent behavior by establishing a concentration result in a setting with randomly chosen individual nodal dynamics.},
  author = {Min, Hancheng and Pates, Richard and Mallada, Enrique},
  bdsk-url-3 = {https://mallada.ece.jhu.edu/pubs/2025-Automatica-MPM.pdf},
  bdsk-url-4 = {https://doi.org/10.1016/j.automatica.2025.112184},
  doi = {https://doi.org/10.1016/j.automatica.2025.112184},
  grants = {CAREER-1752362, TRIPODS-1934979, CPS-2136324},
  journal = {Automatica},
  month = {2},
  pages = {1-13},
  record = {published, available online Dec 2024, accepted Oct 2024, revised Feb 2024, submitted Feb 2022},
  title = {A Frequency Domain Analysis of Slow Coherency in Networked Systems},
  url = {https://mallada.ece.jhu.edu/pubs/2025-Automatica-MPM.pdf},
  volume = {74},
  year = {2025}
}

Z. Xu, H. Min, S. Tarmoun, E. Mallada, and R. Vidal, “A Local Polyak-Łojasiewicz and Descent Lemma of Gradient Descent For Overparametrized Linear Models,” Transaction on Machine Learning Research (TMLR), 2025.
[BibTeX] [Download PDF]

@article{xmtmv2025tmlr,
  author = {Xu, Ziqing and Min, Hancheng and Tarmoun, Salma and Mallada, Enrique and Vidal, Rene},
  journal = {Transaction on Machine Learning Research (TMLR)},
  month = {5},
  pubstate = {accepted},
  record = {submitted Feb 2025},
  title = {A Local Polyak-Łojasiewicz and Descent Lemma of Gradient Descent For Overparametrized Linear Models},
  url = {https://mallada.ece.jhu.edu/pubs/2025-TMLR-XMTMV.pdf},
  year = {2025}
}

P. You, Y. Jiang, E. Yeung, D. Gayme, and E. Mallada, “On the Stability, Economic Efficiency and Incentive Compatibility of Electricity Market Dynamics,” IEEE Transactions on Automatic Control, pp. 1-16, 2024.
[BibTeX] [Abstract] [Download PDF]

This paper focuses on the operation of an electricity market that accounts for participants that bid at a sub-minute timescale. To that end, we model the market-clearing process as a dynamical system, called market dynamics, which is temporally coupled with the grid frequency dynamics and is thus required to guarantee system-wide stability while meeting the system operational constraints. We characterize participants as price-takers who rationally update their bids to maximize their utility in response to real-time schedules of prices and dispatch. For two common bidding mechanisms, based on quantity and price, we identify a notion of alignment between participants’ behavior and planners’ goals that leads to a saddle-based design of the market that guarantees convergence to a point meeting all operational constraints. We further explore cases where this alignment property does not hold and observe that misaligned participants’ bidding can destabilize the closed-loop system. We thus design a regularized version of the market dynamics that recovers all the desirable stability and steady-state performance guarantees. Numerical tests validate our results on the IEEE 39-bus system.

@article{yjygm2024tac,
  abstract = {This paper focuses on the operation of an electricity market that accounts for participants that bid at a sub-minute timescale. To that end, we model the market-clearing process as a dynamical system, called market dynamics, which is temporally coupled with the grid frequency dynamics and is thus required to guarantee system-wide stability while meeting the system operational constraints. We characterize participants as price-takers who rationally update their bids to maximize their utility in response to real-time schedules of prices and dispatch. For two common bidding mechanisms, based on quantity and price, we identify a notion of alignment between participants' behavior and planners' goals that leads to a saddle-based design of the market that guarantees convergence to a point meeting all operational constraints. We further explore cases where this alignment property does not hold and observe that misaligned participants' bidding can destabilize the closed-loop system.  We thus design a regularized version of the market dynamics that recovers all the desirable stability and steady-state performance guarantees. Numerical tests validate our results on the IEEE 39-bus system.},
  author = {You, Pengcheng and Jiang, Yan and Yeung, Enoch and Gayme, Dennice and Mallada, Enrique},
  bdsk-url-3 = {https://mallada.ece.jhu.edu/pubs/2024-TAC-YJYGM.pdf},
  grants = {CAREER-1752362, CPS-2136324, Global Centers-2330450},
  journal = {IEEE Transactions on Automatic Control},
  month = {12},
  pages = {1-16},
  pubstate = {available online},
  record = {accepted Aug 2024, revised Dec 2023, submitted Dec 2021},
  title = {On the Stability, Economic Efficiency and Incentive Compatibility of Electricity Market Dynamics},
  url = {https://mallada.ece.jhu.edu/pubs/2024-TAC-YJYGM.pdf},
  year = {2024}
}

T. Zheng, N. Loizou, P. You, and E. Mallada, “Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization,” IEEE Control Systems Letters (L-CSS), vol. 8, pp. 2009-2014, 2024. doi:10.1109/LCSYS.2024.3413004
[BibTeX] [Abstract] [Download PDF]

Gradient Descent Ascent (GDA) methods for min-max optimization problems typically produce oscillatory behavior that can lead to instability, e.g., in bilinear settings. To address this problem, we introduce a dissipation term into the GDA updates to dampen these oscillations. The proposed Dissipative GDA (DGDA) method can be seen as performing standard GDA on a state-augmented and regularized saddle function that does not strictly introduce additional convexity/concavity. We theoretically show the linear convergence of DGDA in the bilinear and strongly convex-strongly concave settings and assess its performance by comparing DGDA with other methods such as GDA, Extra-Gradient (EG), and Optimistic GDA. Our findings demonstrate that DGDA surpasses these methods, achieving superior convergence rates. We support our claims with two numerical examples that showcase DGDA’s effectiveness in solving saddle point problems.

@article{zlym2024lcss,
  abstract = {Gradient Descent Ascent (GDA) methods for min-max optimization problems typically produce oscillatory behavior that can lead to instability, e.g., in bilinear settings.
To address this problem, we introduce a dissipation term into the GDA updates to dampen these oscillations. The proposed Dissipative GDA (DGDA) method can be seen as performing standard GDA on a state-augmented and regularized saddle function that does not strictly introduce additional convexity/concavity. We theoretically show the linear convergence of DGDA in the bilinear and strongly convex-strongly concave settings and assess its performance by comparing DGDA with other methods such as GDA, Extra-Gradient (EG), and Optimistic GDA.
Our findings demonstrate that DGDA surpasses these methods, achieving superior convergence rates. We support our claims with two numerical examples that showcase DGDA's effectiveness in solving saddle point problems.},
  author = {Zheng, Tianqi and Loizou, Nicolas and You, Pengcheng and Mallada, Enrique},
  bdsk-url-3 = {https://doi.org/10.1109/LCSYS.2024.3413004},
  doi = {10.1109/LCSYS.2024.3413004},
  grants = {CPS-2136324, Global-Centers-2330450},
  journal = {IEEE Control Systems Letters (L-CSS)},
  month = {06},
  pages = {2009-2014},
  record = {published, accepted May 2024, submitted Mar 2024},
  title = {Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization},
  url = {https://mallada.ece.jhu.edu/pubs/2024-LCSS-ZLYM.pdf},
  volume = {8},
  year = {2024}
}

R. K. Bansal, Y. Chen, P. You, and E. Mallada, “Market Power Mitigation in Two-stage Electricity Market with Supply Function and Quantity Bidding,” IEEE Transactions on Energy Markets, Policy and Regulation, vol. 1, iss. 4, pp. 512-522, 2023. doi:10.1109/TEMPR.2023.3318149
[BibTeX] [Abstract] [Download PDF]

The main goal of a sequential two-stage electricity market—e.g., day-ahead and real-time markets—is to operate efficiently. However, the price difference across stages due to inadequate competition and unforeseen circumstances leads to undesirable price manipulation. To mitigate this, some Inde- pendent System Operators (ISOs) proposed system-level market power mitigation (MPM) policies in addition to existing local policies. These policies aim to substitute noncompetitive bids with a default bid based on estimated generator costs. However, these policies may lead to unintended consequences when implemented without accounting for the conflicting interest of participants. In this paper, we model the competition between generators (bidding supply functions) and loads (bidding quantity) in a two-stage market with a stage-wise MPM policy. An equilibrium analysis shows that a real-time MPM policy leads to equilibrium loss, meaning no stable market outcome (Nash equilibrium) exists. A day-ahead MPM policy, besides, leads to a Stackelberg-Nash game with loads acting as leaders and generators as followers. In this setting, loads become winners, i.e., their aggregate payment is always less than competitive payments. Moreover, comparison with standard market equilibrium highlights that markets are better off without such policies. Finally, numerical studies highlight the impact of heterogeneity and load size on market equilibrium.

@article{bcym2023tempr,
  abstract = {The main goal of a sequential two-stage electricity market---e.g., day-ahead and real-time markets---is to operate efficiently. However, the price difference across stages due to inadequate competition and unforeseen circumstances leads to undesirable price manipulation. To mitigate this, some Inde- pendent System Operators (ISOs) proposed system-level market power mitigation (MPM) policies in addition to existing local policies. These policies aim to substitute noncompetitive bids with a default bid based on estimated generator costs. However, these policies may lead to unintended consequences when implemented without accounting for the conflicting interest of participants. In this paper, we model the competition between generators (bidding supply functions) and loads (bidding quantity) in a two-stage market with a stage-wise MPM policy. An equilibrium analysis shows that a real-time MPM policy leads to equilibrium loss, meaning no stable market outcome (Nash equilibrium) exists. A day-ahead MPM policy, besides, leads to a Stackelberg-Nash game with loads acting as leaders and generators as followers. In this setting, loads become winners, i.e., their aggregate payment is always less than competitive payments. Moreover, comparison with standard market equilibrium highlights that markets are better off without such policies. Finally, numerical studies highlight the impact of heterogeneity and load size on market equilibrium.},
  author = {Bansal, Rajni Kant and Chen, Yue and You, Pengcheng and Mallada, Enrique},
  bdsk-url-3 = {https://doi.org/10.1109/TEMPR.2023.3318149},
  doi = {10.1109/TEMPR.2023.3318149},
  grants = {CAREER-1752362, CPS-2136324, EPICS-2330450},
  journal = {IEEE Transactions on Energy Markets, Policy and Regulation},
  month = {12},
  number = {4},
  pages = {512-522},
  record = {published, online Sep 2023, revised July 2023, under revision May 2023, submitted Jan 2023},
  title = {Market Power Mitigation in Two-stage Electricity Market with Supply Function and Quantity Bidding},
  url = {https://mallada.ece.jhu.edu/pubs/2023-TEMPR-BCYM.pdf},
  volume = {1},
  year = {2023}
}

A. Castellano, H. Min, J. Bazerque, and E. Mallada, “Learning to Act Safely with Limited Exposure and Almost Sure Certainty,” IEEE Transactions on Automatic Control, vol. 68, iss. 5, pp. 2979-2994, 2023. doi:10.1109/TAC.2023.3240925
[BibTeX] [Abstract] [Download PDF]

This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials, provided that one is willing to navigate trade-offs between optimality, level of exposure to unsafe events, and the maximum detection time of unsafe actions. We illustrate this concept in two complementary settings. We first focus on the canonical multi-armed bandit problem and seek to study the intrinsic trade-offs of learning safety in the presence of uncertainty. Under mild assumptions on sufficient exploration, we provide an algorithm that provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a trade-off between the number of rounds needed to secure the environment and the probability of discarding safe machines. We then consider the problem of finding optimal policies for a Markov Decision Process (MDP) with almost sure constraints. We show that the (action) value function satisfies a barrier-based decomposition which allows for the identification of feasible policies independently of the reward process. Using this decomposition, we develop a Barrier-learning algorithm, that identifies such unsafe state-action pairs in a finite expected number of steps. Our analysis further highlights a trade-off between the time lag for the underlying MDP necessary to detect unsafe actions, and the level of exposure to unsafe events. Simulations corroborate our theoretical findings, further illustrating the aforementioned trade-offs, and suggesting that safety constraints can further speed up the learning process.

@article{cmbm2023tac,
  abstract = {This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials, provided that one is willing to navigate trade-offs between optimality, level of exposure to unsafe events, and the maximum detection time of unsafe actions. We illustrate this concept in two complementary settings. We first focus on the canonical multi-armed bandit problem and seek to study the intrinsic trade-offs of learning safety in the presence of uncertainty.  Under mild assumptions on sufficient exploration, we provide an algorithm that provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a trade-off between the number of rounds needed to secure the environment and the probability of discarding safe machines.  We then consider the problem of finding optimal policies for a Markov Decision Process (MDP) with almost sure constraints. 
We show that the (action) value function satisfies a barrier-based decomposition which allows for the identification of feasible policies independently of the reward process. Using this decomposition, we develop a Barrier-learning algorithm, that identifies such unsafe state-action pairs in a finite expected number of steps. Our analysis further highlights a trade-off between the time lag for the underlying MDP necessary to detect unsafe actions, and the level of exposure to unsafe events. Simulations corroborate our theoretical findings, further illustrating the aforementioned trade-offs, and suggesting that safety constraints can further speed up the learning process.},
  author = {Castellano, Agustin and Min, Hancheng and Bazerque, Juan and Mallada, Enrique},
  bdsk-url-3 = {https://mallada.ece.jhu.edu/pubs/2023-TAC-CMBM.pdf},
  bdsk-url-4 = {http://dx.doi.org/10.1109/TAC.2023.3240925},
  doi = {10.1109/TAC.2023.3240925},
  grants = {CAREER-1752362, TRIPODS-1934979, CPS-2136324},
  journal = {IEEE Transactions on Automatic Control},
  month = {5},
  number = {5},
  pages = {2979-2994},
  record = {published, online May 2023, accepted Jan 2023, revised Oct 2022, submitted May 2021},
  title = {Learning to Act Safely with Limited Exposure and Almost Sure Certainty},
  url = {https://mallada.ece.jhu.edu/pubs/2023-TAC-CMBM.pdf},
  volume = {68},
  year = {2023}
}

B. K. Poolla, Y. Lin, A. Bernstein, E. Mallada, and D. Groß, “Frequency shaping control for weakly-coupled grid-forming IBRs,” IEEE Control Systems Letters (L-CSS), pp. 937-942, 2022. doi:10.1109/LCSYS.2022.3228855
[BibTeX] [Abstract] [Download PDF]

We introduce a novel framework to approximate the aggregate frequency dynamics of coherent synchronous generators. By leveraging recent results on dynamics concentration of tightly connected networks, we develop a hierarchy of reduced order models –based on frequency weighted balanced truncation– that accurately approximate the aggregate system response. Our results outperform existing aggregation techniques and can be shown to monotonically improve the approximation as the hierarchy order increases.

@article{plbmg2023lcss,
  abstract = {We introduce a novel framework to approximate the aggregate frequency dynamics of coherent synchronous generators. By leveraging recent results on dynamics concentration of tightly connected networks, we develop a hierarchy of reduced order models --based on frequency weighted balanced truncation-- that accurately approximate the aggregate system response. Our results outperform existing aggregation techniques and can be shown to monotonically improve the approximation as the hierarchy order increases.},
  author = {Poolla, Bala Kameshwar and Lin, Yashen and Bernstein, Andrey and Mallada, Enrique and Groß, Dominic},
  bdsk-url-3 = {https://doi.org/10.1109/LCSYS.2022.3228855},
  doi = {10.1109/LCSYS.2022.3228855},
  grants = {CAREER-1752362, CPS-2136324},
  journal = {IEEE Control Systems Letters (L-CSS)},
  month = {12},
  pages = {937-942},
  record = {published, online Dec 2022, accepted Nov 2022, submitted Sep 2022.},
  title = {Frequency shaping control for weakly-coupled grid-forming IBRs},
  url = {https://mallada.ece.jhu.edu/pubs/2022-LCSS-PLBMG.pdf},
  year = {2022}
}

G. H. Oral, E. Mallada, and D. Gayme, “On the Role of Interconnection Directionality in the Quadratic Performance of Double-Integrator Networks,” IEEE Transactions on Automatic Control, vol. 67, iss. 11, pp. 6211-6218, 2022. doi:10.1109/TAC.2021.3135358
[BibTeX] [Abstract] [Download PDF]

This paper provides a framework to evaluate the performance of single and double integrator networks over arbitrary directed graphs. Adopting vehicular network terminology, we consider quadratic performance metrics defined by the L2-norm of position and velocity based response functions given impulsive inputs to each vehicle. We exploit the spectral properties of weighted graph Laplacians and output performance matrices to derive a novel method of computing the closed-form solutions for this general class of performance metrics, which include H2-norm based quantities as special cases. We then explore the effect of the interplay between network properties (such as edge directionality and connectivity) and the control strategy on the overall network performance. More precisely, for systems whose interconnection is described by graphs with normal Laplacian L, we characterize the role of directionality by comparing their performance with that of their undirected counterparts, represented by the Hermitian part of L. We show that, for single-integrator networks, directed and undirected graphs perform identically. However, for double-integrator networks, graph directionality -expressed by the eigenvalues of L with nonzero imaginary part- can significantly degrade performance. Interestingly, in many cases, well-designed feedback can also exploit directionality to mitigate degradation or even improve the performance to exceed that of the undirected case. Finally we focus on a system coherence metric -aggregate deviation from the state average- to investigate the relationship between performance and degree of connectivity, leading to somewhat surprising findings. For example, increasing the number of neighbors on a ω-nearest neighbor directed graph does not necessarily improve performance. Similarly, we demonstrate equivalence in performance between all-to-one and all-to-all communication graphs.

@article{omg2022tac,
  abstract = {This paper provides a framework to evaluate the performance of single and double integrator networks over arbitrary directed graphs. Adopting vehicular network terminology, we consider quadratic performance metrics defined by the L2-norm of position and velocity based response functions given impulsive inputs to each vehicle. We exploit the spectral properties of weighted graph Laplacians and output performance matrices to derive a novel method of computing the closed-form solutions for this general class of performance metrics, which include H2-norm based quantities as special cases. We then explore the effect of the interplay between network properties (such as edge directionality and connectivity) and the control strategy on the overall network performance. More precisely, for systems whose interconnection is described by graphs with normal Laplacian L, we characterize the role of directionality by comparing their performance with that of their undirected counterparts, represented by the Hermitian part of L. We show that, for single-integrator networks, directed and undirected graphs perform identically. However, for double-integrator networks, graph directionality -expressed by the eigenvalues of L with nonzero imaginary part- can significantly degrade performance. Interestingly, in many cases, well-designed feedback can also exploit directionality to mitigate degradation or even improve the performance to exceed that of the undirected case. Finally we focus on a system coherence metric -aggregate deviation from the state average- to investigate the relationship between performance and degree of connectivity, leading to somewhat surprising findings. For example, increasing the number of neighbors on a ω-nearest neighbor directed graph does not necessarily improve performance. Similarly, we demonstrate equivalence in performance between all-to-one and all-to-all communication graphs.},
  author = {Oral, H. Giray and Mallada, Enrique and Gayme, Dennice},
  bdsk-url-3 = {https://doi.org/10.1109/TAC.2021.3135358},
  doi = {10.1109/TAC.2021.3135358},
  grants = {ENERGISE-DE-EE0008006, EPCN-1711188,AMPS-1736448, CPS-1544771, CAREER-1752362, AMPS-1736448, ARO-W911NF-17-1-0092},
  journal = {IEEE Transactions on Automatic Control},
  month = {11},
  number = {11},
  pages = {6211-6218},
  record = {published, online Dec. 2021, accepted Nov. 2021, conditionally accepted Apr 2021, revised Nov. 2020, submitted Nov. 2019},
  title = {On the Role of Interconnection Directionality in the Quadratic Performance of Double-Integrator Networks},
  url = {https://mallada.ece.jhu.edu/pubs/2019-Preprint-OMG.pdf},
  volume = {67},
  year = {2022}
}

Recent Conference Papers

H. M. Bui, E. Mallada, and A. Liu, “Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits,” in International Conference on Artificial Intelligence and Statistics (AISTATS), 2025.
[BibTeX] [Abstract] [Download PDF]

Despite the empirical success of Low-Rank Adaptation (LoRA) in fine-tuning pretrained models, there is little theoretical understanding of how first-order methods with carefully crafted initialization adapt models to new tasks. In this work, we take the first step towards bridging this gap by theoretically analyzing the learning dynamics of LoRA for matrix factorization (MF) under gradient flow (GF), emphasizing the crucial role of initialization. For small initialization, we theoretically show that GF converges to a neighborhood of the optimal solution, with smaller initialization leading to lower final error. Our analysis shows that the final error is affected by the misalignment between the singular spaces of the pre-trained model and the target matrix, and reducing the initialization scale improves alignment. To address this misalignment, we propose a spectral initialization for LoRA in MF and theoretically prove that GF with small spectral initialization converges to the fine-tuning task with arbitrary precision. Numerical experiments from MF and image classification validate our findings.

@inproceedings{bml2025aistats,
  abstract = {Despite the empirical success of Low-Rank Adaptation (LoRA) in fine-tuning pretrained models, there is little theoretical understanding of how first-order methods with carefully crafted initialization adapt models to new tasks. In this work, we take the first step towards bridging this gap by theoretically analyzing the learning dynamics of LoRA for matrix factorization (MF) under gradient flow (GF), emphasizing the crucial role of initialization. For small initialization, we theoretically show that GF converges to a neighborhood of the optimal solution, with smaller initialization leading to lower final error. Our analysis shows that the final error is affected by the misalignment between the singular spaces of the pre-trained model and the target matrix, and reducing the initialization scale improves alignment. To address this misalignment, we propose a spectral initialization for LoRA in MF and theoretically prove that GF with small spectral initialization converges to the fine-tuning task with arbitrary precision. Numerical experiments from MF and image classification validate our findings.},
  author = {Bui, Ha Manh and Mallada, Enrique and Liu, Anqi},
  booktitle = {International Conference on Artificial Intelligence and Statistics (AISTATS)},
  grants = {No Grant},
  month = {4},
  publisher = {PMLR},
  record = {accepted Jan 2025, submitted Oct 2024},
  series = {Proceedings of Machine Learning Research},
  title = {Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits},
  url = {https://mallada.ece.jhu.edu/pubs/2025-AISTATS-BML.pdf},
  year = {2025}
}

A. Castellano, S. Rezaei, J. Markovitz, and E. Mallada, “Nonparametric Policy Improvement for Continuous Aciton Spaces via Expert Trajectories,” in Reinforcement Learning Conference, 2025.
[BibTeX] [Abstract] [Download PDF]

The policy improvement theorem is a fundamental building block of classical reinforcement learning for discrete action spaces. Unfortunately, the lack of an analogous result for continuous action spaces with function approximation has historically limited theoretical guarantees of policy optimization algorithms, undermining their reliability. Here, we introduce a novel nonparametric policy that relies purely on data to take actions and that admits a policy improvement theorem for deterministic Markov Decision Processes (MDPs). By imposing mild regularity assumptions on the optimal policy, we show that, when data come from expert demonstrations, one can construct a nonparametric lower bound on the value of the policy, thus enabling its robust evaluation. The constructed lower bound naturally leads to a simple improvement mechanism based on adding more demonstrations. We also provide conditions to identify regions of the state space where additional demonstrations are needed to meet specific performance goals. Finally, we propose a policy optimization algorithm that ensures a monotonic improvement of the lower bound and leads to high probability performance guarantees. These contributions provide a foundational step toward establishing a rigorous framework for policy improvement in continuous action spaces.

@inproceedings{crmm2025rlc,
  abstract = {The policy improvement theorem is a fundamental building block of classical reinforcement learning for discrete action spaces. Unfortunately, the lack of an analogous result for continuous action spaces with function approximation has historically limited theoretical guarantees of policy optimization algorithms, undermining their reliability. Here, we introduce a novel nonparametric policy that relies purely on data to take actions and that admits a 
policy improvement theorem for deterministic Markov Decision Processes (MDPs). By imposing mild regularity assumptions on the optimal policy, we show that, when data come from expert demonstrations, one can construct a nonparametric lower bound on the value of the policy, thus enabling its robust evaluation. The constructed lower bound naturally leads to a simple improvement mechanism based on adding more demonstrations. We also provide conditions to identify regions of the state space where additional demonstrations are needed to meet specific performance goals. Finally, we propose a policy optimization algorithm that ensures a monotonic improvement of the lower bound and leads to high probability performance guarantees. These contributions provide a foundational step toward establishing a rigorous framework for policy improvement in continuous action spaces.},
  author = {Castellano, Agustin and Rezaei, Sohrab and Markovitz, Jared and Mallada, Enrique},
  booktitle = {Reinforcement Learning Conference},
  month = {5},
  pubstate = {accepted},
  record = {submitted Feb 2025},
  title = {Nonparametric Policy Improvement for Continuous Aciton Spaces via Expert Trajectories},
  url = {https://mallada.ece.jhu.edu/pubs/2025-Preprint-CRMM.pdf},
  year = {2025}
}

Z. Xu, H. Min, L. E. MacDonald, J. Luo, S. Tarmoun, E. Mallada, and R. Vidal, “Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization,” in International Conference on Artificial Intelligence and Statistics (AISTATS), 2025.
[BibTeX] [Abstract] [Download PDF]

Despite the empirical success of Low-Rank Adaptation (LoRA) in fine-tuning pretrained models, there is little theoretical understanding of how first-order methods with carefully crafted initialization adapt models to new tasks. In this work, we take the first step towards bridging this gap by theoretically analyzing the learning dynamics of LoRA for matrix factorization (MF) under gradient flow (GF), emphasizing the crucial role of initialization. For small initialization, we theoretically show that GF converges to a neighborhood of the optimal solution, with smaller initialization leading to lower final error. Our analysis shows that the final error is affected by the misalignment between the singular spaces of the pre-trained model and the target matrix, and reducing the initialization scale improves alignment. To address this misalignment, we propose a spectral initialization for LoRA in MF and theoretically prove that GF with small spectral initialization converges to the fine-tuning task with arbitrary precision. Numerical experiments from MF and image classification validate our findings.

@inproceedings{xmmltmv2025aistats,
  abstract = {Despite the empirical success of Low-Rank Adaptation (LoRA) in fine-tuning pretrained models, there is little theoretical understanding of how first-order methods with carefully crafted initialization adapt models to new tasks. In this work, we take the first step towards bridging this gap by theoretically analyzing the learning dynamics of LoRA for matrix factorization (MF) under gradient flow (GF), emphasizing the crucial role of initialization. For small initialization, we theoretically show that GF converges to a neighborhood of the optimal solution, with smaller initialization leading to lower final error. Our analysis shows that the final error is affected by the misalignment between the singular spaces of the pre-trained model and the target matrix, and reducing the initialization scale improves alignment. To address this misalignment, we propose a spectral initialization for LoRA in MF and theoretically prove that GF with small spectral initialization converges to the fine-tuning task with arbitrary precision. Numerical experiments from MF and image classification validate our findings.},
  author = {Xu, Ziqing and Min, Hancheng and MacDonald, Lachlan Ewen and Luo, Jinqi and Tarmoun, Salma and Mallada, Enrique and Vidal, Rene},
  booktitle = {International Conference on Artificial Intelligence and Statistics (AISTATS)},
  grants = {No Grant},
  month = {4},
  publisher = {PMLR},
  record = {accepted Jan 2024, submitted Oct 2024},
  series = {Proceedings of Machine Learning Research},
  title = {Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization},
  url = {https://mallada.ece.jhu.edu/pubs/2025-AISTATS-XMMLTMV.pdf},
  year = {2025}
}

K. Poe, E. Mallada, and R. Vidal, “Invertibility of Discrete-Time Linear Systems with Sparse Inputs,” in 63rd IEEE Conference on Decision and Control (CDC), 2024.
[BibTeX] [Abstract] [Download PDF]

One of the fundamental problems of interest for discrete-time linear systems is whether its input sequence may be recovered given its output sequence, a.k.a. the left inversion problem. Many conditions on the state space geometry, dynamics, and spectral structure of a system have been used to characterize the well-posedness of this problem, without assumptions on the inputs. However, certain structural assumptions, such as input sparsity, have been shown to translate to practical gains in the performance of inversion algorithms, surpassing classical guarantees. Establishing necessary and sufficient conditions for left invertibility of systems with sparse inputs is therefore a crucial step toward understanding the performance limits of system inversion under structured input assumptions. In this work, we provide the first necessary and sufficient characterizations of left invertibility for linear systems with sparse inputs, echoing classic characterizations for standard linear systems. The key insight in deriving these results is in establishing the existence of two novel geometric invariants unique to the sparse-input setting, the weakly unobservable and strongly reachable subspace arrangements. By means of a concrete example, we demonstrate the utility of these characterizations. We conclude by discussing extensions and applications of this framework to several related problems in sparse control.

@inproceedings{pmv2024cdc,
  abstract = {One of the fundamental problems of interest for discrete-time linear systems is whether its input sequence may be recovered given its output sequence, a.k.a. the left inversion problem. Many conditions on the state space geometry, dynamics, and spectral structure of a system have been used to characterize the well-posedness of this problem, without assumptions on the inputs. However, certain structural assumptions, such as input sparsity, have been shown to translate to practical gains in the performance of inversion algorithms, surpassing classical guarantees. Establishing necessary and sufficient conditions for left invertibility of systems with sparse inputs is therefore a crucial step toward understanding the performance limits of system inversion under structured input assumptions. In this work, we provide the first necessary and sufficient characterizations of left invertibility for linear systems with sparse inputs,  echoing classic characterizations for standard linear systems. The key insight in deriving these results is in establishing the existence of two novel geometric invariants unique to the sparse-input setting, the weakly unobservable and strongly reachable subspace arrangements. By means of a concrete example, we demonstrate the utility of these characterizations. We conclude by discussing extensions and applications of this framework to several related problems in sparse control.},
  author = {Poe, Kyle and Mallada, Enrique and Vidal, Rene},
  booktitle = {63rd IEEE Conference on Decision and Control (CDC)},
  grants = {CPS-2136324, Global-Centers-2330450},
  month = {12},
  pubstate = {presented},
  record = {accepted Jul 2024, submitted Mar 2024},
  title = {Invertibility of Discrete-Time Linear Systems with Sparse Inputs},
  url = {https://mallada.ece.jhu.edu/pubs/2024-CDC-PMV.pdf},
  year = {2024}
}

T. Zheng, N. Loizou, P. You, and E. Mallada, “Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization,” in 63rd IEEE Conference on Decision and Control (CDC), 2024.
[BibTeX] [Abstract] [Download PDF]

Gradient Descent Ascent (GDA) methods for min-max optimization problems typically produce oscillatory behavior that can lead to instability, e.g., in bilinear settings. To address this problem, we introduce a dissipation term into the GDA updates to dampen these oscillations. The proposed Dissipative GDA (DGDA) method can be seen as performing standard GDA on a state-augmented and regularized saddle function that does not strictly introduce additional convexity/concavity. We theoretically show the linear convergence of DGDA in the bilinear and strongly convex-strongly concave settings and assess its performance by comparing DGDA with other methods such as GDA, Extra-Gradient (EG), and Optimistic GDA. Our findings demonstrate that DGDA surpasses these methods, achieving superior convergence rates. We support our claims with two numerical examples that showcase DGDA’s effectiveness in solving saddle point problems.

@inproceedings{zlym2024cdc,
  abstract = {Gradient Descent Ascent (GDA) methods for min-max optimization problems typically produce oscillatory behavior that can lead to instability, e.g., in bilinear settings.
To address this problem, we introduce a dissipation term into the GDA updates to dampen these oscillations. The proposed Dissipative GDA (DGDA) method can be seen as performing standard GDA on a state-augmented and regularized saddle function that does not strictly introduce additional convexity/concavity. We theoretically show the linear convergence of DGDA in the bilinear and strongly convex-strongly concave settings and assess its performance by comparing DGDA with other methods such as GDA, Extra-Gradient (EG), and Optimistic GDA.
Our findings demonstrate that DGDA surpasses these methods, achieving superior convergence rates. We support our claims with two numerical examples that showcase DGDA's effectiveness in solving saddle point problems.},
  author = {Zheng, Tianqi and Loizou, Nicolas and You, Pengcheng and Mallada, Enrique},
  booktitle = {63rd IEEE Conference on Decision and Control (CDC)},
  grants = {CPS-2136324, Global-Centers-2330450},
  month = {12},
  note = {also in L-CSS},
  pubstate = {presented},
  record = {accepted Jul 2024, submitted Mar 2024},
  title = {Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization},
  url = {https://mallada.ece.jhu.edu/pubs/2024-CDC-ZLYM.pdf},
  year = {2024}
}

Y. Shen, H. Sibai, and E. Mallada, “Generalized Barrier Functions: Integral Conditions & Recurrent Relaxations,” in 60th Allerton Conference on Communication, Control, and Computing, 2024, pp. 1-8.
[BibTeX] [Abstract] [Download PDF]

Barrier functions constitute an effective tool for assessing and enforcing safety-critical constraints on dynamical systems. To this end, one is required to find a function $h$ that satisfies a Lyapunov-like differential condition, thereby ensuring the invariance of its zero super-level set $h_\ge 0$. This methodology, however, does not prescribe a general method for finding the function $h$ that satisfies such differential conditions, which, in general, can be a daunting task. In this paper, we seek to overcome this limitation by developing a generalized barrier condition that makes the search for $h$ easier. We do this in two steps. First, we develop integral barrier conditions that reveal equivalent asymptotic behavior to the differential ones, but without requiring differentiability of $h$. Subsequently, we further replace the stringent invariance requirement on $h≥0$ with a more flexible concept known as recurrence. A set is ($τ$-)recurrent if every trajectory that starts in the set returns to it (within $τ$ seconds) infinitely often. We show that, under mild conditions, a simple sign distance function can satisfy our relaxed condition and that the ($τ$-)recurrence of the super-level set $h_≥ 0$ is sufficient to guarantee the system’s safety.

@inproceedings{ssm2024allerton,
  abstract = {Barrier functions constitute an effective tool for assessing and enforcing safety-critical constraints on dynamical systems.  To this end, one is required to find a function $h$ that satisfies a Lyapunov-like differential condition, thereby ensuring the invariance of its zero super-level set $h_\ge 0$.  This methodology, however, does not prescribe a general method for finding the function $h$ that satisfies such differential conditions, which, in general, can be a daunting task. In this paper, we seek to overcome this limitation by developing a generalized barrier condition that makes the search for $h$ easier. We do this in two steps. First, we develop integral barrier conditions that reveal equivalent asymptotic behavior to the differential ones, but without requiring differentiability of $h$. Subsequently, we further replace the stringent invariance requirement on $h≥0$ with a more flexible concept known as recurrence. A set is ($τ$-)recurrent if every trajectory that starts in the set returns to it (within $τ$ seconds) infinitely often. We show that, under mild conditions, a simple sign distance function can satisfy our relaxed condition and that the ($τ$-)recurrence of the super-level set $h_≥ 0$ is sufficient to guarantee the system's safety.},
  author = {Shen, Yue and Sibai, Hussein and Mallada, Enrique},
  booktitle = {60th Allerton Conference on Communication, Control, and Computing},
  grants = {CPS-2136324, Global-Centers-2330450},
  keywords = {Barrier Functions},
  month = {09},
  pages = {1-8},
  pubstate = {presented},
  record = {accepted Jul 2024, submitted Jul 2024},
  title = {Generalized Barrier Functions: Integral Conditions & Recurrent Relaxations},
  url = {https://mallada.ece.jhu.edu/pubs/2024-Allerton-SSM.pdf},
  year = {2024}
}

Y. Liu, E. Mallada, Z. Li, and P. You, “Accelerated Saddle Flow Dynamics for Bilinearly Coupled Minimax Problems,” in 63rd IEEE Conference on Decision and Control (CDC), 2024.
[BibTeX] [Abstract] [Download PDF]

Minimax problems have attracted much attention due to various applications in constrained optimization problems and zero-sum games. Identifying saddle points within these problems is crucial, and saddle flow dynamics offer a straightforward yet useful approach. This study focuses on a class of bilinearly coupled minimax problems and designs an accelerated algorithm based on saddle flow dynamics that achieves a convergence rate beyond stereotype limits. The algorithm is derived based on a sequential two-step transformation of the objective function. First, a change of variable is aimed at a better-conditioned saddle function. Second, a proximal regularization, when staggered with the first step, guarantees strong convexity-strong concavity of the objective function that can be tuned for accelerated exponential convergence. Besides, such an approach can be extended to a class of weakly convex-weakly concave functions and still achieve exponential convergence to one stationary point. The theory is verified by a numerical test on an affine equality-constrained convex optimization problem.

@inproceedings{lmly2024cdc,
  abstract = {Minimax problems have attracted much attention due to various applications in constrained optimization problems and zero-sum games. Identifying saddle points within these problems is crucial, and saddle flow dynamics offer a straightforward yet useful approach. This study focuses on a class of bilinearly coupled minimax problems and designs an accelerated algorithm based on saddle flow dynamics that achieves a convergence rate beyond stereotype limits.  The algorithm is derived based on a sequential two-step transformation of the objective function. First, a change of variable is aimed at a better-conditioned saddle function. Second, a proximal regularization, when staggered with the first step, guarantees strong convexity-strong concavity of the objective function that can be tuned for accelerated exponential convergence.
Besides, such an approach can be extended to a class of weakly convex-weakly concave functions and still achieve exponential convergence to one stationary point. The theory is verified by a numerical test on an affine equality-constrained convex optimization problem.},
  author = {Liu, Yingzhu and Mallada, Enrique and Li, Zhongkui and You, Pengcheng},
  booktitle = {63rd IEEE Conference on Decision and Control (CDC)},
  grants = {CPS-2136324, Global-Centers-2330450},
  month = {07},
  pubstate = {accepted, submitted Mar 2024},
  title = {Accelerated Saddle Flow Dynamics for Bilinearly Coupled Minimax Problems},
  url = {https://mallada.ece.jhu.edu/pubs/2024-Preprint-LMLY.pdf},
  year = {2024}
}

B. K. Poolla, Y. Lin, A. Bernstein, E. Mallada, and D. Groß, “Dynamic Shaping of Grid Response of Multi-Machine Multi-Inverter Systems Through Grid-Forming IBRs,” in PES General Meeting, 2024, pp. 1-5. doi:10.1109/PESGM51994.2024.10688717
[BibTeX] [Abstract] [Download PDF]

We consider the problem of controlling the frequency response of weakly-coupled multi-machine multi-inverter low-inertia power systems via grid-forming inverter-based resources (IBRs). In contrast to existing methods, our approach relies on dividing the larger system into multiple strongly-coupled subsystems, without ignoring either the underlying network or approximating the subsystem response as an aggregate harmonic mean model. Rather, through a structured clustering and recursive dynamic shaping approach, the frequency response of the overall system to load perturbations is shaped appropriately. We demonstrate the proposed approach for a three-node triangular configuration and a small-scale radial network. Furthermore, previous synchronization analysis for heterogeneous systems requires the machines to satisfy certain proportionality property. In our approach, the effective transfer functions for each cluster can be tuned by the IBRs to satisfy such property, enabling us to apply the shaping control to systems with a wider range of heterogeneous machines.

@inproceedings{plbmg2024pesgm,
  abstract = {We consider the problem of controlling the frequency response of weakly-coupled multi-machine multi-inverter low-inertia power systems via grid-forming inverter-based resources (IBRs). In contrast to existing methods, our approach relies on dividing the larger system into multiple strongly-coupled subsystems, without ignoring either the underlying network or approximating the subsystem response as an aggregate harmonic mean model. Rather, through a structured clustering and recursive dynamic shaping approach, the frequency response of the overall system to load perturbations is shaped appropriately. We demonstrate the proposed approach for a three-node triangular configuration and a small-scale radial network. Furthermore, previous synchronization analysis for heterogeneous systems requires the machines to satisfy certain proportionality property. In our approach, the effective transfer functions for each cluster can be tuned by the IBRs to satisfy such property, enabling us to apply the shaping control to systems with a wider range of heterogeneous machines.},
  author = {Poolla, Bala Kameshwar and Lin, Yashen and Bernstein, Andrey and Mallada, Enrique and Groß, Dominic},
  bdsk-url-3 = {https://doi.org/10.1109/PESGM51994.2024.10688717},
  booktitle = {PES General Meeting},
  doi = {10.1109/PESGM51994.2024.10688717},
  grants = {CPS-2136324, CAREER-1752362, Global Centers-2330450},
  month = {06},
  pages = {1-5},
  record = {presented Jun. 2024, accepted Mar. 2024, submitted Nov. 2023},
  title = {Dynamic Shaping of Grid Response of Multi-Machine Multi-Inverter Systems Through Grid-Forming IBRs},
  url = {https://mallada.ece.jhu.edu/pubs/2024-PESGM-PLBMG.pdf},
  year = {2024}
}