Talks and Seminars

This is a list of recent talks and seminars.

2026

2026-01-27: Reliability Challenges in IBR-Rich Power Grids, Newcastle.
[BibTeX] [Abstract] [Download PDF]

The rapid integration of inverter-based resources (IBRs) is transforming power system dynamics, eroding traditional sources of inertia and voltage support and introducing new forms of instability. In IBR-rich grids, uncertainty in network conditions, operating points, and proprietary inverter controls has led to an increasing number of oscillatory events, including sub-synchronous oscillations driven by inverter–grid interactions. This talk examines the mechanisms underlying these instabilities and argues that their onset can be understood through small-signal models and impedance-based analysis, where critical transitions arise via Hopf bifurcations. Building on this perspective, we present robust, decentralized stability criteria that account for inverter heterogeneity and operating-point dependence while requiring only local measurements and testing. These results expose a fundamental trade-off between robustness and efficiency: expanding the admissible dispatch region necessitates tighter constraints on inverter dynamic behavior. Together, these insights provide a foundation for stability-aware planning, control, and operation of future low-inertia power systems dominated by inverter-based resources.

@talk{EPICS26-AU,
  abstract = {The rapid integration of inverter-based resources (IBRs) is transforming power system dynamics, eroding traditional sources of inertia and voltage support and introducing new forms of instability. In IBR-rich grids, uncertainty in network conditions, operating points, and proprietary inverter controls has led to an increasing number of oscillatory events, including sub-synchronous oscillations driven by inverter--grid interactions.

This talk examines the mechanisms underlying these instabilities and argues that their onset can be understood through small-signal models and impedance-based analysis, where critical transitions arise via Hopf bifurcations. Building on this perspective, we present robust, decentralized stability criteria that account for inverter heterogeneity and operating-point dependence while requiring only local measurements and testing. These results expose a fundamental trade-off between robustness and efficiency: expanding the admissible dispatch region necessitates tighter constraints on inverter dynamic behavior.

Together, these insights provide a foundation for stability-aware planning, control, and operation of future low-inertia power systems dominated by inverter-based resources.},
  annote = {Enrique Mallada is an Associate Professor of Electrical and Computer Engineering at Johns Hopkins University, where he has been a faculty member since 2016. He received his Ph.D. in Electrical and Computer Engineering with a minor in Applied Mathematics from Cornell University and a Telecommunications Engineering degree from ORT University, Uruguay. Before joining Hopkins, he was a Postdoctoral Fellow at Caltech's Center for the Mathematics of Information. His honors include the Johns Hopkins Alumni Association Teaching Award (2021), NSF CAREER Award (2018), Caltech's CMI Fellowship (2014), and Cornell ECE Director's Thesis Award (2014). His research spans control and dynamical systems, machine learning, and optimization, with applications to safety-critical systems, networks, and power grids.},
  date = {01/27/2026},
  day = {27},
  event = {Newcastle},
  host = {Behrooz Bahrani},
  month = {01},
  role = {Speaker},
  title = {Reliability Challenges in IBR-Rich Power Grids},
  url = {https://mallada.ece.jhu.edu/talks/202601-EPICS-AU.pdf},
  year = {2026}
}

2025

2025-12-05: Nonparametric Analysis and Control of Dynamical Systems, Control Workshop @ Uruguay.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{controluy25,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  annote = {Enrique Mallada is an Associate Professor of Electrical and Computer Engineering at Johns Hopkins University, where he has been a faculty member since 2016. He received his Ph.D. in Electrical and Computer Engineering with a minor in Applied Mathematics from Cornell University and a Telecommunications Engineering degree from ORT University, Uruguay. Before joining Hopkins, he was a Postdoctoral Fellow at Caltech's Center for the Mathematics of Information. His honors include the Johns Hopkins Alumni Association Teaching Award (2021), NSF CAREER Award (2018), Caltech's CMI Fellowship (2014), and Cornell ECE Director's Thesis Award (2014). His research spans control and dynamical systems, machine learning, and optimization, with applications to safety-critical systems, networks, and power grids.},
  date = {12/05/2025},
  day = {05},
  event = {Control Workshop @ Uruguay},
  host = {Andres Ferragut, Enrique Mallada},
  month = {12},
  role = {Lecturer},
  title = {Nonparametric Analysis and Control of Dynamical Systems},
  url = {https://mallada.ece.jhu.edu/talks/202512-Control-UY.pdf},
  year = {2025}
}

2025-12-05: Nonparametric Analysis and Control of Dynamical Systems, Neurocomputing and Dynamics Workshop.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{cdcworkshop25,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  annote = {Enrique Mallada is an Associate Professor of Electrical and Computer Engineering at Johns Hopkins University, where he has been a faculty member since 2016. He received his Ph.D. in Electrical and Computer Engineering with a minor in Applied Mathematics from Cornell University and a Telecommunications Engineering degree from ORT University, Uruguay. Before joining Hopkins, he was a Postdoctoral Fellow at Caltech's Center for the Mathematics of Information. His honors include the Johns Hopkins Alumni Association Teaching Award (2021), NSF CAREER Award (2018), Caltech's CMI Fellowship (2014), and Cornell ECE Director's Thesis Award (2014). His research spans control and dynamical systems, machine learning, and optimization, with applications to safety-critical systems, networks, and power grids.},
  date = {12/05/2025},
  day = {05},
  event = {Neurocomputing and Dynamics Workshop},
  host = {Francesco Bullo, Adilson Motte, Arthur Montanari},
  month = {12},
  role = {Lecturer},
  title = {Nonparametric Analysis and Control of Dynamical Systems},
  url = {https://mallada.ece.jhu.edu/talks/202512-CDC-Workshop.pdf},
  year = {2025}
}

2025-10-31: Reliability Challenges in IBR-Rich Power Grids, 50 Hertz.
[BibTeX] [Abstract] [Download PDF]

The rapid integration of inverter-based resources (IBRs) is transforming power system dynamics, eroding traditional sources of inertia and voltage support and introducing new forms of instability. In IBR-rich grids, uncertainty in network conditions, operating points, and proprietary inverter controls has led to an increasing number of oscillatory events, including sub-synchronous oscillations driven by inverter–grid interactions. This talk examines the mechanisms underlying these instabilities and argues that their onset can be understood through small-signal models and impedance-based analysis, where critical transitions arise via Hopf bifurcations. Building on this perspective, we present robust, decentralized stability criteria that account for inverter heterogeneity and operating-point dependence while requiring only local measurements and testing. These results expose a fundamental trade-off between robustness and efficiency: expanding the admissible dispatch region necessitates tighter constraints on inverter dynamic behavior. Together, these insights provide a foundation for stability-aware planning, control, and operation of future low-inertia power systems dominated by inverter-based resources.

@talk{EPICS25,
  abstract = {The rapid integration of inverter-based resources (IBRs) is transforming power system dynamics, eroding traditional sources of inertia and voltage support and introducing new forms of instability. In IBR-rich grids, uncertainty in network conditions, operating points, and proprietary inverter controls has led to an increasing number of oscillatory events, including sub-synchronous oscillations driven by inverter--grid interactions.

This talk examines the mechanisms underlying these instabilities and argues that their onset can be understood through small-signal models and impedance-based analysis, where critical transitions arise via Hopf bifurcations. Building on this perspective, we present robust, decentralized stability criteria that account for inverter heterogeneity and operating-point dependence while requiring only local measurements and testing. These results expose a fundamental trade-off between robustness and efficiency: expanding the admissible dispatch region necessitates tighter constraints on inverter dynamic behavior.

Together, these insights provide a foundation for stability-aware planning, control, and operation of future low-inertia power systems dominated by inverter-based resources.},
  annote = {Enrique Mallada is an Associate Professor of Electrical and Computer Engineering at Johns Hopkins University, where he has been a faculty member since 2016. He received his Ph.D. in Electrical and Computer Engineering with a minor in Applied Mathematics from Cornell University and a Telecommunications Engineering degree from ORT University, Uruguay. Before joining Hopkins, he was a Postdoctoral Fellow at Caltech's Center for the Mathematics of Information. His honors include the Johns Hopkins Alumni Association Teaching Award (2021), NSF CAREER Award (2018), Caltech's CMI Fellowship (2014), and Cornell ECE Director's Thesis Award (2014). His research spans control and dynamical systems, machine learning, and optimization, with applications to safety-critical systems, networks, and power grids.},
  date = {10/31/2025},
  day = {31},
  event = {50 Hertz},
  host = {Ben Hobbs},
  month = {10},
  role = {Speaker},
  title = {Reliability Challenges in IBR-Rich Power Grids},
  url = {https://mallada.ece.jhu.edu/talks/202510-EPICS.pdf},
  year = {2025}
}

2025-10-26: Nonparametric Policy Improvement in Continuous Action Spaces via Expert Demonstrations, INFORMS .
[BibTeX] [Abstract] [Download PDF]

The policy improvement theorem is a fundamental building block of classical reinforcement learning for discrete action spaces. Unfortunately, the lack of an analogous result for continuous action spaces with function approximation has historically limited theoretical guarantees of policy optimization algorithms, undermining their reliability. Here, we introduce a novel nonparametric policy that relies purely on data to take actions and that admits a policy improvement theorem for deterministic Markov Decision Processes (MDPs). By imposing mild regularity assumptions on the optimal policy, we show that, when data come from expert demonstrations, one can construct a nonparametric lower bound on the value of the policy, thus enabling its robust evaluation. The constructed lower bound naturally leads to a simple improvement mechanism based on adding more demonstrations. We also provide conditions to identify regions of the state space where additional demonstrations are needed to meet specific performance goals. Finally, we propose a policy optimization algorithm that ensures a monotonic improvement of the lower bound and leads to high probability performance guarantees. These contributions provide a foundational step toward establishing a rigorous framework for policy improvement in continuous action spaces.

@talk{informs25,
  abstract = {The policy improvement theorem is a fundamental building block of classical reinforcement learning for discrete action spaces. Unfortunately, the lack of an analogous result for continuous action spaces with function approximation has historically limited theoretical guarantees of policy optimization algorithms, undermining their reliability. Here, we introduce a novel nonparametric policy that relies purely on data to take actions and that admits a policy improvement theorem for deterministic Markov Decision Processes (MDPs). By imposing mild regularity assumptions on the optimal policy, we show that, when data come from expert demonstrations, one can construct a nonparametric lower bound on the value of the policy, thus enabling its robust evaluation. The constructed lower bound naturally leads to a simple improvement mechanism based on adding more demonstrations. We also provide conditions to identify regions of the state space where additional demonstrations are needed to meet specific performance goals. Finally, we propose a policy optimization algorithm that ensures a monotonic improvement of the lower bound and leads to high probability performance guarantees. These contributions provide a foundational step toward establishing a rigorous framework for policy improvement in continuous action spaces.},
  annote = {Enrique Mallada is an Associate Professor of Electrical and Computer Engineering at Johns Hopkins University, where he has been a faculty member since 2016. He received his Ph.D. in Electrical and Computer Engineering with a minor in Applied Mathematics from Cornell University and a Telecommunications Engineering degree from ORT University, Uruguay. Before joining Hopkins, he was a Postdoctoral Fellow at Caltech's Center for the Mathematics of Information. His honors include the Johns Hopkins Alumni Association Teaching Award (2021), NSF CAREER Award (2018), Caltech's CMI Fellowship (2014), and Cornell ECE Director's Thesis Award (2014). His research spans control and dynamical systems, machine learning, and optimization, with applications to safety-critical systems, networks, and power grids.},
  date = {10/26/2025},
  day = {26},
  event = {INFORMS },
  host = {Laixi Shi (JHU)},
  month = {10},
  role = {Speaker},
  title = {Nonparametric Policy Improvement in Continuous Action Spaces via Expert Demonstrations},
  url = {https://mallada.ece.jhu.edu/talks/202510-Informs.pdf},
  year = {2025}
}

2025-09-26: Interconnection Compliance in High-IBR Grids, 50 Hertz.
[BibTeX] [Abstract] [Download PDF]

As inverter-based resources (IBRs) become dominant in modern power grids, interconnection compliance faces growing challenges driven by reduced inertia, weak grid conditions, and limited transparency of proprietary inverter controls. A particularly critical concern is the emergence of sub-synchronous oscillations (SSOs), which have been observed across a wide range of systems and operating conditions. Existing compliance practices—largely based on static screening metrics and ad hoc dynamic studies—struggle to account for inverter heterogeneity, operating-point dependence, and uncertainty while remaining technology-agnostic. This talk revisits the mechanisms underlying IBR-induced SSOs and argues that their onset can be reliably characterized using linearized small-signal and impedance-based models. Building on this foundation, we present a robust, decentralized stability analysis framework that enables certifiable stability margins using black-box inverter models and local testing. The resulting criteria explicitly expose a trade-off between robustness and operational efficiency, clarifying how conservative compliance requirements can restrict dispatch flexibility, while permissive rules risk instability. These insights point toward stability-aware, scalable approaches for interconnection compliance in high-IBR power systems

@talk{50hertz25,
  abstract = {As inverter-based resources (IBRs) become dominant in modern power grids, interconnection compliance faces growing challenges driven by reduced inertia, weak grid conditions, and limited transparency of proprietary inverter controls. A particularly critical concern is the emergence of sub-synchronous oscillations (SSOs), which have been observed across a wide range of systems and operating conditions. Existing compliance practices---largely based on static screening metrics and ad hoc dynamic studies---struggle to account for inverter heterogeneity, operating-point dependence, and uncertainty while remaining technology-agnostic.

This talk revisits the mechanisms underlying IBR-induced SSOs and argues that their onset can be reliably characterized using linearized small-signal and impedance-based models. Building on this foundation, we present a robust, decentralized stability analysis framework that enables certifiable stability margins using black-box inverter models and local testing. The resulting criteria explicitly expose a trade-off between robustness and operational efficiency, clarifying how conservative compliance requirements can restrict dispatch flexibility, while permissive rules risk instability. These insights point toward stability-aware, scalable approaches for interconnection compliance in high-IBR power systems},
  annote = {Enrique Mallada is an Associate Professor of Electrical and Computer Engineering at Johns Hopkins University, where he has been a faculty member since 2016. He received his Ph.D. in Electrical and Computer Engineering with a minor in Applied Mathematics from Cornell University and a Telecommunications Engineering degree from ORT University, Uruguay. Before joining Hopkins, he was a Postdoctoral Fellow at Caltech's Center for the Mathematics of Information. His honors include the Johns Hopkins Alumni Association Teaching Award (2021), NSF CAREER Award (2018), Caltech's CMI Fellowship (2014), and Cornell ECE Director's Thesis Award (2014). His research spans control and dynamical systems, machine learning, and optimization, with applications to safety-critical systems, networks, and power grids.},
  date = {09/26/2025},
  day = {26},
  event = {50 Hertz},
  host = {Mark O'Malley},
  month = {09},
  role = {Speaker},
  title = {Interconnection Compliance in High-IBR Grids},
  url = {https://mallada.ece.jhu.edu/talks/202509-50Hertz.pdf},
  year = {2025}
}

2025-09-19: Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement, Texas AM.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{texas-am25,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  annote = {Enrique Mallada is an Associate Professor of Electrical and Computer Engineering at Johns Hopkins University, where he has been a faculty member since 2016. He received his Ph.D. in Electrical and Computer Engineering with a minor in Applied Mathematics from Cornell University and a Telecommunications Engineering degree from ORT University, Uruguay. Before joining Hopkins, he was a Postdoctoral Fellow at Caltech's Center for the Mathematics of Information. His honors include the Johns Hopkins Alumni Association Teaching Award (2021), NSF CAREER Award (2018), Caltech's CMI Fellowship (2014), and Cornell ECE Director's Thesis Award (2014). His research spans control and dynamical systems, machine learning, and optimization, with applications to safety-critical systems, networks, and power grids.},
  date = {09/19/2025},
  day = {19},
  event = {Texas AM},
  host = {Alfredo Garcia},
  month = {09},
  role = {Lecturer},
  title = {Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement},
  url = {https://mallada.ece.jhu.edu/talks/202509-Texas-AM.pdf},
  year = {2025}
}

2025-07-09: Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement, Chinese University of Hong Kong @ Shenzhen.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{cuhk-sz25,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  annote = {Enrique Mallada is an Associate Professor of Electrical and Computer Engineering at Johns Hopkins University, where he has been a faculty member since 2016. He received his Ph.D. in Electrical and Computer Engineering with a minor in Applied Mathematics from Cornell University and a Telecommunications Engineering degree from ORT University, Uruguay. Before joining Hopkins, he was a Postdoctoral Fellow at Caltech's Center for the Mathematics of Information. His honors include the Johns Hopkins Alumni Association Teaching Award (2021), NSF CAREER Award (2018), Caltech's CMI Fellowship (2014), and Cornell ECE Director's Thesis Award (2014). His research spans control and dynamical systems, machine learning, and optimization, with applications to safety-critical systems, networks, and power grids.},
  date = {07/09/2025},
  day = {09},
  event = {Chinese University of Hong Kong @ Shenzhen},
  host = {Yan Jiang},
  month = {07},
  role = {Lecturer},
  title = {Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement},
  url = {https://mallada.ece.jhu.edu/talks/202507-CUHK-SZ.pdf},
  year = {2025}
}

2025-07-09: Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement, Tsinghua University.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{tsinghua25automation,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  date = {7/21/2025},
  day = {09},
  event = {Tsinghua University},
  host = {Yilin Mo},
  month = {07},
  role = {Lecturer},
  title = {Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement},
  url = {https://mallada.ece.jhu.edu/talks/202507-Tsinghua-Auto.pdf},
  year = {2025}
}

2025-07-04: Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement, Shanghai Jiao Tong University.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{sjtu25,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  date = {07/04/2025},
  day = {04},
  event = {Shanghai Jiao Tong University},
  host = {Xiaoming Du},
  month = {07},
  role = {Lecturer},
  title = {Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement},
  url = {https://mallada.ece.jhu.edu/talks/202507-SJTU.pdf},
  year = {2025}
}

2025-07-16: Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement, Peking University.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{pku25,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  date = {7/16/2025},
  day = {16},
  event = {Peking University},
  host = {Pengcheng You},
  month = {07},
  role = {Lecturer},
  title = {Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement},
  url = {https://mallada.ece.jhu.edu/talks/202507-PKU.pdf},
  year = {2025}
}

2025-07-15: Grid Shaping Control for High-IBR Power Systems: Stability Analysis and Control Design, Tsinghua University.
[BibTeX] [Abstract] [Download PDF]

The transition of power systems from conventional synchronous generation toward renewable energy sources — with little, or no, inertia — is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system previously maintained. In this talk, we seek to challenge this approach. We advocate leveraging the system’s low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfy a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, this condition reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load disturbances to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.

@talk{tsinghua25ee,
  abstract = {The transition of power systems from conventional synchronous generation toward renewable energy sources --- with little, or no, inertia --- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system previously maintained. In this talk, we seek to challenge this approach. We advocate leveraging the system's low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfy a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, this condition reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load disturbances to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.

},
  date = {07/15/2025},
  day = {15},
  event = {Tsinghua University},
  host = {Feng Liu},
  month = {07},
  role = {Lecturer},
  title = {Grid Shaping Control for High-IBR Power Systems: Stability Analysis and Control Design},
  url = {https://mallada.ece.jhu.edu/talks/202507-Tsinghua-EE.pdf},
  year = {2025}
}

2025-06-06: Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement, JHU Applied Physics Laboratory.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{jhuapl25,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  date = {6/02/2025},
  day = {06},
  event = {JHU Applied Physics Laboratory},
  host = {Phillip Rivera},
  month = {06},
  role = {Lecturer},
  title = {Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement},
  url = {https://mallada.ece.jhu.edu/talks/202506-JHUAPL.pdf},
  year = {2025}
}

2025-04-03: Grid Shaping Control for High-IBR Power Systems: Stability Analysis and Control Design, Caltech RSRG.
[BibTeX] [Abstract] [Download PDF]

The transition of power systems from conventional synchronous generation toward renewable energy sources — with little, or no, inertia — is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system previously maintained. In this talk, we seek to challenge this approach. We advocate leveraging the system’s low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfy a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, this condition reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load disturbances to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.

@talk{caltech25b,
  abstract = {The transition of power systems from conventional synchronous generation toward renewable energy sources --- with little, or no, inertia --- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system previously maintained. In this talk, we seek to challenge this approach. We advocate leveraging the system's low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfy a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, this condition reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load disturbances to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.

},
  date = {04/3/2025},
  day = {03},
  event = {Caltech RSRG},
  host = {Adam Wierman},
  month = {04},
  role = {Speaker},
  title = {Grid Shaping Control for High-IBR Power Systems: Stability Analysis and Control Design},
  url = {https://mallada.ece.jhu.edu/talks/202504b-Caltech.pdf},
  year = {2025}
}

2025-04-02: Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement, Caltech CDS.
[BibTeX] [Abstract] [Download PDF]

This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency. First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity. Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.

@talk{caltech25a,
  abstract = {This talk presents a novel nonparametric framework for analyzing dynamical systems and synthesizing control policies that relies purely on trajectory data and is designed to exploit GPU parallelization for scalability. The key insight behind this work is to relax strict objectives, such as invariance and optimality, and replace them with weaker conditions that enable a flexible trade-off between accuracy, computational complexity, and sample efficiency.

First, we introduce the concept of recurrence, a relaxation of invariance that allows trajectories to leave a set temporarily before returning within a finite time. This relaxed condition serves as a functional substitute for invariance and provides an alternative foundation for analyzing dynamical systems. By leveraging recurrence, we develop integral Lyapunov and barrier function conditions, where function values are required to be eventually monotonic over a finite time window rather than strictly increasing or decreasing. This relaxation offers a more flexible framework for stability and safety verification, enabling a trade-off between verification accuracy and computational complexity.

Next, we turn to the policy optimization problem and introduce a class of nonparametric policies designed for continuous action spaces. These policies rely purely on (expert) trajectory data to construct a nonparametric lower bound, Q_lb, on the optimal action-value function Q^⋆. Crucially, we show that this policy representation admits a policy improvement theorem, overcoming a key limitation faced by function approximation methods in continuous action spaces. Building on this result, we develop a practical algorithm that drives continual policy improvement by selectively incorporating new expert demonstrations, ensuring efficient data use while achieving monotonic performance gains.},
  date = {4/02/2025},
  day = {02},
  event = {Caltech CDS},
  host = {Steven Low},
  month = {04},
  role = {Speaker},
  title = {Nonparametric Analysis and Control of Dynamical Systems: Stability, Safety and Policy Improvement},
  url = {https://mallada.ece.jhu.edu/talks/202504a-Caltech.pdf},
  year = {2025}
}

2024

2024-11-19: Nonparametric Analysis of Dynamical Systems: From Recurrent Sets to Generalized Lyapunov and Barrier Conditions, ESE Fall Colloquium, UPenn.
[BibTeX] [Abstract] [Download PDF]

This talk presents novel non-parametric methods for analyzing dynamical systems using solely trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed condition of recurrence. Specifically, a set is τ-recurrent if every trajectory that starts within the set returns to it after at most τ seconds. We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions and develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov and Barrier Function Methods to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability and safety using τ-monotonic functions (functions whose value along trajectories monotonically increases or decreases after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing power to verify stability and safety using only trajectory information. We finalize by discussing future research directions and possible extensions for control.

@talk{upenn24,
  abstract = {This talk presents novel non-parametric methods for analyzing dynamical systems using solely trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed condition of recurrence. Specifically, a set is τ-recurrent if every trajectory that starts within the set returns to it after at most τ seconds. We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions and develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov and Barrier Function Methods to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability and safety using τ-monotonic functions (functions whose value along trajectories monotonically increases or decreases after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing power to verify stability and safety using only trajectory information. We finalize by discussing future research directions and possible extensions for control.},
  date = {11/20/2024},
  day = {19},
  event = {ESE Fall Colloquium, UPenn},
  host = {Rene Vidal},
  month = {11},
  role = {Speaker},
  title = {Nonparametric Analysis of Dynamical Systems: From Recurrent Sets to Generalized Lyapunov and Barrier Conditions},
  url = {https://mallada.ece.jhu.edu/talks/202411-UPenn.pdf},
  year = {2024}
}

2024-09-25: Generalized Barrier Functions: Integral Conditions and Recurrent Relaxations, 60th Allerton Conference on Communication, Control, and Computing.
[BibTeX] [Abstract] [Download PDF]

Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL’s suitability for safety-critical systems. Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy. Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing–except for a spurious solution–maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.

@talk{allerton24,
  abstract = {Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL's suitability for safety-critical systems.

Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy.

Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing--except for a spurious solution--maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.},
  date = {06/12/2024},
  day = {25},
  event = {60th Allerton Conference on Communication, Control, and Computing},
  host = {N/A},
  month = {09},
  role = {Speaker},
  title = {Generalized Barrier Functions: Integral Conditions and Recurrent Relaxations},
  url = {https://mallada.ece.jhu.edu/talks/202409-Allerton.pdf},
  year = {2024}
}

2024-06-12: Reinforcement Learning for Safety Critical Applications, Tercera Conferencia Colombiana de Matematicas Aplicadas e Industriales.
[BibTeX] [Abstract] [Download PDF]

Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL’s suitability for safety-critical systems. Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy. Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing–except for a spurious solution–maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.

@talk{mapi24,
  abstract = {Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL's suitability for safety-critical systems.

Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy.

Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing--except for a spurious solution--maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.},
  date = {06/12/2024},
  day = {12},
  event = {Tercera Conferencia Colombiana de Matematicas Aplicadas e Industriales},
  host = {Javier Peña (CMU), Mateo Diaz (JHU)},
  month = {06},
  role = {Speaker},
  title = {Reinforcement Learning for Safety Critical Applications},
  url = {https://mallada.ece.jhu.edu/talks/202406-MAPI.pdf},
  year = {2024}
}

2024-06-18: Data-driven Analysis of Dynamical Systems Using Recurrent Sets, INFORMS International Conference.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov’s Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.

@talk{informs2024,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov's Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.},
  date = {06/18/2024},
  day = {18},
  event = {INFORMS International Conference},
  host = {Luis Zuluaga (Lehigh), Mateo Diaz (JHU)},
  month = {06},
  role = {Speaker},
  title = {Data-driven Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202406-Informs.pdf},
  year = {2024}
}

2024-06-05: Data-driven Analysis of Dynamical Systems Using Recurrent Sets, Department of Automatic Control, Lund University.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov’s Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.

@talk{lund2024,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov's Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.},
  date = {06/05/2024},
  day = {05},
  event = {Department of Automatic Control, Lund University},
  host = {Richard Pates (Lund)},
  month = {06},
  role = {Lecture},
  title = {Data-driven Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202406-Lund.pdf},
  year = {2024}
}

2024-06-06: Data-driven Analysis of Dynamical Systems Using Recurrent Sets, Cyber-Physical Systems Lab, Université catholique de Louvain.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov’s Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.

@talk{ucl2024,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov's Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.},
  date = {06/06/2024},
  day = {06},
  event = {Cyber-Physical Systems Lab, Université catholique de Louvain},
  host = {Raphael Jungers (UCL)},
  month = {06},
  role = {Lecture},
  title = {Data-driven Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202406-UCL.pdf},
  year = {2024}
}

2024-05-16: Recurrence of Nonlinear Control Systems: Entropy and Bit Rates, Hybrid Systems: Computation and Control (HSCC).
[BibTeX] [Abstract] [Download PDF]

In this paper, we introduce the notion of recurrence entropy in the context of nonlinear control systems. A set is said to be (tau-)recurrent if every trajectory that starts in the set returns to it (within at most $τ$ units of time). Recurrence entropy quantifies the complexity of making a set tau-recurrent measured by the average rate of growth, as time increases, of the number of control signals required to achieve this goal. Our analysis reveals that, compared to invariance, recurrence is quantitatively less complex, meaning that the recurrence entropy of a set is no larger than, and often strictly smaller than, the invariance entropy. Our results further offer insights into the minimum data rate required for achieving recurrence. We also present an algorithm for achieving recurrence asymptotically.

@talk{hscc2024,
  abstract = {In this paper, we introduce the notion of recurrence entropy in the context of nonlinear control systems. A set is said to be (tau-)recurrent if every trajectory that starts in the set returns to it (within at most $τ$ units of time). Recurrence entropy quantifies the complexity of making a set tau-recurrent measured by the average rate of growth, as time increases, of the number of control signals required to achieve this goal. Our analysis reveals that, compared to invariance, recurrence is quantitatively less complex, meaning that the recurrence entropy of a set is no larger than, and often strictly smaller than, the invariance entropy. Our results further offer insights into the minimum data rate required for achieving recurrence. We also present an algorithm for achieving recurrence asymptotically.},
  date = {05/16/2024},
  day = {16},
  event = {Hybrid Systems: Computation and Control (HSCC)},
  month = {05},
  role = {Lecture},
  title = {Recurrence of Nonlinear Control Systems: Entropy and Bit Rates},
  url = {https://mallada.ece.jhu.edu/talks/202405-HSCC.pdf},
  year = {2024}
}

2024-03-28: Options for Mitigation Measures: Avenues for new Research, ESIG/G-PST Special Topic Workshop on Oscillations.
[BibTeX] [Abstract] [Download PDF]

As inverter-based resources (IBRs) become pervasive, power system stability is increasingly shaped by nonlinear dynamics that lie beyond the scope of traditional small-signal analysis. Oscillatory behavior in IBR-rich grids can arise through bifurcations and, in some cases, transition rapidly to chaotic regimes, with dynamics that depend sensitively on control design, operating conditions, and network interactions. These phenomena pose new challenges for both mitigation and certification, particularly as faster inverter controls can accelerate the onset of instability rather than suppress it. This talk explores recent insights into nonlinear phenomena in IBR-dominated systems, highlighting fundamental differences between grid-following and grid-forming control architectures and their respective routes to instability. Building on decentralized and scale-free stability analysis, the presentation outlines emerging avenues for mitigation, including stability-aware control design, early-warning indicators based on critical slowing down, and operational strategies that integrate dispatch with dynamic constraints. Together, these directions point toward a new research agenda for managing stability in high-IBR power systems

@talk{esig24,
  abstract = {As inverter-based resources (IBRs) become pervasive, power system stability is increasingly shaped by nonlinear dynamics that lie beyond the scope of traditional small-signal analysis. Oscillatory behavior in IBR-rich grids can arise through bifurcations and, in some cases, transition rapidly to chaotic regimes, with dynamics that depend sensitively on control design, operating conditions, and network interactions. These phenomena pose new challenges for both mitigation and certification, particularly as faster inverter controls can accelerate the onset of instability rather than suppress it.

This talk explores recent insights into nonlinear phenomena in IBR-dominated systems, highlighting fundamental differences between grid-following and grid-forming control architectures and their respective routes to instability. Building on decentralized and scale-free stability analysis, the presentation outlines emerging avenues for mitigation, including stability-aware control design, early-warning indicators based on critical slowing down, and operational strategies that integrate dispatch with dynamic constraints. Together, these directions point toward a new research agenda for managing stability in high-IBR power systems},
  date = {03/28/2024},
  day = {28},
  event = {ESIG/G-PST Special Topic Workshop on Oscillations},
  host = {Mark O'Malley (Imperial)},
  month = {03},
  role = {Lecture},
  title = {Options for Mitigation Measures: Avenues for new Research},
  url = {https://mallada.ece.jhu.edu/talks/202403-ESIG.pdf},
  year = {2024}
}

2024-03-20: Model-Free Analysis of Dynamical Systems Using Recurrent Sets, ECE Colloquium, Rutgers University.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov’s Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.

@talk{rutgers24,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov's Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize by discussing future research directions and possible extensions for control.},
  date = {03/20/2024},
  day = {20},
  event = {ECE Colloquium, Rutgers University},
  host = {Daniel Burbano (Rutgers)},
  month = {03},
  role = {Lecture},
  title = {Model-Free Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202403-Rutgers.pdf},
  year = {2024}
}

2024-02-16: Reinforcement Learning for Safety Critical Applications, George Mason University.
[BibTeX] [Abstract] [Download PDF]

Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL’s suitability for safety-critical systems. Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy. Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing–except for a spurious solution–maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.

@talk{gmu24,
  abstract = {Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL's suitability for safety-critical systems.

Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy.

Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing--except for a spurious solution--maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.},
  date = {02/2024},
  day = {16},
  event = {George Mason University},
  host = {Ningshi Yao (GMU)},
  month = {02},
  role = {Lecture},
  title = {Reinforcement Learning for Safety Critical Applications},
  url = {https://mallada.ece.jhu.edu/talks/202402-GMU.pdf},
  year = {2024}
}

2024-01-11: Reinforcement Learning for Safety Critical Applications, Applied Physics Laboratory, JHU.
[BibTeX] [Abstract] [Download PDF]

Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL’s suitability for safety-critical systems. Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy. Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing–except for a spurious solution–maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.

@talk{apl24,
  abstract = {Integrating Reinforcement Learning (RL) in safety-critical applications, such as autonomous vehicles, healthcare, and industrial automation, necessitates an increased focus on safety and reliability. In this talk, we consider two complementary mechanisms to augment RL's suitability for safety-critical systems.

Firstly, we consider a constrained reinforcement learning (C-RL) setting, wherein agents aim to maximize rewards while adhering to required constraints on secondary specifications. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods exhibit a discrepancy between the behavioral and optimal policies due to their reliance on stochastic gradient descent-ascent algorithms. We propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories almost surely converge to the optimal policy.

Secondly, we study the problem of incorporating safety-critical constraints to RL that allow an agent to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks the desired contraction and uniqueness properties that the classical Bellman operator enjoys. In this work, we overcome the non-contractiveness of safety critic operators by leveraging that safety is a binary property. To that end, we study the properties of the binary safety critic associated with a deterministic dynamical system that seeks to avoid reaching an unsafe region. We formulate the corresponding binary Bellman equation (B2E) for safety and study its properties. While the resulting operator is still non-contractive, we fully characterize its fixed points representing--except for a spurious solution--maximal persistently safe regions of the state space that can always avoid failure. We provide an algorithm that, by design, leverages axiomatic knowledge of safe data to avoid spurious fixed points.},
  date = {02/2024},
  day = {11},
  event = {Applied Physics Laboratory, JHU},
  host = {Jared Markowitz},
  month = {01},
  role = {Lecture},
  title = {Reinforcement Learning for Safety Critical Applications},
  url = {https://mallada.ece.jhu.edu/talks/202401-JHUAPL.pdf},
  year = {2024}
}

2023

2023-12-11: Unintended Consequences of Market Designs, IHPC’s Workshop of Power and Energy Systems of the (near) Future, ASTAR.
[BibTeX] [Abstract] [Download PDF]

In this talk, we seek to highlight the importance of accounting for the incentives of *all* market participants when designing market mechanisms for electricity. To this end, we perform a Nash equilibrium analysis of two different market mechanisms that aim to illustrate the critical role that the incentives of consumers and other new types of participants, such as storage, play in the equilibrium outcome. Firstly, we study the incentives of heterogeneous participants (generators and consumers) in a two-stage settlement market, where generators participate using a supply function bid and consumers use a quantity bid. We show that strategic consumers are able to exploit generators’ strategic behavior to maintain a systematic difference between the forward and spot prices, with the latter being higher. Notably, such a strategy does bring down consumer payments and undermines the supply-side market power. We further observe situations where generators lose profit by behaving strategically, a sign of overturn of the conventional supply-side market power. Secondly, we study a market mechanism for multi-interval electricity markets with generator and storage participants. Drawing ideas from supply function bidding, we introduce a novel bid structure for storage participation that allows storage units to communicate their cost to the market using energy-cycling functions that map prices to cycle depths. The resulting market-clearing process — implemented via convex programming — yields corresponding schedules and payments based on traditional energy prices for power supply and per-cycle prices for storage utilization. Our solution shows several advantages over the standard prosumer-based approach that prices energy per slot. In particular, it does not require a priori estimation of future prices and leads to an efficient, competitive equilibrium.

@talk{astar23,
  abstract = {In this talk, we seek to highlight the importance of accounting for the incentives of *all* market participants when designing market mechanisms for electricity. To this end, we perform a Nash equilibrium analysis of two different market mechanisms that aim to illustrate the critical role that the incentives of consumers and other new types of participants, such as storage, play in the equilibrium outcome. Firstly, we study the incentives of heterogeneous participants (generators and consumers) in a two-stage settlement market, where generators participate using a supply function bid and consumers use a quantity bid. We show that strategic consumers are able to exploit generators' strategic behavior to maintain a systematic difference between the forward and spot prices, with the latter being higher.  Notably, such a strategy does bring down consumer payments and undermines the supply-side market power. We further observe situations where generators lose profit by behaving strategically, a sign of overturn of the conventional supply-side market power. Secondly, we study a market mechanism for multi-interval electricity markets with generator and storage participants. Drawing ideas from supply function bidding, we introduce a novel bid structure for storage participation that allows storage units to communicate their cost to the market using energy-cycling functions that map prices to cycle depths. The resulting market-clearing process -- implemented via convex programming -- yields corresponding schedules and payments based on traditional energy prices for power supply and per-cycle prices for storage utilization. Our solution shows several advantages over the standard prosumer-based approach that prices energy per slot. In particular, it does not require a priori estimation of future prices and leads to an efficient, competitive equilibrium.},
  date = {12/11/2023},
  day = {11},
  event = {IHPC's Workshop of Power and Energy Systems of the (near) Future, ASTAR},
  host = {John Pang (ASTAR)},
  month = {12},
  role = {Speaker},
  title = {Unintended Consequences of Market Designs},
  url = {https://mallada.ece.jhu.edu/talks/202312-ASTAR.pdf},
  year = {2023}
}

2023-11-04: Model-Free Analysis of Dynamical Systems Using Recurrent Sets, FIND Seminar, Cornell University.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov’s Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize discussing future research directions and possible extensions for control.

@talk{cornell23,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using trajectory data. Our critical insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. Specifically, a set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We leverage this notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. Firstly, we consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point using trajectory data. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Secondly, we generalize Lyapunov's Direct Method to allow for non-monotonic evolution of the function values by only requiring sub-level sets to be τ-recurrent (instead of invariant). We provide conditions for stability, asymptotic stability, and exponential stability of an equilibrium using τ-decreasing functions (functions whose value along trajectories decrease after at most τ seconds) and develop a verification algorithm that leverages GPU parallel processing to verify such conditions using trajectories. We finalize discussing future research directions and possible extensions for control.},
  date = {11/04/2023},
  day = {04},
  event = {FIND Seminar, Cornell University},
  host = {Kevin A. Tang (Cornell)},
  month = {11},
  role = {Lecture},
  title = {Model-Free Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202311-Cornell.pdf},
  year = {2023}
}

2023-10-12: Reinforcement Learning with Almost Sure Constraints, MURI Workshop.
[BibTeX] [Abstract] [Download PDF]

In this work, we study how to tackle decision-making for safety-critical systems under uncertainty. To that end, we formulate a Reinforcement Learning problem with Almost Sure constraints, in which one seeks a policy that allows no more than $Δınℕ$ unsafe events in any trajectory, with probability one. We argue that this type of constraint might be better suited for safety-critical systems as opposed to the usual average constraint employed in Constrained Markov Decision Processes and that, moreover, having constraints of this kind makes feasible policies much easier to find. The talk is didactically split into two parts, first considering $Δ=0$ and then the $Δ≥ 0$ case. At the core of our theory is a barrier-based decomposition of the Q-function that decouples the problems of optimality and feasibility and allows them to be learned either independently or in conjunction. We develop an algorithm for characterizing the set of all feasible policies that provably converges in expected finite time. We further develop sample-complexity bounds for learning this set with high probability. Simulations corroborate our theoretical findings and showcase how our algorithm can be wrapped around other learning algorithms to hasten the search for first feasible and then optimal policies.

@talk{muri23,
  abstract = {In this work, we study how to tackle decision-making for safety-critical systems under uncertainty. To that end, we formulate a Reinforcement Learning problem with Almost Sure constraints, in which one seeks a policy that allows no more than $Δınℕ$ unsafe events in any trajectory, with probability one. We argue that this type of constraint might be better suited for safety-critical systems as opposed to the usual average constraint employed in Constrained Markov Decision Processes and that, moreover, having constraints of this kind makes feasible policies much easier to find. The talk is didactically split into two parts, first considering $Δ=0$ and then the $Δ≥ 0$ case. At the core of our theory is a barrier-based decomposition of the Q-function that decouples the problems of optimality and feasibility and allows them to be learned either independently or in conjunction. We develop an algorithm for characterizing the set of all feasible policies that provably converges in expected finite time. We further develop sample-complexity bounds for learning this set with high probability. Simulations corroborate our theoretical findings and showcase how our algorithm can be wrapped around other learning algorithms to hasten the search for first feasible and then optimal policies.},
  date = {10/2023},
  day = {12},
  event = {MURI Workshop},
  host = {Mario Sznaier (Northeastern)},
  month = {10},
  role = {Speaker},
  title = {Reinforcement Learning with Almost Sure Constraints},
  url = {https://mallada.ece.jhu.edu/talks/202310-MURI.pdf},
  year = {2023}
}

2023-09-07: Grid Shaping Control for High-IBR Power Systems: Stability Analysis and Control Design, GE EDGE Symposium.
[BibTeX] [Abstract] [Download PDF]

The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system’s low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfies a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, so-called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load perturbations so as to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that can characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.

@talk{ge-edge23,
  abstract = {The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system's low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfies a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, so-called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load perturbations so as to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that can characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.},
  date = {09/20/2023},
  day = {07},
  event = {GE EDGE Symposium},
  host = {Aditya Kumar (GE)},
  month = {09},
  role = {Speaker},
  title = {Grid Shaping Control for High-IBR Power Systems: Stability Analysis and Control Design},
  url = {https://mallada.ece.jhu.edu/talks/202309-GE-EDGE.pdf},
  year = {2023}
}

2023-09-07: Learning Coherent Clusters in Weakly Connected Power Networks, 6th Workshop on Autonomous Energy Systems.
[BibTeX] [Abstract] [Download PDF]

Network coherence generally refers to the emergence of a simple aggregated dynamic response of generator units, despite heterogeneity in the unit’s location and dynamic constitution. In this talk, we develop a general frequency domain framework to analyze and quantify the level of network coherence that a system exhibits by relating coherence with a low-rank property of the system’s input-output response. Our analysis unveils the frequency-dependent nature of coherence and a non-trivial interplay between dynamics, network topology, and the type of disturbance. We further leverage this framework to build structure-preserving model-reduction methodology for large-scale dynamic networks with tightly-connected components and provide time-domain bounds on the approximation error of our model. Our work provides new avenues for analysis and control designs of IBR-rich power systems.

@talk{nrel23,
  abstract = {Network coherence generally refers to the emergence of a simple aggregated dynamic response of generator units, despite heterogeneity in the unit's location and dynamic constitution. In this talk, we develop a general frequency domain framework to analyze and quantify the level of network coherence that a system exhibits by relating coherence with a low-rank property of the system's input-output response. Our analysis unveils the frequency-dependent nature of coherence and a non-trivial interplay between dynamics, network topology, and the type of disturbance. We further leverage this framework to build structure-preserving model-reduction methodology for large-scale dynamic networks with tightly-connected components and provide time-domain bounds on the approximation error of our model. Our work provides new avenues for analysis and control designs of IBR-rich power systems. },
  date = {09/07/2023},
  day = {07},
  event = {6th Workshop on Autonomous Energy Systems},
  host = {Andrey Berstein (NREL), Guido Carvaro (NREL)},
  month = {09},
  role = {Speaker},
  title = {Learning Coherent Clusters in Weakly Connected Power Networks},
  url = {https://mallada.ece.jhu.edu/talks/202309-NREL.pdf},
  year = {2023}
}

2023-07-06: Model-Free Analysis of Dynamical Systems Using Recurrent Sets, Workshop on Uncertain Dynamical Systems.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using data. Our key insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. A set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We then leverage the notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. We first consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point without an explicit model of the dynamics. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then leverage this property to develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Our algorithms process samples sequentially, which allows them to continue being executed even after an initial offline training stage. We will finalize by providing some recent extensions of this work that generalizes Lyapunov’s Direct Method to allow for non-decreasing functions to certify stability and illustrate future research directions.

@talk{wuds23,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using data. Our key insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. A set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We then leverage the notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. We first consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point without an explicit model of the dynamics. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then leverage this property to develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Our algorithms process samples sequentially, which allows them to continue being executed even after an initial offline training stage.  We will finalize by providing some recent extensions of this work that generalizes Lyapunov's Direct Method to allow for non-decreasing functions to certify stability and illustrate future research directions.},
  date = {07/06/2023},
  day = {06},
  event = {Workshop on Uncertain Dynamical Systems},
  host = {Mario Sznaier (Northeastern), Fabrizio Dabbene (PoliTo), Constantino Lagoa (Penn State)},
  month = {07},
  role = {Speaker},
  title = {Model-Free Analysis of Dynamical Systems Using Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202307-WUDS.pdf},
  year = {2023}
}

2023-07-19: Grid Shaping Control for High-IBR Power Systems, Panel on Future electricity systems: How to handle millions of power electronic-based devices and other emerging technologies, IEEE PES General Meeting.
[BibTeX] [Abstract] [Download PDF]

The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system’s low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfies a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, so-called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load perturbations so as to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that can characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.

@talk{pesgm23,
  abstract = {The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system's low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. First, we develop novel stability analysis tools for power systems, which allow for the decentralized design of inverter-based controllers. The method requires that each inverter satisfies a standard H-infinity design requirement that depends on the dynamics of the components and inverters at each bus and the aggregate susceptance of the transmission lines connected to it. It is robust to network and delay uncertainty and, when no network information is available, reduces to the standard passivity condition for stability. Then, we propose a novel grid-forming control strategy, so-called grid shaping control, that aims to shape the frequency response of synchronous generators (SGs) to load perturbations so as to efficiently arrest sudden frequency drops. The approach builds on novel analysis tools that can characterize the Center of Inertia (CoI) response of a system with both IBRs and SGs and use this characterization to reshape it.},
  date = {07/19/2023},
  day = {19},
  event = {Panel on Future electricity systems: How to handle millions of power electronic-based devices and other emerging technologies, IEEE PES General Meeting},
  host = {Claudia Andrea Rahmann (UChile), Amarsagar Reddy Ramapuram Matavalam (ASU)},
  month = {07},
  role = {Panelist},
  title = {Grid Shaping Control for High-IBR Power Systems},
  url = {https://mallada.ece.jhu.edu/talks/202307-PESGM.pdf},
  year = {2023}
}

2023-05-30: Iterative Policy Learning for Constrained RL via Dissipative Gradient Descent-Ascent, Workshop on Online optimization Methods for Data-Driven Feedback Control, American Control Conferenece.
[BibTeX] [Abstract] [Download PDF]

In constrained reinforcement learning (C-RL), an agent seeks to learn from the environment a policy that maximizes the expected cumulative reward while satisfying minimum requirements in secondary cumulative reward constraints. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods are based on stochastic gradient descent-ascent algorithms whose trajectories are connected to the optimal policy only after a mixing output stage that depends on the algorithm’s history. As a result, there is a mismatch between the behavioral policy and the optimal one. In this talk, we propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories converge to the optimal policy almost surely.

@talk{acc23,
  abstract = {In constrained reinforcement learning (C-RL), an agent seeks to learn from the environment a policy that maximizes the expected cumulative reward while satisfying minimum requirements in secondary cumulative reward constraints. Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space. However, such methods are based on stochastic gradient descent-ascent algorithms whose trajectories are connected to the optimal policy only after a mixing output stage that depends on the algorithm's history. As a result, there is a mismatch between the behavioral policy and the optimal one. In this talk, we propose a novel algorithm for constrained RL that does not suffer from these limitations. Leveraging recent results on regularized saddle-flow dynamics, we develop a novel stochastic gradient descent-ascent algorithm whose trajectories converge to the optimal policy almost surely.},
  date = {05/30/2023},
  day = {30},
  event = {Workshop on Online optimization Methods for Data-Driven Feedback Control, American Control Conferenece},
  host = {Gianluca Bianchin (UCLouvain), Emiliano Dall'Anese (UC Boulder), Jorge Cortés (UCSD), Miguel Vaquero (IE University)},
  month = {05},
  role = {Speaker},
  title = {Iterative Policy Learning for Constrained RL via Dissipative Gradient Descent-Ascent},
  url = {https://mallada.ece.jhu.edu/talks/202305-ACC-Workshop.pdf},
  year = {2023}
}

2023-01-05: Learning Dynamics and Implicit Bias of Gradient Flow in Overparametrerized Linear Models, Joint Mathematics Meeting, Special Session.
[BibTeX] [Abstract] [Download PDF]

Contrary to the common belief that overparameterization may hurt generalization and optimization, recent work suggests that overparameterization may bias the optimization algorithm towards solutions that generalize well — a phenomenon known as implicit regularization or implicit bias — and may also accelerate convergence — a phenomenon known as implicit acceleration. This talk will provide a detailed analysis of the dynamics of gradient flow in overparameterized linear models showing that convergence to equilibrium depends on the imbalance between input and output weights (which is fixed at initialization) and the margin of the initial solution. The talk will also provide an analysis of the implicit bias, showing that large hidden layer width, together with (properly scaled) random initialization, constrains the network parameters to converge to a solution which is close to the min-norm solution.

@talk{jmm23,
  abstract = {Contrary to the common belief that overparameterization may hurt generalization and optimization, recent work suggests that overparameterization may bias the optimization algorithm towards solutions that generalize well --- a phenomenon known as implicit regularization or implicit bias --- and may also accelerate convergence --- a phenomenon known as implicit acceleration. This talk will provide a detailed analysis of the dynamics of gradient flow in overparameterized linear models showing that convergence to equilibrium depends on the imbalance between input and output weights (which is fixed at initialization) and the margin of the initial solution. The talk will also provide an analysis of the implicit bias, showing that large hidden layer width, together with (properly scaled) random initialization, constrains the network parameters to converge to a solution which is close to the min-norm solution.},
  date = {01/05/2023},
  day = {05},
  event = {Joint Mathematics Meeting, Special Session},
  host = {Josué Tonelli Cueto, Hitesh Gakhar, Harlin Lee},
  month = {01},
  role = {Speaker},
  title = {Learning Dynamics and Implicit Bias of Gradient Flow in Overparametrerized Linear Models},
  url = {https://mallada.ece.jhu.edu/talks/202301-JMM.pdf},
  year = {2023}
}

2023-01-18: Frequency Shaping Control for Low Inertia Power Systems,, 2023 ROSEI Summit, Johns Hopkins University.
[BibTeX] [Abstract] [Download PDF]

The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system’s low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. We define system-level performance metrics that are of practical relevance for power systems and systematically evaluate the performance of standard control strategies, such as virtual inertia and droop control, in the presence of power disturbances. Our analysis unveils the relatively limited role of inertia in improving performance and the inability of droop control to enhance performance without incurring considerable steady-state control effort. To overcome these limitations, we propose a novel frequency shaping control for grid-connected inverters -exploiting classical lead/lag compensation and model matching techniques from control theory- that can significantly outperform existing solutions while using comparable control effort.

@talk{rosei23,
  abstract = {The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system's low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. We define system-level performance metrics that are of practical relevance for power systems and systematically evaluate the performance of standard control strategies, such as virtual inertia and droop control, in the presence of power disturbances. Our analysis unveils the relatively limited role of inertia in improving performance and the inability of droop control to enhance performance without incurring considerable steady-state control effort. To overcome these limitations, we propose a novel frequency shaping control for grid-connected inverters -exploiting classical lead/lag compensation and model matching techniques from control theory- that can significantly outperform existing solutions while using comparable control effort.},
  date = {01/18/2023},
  day = {18},
  event = {2023 ROSEI Summit, Johns Hopkins University},
  host = {Ben Schaffer, Ben Link},
  month = {01},
  role = {Speaker},
  title = {Frequency Shaping Control for Low Inertia Power Systems,},
  url = {https://mallada.ece.jhu.edu/talks/202301-ROSEI.pdf},
  year = {2023}
}

2022

2022-12-19: Reinforcement Learning with Almost Sure Constraints, Topologia y Probabilidad en analisis de datos, Universidad de la Republica.
[BibTeX] [Abstract] [Download PDF]

In this work, we study how to tackle decision-making for safety-critical systems under uncertainty. To that end, we formulate a Reinforcement Learning problem with Almost Sure constraints, in which one seeks a policy that allows no more than $Δınℕ$ unsafe events in any trajectory, with probability one. We argue that this type of constraint might be better suited for safety-critical systems as opposed to the usual average constraint employed in Constrained Markov Decision Processes and that, moreover, having constraints of this kind makes feasible policies much easier to find. The talk is didactically split into two parts, first considering $Δ=0$ and then the $Δ≥ 0$ case. At the core of our theory is a barrier-based decomposition of the Q-function that decouples the problems of optimality and feasibility and allows them to be learned either independently or in conjunction. We develop an algorithm for characterizing the set of all feasible policies that provably converges in expected finite time. We further develop sample-complexity bounds for learning this set with high probability. Simulations corroborate our theoretical findings and showcase how our algorithm can be wrapped around other learning algorithms to hasten the search for first feasible and then optimal policies.

@talk{udelar22,
  abstract = {In this work, we study how to tackle decision-making for safety-critical systems under uncertainty. To that end, we formulate a Reinforcement Learning problem with Almost Sure constraints, in which one seeks a policy that allows no more than $Δınℕ$ unsafe events in any trajectory, with probability one. We argue that this type of constraint might be better suited for safety-critical systems as opposed to the usual average constraint employed in Constrained Markov Decision Processes and that, moreover, having constraints of this kind makes feasible policies much easier to find. The talk is didactically split into two parts, first considering $Δ=0$ and then the $Δ≥ 0$ case. At the core of our theory is a barrier-based decomposition of the Q-function that decouples the problems of optimality and feasibility and allows them to be learned either independently or in conjunction. We develop an algorithm for characterizing the set of all feasible policies that provably converges in expected finite time. We further develop sample-complexity bounds for learning this set with high probability. Simulations corroborate our theoretical findings and showcase how our algorithm can be wrapped around other learning algorithms to hasten the search for first feasible and then optimal policies.},
  date = {12/19/2022},
  day = {19},
  event = {Topologia y Probabilidad en analisis de datos, Universidad de la Republica},
  host = {Nicolas Frevenza (UdelaR), Soledad Villar (JHU)},
  month = {12},
  role = {Speaker},
  title = {Reinforcement Learning with Almost Sure Constraints},
  url = {https://mallada.ece.jhu.edu/talks/202212-UdelaR.pdf},
  year = {2022}
}

2022-11-02: Model-free Analysis of Dynamical Systems Using Recurrence, Data Science Seminar, Johns Hopkins University.
[BibTeX] [Download PDF]

@talk{dss22,
  annote = {In this talk, we develop model-free methods for analyzing dynamical systems using data. Our key insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. A set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We then leverage the notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. We first consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point without an explicit model of the dynamics. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then leverage this property to develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Our algorithms process samples sequentially, which allows them to continue being executed even after an initial offline training stage.  We will finalize by providing some recent extensions of this work that generalizes Lyapunov's Direct Method to allow for non-decreasing functions to certify stability and illustrate future research directions.},
  date = {11/02/2022},
  day = {02},
  event = {Data Science Seminar, Johns Hopkins University},
  host = {Fei Lu (JHU), Mauro Maggioni (JHU)},
  month = {11},
  role = {Lecture},
  title = {Model-free Analysis of Dynamical Systems Using Recurrence},
  url = {https://mallada.ece.jhu.edu/talks/202211-DSS-JHU.pdf},
  year = {2022}
}

2022-09-07: Unintended Consequences of Market Designs, Workshon on Human Dimension of Energy Systems, NREL.
[BibTeX] [Abstract] [Download PDF]

In this talk, we seek to highlight the importance of accounting for the incentives of *all* market participants when designing market mechanisms for electricity. To this end, we perform a Nash equilibrium analysis of two different market mechanisms that aim to illustrate the critical role that the incentives of consumers and other new types of participants, such as storage, play in the equilibrium outcome. Firstly, we study the incentives of heterogeneous participants (generators and consumers) in a two-stage settlement market, where generators participate using a supply function bid and consumers use a quantity bid. We show that strategic consumers are able to exploit generators’ strategic behavior to maintain a systematic difference between the forward and spot prices, with the latter being higher. Notably, such a strategy does bring down consumer payments and undermines the supply-side market power. We further observe situations where generators lose profit by behaving strategically, a sign of overturn of the conventional supply-side market power. Secondly, we study a market mechanism for multi-interval electricity markets with generator and storage participants. Drawing ideas from supply function bidding, we introduce a novel bid structure for storage participation that allows storage units to communicate their cost to the market using energy-cycling functions that map prices to cycle depths. The resulting market-clearing process — implemented via convex programming — yields corresponding schedules and payments based on traditional energy prices for power supply and per-cycle prices for storage utilization. Our solution shows several advantages over the standard prosumer-based approach that prices energy per slot. In particular, it does not require a priori estimation of future prices and leads to an efficient, competitive equilibrium.

@talk{nrel-hd22,
  abstract = {In this talk, we seek to highlight the importance of accounting for the incentives of *all* market participants when designing market mechanisms for electricity. To this end, we perform a Nash equilibrium analysis of two different market mechanisms that aim to illustrate the critical role that the incentives of consumers and other new types of participants, such as storage, play in the equilibrium outcome. Firstly, we study the incentives of heterogeneous participants (generators and consumers) in a two-stage settlement market, where generators participate using a supply function bid and consumers use a quantity bid. We show that strategic consumers are able to exploit generators' strategic behavior to maintain a systematic difference between the forward and spot prices, with the latter being higher.  Notably, such a strategy does bring down consumer payments and undermines the supply-side market power. We further observe situations where generators lose profit by behaving strategically, a sign of overturn of the conventional supply-side market power. Secondly, we study a market mechanism for multi-interval electricity markets with generator and storage participants. Drawing ideas from supply function bidding, we introduce a novel bid structure for storage participation that allows storage units to communicate their cost to the market using energy-cycling functions that map prices to cycle depths. The resulting market-clearing process -- implemented via convex programming -- yields corresponding schedules and payments based on traditional energy prices for power supply and per-cycle prices for storage utilization. Our solution shows several advantages over the standard prosumer-based approach that prices energy per slot. In particular, it does not require a priori estimation of future prices and leads to an efficient, competitive equilibrium.},
  date = {09/07/2022},
  day = {07},
  event = {Workshon on Human Dimension of Energy Systems, NREL},
  host = {Andrey Berstein (NREL)},
  month = {09},
  role = {Speaker},
  title = {Unintended Consequences of Market Designs},
  url = {https://mallada.ece.jhu.edu/talks/202209-NREL-HD.pdf},
  year = {2022}
}

2022-08-25: Reinforcement Learning with Almost Sure Constraints, Massachusetts Institute of Techonology.
[BibTeX] [Abstract] [Download PDF]

In this work, we study how to tackle decision-making for safety-critical systems under uncertainty. To that end, we formulate a Reinforcement Learning problem with Almost Sure constraints, in which one seeks a policy that allows no more than $Δınℕ$ unsafe events in any trajectory, with probability one. We argue that this type of constraint might be better suited for safety-critical systems as opposed to the usual average constraint employed in Constrained Markov Decision Processes and that, moreover, having constraints of this kind makes feasible policies much easier to find. The talk is didactically split into two parts, first considering $Δ=0$ and then the $Δ≥ 0$ case. At the core of our theory is a barrier-based decomposition of the Q-function that decouples the problems of optimality and feasibility and allows them to be learned either independently or in conjunction. We develop an algorithm for characterizing the set of all feasible policies that provably converges in expected finite time. We further develop sample-complexity bounds for learning this set with high probability. Simulations corroborate our theoretical findings and showcase how our algorithm can be wrapped around other learning algorithms to hasten the search for first feasible and then optimal policies.

@talk{mit-rl22,
  abstract = {In this work, we study how to tackle decision-making for safety-critical systems under uncertainty. To that end, we formulate a Reinforcement Learning problem with Almost Sure constraints, in which one seeks a policy that allows no more than $Δınℕ$ unsafe events in any trajectory, with probability one. We argue that this type of constraint might be better suited for safety-critical systems as opposed to the usual average constraint employed in Constrained Markov Decision Processes and that, moreover, having constraints of this kind makes feasible policies much easier to find. The talk is didactically split into two parts, first considering $Δ=0$ and then the $Δ≥ 0$ case. At the core of our theory is a barrier-based decomposition of the Q-function that decouples the problems of optimality and feasibility and allows them to be learned either independently or in conjunction. We develop an algorithm for characterizing the set of all feasible policies that provably converges in expected finite time. We further develop sample-complexity bounds for learning this set with high probability. Simulations corroborate our theoretical findings and showcase how our algorithm can be wrapped around other learning algorithms to hasten the search for first feasible and then optimal policies.},
  date = {08/24/2022},
  day = {25},
  event = {Massachusetts Institute of Techonology},
  host = {Ali Jadbabaie (MIT)},
  month = {08},
  role = {Lecture},
  title = {Reinforcement Learning with Almost Sure Constraints},
  url = {https://mallada.ece.jhu.edu/talks/202208-MIT-DL.pdf},
  year = {2022}
}

2022-08-26: On the Convergence of Gradient Flow on Multi-layer Linear Models, Massachusetts Institute of Techonology.
[BibTeX] [Abstract] [Download PDF]

The mysterious ability of gradient-based optimization algorithms to solve the non-convex neural network training problem is one of the many unexplained puzzles behind the success of deep learning in various applications. A promising direction to explain this phenomenon is to study how initialization and overparametrization affect the convergence of training algorithms. In this talk, we analyze the convergence of gradient flow on a multi-layer linear model with a loss function of the form $f(W_1W_2·s W_L)$. We show that when $f$ satisfies the gradient dominance property, proper weight initialization leads to exponential convergence of the gradient flow to a global minimum of the loss. Moreover, the convergence rate depends on two trajectory-specific quantities that are controlled by the weight initialization: the \emphimbalance matrices, which measure the difference between the weights of adjacent layers, and the least singular value of the \emphweight product $W=W_1W_2·s W_L$. Our analysis provides improved rate bounds for several multi-layer network models studied in the literature, leading to novel characterizations of the effect of weight imbalance on the rate of convergence. Our results apply to most regression losses and extend to classification ones.

@talk{mit-dl22,
  abstract = {The mysterious ability of gradient-based optimization algorithms to solve the non-convex neural network training problem is one of the many unexplained puzzles behind the success of deep learning in various applications. A promising direction to explain this phenomenon is to study how initialization and overparametrization affect the convergence of training algorithms. In this talk, we analyze the convergence of gradient flow on a multi-layer linear model with a loss function of the form $f(W_1W_2·s W_L)$. We show that when $f$ satisfies the gradient dominance property, proper weight initialization leads to exponential convergence of the gradient flow to a global minimum of the loss. Moreover, the convergence rate depends on two trajectory-specific quantities that are controlled by the weight initialization: the \emphimbalance matrices, which measure the difference between the weights of adjacent layers, and the least singular value of the \emphweight product $W=W_1W_2·s W_L$. Our analysis provides improved rate bounds for several multi-layer network models studied in the literature, leading to novel characterizations of the effect of weight imbalance on the rate of convergence. Our results apply to most regression losses and extend to classification ones.},
  date = {08/26/2022},
  day = {26},
  event = {Massachusetts Institute of Techonology},
  host = {Navid Azizan (MIT)},
  month = {08},
  role = {Lecture},
  title = {On the Convergence of Gradient Flow on Multi-layer Linear Models},
  url = {https://mallada.ece.jhu.edu/talks/202208-MIT-DL.pdf},
  year = {2022}
}

2022-07-14: Learning-based Analysis and Control of Safte-Critical Systems, Workshop on Autonomous Energy Systems, National Renewable Energy Laboratory.
[BibTeX] [Download PDF]

@talk{nrel-aes22,
  date = {07/14/2022},
  day = {14},
  event = {Workshop on Autonomous Energy Systems, National Renewable Energy Laboratory},
  host = {Andrey Berstein (NREL), Ahmed Zamzam (NREL), Bai Cui (NREL)},
  month = {07},
  role = {Speaker},
  title = {Learning-based Analysis and Control of Safte-Critical Systems},
  url = {https://mallada.ece.jhu.edu/talks/202207-NREL.pdf},
  year = {2022}
}

2022-05-26: Learning-based Analysis and Control of Safte-Critical Systems, University of California San Diego.
[BibTeX] [Download PDF]

@talk{ucsd22,
  date = {05/26/2022},
  day = {26},
  event = {University of California San Diego},
  host = {Jorge Cortés (UCSD)},
  month = {05},
  role = {Lecture},
  title = {Learning-based Analysis and Control of Safte-Critical Systems},
  url = {https://mallada.ece.jhu.edu/talks/202205-UCSD.pdf},
  year = {2022}
}

2022-05-27: Reinforcement Learning with Almost Sure Constraints, Information Theory and Applications Workshop.
[BibTeX] [Abstract] [Download PDF]

In this work, we study how to tackle decision-making for safety-critical systems under uncertainty. To that end, we formulate a Reinforcement Learning problem with Almost Sure constraints, in which one seeks a policy that allows no more than $Δınℕ$ unsafe events in any trajectory, with probability one. We argue that this type of constraint might be better suited for safety-critical systems as opposed to the usual average constraint employed in Constrained Markov Decision Processes and that, moreover, having constraints of this kind makes feasible policies much easier to find. The talk is didactically split into two parts, first considering $Δ=0$ and then the $Δ≥ 0$ case. At the core of our theory is a barrier-based decomposition of the Q-function that decouples the problems of optimality and feasibility and allows them to be learned either independently or in conjunction. We develop an algorithm for characterizing the set of all feasible policies that provably converges in expected finite time. We further develop sample-complexity bounds for learning this set with high probability. Simulations corroborate our theoretical findings and showcase how our algorithm can be wrapped around other learning algorithms to hasten the search for first feasible and then optimal policies.

@talk{ita22,
  abstract = {In this work, we study how to tackle decision-making for safety-critical systems under uncertainty. To that end, we formulate a Reinforcement Learning problem with Almost Sure constraints, in which one seeks a policy that allows no more than $Δınℕ$ unsafe events in any trajectory, with probability one. We argue that this type of constraint might be better suited for safety-critical systems as opposed to the usual average constraint employed in Constrained Markov Decision Processes and that, moreover, having constraints of this kind makes feasible policies much easier to find. The talk is didactically split into two parts, first considering $Δ=0$ and then the $Δ≥ 0$ case. At the core of our theory is a barrier-based decomposition of the Q-function that decouples the problems of optimality and feasibility and allows them to be learned either independently or in conjunction. We develop an algorithm for characterizing the set of all feasible policies that provably converges in expected finite time. We further develop sample-complexity bounds for learning this set with high probability. Simulations corroborate our theoretical findings and showcase how our algorithm can be wrapped around other learning algorithms to hasten the search for first feasible and then optimal policies.},
  date = {05/27/2022},
  day = {27},
  event = {Information Theory and Applications Workshop},
  host = {Christina Yu (Cornell)},
  month = {05},
  role = {Speaker},
  title = {Reinforcement Learning with Almost Sure Constraints},
  url = {https://mallada.ece.jhu.edu/talks/202205-ITA.pdf},
  year = {2022}
}

2022-05-04: Model Free Learning of Regions of Attraction via Recurrent Sets, MURI Workshop.
[BibTeX] [Abstract] [Download PDF]

In this talk, we develop model-free methods for analyzing dynamical systems using data. Our key insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. A set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We then leverage the notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. We first consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point without an explicit model of the dynamics. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then leverage this property to develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Our algorithms process samples sequentially, which allows them to continue being executed even after an initial offline training stage. We will finalize by providing some recent extensions of this work that generalizes Lyapunov’s Direct Method to allow for non-decreasing functions to certify stability and illustrate future research directions.

@talk{muri22,
  abstract = {In this talk, we develop model-free methods for analyzing dynamical systems using data. Our key insight is to replace the notion of invariance, a core concept in Lyapunov Theory, with the more relaxed notion of recurrence. A set is τ-recurrent (resp. k-recurrent) if every trajectory that starts within the set returns to it after at most τ seconds (resp. k steps). We then leverage the notion of recurrence to develop several analysis tools and algorithms to study dynamical systems. We first consider the problem of learning an inner approximation of the region of attraction (ROA) of an asymptotically stable equilibrium point without an explicit model of the dynamics. We show that a τ-recurrent set containing a stable equilibrium must be a subset of its ROA under mild assumptions. We then leverage this property to develop algorithms that compute inner approximations of the ROA using counter-examples of recurrence that are obtained by sampling finite-length trajectories. Our algorithms process samples sequentially, which allows them to continue being executed even after an initial offline training stage.  We will finalize by providing some recent extensions of this work that generalizes Lyapunov's Direct Method to allow for non-decreasing functions to certify stability and illustrate future research directions.},
  date = {05/04/2022},
  day = {04},
  event = {MURI Workshop},
  host = {Mario Sznaier (Northeastern), Necmiye Ozay (UMich)},
  month = {05},
  role = {Panelist},
  title = {Model Free Learning of Regions of Attraction via Recurrent Sets},
  url = {https://mallada.ece.jhu.edu/talks/202205-MURI.pdf},
  year = {2022}
}

2022-04-25: Embracing Low-Inertia in Power Systems: A Frequency Shaping Approach, University of California Berkeley.
[BibTeX] [Abstract] [Download PDF]

The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system’s low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. We define system-level performance metrics that are of practical relevance for power systems and systematically evaluate the performance of standard control strategies, such as virtual inertia and droop control, in the presence of power disturbances. Our analysis unveils the relatively limited role of inertia in improving performance and the inability of droop control to enhance performance without incurring considerable steady-state control effort. To overcome these limitations, we propose a novel frequency shaping control for grid-connected inverters -exploiting classical lead/lag compensation and model matching techniques from control theory- that can significantly outperform existing solutions while using comparable control effort.

@talk{berkeley22,
  abstract = {The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system's low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. We define system-level performance metrics that are of practical relevance for power systems and systematically evaluate the performance of standard control strategies, such as virtual inertia and droop control, in the presence of power disturbances. Our analysis unveils the relatively limited role of inertia in improving performance and the inability of droop control to enhance performance without incurring considerable steady-state control effort. To overcome these limitations, we propose a novel frequency shaping control for grid-connected inverters -exploiting classical lead/lag compensation and model matching techniques from control theory- that can significantly outperform existing solutions while using comparable control effort.},
  date = {04/25/2022},
  day = {25},
  event = {University of California Berkeley},
  host = {Murat Arcak (Berkeley)},
  month = {04},
  role = {Lecture},
  title = {Embracing Low-Inertia in Power Systems: A Frequency Shaping Approach},
  url = {https://mallada.ece.jhu.edu/talks/202204-Berkeley.pdf},
  year = {2022}
}

2022-04-11: Embracing Low Inertia for Power System Frequency Control: A Frequency Shaping Approach, ECE Seminar, University of Michigan.
[BibTeX] [Abstract] [Download PDF]

The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system’s low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. We define system-level performance metrics that are of practical relevance for power systems and systematically evaluate the performance of standard control strategies, such as virtual inertia and droop control, in the presence of power disturbances. Our analysis unveils the relatively limited role of inertia in improving performance and the inability of droop control to enhance performance without incurring considerable steady-state control effort. To overcome these limitations, we propose a novel frequency shaping control for grid-connected inverters -exploiting classical lead/lag compensation and model matching techniques from control theory- that can significantly outperform existing solutions while using comparable control effort.

@talk{umich22,
  abstract = {The transition of power systems from conventional synchronous generation towards renewable energy sources -with little or no inertia- is gradually threatening classical methods for achieving grid synchronization. A widely embraced approach to mitigate this problem is to mimic inertial response using grid-connected inverters. That is, to introduce virtual inertia to restore the stiffness that the system used to enjoy. In this talk, we seek to challenge this approach. We advocate taking advantage of the system's low inertia to restore grid synchronism without incurring excessive control efforts. To this end, we develop an analysis and design framework for inverter-based frequency control. We define system-level performance metrics that are of practical relevance for power systems and systematically evaluate the performance of standard control strategies, such as virtual inertia and droop control, in the presence of power disturbances. Our analysis unveils the relatively limited role of inertia in improving performance and the inability of droop control to enhance performance without incurring considerable steady-state control effort. To overcome these limitations, we propose a novel frequency shaping control for grid-connected inverters -exploiting classical lead/lag compensation and model matching techniques from control theory- that can significantly outperform existing solutions while using comparable control effort.},
  date = {04/11/2022},
  day = {11},
  event = {ECE Seminar, University of Michigan},
  host = {Johanna Mathieu},
  month = {04},
  role = {Lecture},
  title = {Embracing Low Inertia for Power System Frequency Control: A Frequency Shaping Approach},
  url = {https://mallada.ece.jhu.edu/talks/202204-UMich.pdf},
  year = {2022}
}

2022-03-30: Coherence and Concentration in Tightly-Connected Networks, Workshop on Synchronization in Complex Systems, Army Research Office.
[BibTeX] [Abstract] [Download PDF]

Achieving coordinated behavior— engineered or emergent—on networked systems has attracted widespread interest in several fields. This interest has led to remarkable advances in developing a theoretical understanding of the conditions under which agents within a network can reach an agreement (consensus) or develop coordinated behavior, such as synchronization. However, much less understood is the phenomenon of network coherence. Network coherence generally refers to nodes’ ability in a network to have a similar dynamic response despite heterogeneity in their behavior. In this talk, we present a general framework to analyze and quantify the level of network coherence that a system exhibits by relating coherence with a low-rank property. More precisely, for a networked system with linear dynamics and coupling, we show that the system transfer matrix converges to a rank-one transfer matrix representing the coherent behavior as the network connectivity grows. Interestingly, the non-zero eigenvalue of such a rank-one matrix is given by the harmonic mean of individual nodal dynamics, and we refer to it as the coherent dynamics. Our analysis unveils the frequency-dependent nature of coherence and a non-trivial interplay between dynamics and network topology. We further illustrate how this framework can be leveraged for obtaining accurate reduced-order models of coherent generators and tuning grid forming inverters to shape the coherent response of a power grid.

@talk{aro22,
  abstract = {Achieving coordinated behavior--- engineered or emergent---on networked systems has attracted widespread interest in several fields. This interest has led to remarkable advances in developing a theoretical understanding of the conditions under which agents within a network can reach an agreement (consensus) or develop coordinated behavior, such as synchronization. However, much less understood is the phenomenon of network coherence. Network coherence generally refers to nodes' ability in a network to have a similar dynamic response despite heterogeneity in their behavior. In this talk, we present a general framework to analyze and quantify the level of network coherence that a system exhibits by relating coherence with a low-rank property. More precisely, for a networked system with linear dynamics and coupling, we show that the system transfer matrix converges to a rank-one transfer matrix representing the coherent behavior as the network connectivity grows.  Interestingly, the non-zero eigenvalue of such a rank-one matrix is given by the harmonic mean of individual nodal dynamics, and we refer to it as the coherent dynamics. Our analysis unveils the frequency-dependent nature of coherence and a non-trivial interplay between dynamics and network topology.   We further illustrate how this framework can be leveraged for obtaining accurate reduced-order models of coherent generators and tuning grid forming inverters to shape the coherent response of a power grid.},
  date = {03/30/2022},
  day = {30},
  event = {Workshop on Synchronization in Complex Systems, Army Research Office},
  host = {Derya Cansever (ARO), Jorge Cortés (UCSD), Fabio Pasqualetti (UCR)},
  month = {03},
  role = {Speaker},
  title = {Coherence and Concentration in Tightly-Connected Networks},
  url = {https://mallada.ece.jhu.edu/talks/202203-ARO-Workshop.pdf},
  year = {2022}
}

2021

2021-11-03: Reinforcement Learning with Almost Sure Constraints, NSF TRIPODS PI Meeting.
[BibTeX] [Abstract] [Download PDF]

This talk aims to put forward the idea that learning to take safe actions in unknown environments (even with probability one guarantees) can be achieved without the need for an unbounded number of exploratory trials; provided that one is willing to relax its optimality requirements mildly. To this aim, we look at two settings aimed at illustrating the feasibility of this approach. We first focus on the canonical multi-armed bandit problem and seek to study the exploration-preservation trade-off intrinsic within safe learning. By defining a handicap metric that counts the number of unsafe actions, we provide an algorithm for discarding unsafe machines (or actions), with probability one, that achieves constant handicap. Our algorithm is rooted in the classical sequential probability ratio test, redefined here for continuing tasks. Under standard assumptions on sufficient exploration, our rule provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a trade-off between the number of rounds needed to secure the environment and the probability of discarding safe machines. We then study the problem of learning safe policies in the context of model-free constrained Markov decision processes. We propose the use of hard penalties/damage information, as a complement for rewards, that can be used to learn which actions lead to constraint violations. We show that such penalties naturally arise from a separation principle that decomposes the value and action-value functions into a reward component, and feasibility component–represented by a hard barrier function. We further develop an adaptive algorithm for learning this \emphbarrier function, which incorporates the damage information and gradually reveals the safety constraints. In the process of learning such a barrier function, the policy is adapted so as to avoid “bumping to the same rock twice”. Both algorithms can wrap around any other algorithm to optimize a specific auxiliary goal as they provide a safe environment to search for (approximately) optimal policies.

@talk{tripods21,
  abstract = {This talk aims to put forward the idea that learning to take safe actions in unknown environments (even with probability one guarantees) can be achieved without the need for an unbounded number of exploratory trials; provided that one is willing to relax its optimality requirements mildly. To this aim, we look at two settings aimed at illustrating the feasibility of this approach.

We first focus on the canonical multi-armed bandit problem and seek to study the exploration-preservation trade-off intrinsic within safe learning. By defining a handicap metric that counts the number of unsafe actions, we provide an algorithm for discarding unsafe machines (or actions), with probability one, that achieves constant handicap. Our algorithm is rooted in the classical sequential probability ratio test, redefined here for continuing tasks. Under standard assumptions on sufficient exploration, our rule provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a trade-off between the number of rounds needed to secure the environment and the probability of discarding safe machines. 

We then study the problem of learning safe policies in the context of model-free constrained Markov decision processes. We propose the use of hard penalties/damage information, as a complement for rewards, that can be used to learn which actions lead to constraint violations. We show that such penalties naturally arise from a separation principle that decomposes the value and action-value functions into a reward component, and feasibility component--represented by a hard barrier function. We further develop an adaptive algorithm for learning this \emphbarrier function, which incorporates the damage information and gradually reveals the safety constraints. In the process of learning such a barrier function, the policy is adapted so as to avoid ``bumping to the same rock twice''. 

Both algorithms can wrap around any other algorithm to optimize a specific auxiliary goal as they provide a safe environment to search for (approximately) optimal policies.},
  date = {11/03/2021},
  day = {03},
  event = {NSF TRIPODS PI Meeting},
  host = {Maryam Fazel (UW), Rene Vidal (JHU)},
  month = {11},
  role = {Speaker},
  title = {Reinforcement Learning with Almost Sure Constraints},
  url = {https://mallada.ece.jhu.edu/talks/202111-TRIPODS.pdf},
  year = {2021}
}

2021-10-27: Coherence and Concentration in Tightly Connected Networks, Data-based Diagnosis of Networked Dynamical Systems, CCS 2021 Satellite Symposium.
[BibTeX] [Download PDF]

@talk{css21,
  date = {10/27/2021},
  day = {27},
  event = {Data-based Diagnosis of Networked Dynamical Systems, CCS 2021 Satellite Symposium},
  host = {Melvyn Tyloo, Laurent Pagnier, Robinn Delabays},
  month = {10},
  role = {Speaker},
  title = {Coherence and Concentration in Tightly Connected Networks},
  url = {https://mallada.ece.jhu.edu/talks/202110-CSS.pdf},
  year = {2021}
}

2021-09-09: Coherence and Concentration in Tightly Connected Networks, Resilient Autonomous Energy Systems Workshop, National Renewable Energy Laboratory.
[BibTeX] [Download PDF]

@talk{nrel21,
  date = {09/09/2021},
  day = {09},
  event = {Resilient Autonomous Energy Systems Workshop, National Renewable Energy Laboratory},
  host = {Andrey Berstein (NREL), Bai Cui (NREL)},
  month = {09},
  role = {Speaker},
  title = {Coherence and Concentration in Tightly Connected Networks},
  url = {https://mallada.ece.jhu.edu/talks/202109-NREL.pdf},
  year = {2021}
}

2021-04-13: Incentive Analysis and Coordination Design for Multi-Timescale Electricity Markets, Epstein Institute Seminar, University of Southern California.
[BibTeX] [Abstract] [Download PDF]

This talk discusses incentives and coordination requirements that arise when heterogeneous participants bid in electricity markets that operate at different timescales. First, we consider the conventional timescales of market clearing, spanning 5 minutes to several hours ahead, and investigate the incentives for price manipulation that market participants (generators and loads) have in a two-stage settlement market. Our analysis unveils the importance of accounting for both generators’ and loads’ strategic behavior in two-stage markets, even when the consumers’ demand is inelastic! Precisely, we show that loads can exploit generators’ strategic bidding and maintain a systematic difference between the forward and spot prices, the latter being higher than the former. Such a strategy does bring down demand-side payments and undermines supply-side market power. Second, we consider the problem of co-optimizing generation resources with different timescale characteristics. To that end, we frame and study a joint problem that optimizes both slow-timescale economic dispatch resources and fast-timescale frequency regulation resources. We provide sufficient conditions to optimally decompose the joint problem into slow and fast timescale problems. These slow and fast timescale problems have appealing interpretations as the economic dispatch and frequency regulation problems, respectively. We further provide a market implementation for the fast-timescale problem. In this implementation, participants receive prices and dispatch and dynamically update their bids according to either a dynamic gradient play or best response. Under price-taking assumptions, our market implementation is guaranteed to converge to the optimal (efficient) allocation even in the presence of generator dynamics. A by-product of this solution is that frequency restoration and thermal limits are automatically guaranteed.

@talk{epstein21,
  abstract = {This talk discusses incentives and coordination requirements that arise when heterogeneous participants bid in electricity markets that operate at different timescales. First, we consider the conventional timescales of market clearing, spanning 5 minutes to several hours ahead, and investigate the incentives for price manipulation that market participants (generators and loads) have in a two-stage settlement market.  Our analysis unveils the importance of accounting for both generators' and loads' strategic behavior in two-stage markets, even when the consumers' demand is inelastic! Precisely, we show that loads can exploit generators' strategic bidding and maintain a systematic difference between the forward and spot prices, the latter being higher than the former. Such a strategy does bring down demand-side payments and undermines supply-side market power. Second, we consider the problem of co-optimizing generation resources with different timescale characteristics. To that end, we frame and study a joint problem that optimizes both slow-timescale economic dispatch resources and fast-timescale frequency regulation resources. We provide sufficient conditions to optimally decompose the joint problem into slow and fast timescale problems. These slow and fast timescale problems have appealing interpretations as the economic dispatch and frequency regulation problems, respectively. We further provide a market implementation for the fast-timescale problem. In this implementation, participants receive prices and dispatch and dynamically update their bids according to either a dynamic gradient play or best response. Under price-taking assumptions, our market implementation is guaranteed to converge to the optimal (efficient) allocation even in the presence of generator dynamics. A by-product of this solution is that frequency restoration and thermal limits are automatically guaranteed.},
  date = {04/13/2021},
  day = {13},
  event = {Epstein Institute Seminar, University of Southern California},
  host = {Jong-Shi Pang (USC), Suvrajeet Sen (USC)},
  month = {04},
  role = {Speaker},
  title = {Incentive Analysis and Coordination Design for Multi-Timescale Electricity Markets},
  url = {https://mallada.ece.jhu.edu/talks/202104-Epstein.pdf},
  year = {2021}
}