Many chemical reactions and molecular processes occur on time scales that are significantly longer than those accessible by direct simulations. One successful approach to estimating dynamical statistics for such processes is to use many short time series of observations of the system to construct a Markov state model, which approximates the dynamics of the system as memoryless transitions between a set of discrete states. The dynamical Galerkin approximation (DGA) is a closely related framework for estimating dynamical statistics, such as committors and mean first passage times, by approximating solutions to their equations with a projection onto a basis. Because the projected dynamics are generally not memoryless, the Markov approximation can result in significant systematic errors. Inspired by quasi-Markov state models, which employ the generalized master equation to encode memory resulting from the projection, we reformulate DGA to account for memory and analyze its performance on two systems: a two-dimensional triple well and the AIB9 peptide. We demonstrate that our method is robust to the choice of basis and can decrease the time series length required to obtain accurate kinetics by an order of magnitude.

1.
N.
Guttenberg
,
J. F.
Dama
,
M. G.
Saunders
,
G. A.
Voth
,
J.
Weare
, and
A. R.
Dinner
, “
Minimizing memory as an objective for coarse-graining
,”
J. Chem. Phys.
138
,
094111
(
2013
).
2.
W.
Liebert
and
H. G.
Schuster
, “
Proper choice of the time delay for the analysis of chaotic time series
,”
Phys. Lett. A
142
,
107
111
(
1989
).
3.
H.
Arbabi
and
I.
Mezić
, “
Ergodic theory, dynamic mode decomposition, and computation of spectral properties of the Koopman operator
,”
SIAM J. Appl. Dyn. Syst.
16
,
2096
2126
(
2017
).
4.
S.
Das
and
D.
Giannakis
, “
Delay-coordinate maps and the spectra of Koopman operators
,”
J. Stat. Phys.
175
,
1107
1145
(
2019
).
5.
M.
Kamb
,
E.
Kaiser
,
S. L.
Brunton
, and
J. N.
Kutz
, “
Time-delay observables for Koopman: Theory and applications
,”
SIAM J. Appl. Dyn. Syst.
19
,
886
917
(
2020
).
6.
E. H.
Thiede
,
D.
Giannakis
,
A. R.
Dinner
, and
J.
Weare
, “
Galerkin approximation of dynamical quantities using trajectory data
,”
J. Chem. Phys.
150
,
244111
(
2019
).
7.
J.
Strahan
,
A.
Antoszewski
,
C.
Lorpaiboon
,
B. P.
Vani
,
J.
Weare
, and
A. R.
Dinner
, “
Long-time-scale predictions from short-trajectory data: A benchmark analysis of the trp-cage miniprotein
,”
J. Chem. Theory Comput.
17
,
2948
2963
(
2021
).
8.
H.
Mori
, “
Transport, collective motion, and Brownian motion
,”
Prog. Theor. Phys.
33
,
423
455
(
1965
).
9.
R.
Zwanzig
, “
From classical dynamics to continuous time random walks
,”
J. Stat. Phys.
30
,
255
262
(
1983
).
10.
M.
Berkowitz
,
J. D.
Morgan
,
D. J.
Kouri
, and
J. A.
McCammon
, “
Memory kernels from molecular dynamics
,”
J. Chem. Phys.
75
,
2462
2463
(
1981
).
11.
M.
Berkowitz
,
J. D.
Morgan
, and
J. A.
McCammon
, “
Generalized Langevin dynamics simulations with arbitrary time-dependent memory kernels
,”
J. Chem. Phys.
78
,
3256
3261
(
1983
).
12.
A.
Perico
,
R.
Pratolongo
,
K. F.
Freed
,
R. W.
Pastor
, and
A.
Szabo
, “
Positional time correlation function for one-dimensional systems with barrier crossing: Memory function corrections to the optimized Rouse–Zimm approximation
,”
J. Chem. Phys.
98
,
564
573
(
1993
).
13.
K. S.
Kostov
and
K. F.
Freed
, “
Mode coupling theory for calculating the memory functions of flexible chain molecules: Influence on the long time dynamics of oligoglycines
,”
J. Chem. Phys.
106
,
771
783
(
1997
).
14.
K. S.
Kostov
and
K. F.
Freed
, “
Long-time dynamics of met-enkephalin: Comparison of theory with Brownian dynamics simulations
,”
Biophys. J.
76
,
149
163
(
1999
).
15.
A. J.
Chorin
,
O. H.
Hald
, and
R.
Kupferman
, “
Optimal prediction with memory
,”
Physica D
166
,
239
257
(
2002
).
16.
H.
Lei
,
N. A.
Baker
, and
X.
Li
, “
Data-driven parameterization of the generalized Langevin equation
,”
Proc. Natl. Acad. Sci. U. S. A.
113
,
14183
14188
(
2016
).
17.
F.
Grogan
,
H.
Lei
,
X.
Li
, and
N. A.
Baker
, “
Data-driven molecular modeling with the generalized Langevin equation
,”
J. Comput. Phys.
418
,
109633
(
2020
).
18.
B. J.
Berne
,
J. P.
Boon
, and
S. A.
Rice
, “
On the calculation of autocorrelation functions of dynamical variables
,”
J. Chem. Phys.
45
,
1086
1096
(
1966
).
19.
J.-P.
Boon
and
S. A.
Rice
, “
Memory effects and the autocorrelation function of a dynamical variable
,”
J. Chem. Phys.
47
,
2480
2490
(
1967
).
20.
C.
Ayaz
,
L.
Tepper
,
F. N.
Brünig
,
J.
Kappler
,
J. O.
Daldrop
, and
R. R.
Netz
, “
Non-Markovian modeling of protein folding
,”
Proc. Natl. Acad. Sci. U. S. A.
118
,
e2023856118
(
2021
).
21.
O. F.
Lange
and
H.
Grubmüller
, “
Collective Langevin dynamics of conformational motions in proteins
,”
J. Chem. Phys.
124
,
214903
(
2006
).
22.
H. S.
Lee
,
S.-H.
Ahn
, and
E. F.
Darve
, “
The multi-dimensional generalized Langevin equation for conformational motion of proteins
,”
J. Chem. Phys.
150
,
174113
(
2019
).
23.
Y. T.
Lin
,
Y.
Tian
,
D.
Livescu
, and
M.
Anghel
, “
Data-driven learning for the Mori–Zwanzig formalism: A Generalization of the Koopman learning framework
,”
SIAM J. Appl. Dyn. Syst.
20
,
2558
2601
(
2021
).
24.
H.
Vroylandt
,
L.
Goudenège
,
P.
Monmarché
,
F.
Pietrucci
, and
B.
Rotenberg
, “
Likelihood-based non-Markovian models from molecular dynamics
,”
Proc. Natl. Acad. Sci. U. S. A.
119
,
e2117586119
(
2022
).
25.
D.
Aristoff
,
M.
Johnson
, and
D.
Perez
, “
Arbitrarily accurate, nonparametric coarse graining with Markov renewal processes and the Mori–Zwanzig formulation
,”
AIP Adv.
13
,
095131
(
2023
).
26.
Y. T.
Lin
,
Y.
Tian
,
D.
Perez
, and
D.
Livescu
, “
Regression-based projection for learning Mori–Zwanzig operators
,”
SIAM J. Appl. Dyn. Syst.
22
,
2890
2926
(
2023
).
27.
S.
Cao
,
A.
Montoya-Castillo
,
W.
Wang
,
T. E.
Markland
, and
X.
Huang
, “
On the advantages of exploiting memory in Markov state models for biomolecular dynamics
,”
J. Chem. Phys.
153
,
014105
(
2020
).
28.
S.
Cao
,
Y.
Qiu
,
M. L.
Kalin
, and
X.
Huang
, “
Integrative generalized master equation: A method to study long-timescale biomolecular dynamics via the integrals of memory kernels
,”
J. Chem. Phys.
159
,
134106
(
2023
).
29.
A. J.
Dominic
III
,
T.
Sayer
,
S.
Cao
,
T. E.
Markland
,
X.
Huang
, and
A.
Montoya-Castillo
, “
Building insightful, memory-enriched models to capture long-time biochemical processes from short-time simulations
,”
Proc. Natl. Acad. Sci. U. S. A.
120
,
e2221048120
(
2023
).
30.
F.
Noé
,
C.
Schütte
,
E.
Vanden-Eijnden
,
L.
Reich
, and
T. R.
Weikl
, “
Constructing the equilibrium ensemble of folding pathways from short off-equilibrium simulations
,”
Proc. Natl. Acad. Sci. U. S. A.
106
,
19011
19016
(
2009
).
31.
J.
Strahan
,
S. C.
Guo
,
C.
Lorpaiboon
,
A. R.
Dinner
, and
J.
Weare
, “
Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction
,”
J. Chem. Phys.
159
,
014110
(
2023
).
32.
E.
Darve
,
J.
Solomon
, and
A.
Kia
, “
Computing generalized Langevin equations and generalized Fokker–Planck equations
,”
Proc. Natl. Acad. Sci. U. S. A.
106
,
10884
10889
(
2009
).
33.
J.
Cerrillo
and
J.
Cao
, “
Non-Markovian dynamical maps: Numerical processing of open quantum trajectories
,”
Phys. Rev. Lett.
112
,
110401
(
2014
).
34.
H.
Wu
,
F.
Nüske
,
F.
Paul
,
S.
Klus
,
P.
Koltai
, and
F.
Noé
, “
Variational Koopman models: Slow collective variables and molecular kinetics from short off-equilibrium simulations
,”
J. Chem. Phys.
146
,
154104
(
2017
).
35.
P.
Metzner
,
C.
Schütte
, and
E.
Vanden-Eijnden
, “
Illustration of transition path theory on a collection of simple examples
,”
J. Chem. Phys.
125
,
084110
(
2006
).
36.
P.
Reimann
,
G. J.
Schmid
, and
P.
Hänggi
, “
Universal equivalence of mean first-passage time and Kramers rate
,”
Phys. Rev. E
60
,
R1
R4
(
1999
).
37.
S.
Buchenberg
,
N.
Schaudinnus
, and
G.
Stock
, “
Hierarchical biomolecular dynamics: Picosecond hydrogen bonding regulates microsecond conformational transitions
,”
J. Chem. Theory Comput.
11
,
1330
1336
(
2015
).
38.
A.
Perez
,
F.
Sittel
,
G.
Stock
, and
K.
Dill
, “
MELD-path efficiently computes conformational transitions, including multiple and diverse paths
,”
J. Chem. Theory Comput.
14
,
2109
2116
(
2018
).
39.
P.
Eastman
,
J.
Swails
,
J. D.
Chodera
,
R. T.
McGibbon
,
Y.
Zhao
,
K. A.
Beauchamp
,
L.-P.
Wang
,
A. C.
Simmonett
,
M. P.
Harrigan
,
C. D.
Stern
,
R. P.
Wiewiora
,
B. R.
Brooks
, and
V. S.
Pande
, “
OpenMM 7: Rapid development of high performance algorithms for molecular dynamics
,”
PLOS Comput. Biol.
13
,
e1005659
(
2017
).
40.
G. A.
Khoury
,
J.
Smadbeck
,
P.
Tamamis
,
A. C.
Vandris
,
C. A.
Kieslich
, and
C. A.
Floudas
, “
Forcefield_NCAA: Ab initio charge parameters to aid in the discovery and design of therapeutic proteins and peptides with unnatural amino acids and their application to complement inhibitors of the compstatin family
,”
ACS Synth. Biol.
3
,
855
869
(
2014
).
41.
C. W.
Hopkins
,
S.
Le Grand
,
R. C.
Walker
, and
A. E.
Roitberg
, “
Long-time-step molecular dynamics through hydrogen mass repartitioning
,”
J. Chem. Theory Comput.
11
,
1864
1874
(
2015
).
42.
H.
Nguyen
,
D. R.
Roe
, and
C.
Simmerling
, “
Improved generalized Born solvent model parameters for protein simulations
,”
J. Chem. Theory Comput.
9
,
2020
2034
(
2013
).
43.
F.
Sittel
,
T.
Filk
, and
G.
Stock
, “
Principal component analysis on a torus: Theory and application to protein dynamics
,”
J. Chem. Phys.
147
,
244101
(
2017
).
44.
C.
Lorpaiboon
,
E. H.
Thiede
,
R. J.
Webber
,
J.
Weare
, and
A. R.
Dinner
, “
Integrated variational approach to conformational dynamics: A robust strategy for identifying eigenfunctions of dynamical operators
,”
J. Phys. Chem. B
124
,
9354
9364
(
2020
).
45.
J.
Strahan
,
J.
Finkel
,
A. R.
Dinner
, and
J.
Weare
, “
Predicting rare events using neural networks and short-trajectory data
,”
J. Comput. Phys.
488
,
112152
(
2023
).
You do not currently have access to this content.