Binding Affinity Calculator

Application description

The Binding Affinity Calculator (BAC), developed by the team of Prof Peter Coveney at University College London (UK), is a workflow tool that runs and analyses simulations designed to assess how well drugs bind to their target proteins and the impact of changes to those proteins. It is a collection of scripts which wrap around common molecular dynamics codes to facilitate free energy calculations. Use of ensemble simulations to robust, accurate and precise free energy computations from both alchemical and end-point analysis methodologies. 

BAC or Binding Affinity Calculator encompasses two highly related methodologies. These two methodologies are Thermodynamic Integration with Enhanced Sampling (TIES) and Enhanced Sampling of Molecular dynamics with Approximation of Continuum Solvent (ESMACS). Centrally both these methods make use of ensembles of molecular dynamics (MD) simulations to control for the inherently chaotic nature of the trajectories MD produces. The trajectories collected can be used to calculate the binding affinity of ligand and protein. These affinities are calculated as a free energy of binding which is a critical quantity relevant in drug design and personalized medicine. The free energy difference between two states is an important quantity in biology, because it determines the ensemble at equilibrium. The binding free energy of an inhibitor often correlates to its effectiveness. As such the accurate prediction of binding free energies has been a longstanding goal of computational methods. The two protocols are complementary in that ESMACS is an absolute free energy method for the ranking of binding affinities for highly diverse compounds, whereas TIES is a relative free energy method for the estimation of free energy differences for pairs of similar (congeneric) compounds and/or mutated protein sequences.

TIES aims to alleviate many of the bottlenecks in relative binding free energy calculations however, due to the complexity and large computational cost of running these calculations TIES is targeted at expert users experienced in alchemical free energy calculation. We provide tutorials on how to run calculations using TIES via our online documentation for the project. BAC is a fairly complex tool to use, so at the moment the development team at UCL have made it available as part of consulting services or research collaborations.

 

Technical specifications

TIES is written in Python and as such is relatively hardware agnostic. The Python code will call external programs such as AMBERTools [1] or OpenMM [2] and these codes are less hardware agnostic. The external programs which TIES uses can be interchanged for example the MD engine OpenMM can be swapped for NAMD [3] and this allows TIES the flexibility to exploit Intel/AMD/IBM CPUs as well as NVIDIA/AMD GPUs. The main dependencies of TIES are then these external MD engines previously mentioned as OpenMM and NAMD.

For ESMACS, AmberTools [1] is required for the post-processing analysis to extract the parameters of interest.

Typically, the resulting trajectories from a TIES or ESMACS calculation would be inspected using visualization software such as PyMol [4] or VMD [5]. No other external tools are strictly necessary to run TIES. 

 

Cited references:

[1] Salomon Ferrer, R., Case, D.A. and Walker, R.C., 2013. An overview of the Amber biomolecular simulation package. Wiley Interdisciplinary Reviews: Computational Molecular Science, 3(2), pp.198-210.

[2] P. Eastman, J. Swails, J. D. Chodera, R. T. McGibbon, Y. Zhao, K. A. Beauchamp, L.-P. Wang, A. C. Simmonett, M. P. Harrigan, C. D. Stern, R. P. Wiewiora, B. R. Brooks, and V. S. Pande. “OpenMM 7: Rapid development of high performance algorithms for molecular dynamics.” PLOS Comp. Biol. 13(7): e1005659. (2017)

[3] James C. Phillips, David J. Hardy, Julio D. C. Maia, John E. Stone, Joao V. Ribeiro, Rafael C. Bernardi, Ronak Buch, Giacomo Fiorin, Jerome Henin, Wei Jiang, Ryan McGreevy, Marcelo C. R. Melo, Brian K. Radak, Robert D. Skeel, Abhishek Singharoy, Yi Wang, Benoit Roux, Aleksei Aksimentiev, Zaida Luthey-Schulten, Laxmikant V. Kale, Klaus Schulten, Christophe Chipot, and Emad Tajkhorshid. Scalable molecular dynamics on CPU and GPU architectures with NAMD. Journal of Chemical Physics, 153:044130, 2020. doi:10.1063/5.0014475

[4] DeLano, W.L., 2002. Pymol: An open-source molecular graphics tool. CCP4 Newsletter on protein crystallography, 40(1), pp.82-92

[5] Humphrey, W., Dalke, A. and Schulten, K., 1996. VMD: visual molecular dynamics. Journal of molecular graphics, 14(1), pp.33-38.

 

HPC usage and parallel performance

TIES and ESMACS have been deployed on numerous HPC systems including Summit OLCF, Theta GPU ALCF and Longhorn TACC. TIES and ESMACS use a large ensemble of runs to distribute the computation to a large number of CPU cores or GPUs with no communication between individual replicas in the ensemble. A caveat to this is that a one CPU based instance of TIES/ESMACS using NAMD could itself be run in parallel using a hybrid OpenMP and MPI approach however, this relies on the parallel performance of NAMD which is outside the scope of this project.  Linear scaling has been measured up to 240 GPUs on the Summit HPC system. Testing beyond this number of GPUs was not performed.

HPC infrastructures are critical to the application of TIES and ESMACS. A calculation of one binding affinity with TIES would typically use 120 v100 NVIDIA GPU for a wall time of 1-3 hours and one study may perform the calculation of tens to hundreds of binding affinities. ESMACS is typically performed with an ensemble of 25 replicas, which uses 25 or 50 compute nodes on HPC resources for a wall clock time of 3-6 hours, depending on the HPC architecture and the size of molecular systems. As such TIES/ESMACS can almost exclusively be applied in the large-scale supercomputers. The motivation for running calculations in this massively parallel way is twofold. First, many replicas of the simulation must be run to appropriately control for the aleatoric error in the MD trajectories. Secondly, binding affinities must be calculated in a timely fashion, on the order of hours, to be actionable in a clinical domain. 

The exploitation of alternate hardware and or the more efficient use of hardware via optimization would be provided by the ongoing improvements to the MD engines which underpin the TIES and ESMACS methodologies. These improvements would most likely be made by the respective developers of the MD engines. No co-design related activities are planned at present for TIES or ESMACS, however this could be a route to improving performance with TIES/ESMACS e.g., joint efforts with NAMD or OpenMM developers to improve the performance of binding free energy calculation would yield performance improvements for TIES and ESMACS.

Related articles

  • Wan S et al. 2020, Rapid, accurate, precise and reproducible ligand–protein binding free energy prediction. DOI
  • Zasada SJ et al. 2020, Large-scale binding affinity calculations on commodity compute clouds. DOI
  • Sadiq SK et al. 2008, Automated Molecular Simulation Based Binding Affinity Calculator for Ligand-Bound HIV-1 Proteases. DOI
For more information about the applications supported in CompBioMed, you can contact us at "software at compbiomed.eu".