Skip to content

HIV protease

This guide focuses on using Amber to simulate HIV-1 protease, a crucial enzyme in the life cycle of the human immunodeficiency virus (HIV) and a prime target for antiretroviral therapy. The following sections will cover the practical aspects of using Amber for HIV-1 protease simulations, including system preparation, force field selection, simulation protocols, and data analysis techniques.

Background

HIV-1 protease plays a critical role in the maturation of HIV particles, making it an essential component in the virus's replication process. HIV-1 protease enables the formation of infectious viral particles by cleaving viral polyproteins into functional proteins. Understanding this enzyme's structure, dynamics, and function is paramount for developing effective inhibitors and advancing HIV treatment strategies.

The significance of HIV-1 protease extends beyond its immediate role in viral replication:

  1. Public health impact: HIV/AIDS remains a global health challenge, affecting millions of people worldwide. Insights gained from studying HIV-1 protease contribute to the ongoing efforts to combat this pandemic.
  2. Drug design: As a prime target for antiretroviral drugs, detailed knowledge of HIV-1 protease's structure and dynamics is crucial for rational drug design and developing new therapeutic strategies.
  3. Resistance mechanisms: HIV's rapid mutation rate often leads to drug resistance. MD simulations can help elucidate the molecular basis of these resistance mechanisms, informing the design of more robust inhibitors.

From a biophysical perspective, HIV-1 protease presents several exciting characteristics that make it an excellent subject for MD simulations:

  1. Structural flexibility: The enzyme exhibits significant conformational changes during its catalytic cycle, particularly in the "flap" regions that control access to the active site. MD simulations can capture these dynamic processes, providing insights into the enzyme's mechanism.
  2. Homodimeric nature: HIV-1 protease functions as a homodimer, with the active site formed at the interface between two identical subunits. This symmetry adds an interesting dimension to the simulation setup and analysis.
  3. Substrate specificity: The enzyme recognizes and cleaves specific sequences in viral polyproteins. MD simulations can help elucidate the molecular basis of this specificity and how mutations might affect it.
  4. Water-Mediated Interactions: Water molecules play a crucial role in the enzyme's catalytic mechanism and mediating protein-ligand interactions. Explicit solvent MD simulations are particularly valuable for studying these effects.
  5. Allosteric effects: Recent studies have suggested the presence of allosteric sites in HIV-1 protease, opening new avenues for drug design. MD simulations can help identify and characterize these sites.

By running Amber simulations on HIV-1 protease, researchers can gain valuable insights into these biophysical characteristics, contributing to fundamental science and applied research in drug discovery.

Understood. I appreciate the flexibility to reorganize and expand on the information. Here's a revised version of the section, incorporating your notes while adding some additional relevant information:

Protein preparation

System preparation is a crucial step in molecular dynamics (MD) simulations that significantly impacts the quality and reliability of results. This process involves selecting an appropriate starting structure and setting up the system to accurately represent physiological conditions. Proper preparation minimizes artifacts and ensures that simulations reflect the true behavior of the protein in its biological context.

Protein selection

By meticulously selecting a starting structure based on these criteria, we establish a strong foundation for our MD simulations of HIV-1 protease. This careful selection process increases the likelihood of obtaining biologically relevant and computationally robust results, which are crucial for understanding the protein's behavior and for applications such as drug design and the study of resistance mechanisms. Remember that the chosen structure will serve as the basis for all subsequent steps in the simulation process, including system setup, energy minimization, and production runs. Therefore, the time invested in selecting an appropriate structure is well spent and can save considerable effort and computational resources in the long run.

We will use the Protein Data Bank (PDB) as our primary source of structural data. When selecting a structure, consider the following criteria:

  1. Resolution: Prioritize high-resolution structures (typically < 2.0 Å) for more accurate atomic positions. Lower resolution (i.e., > 2.0 Å) structures may lead to simulation inaccuracies due to less precise atomic coordinates.
  2. Validation Scores: Use the PDB validation reports to assess structure quality. Key metrics include:
    • Clashscore: Lower values indicate fewer steric clashes between atoms.
    • Ramachandran outliers: Fewer outliers suggest better backbone geometry.
    • Rotamer outliers: Fewer outliers indicate more reliable side-chain conformations.
    • Overall quality at a glance: Provides a quick assessment of the structure's quality.
  3. Completeness: Choose structures with minimal missing atoms or residues to reduce uncertainties in the simulation.
  4. Apo structure: Use a ligand-free (apo) structure to study the intrinsic dynamics of HIV-1 protease without bias from bound ligands or inhibitors.
  5. HIV variant: Select a structure representing the most common HIV variant, such as HIV-1 subtype B, which is prevalent in North America, Western Europe, and Australia.
  6. Experimental method: While X-ray crystallography is common, consider high-quality structures from other methods like cryo-electron microscopy (cryo-EM) or NMR spectroscopy if they offer advantages in physiological relevance or completeness.
  7. Publication date and citations: Balance recent structures benefiting from improved techniques with well-established, highly cited structures that allow comparison with previous studies.
  8. Physiological relevance: Prefer structures determined under conditions mimicking physiological environments, such as appropriate pH and temperature.
  9. Authors and laboratory reputation: Structures from laboratories with expertise in HIV protease can provide additional confidence in data quality.

To begin the search, we used the following parameters in the PDB:

  • Full Text: HIV Protease
  • Polymer Entity Type is Protein
  • Refinement Resolution is between 0.5 to 2 (upper included).
  • Enzyme Classification Name is Hydrolases.
  • Scientific Name of the Source Organism is Human immunodeficiency virus 1.

Note

It's worth noting that while we aim for an apo structure to avoid bias, in some cases, the highest quality available structures may be ligand-bound. If this is the case, we'll need to carefully remove the ligand and consider running a short equilibration simulation to allow the protein to relax into its unbound state.

You can go here to view the search results.. At the time of writing there were 302 results, but there are only a few that do not have any drug-like ligands: 1TW7, 2PC0, and 2G69. Both 1TW7 and 2G69 have one or more mutations studying drug resistance mechanism, so we will ignore these for now. This leaves 2PC0 as our protein.

Trimming PDB file

Prelude
HEADER    HYDROLASE                               29-MAR-07   XXXX
TITLE     APO WILD-TYPE HIV PROTEASE IN THE OPEN CONFORMATION
KEYWDS    HIV PROTEASE, HYDROLASE
EXPDTA    X-RAY DIFFRACTION
AUTHOR    H.HEASLET,R.ROSENFELD,M.J.GIFFIN,J.H.ELDER,D.E.MCREE,C.D.STOUT
JRNL        AUTH   H.HEASLET,R.ROSENFELD,M.GIFFIN,Y.C.LIN,K.TAM,B.E.TORBETT,
JRNL        AUTH 2 J.H.ELDER,D.E.MCREE,C.D.STOUT
JRNL        TITL   CONFORMATIONAL FLEXIBILITY IN THE FLAP DOMAINS OF
JRNL        TITL 2 LIGAND-FREE HIV PROTEASE.
JRNL        REF    ACTA CRYSTALLOGR.,SECT.D      V.  63   866 2007
JRNL        REFN                   ISSN 0907-4449
JRNL        PMID   17642513
JRNL        DOI    10.1107/S0907444907029125
SEQRES   1 A   99  PRO GLN ILE THR LEU TRP LYS ARG PRO LEU VAL THR ILE
SEQRES   2 A   99  LYS ILE GLY GLY GLN LEU LYS GLU ALA LEU LEU ASP THR
SEQRES   3 A   99  GLY ALA ASP ASP THR VAL LEU GLU GLU MET ASN LEU PRO
SEQRES   4 A   99  GLY ARG TRP LYS PRO LYS MET ILE GLY GLY ILE GLY GLY
SEQRES   5 A   99  PHE ILE LYS VAL ARG GLN TYR ASP GLN ILE LEU ILE GLU
SEQRES   6 A   99  ILE CYS GLY HIS LYS ALA ILE GLY THR VAL LEU VAL GLY
SEQRES   7 A   99  PRO THR PRO VAL ASN ILE ILE GLY ARG ASN LEU LEU THR
SEQRES   8 A   99  GLN ILE GLY CYS THR LEU ASN PHE
HETNAM      MG MAGNESIUM ION
HETNAM     PGR R-1,2-PROPANEDIOL
FORMUL   2   MG    MG 2+
FORMUL   3  PGR    C3 H8 O2
FORMUL   4  HOH   *112(H2 O)
HELIX    1   1 GLY A 1086  ILE A 1093  1                                   8
SHEET    1   A 8 TRP A1042  GLY A1048  0
SHEET    2   A 8 PHE A1053  ILE A1066 -1  O  GLN A1058   N  LYS A1043
SHEET    3   A 8 HIS A1069  VAL A1077 -1  O  VAL A1077   N  ARG A1057
SHEET    4   A 8 VAL A1032  LEU A1033  1  N  LEU A1033   O  LEU A1076
SHEET    5   A 8 ILE A1084  ILE A1085 -1  O  ILE A1084   N  VAL A1032
SHEET    6   A 8 GLN A1018  LEU A1024  1  N  LEU A1023   O  ILE A1085
SHEET    7   A 8 LEU A1010  ILE A1015 -1  N  ILE A1013   O  LYS A1020
SHEET    8   A 8 PHE A1053  ILE A1066 -1  O  GLU A1065   N  LYS A1014
LINK         O   HOH A   1                MG    MG A 201     1555   1555  2.04
LINK         O   HOH A   1                MG    MG A 201     7555   1555  2.23
LINK         O   HOH A  50                MG    MG A 201     1555   1555  2.69
LINK         O   HOH A  50                MG    MG A 201     7555   1555  1.53
LINK         O   HOH A  62                MG    MG A 201     1555   1555  1.94
LINK         O   HOH A  62                MG    MG A 201     7555   1555  2.31
LINK         O   HOH A  85                MG    MG A 201     1555   1555  2.15
LINK         O   HOH A  85                MG    MG A 201     7555   1555  2.26
SITE     1 AC1  4 HOH A   1  HOH A  50  HOH A  62  HOH A  85
SITE     1 AC2  6 HOH A   7  HOH A 100  GLY A1052  PHE A1053
SITE     2 AC2  6 LEU A1063  ILE A1072
CRYST1   46.416   46.416  101.370  90.00  90.00  90.00 P 41 21 2     8
ORIGX1      1.000000  0.000000  0.000000        0.00000
ORIGX2      0.000000  1.000000  0.000000        0.00000
ORIGX3      0.000000  0.000000  1.000000        0.00000
SCALE1      0.021544  0.000000  0.000000        0.00000
SCALE2      0.000000  0.021544  0.000000        0.00000
SCALE3      0.000000  0.000000  0.009865        0.00000
Temperature factors
ATOM      1  N   PRO A1001      -4.475 -10.469  -7.839  1.00 23.99           N
ANISOU    1  N   PRO A1001     3081   2999   3032     54    -47    -99       N
ATOM      2  CA  PRO A1001      -3.664 -11.208  -6.887  1.00 23.50           C
ANISOU    2  CA  PRO A1001     3061   2870   2998     26    -18    -31       C

Removing water molecules

HETATM  842  O   HOH A   1       5.851   5.385  -0.099  0.50 18.43           O
ANISOU  842  O   HOH A   1     2309   3264   1429   -227    183    401       O
HETATM  843  O   HOH A   2      -5.279   8.960 -13.040  1.00  7.16           O
ANISOU  843  O   HOH A   2      525   1875    320    308    -23     -5       O
sed -i "/HOH/d" "$OUTPUT_PATH"

Removing non-protein molecules

sed -i "/HOH/d" "$OUTPUT_PATH"

Result

System preparation

TODO: Introduce tleap and amber