Loop Modelling
Welcome to the Examples page of ProDA (Protein Design Assistant)

Example 1 (β-hairpin in the rhino virus coat protein 2 (PDBid 4rhv))

We pretend that the six residue tip of this hairpin between the residues 16 and 23 isn't known and we try to find it back in a database from which all virus structures were removed. When loops of length 10 are extracted that fit with an RMSd better than 0.25A on the anchor residues 15-16 and 23-24, we obtain 27 hits (figure 2). Unfortunately, more than half of these loops clash with the rest of the 4rhv coat protein 2. Clearly, LoopFinder is a nice first step, but LoopFinder alone cannot be the whole solution. When we ask WHAT IF to remove the loops that clash with the rest of the protein, only nine hits are left, 8 from cellular retinoic acid binding proteins, and one from a tyrosine-protein phosphatase delta. loops of length 10 in the PDB that fit with RMSd < 0.25Å on the residues 15-16 and 23-24. After optimal superposition on the four anchor residues many loops clash with the main body of the protein. The insert shows how WHAT IF’s loop search option selected six clash-free loops from those 27.

Example 2 (ProDA can search for very specific patterns)

The results of a ProDA search for a 9-residue amphipathic helix with specific surface accessibility BIIIBIIIB and polarity NPPPNPPPN patterns (B=buried I=intermediate exposure; P=polar N=non-polar). Two such helices were found in the PDB. This example illustrates that ProDA can search for very specific patterns. A one turn longer helix with similar constraints results in zero hits. When doing such searches, at the edge of what the PDB is large enough for, the speed of ProDA comes in very handy.

Example 3 (A therapeutic approach for angiogenesis)

Ghavamipour (Ghavamipour et al., 2014) suggested a therapeutic approach for angiogenesis by blocking a vascular endothelial growth factor (VEGF) binding site. An anti-angiogenic VEGF8–109 heterodimer was created in which two loops were transplanted by loops from ureidoglycolate dehydrogenase (PDB ID: 1XRH; see Figure 4) and trichomaglin (PDB ID: 1SGL), respectively. ProDA was used to search for these loops. Loops of 8 amino acid length without regular secondary structure and tight end-to-end distance cut-offs of 0.5, respectively 1.0Å gave as results Gly167-Thr174 of chain B in 1XRH and His43 – Ser50 of chain A in 1SGL, respectively. Neither 1XRH nor 1SGL are in the list of high quality PDB entries so that LoopFinder’s ‘allow-low-quality-structures’ flag had to be used, and the allowed misfit on the anchor stretches had to be relaxed to get these two loops included in the very long list of hits.

Example 4(characterizing the complete biological assembly of HIV integrase)

Roberts (Roberts, 2015) wanted to characterize the complete biological assembly of HIV integrase using its similarity to the structure of PFV-IN (prototype foamy virus integrase). Part of their approach included a loop-transplant of the residues A128-A135 of PDBid 2GHS for the PFV-IN loop 327-334 (PDBid 3OY9). The loop searched for should have eight-residues, a distance of 5.6−5.8 Å between the initial and final Cα atoms, and it should have a proline at the fourth position. ProDA does not find the exact 2GHS loop because GHS is not in its database, but it does find the homologous loop in several PDB files that closely resemble 2GHS.

Example 5 (study the effect of conformational transitions in the ATP hydrolysis of myosin)

Yang (Yang et al., 2008) studied the effect of conformational transitions in the ATP hydrolysis of myosin. They used quantum mechanical and molecular mechanical simulations on two structures of the D. discoideum myosin II motor domain (1FMW and 1VOM). 1FMW misses the residues 203–208, 500–508, and 622–626, while 1VOM misses 205–208, 716–719, and 724–730. Yang et al., modelled these missing loops using the program ModLoop (Sali, 2003). We analysed the loops with LoopFinder, and found a couple dozen examples of nicely fitting loops for all missing stretches in 1FMW and 1VOM. ProDA finds good examples for each missing loop too when searches are performed that involve tight restrictions on the Cα-Cα distances between the endpoints, on the secondary structure, and on the amino acid sequence. The missing residues 203–208 in 1FMW were found with sequence NQANGS and secondary structure CCCCCC, 500–508 was found with sequence NWTFIDFGL and end-point distance 21-22 Å, 622–626 with KKGAN and 8-10 Å. For 1VOM the missing residues 205–208 we searched for AEDS and 7-9 Å, 716–719 with AEDS, CCCC and 9-10 Å, 724–730 with DAVLKHL 9-10 Å. This clearly illustrates that sometimes several sets of parameters should be tried to obtain the best ProDA results. This not a big problem because ProDA performs its searches very fast.

Example 6 (Two peptides for disrupting oligomers of α-synuclein)

Parkinson's disease is characterized by accumulation in dopaminergic neurons of inclusion bodies mainly consisting of fibrillary α-synuclein. Rezaeian et al (Rezaeian et al., 2017) used ProDA to search for sequences that adapt a β-structure and that are identical or very similar to the regions of α-synuclein that are involved in aggregation. They obtained two peptides that could block aggregation in vitro, while one of them even disrupted oligomers of α-synuclein.

Example 7 (Modelling antibody hypervariable loops)

Modelling antibody hypervariable loops is a difficult step in the design of therapeutic humanized antibodies. According to the model of Chotia et al (Chothia, 1989) there are four possible canonical L1 loops for Vκ and VH antibodies. Using Kabat numbering (Kabat, 1991) these loops stretch from position 26 to 32. Figure 5 shows loops fitted on two stretches of three amino acids that have either a Val (canonical model 1) or an Ile (canonical model 2) at position 29. These loops are superposed on the L1 loop of HyHEL-5 (PDBid 1yqv). Loop L1 in HyHEL-5 has canonical model 1.

Example 8 (The study of the binding of viagra to phosphodiesterase 5A)

The study of the binding of viagra to phosphodiesterase 5A by Zagrovic and Van Gunsteren (Zagrovic and Gunsteren, 2007) with the molecular dynamics software GROMOS. For this study, it was crucial to model the missing part of the H loop (residues 665-675 in PDB file 1UDT). We used ProDA to search for this H loop with the IQRSEHPLAQL sequence motif and with an endpoint distance of ~8.5Å.  A segment in PDB file 3BJC (which, by the way, Zagrovic et al had not available to them yet at the time of their research) was found that would make for an ideal start-point for the loop completion.