by
Diana Drennan, Frederic M. Richards and Peter C.
Kahn
Note: This is the abstract from my dissertation. There are currently four papers in preparation using this data.
Although most methods used to analyze proteins of known structure agree on the general placements of secondary structures, the end points of secondary structures and perturbations within them are often ambiguous. It is not surprising, therefore, that prediction methods fail to accurately find secondary structure end points. An analysis of protein structural geometry was done using an in-house program (GASP). The 95% confidence limits were calculated for five parameters (rise, radius, angle of rotation, residues per turn and pitch) for all helical and strand-like tetrads (four consecutive a-carbons) in 112 independently solved, well resolved proteins. These statistics were used to describe secondary structure geometry and compared with traditional descriptions to ascertain where the geometry and hydrogen-bonding differed. Some tetrads within hydrogen-bonded secondary structures did not have helical or strand-like geometry (perturbations). Some tetrads with correct geometry did not have helical or sheet hydrogen-bonding. Some of these tetrads correspond to turns, some were in extended conformation, while others appear at the ends of secondary structures (extensions).
Several ambiguous areas were studied further and appear to be conserved in homologous proteins. This indicates the phenomena are probably not artifacts of crystallography, and that Nature has a reason for having them there. Although these reasons are not always apparent, in other cases, there appear to be readily identifiable forces at work. The ambiguous areas may be residual evidence of the folding pathway, and their further study may lead us to greater understanding of folding.
Positional frequencies were calculated for amino acids in ambiguous areas and compared to those for secondary structures. Significant differences were found, which may have an impact on the prediction of secondary structure end points from sequence data.
A comparison was done between proteins solved by crystallography and NMR. Several systematic differences were found. Three pairs of proteins solved by both crystallography and NMR were studied in depth. Interesting differences were found. These findings could have an impact on the use of crystal and NMR structures in drug design, ligand docking, and modeling protein interfaces.