With the great explosion of use of Internet we can find there large amounts of biological data and genomic information for development of computational methods. There are available all kind of information to elucidate peptide/protein structures obtained by experimental approaches like CD, EPR, FTIR, NMR, and X ray crystallography. This data show the enormous structural diversity of proteins that begins with different amino acid sequences (primary structure) of polypeptide chains that fold into final conformation of secondary and tertiary structures. It was proposed that the information for folding to the native conformation is present in the amino acid sequences1. Although experimental determination of protein three-dimensional structure has become more efficient, the gap between the number of known sequences and the number of known structures is rapidly increasing. Protein structure prediction aims at reducing this sequence-structure gap. Methods for abstracting peptide/protein structures from their sequences can be aimed at secondary structures or 3D structures.
There are four approaches to secondary structure prediction2:
1. Empirical statistical methods that use parameters derived from known 3D structures3.
2. Methods based on physicochemical properties of amino acid residues4 such as volume, exposure, hydrophobicity/hydrophilicity, charge, hydrogen bonding potential, and so on.
3. Methods based on prediction algorithms that use known structures of homologous proteins to assign secondary structures 5-6.
4. Molecular mechanical methods that use force field parameters to model and assign secondary structures7.
In biologically active peptides dominant factor determining the binding characteristics, biological activity and physical properties is the secondary structure rather than the tertiary structure. The conformational space accessible to a peptide is defined by those regions of the 3N-dimensional configurational space � N being the number of atoms � that have a significant probability of being populated. The size and characteristics of this subspace are determined by the length and amino acid sequence of the peptide, as well as by the environment. For an adequate understanding of the macroscopic properties of a peptide, characterization of its accessible conformational space � rather than just of its most stable conformation � is required, since these properties can only be interpreted as weighted averages over the entire ensemble of accessible conformers. Such characterization is presently possible only by computer simulation and has been already used to complement experimental data for a number of peptides8-13.
Peptides are short and therefore have less possibilities of self-stabilization than proteins, and several structures may have the same degree of stability. Regarding to experimental data, the solvent may have a major influence on peptide structure. Some peptides are surface-active and bind to amphiphilic surfaces such as phospholipid surfaces, membranes, receptors, etc. and contain regions comprising amphiphilic secondary structures complementary to those of the surfaces14. When they posses amphiphilic structures represent an active conformation induced either by an amphiphilic surface or by an amphiphilic self-association to form peptide micelle15. It is then quite likely that all membrane-directed peptides (hormones and regulators) will contain at least one region that is susceptible to amphiphilic induction of secondary structure15.
The regular secondary structure components of proteins such as ?-helices, �-sheets, and �-turns are stabilized both by hydrogen bonding and hydrophobic interaction of side chains16-17. Highly specific patterns of complementary intra- and intermolecular hydrogen bonds are created in such secondary structures. Other helices, such as the 310 helix and ? helix, are calculated to have energetically favourable hydrogen-bonding patterns but are rarely if ever observed in natural proteins except at the ends of ? helices due to unfavourable backbone packing in the center of the helix. Other extended structures such as the polyproline helix and alpha sheet are rare in native state proteins but are often hypothesized as important protein folding intermediates. Tight turns and loose, flexible loops link the more "regular" secondary structure elements. The random coil is not a true secondary structure, but is the class of conformations that indicate an absence of regular secondary structure.
Amino acids vary in their ability to form the various secondary structure elements. Proline and glycine are sometimes known as "helix breakers" because they disrupt the regularity of the ? helical backbone conformation; however, both have unusual conformational abilities and are commonly found in turns. Amino acids that prefer to adopt helical conformations in proteins include methionine, alanine, leucine, glutamate and lysine ("MALEK" in amino-acid 1-letter codes); by contrast, the large aromatic residues (tryptophan, tyrosine and phenylalanine) and C�-branched amino acids (isoleucine, valine, and threonine) prefer to adopt �-strand conformations. However, these preferences are not strong enough to produce a reliable method of predicting secondary structure from sequence alone.
IAPP
Human islet amyloid polypeptide (hIAPP, also called amylin) is a 37 amino acid residue pancreatic, endocrine hormone that is stored along with insulin, C-peptide, high concentration of calcium and zink18 in the pancreatic �-cell secretory granules at concentration ~1-4 mM. Insulin exists in molar excess to IAPP in the secretory granules and it has been proposed that the binding of insulin to IAPP prevents the fibrillation of IAPP in the absence of diabetes18-20. IAPP function is stimulating glycogen breakdown in skeletal muscle and liver, acting as insulin antagonist under normal conditions. Additionally, IAPP is involved in the regulation of satiety with respect to food intake, and in maintenance processes of bone, renal proximal tubular and islet �-cells21-26. In type 2 diabetes, extracellular hIAPP aggregates are in the form of fibrillar amyloid deposits27-28 and cause damage to insulin-producting �-cell membranes and cell death29. Over 95% of diagnosed individuals stain positive postmortem for pancreatic amyloid deposits composed of mature, fibrillar IAPP 30-33. Both synthetic and endogenous IAPP have been shown to display significant cytotoxicity to cultured islet cells34-36. This cytotoxicity is believed to be related to islet cell death that occurs in diabetes, leading to loss of blood glucose homeostasis and subsequent insulin dependence37. Mouse IAPP does not form amyloid, and differs from human in only six of 37 residues38.
Like other amyloidogenic peptides and proteins, hIAPP misfolds via a nucleation-dependent aggregation pathway in which small oligomeric assemblies precede the formation of mature amyloid fibrils39. Rather than mature amyloid fibers, the formation of smaller oligomeric species, either on or off the cell membrane, has been identified as a critical step in amyloid-induced cell death40-48. Although intensively studied, little is known about the equilibrium between intermediate states and the formation of early oligomers of IAPP. The initial trigger for IAPP aggregation is particularly mysterious, as pathological IAPP aggregation is not associated with any common mutations49-50. Genetic evidence that amylin is directly involved in pathology includes a familial S20G mutation that leads to early onset of the disease51 and produces an amylin variant that aggregates more readily52. A cross-species comparison of IAPP sequences has been helpful in this regard, in particular comparisons of human IAPP (hIAPP) with its nonamyloidogenic and noncytotoxic rat variant (rIAPP) that differs only in 6 of 37 residues. Significantly, while rats do not ordinarily suffer from type II diabetes transgenic mice and rats expressing hIAPP form amyloid deposits and exhibit signs of diabetes, especially when expression occurs in a background of obesity53-57, supporting the relationship between amyloid formation and type II diabetes58.
Using coordinated X-ray fiber diffraction, electron diffraction, and cryo-EM, Sumner Makin and Serpell59 examined the ultrastructure of fibrils of IAPP, concluding that the fibrils are made up of extended strands running perpendicular to the fibril axis, 4.7 ? apart, as is found in the ��cross-ߒ� structure of other amyloid-like fibrils60. Langen and coworkers61 determined these �-strands to be in a parallel orientation using electron paramagnetic resonance spectroscopy. Recent studies have reported a structural motif common to short microcrystalline segments derived from a variety of amyloid-forming proteins. It consists of two �-sheets with interdigitated side chains, and is called a steric zipper62-63.
The main amyloidogenic region of IAPP includes residues from 20 to 29 has been proposed to be the important factor in fibril formation of hIAPP64-66. This proposal was based on the association of species-specific proline substitutions in rodent IAPP 20-29, which prevents fibril formation. This single �-strand fragment was shown to form intermolecular hydrogen bonds and �-sheets67. However, recent studies have identified a second potential amyloidogenic region within residues 30 to 37, which forms amyloid-like fibrils in aqueous media68. With the presence of more than one �-strand region, intermolecular interactions as well as intramolecular interactions involved in fibril formation are likely to be more complex than that proposed for the single-strand region, IAPP 20-2967. There is also amyloidogenic C-terminal domain of IAPP (amino acid residues 20-29 and 30-37)68 that aggregate through hydrophobic interactions69
Experimental studies, showing aggregation into ordered fibrillar structures of fragments 8-20 and 8-3769-70, have been reported as well. Within these larger sequences, fragments IAPP 15-19 (FLHVS) and IAPP 14-18 (NFVHL) and the possible importance of aromatic residues (and thus ?-? interactions) for amyloid fibril formation were also discussed71-72, while the N-terminal region of residues 1-19 is considered essential for the interaction with membranes73-75. Region 1-13, however, has been reported not to form fibrils, while IAPP 8-20 was found capabable of self-assembly in vitro68.
Natively disordered when free in solution 76-77, IAPP adopts an amphipathic ?-helical conformation when lipid and aqueous solution of hexafluoroisopropanol (HFIP) is added that also accelerate the rate of IAPP fibrillation29,78-81. An important driving force for folding arises from the lower energetic cost of partitioning H-bonded peptide bonds compared to free peptide bonds 82-84. However, Miranker and coworkers have shown by NMR that mouse IAPP is capable of adopting a transient helical structure in solution85. Circular dichroism (CD) as well as infrared reflection absorption spectroscopy (IRRAS), two-dimensional infrared spectroscopy (2D IR), and other NMR studies show an increase in the helical content of IAPP prior to its conversion to the �-sheet rich fibrillar form70,73,76,86-87. Micel-boud hIAPP form ?-helix from about residues 5-2888 whereas by chaperoning IAPP through fusion to maltose binding protein, Wiltzius et al. find that hIAPP can adopts a ?-helical structure at residues 8-18 and 22-27 and that molecules of IAPP dimerize on the pathway to fibrillation 89.
There is clear evidence that IAPP-lipid interactions might play an important role in the pathogenesis of type II diabetes, by catalyzing of misfolding by accelerating the formation of amyloid fibrils and toxic oligomers and disrupting membrane integrity and permeability29,90-92. There are recent indications that the fibrillogenic property of membrane-bound IAPP is largely determined by the chemical nature of membrane lipids. Polar and electrostatic interactions can be stabilized through head groups of the phospholipid, whereas hydrophobic interactions can occur in the lipid chain region. For instance, it has been demonstrated that IAPP aggregation is enhanced in the presence of membranes containing anionic lipids such as phosphatidylglycerol (PG) or phosphatidylserine (PS), and a mechanism of interaction has been proposed29,73-74,77,80,91,93-95. It has been suggested that IAPP inserts into lipid mono- and bilayers via the positively charged N-terminal as a monomer in vitro, possibly representing an essential first step required to induce IAPP-induced membrane damage in type II diabetes in vivo 74. At a surface charge corresponding to 70% mol PG, the rate of fibrillogenesis is maximal, and the rate of fiber formation is limited by the self-assembly of peptide at the lipid-water interface.
The amino acid sequence of IAPP is highly conserved between species, with a few variations. The region that most displays the species diversity is the 20-29 region96. Only a few animal species are known to develop islet amyloid. Besides humans, these include non-human primates97, cats98, raccoons98, and the degu (Octodon degus)99.