Peptide#
- class peptides.Peptide(object)#
A sequence of amino acids.
- classmethod sample(length: int, frequencies: str = 'SwissProt2021') Peptide#
Generate a peptide with the given amino-acid frequencies.
This method is useful for testing, but using amino-acid frequencies to generate a peptide is not a biologically accurate method, instead consider sampling based on dipeptide frequencies in a particular organism, or using k-mer shuffling.
- Parameters:
length (
int) – The desired length for the generated peptide.frequencies (
str) – The name of the amino-acid frequency table to use: either KingJukes to use the amino-acid frequencies for vertebrate organisms reported in King & Jukes (1969), or SwissProt2021 to use the amino-acid frequencies in all the proteins from the January 2021 release of SwissProt.
- Returns:
Peptide– A new peptide. The first amino-acid will always be a Methionine for biological accuracy.
References
King, J. L., and T. H. Jukes. Non-Darwinian Evolution. Science. May 1969;164(3881):788–98. doi:10.1126/science.164.3881.788. PMID:5767777.
The UniProt Consortium. UniProt: The Universal Protein Knowledgebase in 2021. Nucleic Acids Research. Jan 2021;49(D1):D480–89. doi:10.1093/nar/gkaa1100. PMID:33237286.
- __init__(sequence: str) None#
Create a new peptide object with the given sequence.
- Parameters:
sequence (
str) – A sequence of amino acids encoded with the IUPAC one-letter code. Non-standard (O, U), ambiguous (B, Z, J) and unknown (X) residues are supported in some methods, but not all of them.
- aliphatic_index() float#
Compute the aliphatic index of the peptide.
The aliphatic index of a protein was proposed in Ikai (1980). It is defined as the relative volume occupied by aliphatic side chains (Alanine, Valine, Isoleucine, and Leucine):
\[\text{aliphatic index} = A + 2.9 V + 3.9 (I + L + J)\]It may be regarded as a positive factor for the increase of thermostability of globular proteins.
- Returns:
float– The computed aliphatic index for the peptide sequence, between 0.0 and 390.0.
Example
>>> peptide = Peptide("SDKEVDEVDAALSDLEITLE") >>> peptide.aliphatic_index() 117.0
References
Ikai, A. Thermostability and Aliphatic Index of Globular Proteins. Journal of Biochemistry. Dec 1980;88(6):1895–98. PMID:7462208.
- atchley_factors() AtchleyFactors#
Compute the Atchley factors of the peptide.
See
AtchleyFactorsfor more information.- Returns:
peptides.AtchleyFactors– The computed average Atchley factors for all the amino acids in the peptide.
Example
>>> peptide = Peptide("KLKLLLLLKLK") >>> for i, kf in enumerate(peptide.atchley_factors()): ... print(f"AF{i+1:<3} {kf: .4f}") AF1 0.0176 AF2 -0.8321 AF3 -0.7636 AF4 0.7048 AF5 0.0189
- auto_correlation(table: Dict[str, float], lag: int = 1, center: bool = True) float#
Compute the auto-correlation index of a peptide sequence.
Example
>>> peptide = Peptide("SDKEVDEVDAALSDLEITLE") >>> table = peptides.tables.HYDROPHOBICITY["KyteDoolittle"] >>> peptide.auto_correlation(table=table) -0.3519908... >>> peptide.auto_correlation(table=table, lag=5) 0.00113355...
- auto_covariance(table: Dict[str, float], lag: int = 1, center: bool = True) float#
Compute the auto-covariance index of a peptide sequence.
Example
>>> peptide = Peptide("SDKEVDEVDAALSDLEITLE") >>> table = peptides.tables.HYDROPHOBICITY["KyteDoolittle"] >>> peptide.auto_covariance(table) -0.414005... >>> peptide.auto_covariance(table, lag=5) 0.0010003...
- blosum_indices() BLOSUMIndices#
Compute the BLOSUM62-derived indices of the peptide.
See
BLOSUMIndicesfor more information.- Returns:
peptides.BLOSUMIndices– The computed average BLOSUM indices for all the amino acids in the peptide.
Example
>>> peptide = Peptide("KLKLLLLLKLK") >>> for i, b in enumerate(peptide.blosum_indices()): ... print(f"BLOSUM{i+1:<3} {b: .4f}") BLOSUM1 -0.4827 BLOSUM2 -0.5618 BLOSUM3 -0.8509 BLOSUM4 -0.4173 BLOSUM5 0.3173 BLOSUM6 0.2527 BLOSUM7 0.1464 BLOSUM8 0.1427 BLOSUM9 -0.2145 BLOSUM10 -0.3218
- boman() float#
Compute the Boman (potential peptide interaction) index.
The potential interaction index proposed by Boman (2003) is an index computed by averaging the solubility values for all residues in a sequence. It can be used to give an overall estimate of the potential of a peptide to bind to membranes or other proteins.
- Returns:
float– The Boman index for the peptide. A value greater than 2.48 indicates that a protein has high binding potential.
Example
>>> peptide = Peptide("FLPVLAGLTPSIVPKLVCLLTKKC") >>> peptide.boman() -1.2358...
Note
The potential protein interaction index was originally proposed as an easy way to differentiate between the action mechanism of hormones (protein/protein) and antimicrobial peptides (protein/membrane).
References
Boman, H. G. Antibacterial Peptides: Basic Facts and Emerging Concepts. Journal of Internal Medicine. 2003 Sep;254(3):197–215. doi:10.1046/j.1365-2796.2003.01228.x. PMID:12930229.
- charge(pH: float = 7, pKscale: str = 'Lehninger') float#
Compute the theoretical net charge of a peptide sequence.
This function computes the theoretical net charge of a peptide sequence, based on the Henderson-Hasselbach equation described by Dexter S. Moore (1985). The net charge can be computed at a given pH using one of the 9 pKa scales available.
- Parameters:
- Returns:
float– The net charge of the peptide.
Example
>>> peptide = Peptide("FLPVLAGLTPSIVPKLVCLLTKKC") >>> peptide.charge(pKscale="Bjellqvist") 2.7373... >>> peptide.charge(pKscale="EMBOSS") 2.9141... >>> peptide.charge(pKscale="Murray") 2.9075... >>> peptide.charge(pKscale="Sillero") 2.9198... >>> peptide.charge(pKscale="Solomon") 2.8444... >>> peptide.charge(pKscale="Stryer") 2.8765... >>> peptide.charge(pKscale="Lehninger") 2.8731... >>> peptide.charge(pKscale="Dawson") 2.8444... >>> peptide.charge(pKscale="Rodwell") 2.8197...
References
Bjellqvist, B., G. J. Hughes, C. Pasquali, N. Paquet, F. Ravier, J. C. Sanchez, S. Frutiger, and D. Hochstrasser. The Focusing Positions of Polypeptides in Immobilized pH Gradients Can Be Predicted from Their Amino Acid Sequences. Electrophoresis. 1993 Oct;14(10):1023–31. doi:10.1002/elps.11501401163. PMID:8125050.
Dawson, R. M. C. and D. C. Elliott. Data for Biochemical Research. Oxford: Clarendon Press. 2002;3:592. ISBN:978-0-19-855299-4.
Kiraga, J. Analysis and computer simulations of variability of isoelectric point of proteins in the proteomes. PhD thesis, University of Wroclaw, Poland. 2008.
Lehninger, A. L., D. L. Nelson, and M. M. Cox. Lehninger Principles of Biochemistry. 4th ed. New York: W.H. Freeman. 2005;4:1100. ISBN:978-0-7167-4339-2.
Murray, R. K. Harper’s Illustrated Biochemistry. New York: Lange Medical Books/McGraw-Hill. 2006;27. ISBN:978-0-07-146197-9.
Rodwell, J.D. Heterogeneity of Component Bands in Isoelectric Focusing Patterns. Analytical Biochemistry. 1982 Jan;119(2):440-49. doi:10.1016/0003-2697(82)90611-x. PMID:7072964.
Sillero, A., and A. Maldonado. Isoelectric Point Determination of Proteins and Other Macromolecules: Oscillating Method. Computers in Biology and Medicine. 2006 Feb;36(2): 157–66. doi:10.1016/j.compbiomed.2004.09.006. PMID:16389075.
Solomons, T. W. G. Fundamentals of Organic Chemistry. New York: Wiley. 1997;5. ISBN:978-0-471-28298-3.
Stryer, L., J. Augustyniak, and J. Michejda. Biochemia. Warszawa: Wydawnictwo Naukowe PWN. 2000. ISBN:978-83-01-12044-3.
- counts() Dict[str, int]#
Return a table of amino-acid counts in the peptide.
- Returns:
dict– A dictionary mapping each amino-acid code to the number of times it occurs in the peptide sequence.
Example
>>> p = Peptide("SDKEVDEVDAALS") >>> {k:v for k,v in p.counts().items() if v != 0} {'A': 2, 'D': 3, 'E': 2, 'L': 1, 'K': 1, 'S': 2, 'V': 2}
- cross_covariance(table1: Dict[str, float], table2: Dict[str, float], lag: int = 1, center: bool = True) float#
Compute the cross-covariance index of a peptide sequence.
Example
>>> peptide = Peptide("SDKEVDEVDAALSDLEITLE") >>> table1 = peptides.tables.HYDROPHOBICITY["KyteDoolittle"] >>> table2 = peptides.tables.HYDROPHOBICITY["Eisenberg"] >>> peptide.cross_covariance(table1, table2) -0.3026609... >>> peptide.cross_covariance(table1, table2, lag=5) 0.0259803...
- cruciani_properties() CrucianiProperties#
Compute the Cruciani properties of the peptide.
See
CrucianiPropertiesfor more information.- Returns:
peptides.CrucianiProperties– The computed average Cruciani properties of all the amino acids in the corresponding peptide sequence.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, b in enumerate(peptide.cruciani_properties()): ... print(f"PP{i+1:<3} {b: .4f}") PP1 -0.1130 PP2 -0.0220 PP3 0.2735
- descriptors() Dict[str, float]#
Create a dictionary containing every protein descriptor available.
Example
>>> peptide = Peptide("SDKEVDEVDAALSDLEITLE") >>> sorted(peptide.descriptors().keys()) ['AF1', ..., 'F1', ..., 'KF1', ..., 'MSWHIM1', ..., 'PP1', ...]
Hint
Use this method to create a
DataFramecontaining the descriptors for several sequences.
- detect_outlier() OutlierResult#
Detect if this sequence is an outlier based on SwissProt distributions.
This method analyzes the sequence using the vetting metrics (entropy, max_frequency, longest_run) and compares them against established distributions from SwissProt proteins to identify potential outliers, artifacts, or unusual sequences. It provides an automated way to flag sequences that may require further investigation.
See
OutlierResultfor more information.- Returns:
OutlierResult– The outlier detection results, as a named tuple.
Example
For a real peptide, the large ribosomal subunit protein bL32 of Escherichia coli (UniProt:P0A7N4):
>>> peptide = Peptide( ... "MAVQQNKPTRSKRGMRRSHDALTAVTSLSVDKT" ... "SGEKHLRHHITADGYYRGRKVIAK" ... ) >>> result = peptide.detect_outlier() >>> result.is_outlier False
For a problematic sequence:
>>> peptide = Peptide("AAAA") >>> result = peptide.detect_outlier() >>> result.is_outlier True >>> result.issues[0] 'Entropy (0.000) below 5th percentile (3.714)'
Note
Thresholds are based on SwissProt analysis and are hardcoded for efficiency. No external files are required.
Added in version 0.5.0.
- energy_cost(scale: str = 'Akashi', *, mode: str | None = None)#
Estimate the energy cost required to biosynthesize a peptide.
- Parameters:
scale (
str) –The name of the energy estimation scale to use. Supports the following values:
- Akashi
The energetic cost computed by Akashi & Gojobori (2002) based on major codon usage values in Escherichia coli and Bacillus subtilis.
- Craig
The energetic cost computed by Craig & Weber () from amino-acid substitution probabilities in Escherichia coli.
- Heizer
The energetic cost computed by Heizer et al. (2006), derived from Akashi & Gojobori (2002) for photoautotrophs (capable of the Calvin cycle reactions).
- Wagner
The energetic cost computed by Wagner (2005) based on expression data in Saccharomyces cerevisiae.
- Keyword Arguments:
mode (
str) – For theWagnerscale, the mode of growth of the source organism, eitherrespiration(the default) orfermentation.
Example
>>> peptide = Peptide("SDKEVDEVDAALSDLEITLEYLKW") >>> peptide.energy_cost() 550.5... >>> peptide.energy_cost("Heizer") 554.5...
References
Akashi, H., & Gojobori, T. (2002). Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proceedings of the National Academy of Sciences of the United States of America, 99(6), 3695–3700. PMID:11904428. doi:10.1073/pnas.062526999.
Heizer, E. M., Jr, Raiford, D. W., Raymer, M. L., Doom, T. E., Miller, R. V., & Krane, D. E. (2006). Amino acid cost and codon-usage biases in 6 prokaryotic genomes: a whole-genome analysis. Molecular biology and evolution, 23(9), 1670–1680. PMID:16754641. doi:10.1093/molbev/msl029.
Wagner A. (2005). Energy constraints on the evolution of gene expression. Molecular biology and evolution, 22(6), 1365–1374. PMID:15758206. doi:10.1093/molbev/msi126.
- entropy() float#
Compute the Shannon entropy of the amino acid sequence.
The Shannon entropy measures the diversity of amino acids in the peptide sequence. It is calculated using the formula:
\[H = -\sum_{i=1}^{n} p_i \log_2(p_i)\]where \(p_i\) is the frequency of amino acid \(i\) and \(n\) is the number of possible amino acids (26, including ambiguous codes).
The entropy is maximized when all amino acids are equally frequent, and minimized when the sequence consists of only one amino acid type.
- Returns:
float– The Shannon entropy of the peptide sequence in bits. The maximum possible value is log₂(26) ≈ 4.70 bits.
Example
>>> peptide = Peptide("AALS") >>> peptide.entropy() 1.5 >>> peptide = Peptide("AAAA") >>> peptide.entropy() 0.0 >>> peptide = Peptide("ACDEFGHIKLMNPQRSTVWY") >>> peptide.entropy() 4.3219...
References
Shannon, C. E. A Mathematical Theory of Communication. Bell System Technical Journal. 1948;27(3):379-423. doi:10.1002/j.1538-7305.1948.tb01338.x.
Added in version 0.5.0.
- fasgai_vectors() FasgaiVectors#
Compute the FASGAI vectors of the peptide.
See
FasgaiVectorsfor more information.- Returns:
peptides.FasgaiVectors– The computed average FASGAI vectors for all the amino acids in the peptide.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, b in enumerate(peptide.fasgai_vectors()): ... print(f"F{i+1:<3} {b: .5f}") F1 -0.13675 F2 -0.45485 F3 -0.11695 F4 -0.45800 F5 -0.38015 F6 0.52740
- frequencies() Dict[str, float]#
Return a table of amino-acid frequencies in the peptide.
- Returns:
dict– A dictionary mapping each amino-acid code to its frequency in the peptide sequence.
Example
>>> p = Peptide("AALS") >>> {k:v for k,v in p.frequencies().items() if v != 0} {'A': 0.5, 'L': 0.25, 'S': 0.25}
- hydrophobic_moment(window: int = 11, angle: int = 100) float#
Compute the maximal hydrophobic moment of a protein sequence.
This function computes the hydrophobic moment based on Eisenberg et al (1984). Hydrophobic moment is a quantitative measure of the amphiphilicity perpendicular to the axis of any periodic peptide structure, such as the α-helix or β-sheet.
- Parameters:
- Returns:
float– The maximal hydrophobic moment of the peptide.
Example
>>> peptide = Peptide("FLPVLAGLTPSIVPKLVCLLTKKC") >>> peptide.hydrophobic_moment(angle=100) 0.519922... >>> peptide.hydrophobic_moment(angle=160) 0.270590...
See also
The
hydrophobic_moment_profilemethod, which builds a profile for each amino acid position instead of simply extracting the global maximum.References
Eisenberg, D., R. M. Weiss, and T. C. Terwilliger. The Hydrophobic Moment Detects Periodicity in Protein Hydrophobicity. Proceedings of the National Academy of Sciences of the United States of America. 1984 Jan;81(1):140–44. doi:10.1073/pnas.81.1.140. PMID:6582470.
- hydrophobic_moment_profile(window: int = 11, angle: int = 100) Sequence[float]#
Build a hydrophobic moment profile of a sliding window.
This function builds a profile computing the hydrophobic moment of a section of the peptide based on the primary sequecne.
- Parameters:
Example
>>> peptide = Peptide("ARQQNLFINFCLILIFLLLI") >>> uH = peptide.hydrophobic_moment_profile(window=12, angle=100) >>> [round(x, 3) for x in uH] [0.353, 0.317, 0.274, 0.274, 0.253, 0.113, 0.113, 0.108, 0.132]
See also
The
hydrophobic_momentmethod, which computes the maximal hydrophobic moment instead of building a profile.
- hydrophobicity(scale: str = 'KyteDoolittle') float#
Compute the hydrophobicity index of a protein sequence.
This function calculates the hydrophobicity index of an amino acid sequence by averaging the hydrophobicity values of each residue using one of the 39 scales from different sources.
- Parameters:
scale (
str) – The name of the hydrophobicity scale to be used. A list of all the allowed values can be retrieved from the keys of thepeptides.tables.HYDROPHOBICITYdictionary.- Returns:
float– The hydrophobicity index of the peptide.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> peptide.hydrophobicity(scale="Aboderin") 3.84... >>> peptide.hydrophobicity(scale="AbrahamLeo") 0.092...
Note
The hydrophobicity is an important stabilization force in protein folding; this force changes depending on the solvent in which the protein is found.
References
Aboderin, A. A. An Empirical Hydrophobicity Scale for α-Amino-Acids and Some of Its Applications. International Journal of Biochemistry. 1971 Oct;2(11):537–44. doi:10.1016/0020-711X(71)90023-1.
Abraham, D.J., and A. J. Leo. Extension of the Fragment Method to Calculate Amino Acid Zwitterion and Side Chain Partition Coefficients. Proteins: Structure, Function, and Genetics. 1987;2(2):130–52. doi:10.1002/prot.340020207.
Argos, P., J. K. Rao, and P. A. Hargrave. Structural Prediction of Membrane-Bound Proteins. European Journal of Biochemistry. Nov 1982;128(2–3):565–75. doi:10.1111/j.1432-1033.1982.tb07002.x. PMID:7151796.
Barley, M. H., N. J. Turner, and R. Goodacre. Improved Descriptors for the Quantitative Structure–Activity Relationship Modeling of Peptides and Proteins. Journal of Chemical Information and Modeling. Feb 2018;58(2):234–43. doi:10.1021/acs.jcim.7b00488. PMID:29338232.
Black, S. D., and D. R. Mould. Development of Hydrophobicity Parameters to Analyze Proteins Which Bear Post- or Cotranslational Modifications. Analytical Biochemistry. Feb 1991;193(1):72–82. doi:10.1016/0003-2697(91)90045-u. PMID:2042744.
Bull, H. B., and K. Breese. Surface Tension of Amino Acid Solutions: A Hydrophobicity Scale of the Amino Acid Residues. Archives of Biochemistry and Biophysics. Apr 1974;161(2):665–70. doi:10.1016/0003-9861(74)90352-x. PMID:4839053.
Casari, G., and M. J. Sippl. Structure-Derived Hydrophobic Potential. Hydrophobic Potential Derived from X-Ray Structures of Globular Proteins Is Able to Identify Native Folds. Journal of Molecular Biology. Apr 1992;224(3):725–32. doi:10.1016/0022-2836(92)90556-y. PMID:1569551.
Chothia, C. The Nature of the Accessible and Buried Surfaces in Proteins. Journal of Molecular Biology. Jul 2917;105(1):1–12. doi:10.1016/0022-2836(76)90191-1. PMID:994183.
Cid, H., M. Bunster, M. Canales, and F. Gazitúa. Hydrophobicity and Structural Classes in Proteins. Protein Engineering. Jul 1992;5(5):373–75. doi:10.1093/protein/5.5.373. PMID:1518784.
Cowan, R., and R. G. Whittaker. Hydrophobicity Indices for Amino Acid Residues as Determined by High-Performance Liquid Chromatography. Peptide Research. Apr 1990;3(2):75–80. PMID:2134053.
Eisenberg, D., E. Schwarz, M. Komaromy, and R. Wall. Analysis of Membrane and Surface Protein Sequences with the Hydrophobic Moment Plot. Journal of Molecular Biology. Oct 1984;179(1):125–42. doi:10.1016/0022-2836(84)90309-7. PMID:6502707.
Engelman, D. M., T. A. Steitz, and A. Goldman. Identifying Nonpolar Transbilayer Helices in Amino Acid Sequences of Membrane Proteins. Annual Review of Biophysics and Biophysical Chemistry. 1986;15:321–53. doi:10.1146/annurev.bb.15.060186.001541. PMID:3521657.
Fasman, G. D. Prediction of Protein Structure and the Principles of Protein Conformation. Springer US. 1989. doi:10.1007/978-1-4613-1571-1. ISBN:978-0-306-43131-9.
Fauchère, J-L., and Pliska V. Hydrophobic Parameters π of Amino-Acid Side Chains from the Partitioning of N-Acetyl-Amino-Acid Amides. European Journal of Medicinal Chemistry. 1983;18(4):369–75.
Goldsack, D. E., and R. C. Chalifoux. Contribution of the Free Energy of Mixing of Hydrophobic Side Chains to the Stability of the Tertiary Structure of Proteins. Journal of Theoretical Biology. Jun 1973;39(3):645–51. doi:10.1016/0022-5193(73)90075-1. PMID:4354159.
Guy, H. R. Amino Acid Side-Chain Partition Energies and Distribution of Residues in Soluble Proteins. Biophysical Journal. Jan 1985;47(1):61–70. doi:10.1016/S0006-3495(85)83877-7. PMID:3978191.
Hopp, T. P., and K. R. Woods. Prediction of Protein Antigenic Determinants from Amino Acid Sequences. Proceedings of the National Academy of Sciences of the United States of America. Jun 1981;78(6):3824–28. doi:10.1073/pnas.78.6.3824. PMID:6167991.
Janin, J. Surface and inside Volumes in Globular Proteins. Nature. Feb 1979;277(5696):491–92. doi:10.1038/277491a0. PMID:763335.
Jones, D. D. Amino Acid Properties and Side-Chain Orientation in Proteins: A Cross Correlation Approach. Journal of Theoretical Biology. Mar 1975;50(1):167–83. doi:10.1016/0022-5193(75)90031-4. PMID:1127956.
Juretić, D., D. Zucić, B. Lucić, and N. Trinajstić. Preference Functions for Prediction of Membrane-Buried Helices in Integral Membrane Proteins. Computers & Chemistry. Jun 1998;22(4):279–94. doi:10.1016/s0097-8485(97)00070-3. PMID:9680689.
Kawashima, S., H. Ogata, and M. Kanehisa. AAindex: Amino Acid Index Database. Nucleic Acids Research. Jan 1999;27(1):368–69. doi:10.1093/nar/27.1.368. PMID:9847231.
Kawashima, S., and M. Kanehisa. AAindex: Amino Acid Index Database. Nucleic Acids Research. Jan 2000;28(1):374. doi:10.1093/nar/28.1.374. PMID:10592278.
Kawashima, S., P. Pokarowski, M. Pokarowska, A. Kolinski, T. Katayama, and M. Kanehisa. AAindex: Amino Acid Index Database, Progress Report 2008. Nucleic Acids Research. Jan 2008;36:D202-205. doi:10.1093/nar/gkm998. PMID:17998252.
Kidera, A., Y. Konishi, M. Oka, T. Ooi, and H. A. Scheraga. Statistical Analysis of the Physical Properties of the 20 Naturally Occurring Amino Acids. Journal of Protein Chemistry. Feb 1985;4(1):23-55. doi:10.1007/BF01025492.
Kuhn, L. A., C. A. Swanson, M. E. Pique, J. A. Tainer, and E. D. Getzoff. Atomic and Residue Hydrophilicity in the Context of Folded Protein Structures. Proteins. Dec 1995;23(4):536–47. doi:10.1002/prot.340230408. PMID:8749849.
Kyte, J., and R. F. Doolittle. A Simple Method for Displaying the Hydropathic Character of a Protein. Journal of Molecular Biology. May 1982;157(1):105–32. doi:10.1016/0022-2836(82)90515-0. PMID:7108955.
Levitt, M. A Simplified Representation of Protein Conformations for Rapid Simulation of Protein Folding. Journal of Molecular Biology. Jun 1976;104(1):59–107. doi:10.1016/0022-2836(76)90004-8. PMID:957439.
Manavalan, P., and P. K. Ponnuswamy. Hydrophobic Character of Amino Acid Residues in Globular Proteins. Nature. Oct 1978;275(5681):673–74. doi:10.1038/275673a0. PMID:703834.
Miyazawa, S., and R. L. Jernigan. Estimation of Effective Interresidue Contact Energies from Protein Crystal Structures: Quasi-Chemical Approximation. Macromolecules. Mar 1985;18(3):534–52. doi:10.1021/ma00145a039.
Nakai, K., A. Kidera, and M. Kanehisa. Cluster Analysis of Amino Acid Indices for Prediction of Protein Structure and Function. Protein Engineering. Jul 1988;2(2):93–100. doi:10.1093/protein/2.2.93. PMID:3244698.
Nozaki, Y., and C. Tanford. The Solubility of Amino Acids and Two Glycine Peptides in Aqueous Ethanol and Dioxane Solutions. Establishment of a Hydrophobicity Scale. The Journal of Biological Chemistry. Apr 1971;246(7):2211–17. PMID:5555568.
Parker, J. M., D. Guo, and R. S. Hodges. New Hydrophilicity Scale Derived from High-Performance Liquid Chromatography Peptide Retention Data: Correlation of Predicted Surface Residues with Antigenicity and X-Ray-Derived Accessible Sites. Biochemistry. 1986;25(19):5425–32. doi:10.1021/bi00367a013. PMID:2430611.
Ponnuswamy, P. K. Hydrophobic Characteristics of Folded Proteins. Progress in Biophysics and Molecular Biology. 1993;59(1):57–103. doi:10.1016/0079-6107(93)90007-7. PMID:8419986.
Prabhakaran, M. The Distribution of Physical, Chemical and Conformational Properties in Signal and Nascent Peptides. The Biochemical Journal. Aug 1990;269(3):691–96. doi:10.1042/bj2690691. PMID:2390062.
Rao, J. K. M., and P. Argos. A Conformational Preference Parameter to Predict Helices in Integral Membrane Proteins. Biochimica Et Biophysica Acta. Jan 1986;869(2):197–214. doi:10.1016/0167-4838(86)90295-5. PMID:2935194.
Rose, G. D., A. R. Geselowitz, G. J. Lesser, R. H. Lee, and M. H. Zehfus. Hydrophobicity of Amino Acid Residues in Globular Proteins. Science (New York, N.Y.). Aug 1985;229(4716):834–38. doi:10.1126/science.4023714. PMID:4023714.
Roseman, M. A. Hydrophilicity of Polar Amino Acid Side-Chains Is Markedly Reduced by Flanking Peptide Bonds. Journal of Molecular Biology. Apr 1988;200(3):513–22. doi:10.1016/0022-2836(88)90540-2. PMID:3398047.
Sweet, R. M., and D. Eisenberg. Correlation of Sequence Hydrophobicities Measures Similarity in Three-Dimensional Protein Structure. Journal of Molecular Biology. Dec 1983;171(4):479-88. doi:10.1016/0022-2836(83)90041-4. PMID:6663622.
Tomii, K., and M. Kanehisa. Analysis of Amino Acid Indices and Mutation Matrices for Sequence Comparison and Structure Prediction of Proteins. Protein Engineering. Jan 1996;9(1):27–36. doi:10.1093/protein/9.1.27. PMID:9053899.
Welling, G. W., W. J. Weijer, R. van der Zee R, and S. Welling-Wester. Prediction of Sequential Antigenic Regions in Proteins. FEBS Letters. Feb 1985;188(2):215-8. doi:10.1016/0014-5793(85)80374-4. PMID:2411595.
White, S. H., and W. C. Wimley. Membrane Protein Folding and Stability: Physical Principles. Annual Review of Biophysics and Biomolecular Structure. 1999;28:319–65. doi:10.1146/annurev.biophys.28.1.319. PMID:10410805
White, S. H., and W. C. Wimley. Hydrophobic Interactions of Peptides with Membrane Interfaces. Biochimica Et Biophysica Acta. Nov 1998;1376(3):339-52. doi:10.1016/s0304-4157(98)00021-5. PMID:9804985.
Wilson, K. J., A. Honegger, R. P. Stötzel, and G. J. Hughes. The Behaviour of Peptides on Reverse-Phase Supports during High-Pressure Liquid Chromatography. The Biochemical Journal. Oct 1981;199(1):31-41. doi:10.1042/bj1990031. PMID:7337711.
Wimley, W. C., and S. H. White. Experimentally Determined Hydrophobicity Scale for Proteins at Membrane Interfaces. Nature Structural Biology. Oct 1996;3(10):842–48. doi:10.1038/nsb1096-842. PMID:8836100.
Wimley, W. C., T. P. Creamer, and S. H. White. Solvation Energies of Amino Acid Side Chains and Backbone in a Family of Host-Guest Pentapeptides. Biochemistry. Apr 1996;35(16):5109–24. doi:10.1021/bi9600153. PMID:8611495.
Wolfenden, R., L. Andersson, P. M. Cullis, and C. C. Southgate. Affinities of Amino Acid Side Chains for Solvent Water. Biochemistry. Feb 1981;20(4):849–55. doi:10.1021/bi00507a030. PMID:7213619.
Zimmerman, J. M., N. Eliezer, and R. Simha. The Characterization of Amino Acid Sequences in Proteins by Statistical Methods. Journal of Theoretical Biology. Nov 1968;21(2):170–201. doi:10.1016/0022-5193(68)90069-6. PMID:5700434.
- hydrophobicity_profile(window: int = 11, scale: str = 'KyteDoolittle') Sequence[float]#
Build a hydrophobicity profile of a sliding window.
Example
>>> peptide = Peptide("ARQQNLFINFCLILIFLLLI") >>> h = peptide.hydrophobicity_profile(window=12, scale="Eisenberg") >>> [round(x, 3) for x in h] [0.083, 0.147, 0.446, 0.632, 0.802, 0.955, 0.955, 0.944, 0.944]
- instability_index() float#
Compute the instability index of a protein sequence.
This function calculates the instability index proposed by Guruprasad et al (1990). This index predicts the stability of a protein based on its dipeptide composition.
- Returns:
float– The instability index of the peptide. A protein whose instability index is smaller than 40 is predicted as stable, a value above 40 predicts that the protein may be unstable.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> round(peptide.instability_index(), 2) 83.68
References
Guruprasad, K., B.V. Bhasker Reddy, and M. W. Pandit. Correlation between Stability of a Protein and Its Dipeptide Composition: A Novel Approach for Predicting in Vivo Stability of a Protein from Its Primary Sequence. Protein Engineering, Design and Selection. 1990 Dec;4(2):155–61. doi:10.1093/protein/4.2.155. PMID:2075190.
- isoelectric_point(pKscale: str = 'EMBOSS') float#
Compute the isoelectric point of a protein sequence.
The isoelectric point (pI), is the pH at which a particular molecule or surface carries no net electrical charge.
- Parameters:
pKscale (
str) – The name of the pKa scale to be used. A list of all the allowed values can be retrieved from the keys of thepeptides.tables.PKdictionary.- Returns:
float– The pH at which the peptide has a neutral net charge.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> peptide.isoelectric_point(pKscale="EMBOSS") 9.71... >>> peptide.isoelectric_point(pKscale="Murray") 9.81... >>> peptide.isoelectric_point(pKscale="Sillero") 9.89... >>> peptide.isoelectric_point(pKscale="Solomon") 9.58... >>> peptide.isoelectric_point(pKscale="Stryer") 9.62... >>> peptide.isoelectric_point(pKscale="Lehninger") 9.93... >>> peptide.isoelectric_point(pKscale="Dawson") 9.56... >>> peptide.isoelectric_point(pKscale="Rodwell") 9.71...
Note
The pI is a variable that affects the solubility of the peptides under certain conditions of pH. When the pH of the solvent is equal to the pI of the protein, it tends to precipitate and lose its biological function.
References
Rice, P., I. Longden, and A. Bleasby. EMBOSS: The European Molecular Biology Open Software Suite. Trends in Genetics. June 2000;16(6):276–77. doi:10.1016/s0168-9525(00)02024-2. PMID:10827456
- kidera_factors() KideraFactors#
Compute the Kidera factors of the peptide.
See
KideraFactorsfor more information.- Returns:
peptides.KideraFactors– The computed average Kidera factors for all the amino acids in the peptide.
Example
>>> peptide = Peptide("KLKLLLLLKLK") >>> for i, kf in enumerate(peptide.kidera_factors()): ... print(f"KF{i+1:<3} {kf: .4f}") KF1 -0.7855 KF2 0.2982 KF3 -0.2364 KF4 -0.0818 KF5 0.2100 KF6 -1.8936 KF7 1.0291 KF8 -0.5127 KF9 0.1118 KF10 0.8100
- linker_preference_profile(window: int = 15) Sequence[float]#
Compute the linker preference profile of a protein sequence.
The linker preference profile is a measure used as a basis for the DomCut method in Suyama & Ohara (2002). The resulting profile can then be used to identify putative domain boundaries in the input protein, either:
Using prior knowledge of the estimated domain count \(D\), in which case the \(D-1\) global minimums in the sequence can be used as cutting points
Without prior knowledge of the domain count, using a fixed threshold to estimate the number of domains and linkers. A cutoff value of \(-0.09\) was selected by the authors optimizing on the specificity / selectivity tradeoff.
References
Suyama, M., O. Ohara O. DomCut: prediction of inter-domain linker regions in amino acid sequences. Bioinformatics. Mar 2003;19(5):673-4. doi:10.1093/bioinformatics/btg031. PMID:12651735.
Added in version 0.3.0.
- longest_run() int#
Return the length of the longest consecutive run of the same amino acid.
This method finds the longest stretch of consecutive identical amino acids in the sequence. Long runs of the same amino acid can indicate repetitive regions or low complexity sequences.
- Returns:
int– The length of the longest consecutive run, between 1 and len(sequence).
Example
>>> peptide = Peptide("AALS") >>> peptide.longest_run() 2 >>> peptide = Peptide("AAAA") >>> peptide.longest_run() 4 >>> peptide = Peptide("ACDEFGHIKLMNPQRSTVWY") >>> peptide.longest_run() 1 >>> peptide = Peptide("AALLLSSS") >>> peptide.longest_run() 3
Added in version 0.5.0.
- mass_shift(aa_shift: str | Dict[str, float] | None = 'silac_13c', monoisotopic: bool = True) float#
Compute the mass difference of modified peptides.
This function calculates the mass difference of peptides introduced by chemical modifications or heavy isotope labelling.
- Parameters:
aa_shift (
strordict) – Either the key to a pre-defined isotope label (seepeptides.tables.MASS_SHIFT), or a dictionary mapping each amino acid to it mass difference in Dalton (usenTerandcTerkeys for N-terminal and C-terminal modifications).monoisotopic (
bool) – Flag whether monoisotopic weights of amino-acids should be used.
- Returns:
float– The mass difference of the modified peptide.
Example
>>> peptide = Peptide("EGVNDNECEGFFSAR") >>> peptide.mass_shift(aa_shift="silac_13c") 6.020129... >>> peptide.mass_shift(aa_shift=dict(R=10.00827)) 10.00827...
References
Ong, S-E., I. Kratchmarova, and M. Mann. Properties of 13C-Substituted Arginine in Stable Isotope Labeling by Amino Acids in Cell Culture (SILAC). Journal of Proteome Research. Apr 2003;2(2):173–81. doi:10.1021/pr0255708. PMID:12716131.
Picotti, P., B. Bodenmiller, L. N. Mueller, B. Domon, and R. Aebersold. Full Dynamic Range Proteome Analysis of S. Cerevisiae by Targeted Proteomics. Cell. Aug 2009;138(4):795–806. doi:10.1016/j.cell.2009.05.051. PMID:19664813.
- max_frequency() float#
Return the maximum frequency of any amino acid in the peptide.
This method identifies the amino acid that appears most frequently in the sequence and returns its frequency. A high maximum frequency indicates low sequence diversity.
- Returns:
float– The maximum frequency of any amino acid in the peptide, between 1/len(sequence) and 1.0.
Example
>>> peptide = Peptide("AALS") >>> peptide.max_frequency() 0.5 >>> peptide = Peptide("AAAA") >>> peptide.max_frequency() 1.0 >>> peptide = Peptide("ACDEFGHIKLMNPQRSTVWY") >>> peptide.max_frequency() 0.05
Added in version 0.5.0.
- membrane_position_profile(window: int = 11, angle: int = 100) List[str]#
Compute the theoretical class of a protein sequence.
This function builds a profile predicting the theoretical class of a section of the peptide based on the relationship between the hydrophobic moment and hydrophobicity scale as proposed by Eisenberg (1984).
- Parameters:
- Returns:
listofstr– A list containing a one-character code for each window starting position: either'G'for globular,'T'for transmembrane, or'S'for surface.
Example
>>> peptide = Peptide("ARQQNLFINFCLILIFLLLI") >>> peptide.membrane_position_profile(window=12, angle=100) ['G', 'G', 'G', 'T', 'S', 'T', 'T', 'T', 'T'] >>> peptide.membrane_position_profile(window=12, angle=160) ['G', 'G', 'G', 'S', 'S', 'S', 'S', 'S', 'S']
References
Eisenberg, D. Three-Dimensional Structure of Membrane and Surface Proteins. Annual Review of Biochemistry. July 1984;53:595–623. doi:10.1146/annurev.bi.53.070184.003115. PMID:6383201.
Eisenberg, D., E. Schwarz, M. Komaromy, and R. Wall. Analysis of Membrane and Surface Protein Sequences with the Hydrophobic Moment Plot. Journal of Molecular Biology. Oct 1984;179(1):125–42. doi:10.1016/0022-2836(84)90309-7. PMID:6502707.
Eisenberg, D., R. M. Weiss, and T. C. Terwilliger. The Helical Hydrophobic Moment: A Measure of the Amphiphilicity of a Helix. Nature. Sep 1982;299(5881):371–74. doi:10.1038/299371a0. PMID:7110359
- molecular_weight(average: str = 'expasy', aa_shift: str | Dict[str, float] | None = None) float#
Compute the molecular weight of a protein sequence.
This function calculates the molecular weight of a protein sequence. It is calculated as the sum of the mass of each amino acid using one of the 3 available scales. It also supports mass calculation of proteins with predefined or custom stable isotope mass labels.
- Parameters:
- Returns:
float– The molecular weight of the peptide, in Dalton.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> peptide.molecular_weight() 2485.91... >>> peptide.molecular_weight(average="mascot") 2485.89... >>> peptide.molecular_weight(average="monoisotopic") 2484.11...
References
Wilkins, M. R., E. Gasteiger, A. Bairoch, J. C. Sanchez, K. L. Williams, R. D. Appel, and D. F. Hochstrasser. Protein Identification and Analysis Tools in the ExPASy Server. Methods in Molecular Biology. 1992;112: 531–52. doi:10.1385/1-59259-584-7:531. PMID:10027275
- ms_whim_scores() MSWHIMScores#
Compute the MS-WHIM scores of the peptide.
See
MSWHIMScoresfor more information.- Returns:
peptides.MSWHIMScores– The compute average of MS-WHIM scores of all the amino acids in the peptide.
Example
>>> peptide = Peptide("KLKLLLLLKLK") >>> for i, mw in enumerate(peptide.ms_whim_scores()): ... print(f"MSWHIM{i+1:<3} {mw: .4f}") MSWHIM1 -0.6564 MSWHIM2 0.4873 MSWHIM3 0.1164
- mz(charge: int = 2, aa_shift: str | Dict[str, float] | None = None, cysteins: float = 57.021464) float#
Compute the m/z (mass over charge) ratio for a peptide.
This function calculates the (monoisotopic) mass over charge ratio (m/z) for peptides, as measured in mass spectrometry.
- Parameters:
charge (
int) – The net charge for which the m/z should be computed.aa_shift (
str,dictorNone) – Either an appropriate shift value to pass toPeptide.mass_shift, orNoneto get the unmodified weight.cysteins (
float) – The mass shift (in Dalton) of blocked cysteins. Default corresponds to cysteins blocked by iodoacetamide.
- Returns:
float– The m/z ratio of the peptide.
Example
>>> peptide = Peptide("EGVNDNECEGFFSAR") >>> peptide.mz() 865.857... >>> peptide.mz(aa_shift=dict(K=6.020129, R=6.020129)) 868.867... >>> peptide.mz(aa_shift="silac_13c", cysteins=58.005479) 869.359...
- nutrient_cost(nutrient: str = 'glucose', organism: str = 'yeast', relative: bool = False)#
Estimate the nutrient cost to biosynthesize a peptide.
The nutrient cost was proposed by Barton et al. to estimate the energy cost required to biosynthesize a peptide based on genome-scale metabolic modeling in Saccharomyces cerevisiae and Escherichia coli. This approach offers advantages to estimate costs in nutrient-limited environments compared to energy-based methods.
- Parameters:
Example
>>> peptide = Peptide("SDKEVDEVDAALSDLEITLE") >>> peptide.nutrient_cost("glucose", "ecoli") 15.763... >>> peptide.nutrient_cost("ammonia", "yeast", relative=True) 10.839...
References
Barton, M. D., Delneri, D., Oliver, S. G., Rattray, M., & Bergman, C. M. (2010). Evolutionary systems biology of amino acid biosynthetic cost in yeast. PloS one, 5(8), e11935. PMID:20808905 doi:10.1371/journal.pone.0011935.
- pcp_descriptors() PCPDescriptors#
Compute the Physical-Chemical Properties descriptors of the peptide.
See
PCPDescriptorsfor more information.- Returns:
peptides.PCPDescriptors– The computed average of PCP descriptors of all the amino acids in the peptide.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, pcp in enumerate(peptide.pcp_descriptors()): ... print(f"E{i+1:<3} {pcp: .5f}") E1 0.01090 E2 0.03810 E3 0.12505 E4 0.04095 E5 -0.10595
- physical_descriptors() PhysicalDescriptors#
Compute the Physical Descriptors of the peptide.
See
PhysicalDescriptorsfor more information.- Returns:
peptides.PhyiscalDescriptors– The computed average of Physical Descriptors of all the amino acids in the peptide. PD1 is related to volume while PD2 is related to hydrophilicity.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, pd in enumerate(peptide.physical_descriptors()): ... print(f"PD{i+1:<3} {pd: .4f}") PD1 0.1190 PD2 0.2825
- prin_components() PRINComponents#
Compute the PRIN components of the peptide.
See
PRINComponentsfor more information.- Returns:
peptides.PRINComponents– The computed average of PRIN components of all the amino acids in the peptide.
- profile(table: Dict[str, float], window: int = 1, default: float = 0.0) Sequence[float]#
Compute a generic per-residue profile from per-residue indices.
- Parameters:
- Returns:
collections.abc.Sequenceoffloat– The per-residue profile values, averaged in the given window size. Whenwindowis larger than the available number of resiudes, an empty sequence is returned.
Example
>>> peptide = Peptide("PKLVCLKKC") >>> peptide.profile(peptides.tables.CHARGE['sign']) [0.0, 1.0, 0.0, 0.0, -1.0, 0.0, 1.0, 1.0, -1.0] >>> peptide.profile(peptides.tables.MOLECULAR_WEIGHT['expasy'], 5) [108..., 111..., 111..., 114..., 115...]
Added in version 0.3.0.
- protfp_descriptors() ProtFPDescriptors#
Compute the ProtFP descriptors of the peptide.
See
ProtFPDescriptorsfor more information.- Returns:
peptides.ProtFPDescriptors– The computed average of ProtFP descriptors of all the amino acids in the peptide.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, fp in enumerate(peptide.protfp_descriptors()): ... print(f"ProtFP{i+1:<3} {fp: .4f}") ProtFP1 0.2065 ProtFP2 -0.0565 ProtFP3 1.9930 ProtFP4 -0.2845 ProtFP5 0.7315 ProtFP6 0.7000 ProtFP7 0.1715 ProtFP8 0.1135
- sneath_vectors() SneathVectors#
Compute the Sneath vectors for the peptide.
See
SneathVectorsfor more information.- Returns:
peptides.SneathVectors– The computed average of Sneath vectors of all the amino acids in the peptide.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, fp in enumerate(peptide.sneath_vectors()): ... print(f"SV{i+1:<3} {fp: .5f}") SV1 0.19620 SV2 0.04655 SV3 0.04050 SV4 0.02775
- st_scales() STScales#
Compute the ST-scales of the peptide.
See
STScalesfor more information.- Returns:
peptides.STScales– The computed average of ST-scales of all the amino acids in the peptide.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, st in enumerate(peptide.st_scales()): ... print(f"ST{i+1:<3} {st: .5f}") ST1 -0.63760 ST2 0.07965 ST3 0.05150 ST4 0.07135 ST5 -0.27905 ST6 -0.80995 ST7 0.58020 ST8 0.54400
- structural_class(frequencies: str = 'Nakashima', distance: str = 'mahalanobis') str#
Predict the structural class of the peptide from its sequence.
The structural class of a protein, as defined in Levitt and Chothia (1976), can be either α, β, α+β, or α/β, with ζ being later defined for irregular proteins. It depends on the secondary structure of the protein. Several methods have been proposed to elucidate the structural class from the amino acid sequence, all based on similarity with proteins which structures have been elucidated.
- Parameters:
frequencies (
str) – The frequencies of the amino acids in proteins of different structural classes to use as reference centroids. Use"Chou"to load the frequencies of the 64 proteins analyzed in Chou (1989),"Nakashima"to use the normalized frequencies of the 135 proteins analyzed in Nakashima et al. (1986) and Zhang & Chou (1995), or"ChouZhang"to load the frequencies of 120 proteins used in Chou & Zhang (1995).distance (
str) – The distance metric to use in the 20-D space formed by the 20 usual amino acid to find the nearest structural class for the peptide. Use"cityblock"to use the Manhattan distance like in Chou (1989),"euclidean"to use the Euclidean distance like in Nakashima et al (1986),"correlation"to use the correlation distance like in Chou & Zhang (1992),"mahalanobis"to use the Mahalanobis distance like in Chou & Zhang (1995), or"discriminant"to use the Bayes discriminant like in Chou et al. (1998).
- Returns:
str– The structural class the protein most likely belongs to. Note that some classes may not be predictable, depending on the reference frequencies being used (at the moment, the ζ class can only be predicted from the Nakashima frequencies with euclidean or manhattan distances).
Example
Predict the structural class of the skipjack tuna Cytochrome C, (P0025), an α protein
>>> p = Peptide( ... "MGDVAKGKKTFVQKCAQCHTVENGGKHKVGPNLWGLFGRKTGQAEGYSYT" ... "DANKSKGIVWNENTLMEYLENPKKYIPGTKMIFAGIKKKGERQDLVAYLK" ... "SATS" ... ) >>> p.structural_class("Nakashima", distance="mahalanobis") 'alpha' >>> p.structural_class("ChouZhang", distance="mahalanobis") 'beta' >>> p.structural_class("Chou", distance="correlation") 'alpha' >>> p.structural_class("Nakashima", distance="euclidean") 'alpha' >>> p.structural_class("Chou", distance="cityblock") 'alpha+beta'
Predict the structural class of the sea krait Erabutoxin B (Q90VW1), a β protein:
>>> p = Peptide( ... "MKTLLLTLVVVTIVCLDLGYTRICFNHQSSQPQTTKTCSPGESSCYHKQW" ... "SDFRGTIIERGCGCPTVKPGIKLSCCESEVCNN" ... ) >>> p.structural_class("Nakashima", distance="mahalanobis") 'beta' >>> p.structural_class("ChouZhang", distance="mahalanobis") 'alpha+beta' >>> p.structural_class("Chou", distance="correlation") 'beta' >>> p.structural_class("Nakashima", distance="euclidean") 'zeta' >>> p.structural_class("Chou", distance="cityblock") 'beta'
Predict the structural class of the Arthrospira platensis Ferredoxin (P00246), a ζ protein:
>>> p = Peptide( ... "MATYKVTLINEAEGINETIDCDDDTYILDAAEEAGLDLPYSCRAGACSTC" ... "AGTITSGTIDQSDQSFLDDDQIEAGYVLTCVAYPTSDCTIKTHQEEGLY" ... ) >>> p.structural_class("Nakashima", distance="euclidean") 'zeta'
References
Chou, K-C., W-M. Liu, G. M. Maggiora, and C-T. Zhang. Prediction and Classification of Domain Structural Classes. Proteins: Structure, Function, and Genetics. Apr 1998;31(1):97–103. PMID:9552161.
Chou, K-C., and C-T. Zhang. Prediction of Protein Structural Classes. Critical Reviews in Biochemistry and Molecular Biology. Feb 1995;30:275–349. doi:10.3109/10409239509083488. PMID:7587280.
Chou, K-C., and C-T. Zhang. A Correlation-Coefficient Method to Predicting Protein-Structural Classes from Amino Acid Compositions. European Journal of Biochemistry. 1992;207(2):429–33. doi:10.1111/j.1432-1033.1992.tb17067.x. PMID:1633801.
Chou, P. Y. Prediction of Protein Structural Classes from Amino Acid Compositions. In Prediction of Protein Structure and the Principles of Protein Conformation, edited by G. D. Fasman. Springer US. 1989:549–86. doi:10.1007/978-1-4613-1571-1. ISBN:978-0-306-43131-9.
Nakashima, H., K. Nishikawa, and T. Ooi. The Folding Type of a Protein Is Relevant to the Amino Acid Composition. Journal of Biochemistry. Jan 1986;99(1):153–62. doi:10.1093/oxfordjournals.jbchem.a135454. PMID:3957893.
Zhang, Chun-Ting, and Kuo-Chen Chou. An Eigenvalue-Eigenvector Approach to Predicting Protein Folding Types. Journal of Protein Chemistry. Jul 1995;14(5):309–26. doi:10.1007/BF01886788. PMID:8590599.
Zhou, G.P., and N. Assa-Munt. Some Insights into Protein Structural Class Prediction. Proteins: Structure, Function, and Bioinformatics. 2001;44(1):57–59. doi:10.1002/prot.1071. PMID:11354006.
- svger_descriptors() SVGERDescriptors#
Compute the SVGER descriptors of the peptide.
See
SVGERDescriptorsfor more information.- Returns:
peptides.SVGERDescriptors– The computed average of SVGER descriptors of all the amino acids in the peptide.
Added in version 0.3.2.
- t_scales() TScales#
Compute the T-scales of the peptide.
See
TScalesfor more information.- Returns:
peptides.TScales– The computed average of T-scales of all the amino acids in the peptide.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, t in enumerate(peptide.t_scales()): ... print(f"T{i+1:<3} {t: .4f}") T1 -3.2700 T2 -0.0035 T3 -0.3855 T4 -0.1475 T5 0.7585
- vhse_scales() VHSEScales#
Compute the VHSE-scales of the peptide.
See
VHSEScalesfor more information.- Returns:
peptides.VHSEScales– The computed average of VHSE-scales of the amino acids in the peptide. VHSE1 and VHSE2 represent hydrophobic properties, VHSE3 and VHSE4 represent steric properties, while VHSE5, VHSE6, VHSE7 and VHSE8 represent electronic properties.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, vhse in enumerate(peptide.vhse_scales()): ... print(f"VHSE{i+1:<3} {vhse: .4f}") VHSE1 -0.1150 VHSE2 0.0630 VHSE3 -0.0055 VHSE4 0.7955 VHSE5 0.4355 VHSE6 0.2485 VHSE7 0.1740 VHSE8 -0.0960
- vstpv_descriptors() VSTPVDescriptors#
Compute the VSTPV descriptors of the peptide.
See
VSTPVDescriptorsfor more information.- Returns:
peptides.VSTPVDescriptors– The computed VSTPV descriptors for the peptide.
- z_scales() ZScales#
Compute the Z-scales of the peptide.
See
ZScalesfor more information.- Returns:
peptides.ZScales– The computed average of Z-scales of all the amino acid in the peptide.
Example
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC") >>> for i, z in enumerate(peptide.z_scales()): ... print(f"Z{i+1:<3} {0.0+round(z,5): .4f}") Z1 0.5520 Z2 0.0985 Z3 0.0000 Z4 0.8130 Z5 -0.8285