▼ Public Database integrated in EPSD
(1) Public phosphorylation resource
(3) Genetic variation & mutation
(8) Disease-associated information
(9) Protein-protein interaction
(13) Transcriptional regulator
(15) Protein expression/Proteomics
▼ Public Database integrated in EPSD
(1) Public phosphorylation resource
1. dbPPT: A database that contains 82,175 known p-sites in 31,012 proteins for 20 plant species (Cheng, et al., 2014).
2. dbPAF: A database that contains 54,148 phosphoproteins with 483,001 phosphorylation sites in H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, S. pombe and S. cerevisiae (Ullah, et al., 2016).
3. PhosphoSitePlus: Provides comprehensive information and tools for the study of protein post-translational modifications (PTMs) including phosphorylation, acetylation, and more. The web use is free for everyone including commercial (Hornbeck, et al., 2015).
4. Phospho.ELM: A relational database designed to store in vivo and in vitro phosphorylation data extracted from the scientific literature and phosphoproteomic analyses (Dinkel, et al., 2011).
5. PhosphoPep: A project to support systems biology signaling research by providing interactive interrogation of MS-derived phosphorylation data from 4 different organisms (Bodenmiller, et al., 2011).
6. BioGRID: An interaction repository with data compiled through comprehensive curation efforts (Oughtred, et al.,2019).
7. dbPTM: An integrated resource for protein post-translational modifications (Huang, et al., 2019).
8. FPD: Including 62,272 non-redundant phosphorylation sites in 11,222 proteins across eight fungi (Bai, et al., 2017).
9. HPRD: Represents a centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome (Keshava, et al., 2009).
10. MPPD: A repository for Medicago truncatula phosphoprotein data (Grimsrud, et al., 2010).
11. P3DB: One of the most significant in vivo data resources for studying plant phosphoproteomics (Yao, et al., 2014).
12. PHOSIDA: A phosphorylation site database, integrates thousands of high-confidence in vivo phosphosites identified by mass spectrometry-based proteomics in various species (Gnad, et al., 2011).
13. PhosPhAt: The Arabidopsis thaliana phosphorylation site database (Durek, et al., 2010).
14. SysPTM: A systematic resource for proteomic research on post-translational modifications (Li, et al., 2014).
15. UniProt: The universal protein knowledgebase (The UniProt Consortium, 2019).
16. iPTMnet: An integrated database of protein PTMs in the context of systems biology. (Huang, et al., 2018).
17. CPTAC: The CPTAC Data Portal: A Resource for Cancer Proteomics Research (Nathan J Edwards, et al., 2015).
18. Pf-phospho: a machine learning-based phosphorylation sites prediction tool for Plasmodium proteins (Priya Gupta, et al., 2022).
19. Plant_PTM_Viewer : a central resource for exploring plant protein modifications (Patrick Willems, et al., 2019).
20. qPTM: A new version of qPhos database, is a web-based database for 6 types of PTMs including acetylation, glycosylation, methylation, phosphorylation, SUMOylation, ubiquitylation in 4 different organisms including human, mouse, rat and yeast. Also, the matched proteome datasets were integrated if available. In total, 11,482,533 quantification events for 660,030 sites on 40,728 proteins under 2,596 conditions are collected and integrated in the qPTM database (Yu K, et al., 2022).
(2) Phosphorylation regulator
1. PhosphoSitePlus: A knowledgebase dedicated to mammalian post-translational modifications (PTMs), contains over 330,000 non-redundant PTMs, including phospho, acetyl, ubiquityl and methyl groups (Hornbeck, et al., 2015).
2. Phospho.ELM: A relational database designed to store in vivo and in vitro phosphorylation data extracted from the scientific literature and phosphoproteomic analyses (Dinkel, et al., 2011).
3. PostMod: A predict sever for phosphorylation sites (Jung, et al., 2010).
4. PSEA: Kinase-specific prediction and analysis of human phosphorylation substrates (Suo, et al., 2014).
5. PhosphoNetworks: A database for human phosphorylation networks (Hu, et al., 2014).
6. HuPHO: The human phosphatase portal (Liberti, et al., 2013).
7. DEPOD: A manually curated open access database providing human phosphatases (Duan, et al., 2015).
8. GPS: A tool to predict kinase-specific phosphorylation sites in hierarchy (Xue, et al., 2011).
9. NetworKIN: An algorithm that integrates cellular context information and motif-based predictions (Horn, et al., 2014).
10. PKIS: Computational identification of protein kinases for experimentally discovered protein phosphorylation sites (Zou, et al., 2013).
11. PhosphoPICK: A method for predicting kinase substrates using an integrated system of cellular context and protein sequence information (Patrick, et al., 2015).
12. RegPhos: A resource to explore the protein kinase-substrate phosphorylation networks in human and mouse (Huang, et al., 2014).
(3) Genetic variation & mutation
1. TCGA: A public funded project that aims to catalogue and discover major cancer-causing genomic alterations to create a comprehensive "atlas" of cancer genomic profiles (Hutter, et al., 2018).
2. ICGC: To obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in 50 different tumor types and/or subtypes which are of clinical and societal importance across the globe (Zhang, et al., 2019).
3. COSMIC: The world's largest and most comprehensive resource for exploring the impact of somatic mutations in human cancer (Tate, et al., 2019).
4. dbSNP: The NCBI database of genetic variation (Sherry, et al., 2001).
5. IntOGen: Integration and data mining of multidimensional oncogenomic data (Rubio-Perez, et al., 2015).
6. MIMP: Characterizes genetic variants such as cancer mutations that specifically alter kinase-binding sites in proteins (Wagih, et al., 2015).
(4) Functional annotation
1. iEKPD: Contained 197,348 phosphorylation regulators, including 109,912 protein kinases, 23,294 protein phosphatases and 68,748 PPBD-containing proteins in 164 eukaryotic specie (Guo, et al., 2019).
2. iUUCD: Contained 136,512 UB/UBL regulators, including 1,230 E1s, 5,636 E2s, 93,343 E3s, 9,548 DUBs, 30,173 UBDs and 11,099 ULDs in 148 eukaryotic species (Zhou, et al., 2018).
3. WERAM: Collected over 580 experimentally identified histone regulators from eight model organisms (Xu, et al., 2017).
4. AnimalTFDB 3.0: A comprehensive resource for annotation and prediction of animal transcription factors (Hu, et al., 2019).
5. PlantTFDB 4.0: Plant Transcription Factor Database (Jin, et al., 2017).
6. AmyPro: A database of proteins with validated amyloidogenic regions (Varadi, et al., 2018).
7. HAMAP: High-quality Automated and Manual Annotation of Proteins (Pedruzzi, et al., 2015).
8. Membranome: Provides structural and functional data about single-spanning (bitopic) transmembrane proteins of six organisms (Lomize, et al., 2018).
9. neXtProt: Now has proteomics data for over 85% of the human proteins, as well as new tools tailored to the proteomics community (Gaudet, et al., 2017).
10. THANATOS: Containing 191,543 proteins potentially associated with autophagy and cell death pathways in 164 eukaryotes (Deng, et al., 2018).
11. CGDB: Containing ∼73,000 circadian-related genes in 68 animals, 39 plants and 41 fungi (Li, et al., 2017).
12. MultitaskProtDB-II: A database of multitasking/moonlighting proteins (Franco-Serrano, et al., 2018).
13. MoonDB 2.0: An updated database of extreme multifunctional and moonlighting proteins (Ribeiro, et al., 2019).
14. CORUM: The comprehensive resource of mammalian protein complexes (Giurgiu, et al., 2019).
15. CellMarker: A manually curated resource of cell markers in human and mouse (Zhang, et al., 2019).
16. GPCRdb: G protein-coupled receptors (GPCRs) (Pándy-Szekeres, et al., 2018).
17. PTMCode: A resource for functional associations of post-translational modifications within and between proteins (Minguez, et al., 2015).
(5) Structural annotation
1. PDB: Contains 41,599 distinct protein sequences, 36,830 structures of human sequences and 9,465 nucleic acid containing structures (Burley, et al., 2019).
2. MMDB: Close to 60% of protein sequences tracked in comprehensive databases can be mapped to a known three-dimensional (3D) structure by standard sequence similarity searches (Madej, et al., 2014).
3. SCOP2: A prototype of a new structural classification of proteins (Andreeva, et al., 2014).
4. IUPred: Web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content (Dosztányi, et al., 2005).
5. MobiDB 3.0: Annotations of intrinsic protein disorder and its function (Piovesan, et al., 2018).
(6) Physicochemical property
1. AAindex: A database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids (Kawashima, et al., 2008).
2. Compute pI/Mw: A tool which allows the computation of the theoretical pI (isoelectric point) and Mw (molecular weight) for a list of UniProt Knowledgebase (Swiss-Prot or TrEMBL) entries or for user entered sequences Z (Wilkins, et al., 1999).
(7) Functional domain
1. Pfam: A widely used database of protein families, containing 14,831 manually curated entries in the current release (El-Gebali, et al., 2019).
2. PROSITE: Consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them (Sigrist, et al., 2013).
3. InterPro: Provides functional analysis of proteins by classifying them into families and predicting domains and important sites (Mitchell, et al., 2019).
4. Gene3D: A database of globular domain annotations for millions of available protein sequences (Lewis, et al., 2018).
5. PIRSF: Reflects evolutionary relationships of full-length proteins and domains (Nikolskaya, et al., 2007).
6. PRINTS: A collection of diagnostic protein family 'fingerprints' (Attwood, et al., 2012).
7. SMART: A web resource for the identification and annotation of protein domains and the analysis of protein domain architectures (Letunic, et al., 2018).
(8) Disease-associated information
1. ClinVar: A public archive of reports of the relationships among human variations and phenotypes with supporting evidence (Landrum, et al., 2018).
2. GWASdb: Collected 2,479 unique publications from PubMed and other resources, generated a total of 252,530 unique TASs, mapped 1,610 GWAS traits to 501 Human Phenotype Ontology (HPO) terms, 435 Disease Ontology (DO) terms and 228 Disease Ontology Lite (DOLite) terms (Li, et al., 2016).
3. SNPdbe: A database and a webinterface that is designed to fill the annotation gap left by the high cost of experimental testing for functional significance of protein variants (Schaefer, et al., 2012).
4. PMD: Contains over 81,000 mutants, including artificial as well as natural mutants of various proteins extracted from about 10,000 articles (Kawabata, et al., 1999).
5. MSV3d: Provides full annotation of missense variants of all human proteins with multi-level characterization including details of the physico-chemical changes induced by the amino acid modification (Luu, et al., 2012).
6. ActiveDriverDB: Human disease mutations and genome variation in post-translational modification sites of proteins (Krassowski, et al., 2018).
7. BioMuta: An integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines (Dingerdissen, et al., 2018).
8. Kin-Driver: A human kinase database with driver mutations (Simonetti, et al., 2014).
9. NECTAR: A database and web application to annotate disease-related and functionally important amino acids in human proteins (Gong, et al., 2014).
10. OMIM: A comprehensive, authoritative and timely research resource of curated descriptions of human genes and phenotypes and the relationships between them (Amberger, et al., 2019).
11. PTMD: A Database of Human Disease-associated Post-translational Modifications (Xu, et al., 2018).
12. MSDD: miRNA SNP Disease Database (Yue, et al., 2018).
13. DiseaseEnhancer: A resource of human disease-associated enhancer catalog (Zhang, et al., 2018).
(9) Protein-protein interaction
1. IID: A major replacement of the I2D interaction database, with larger PPI networks (a total of 1,566,043 PPIs among 68,831 proteins) (Kotlyar, et al., 2019).
2. iRefIndex: A consolidated protein interaction database with provenance (Razick, et al., 2008).
3. PINA: Including multiple collections of interaction modules identified by different clustering approaches from the whole network of protein interactions ('interactome') for six model organisms (Cowley, et al., 2012).
4. HINT: High-quality protein interactomes and their applications in understanding human disease (Das, et al., 2012).
5. Mentha: A resource for browsing integrated protein-interaction networks (Calderone, et al., 2013).
6. inBio MapTM: >500,000 functional interpretation of >4,700 cancer genomes and genes involved in autism (Li, et al., 2017).
7. STRING: A database of known and predicted protein-protein interactions, covers 9,643,763 proteins from 2,031 organisms (Szklarczyk, et al., 2019).
8. TIMBAL: A database holding molecules of molecular weight <1200 Daltons that modulate protein–protein interactions (Higueruelo, et al., 2013).
(10) Drug-target relation
1. TTD: Contains 2,025 targets, including 364 successful, 286 clinical trial, 44 discontinued and 1,331 research targets, 17,816 drugs, including 1,540 approved, 1,423 clinical trial, 14,853 experimental drugs and 3,681 multi-target agents (Li, et al., 2018).
2. DrugBank: Contains 9,591 drug entries including 2,037 FDA-approved small molecule drugs, 241 FDA-approved biotech (protein/peptide) drugs, 96 nutraceuticals and over 6,000 experimental drugs (Wishart, et al., 2018).
3. GtoPdb: Providing pharmacological, chemical, genetic, functional and pathophysiological data on the targets of approved and experimental drugs (Harding, et al., 2018).
4. ADReCS-Target: Provides comprehensive information for illustrating ADRs caused by drug interactions with protein, gene and genetic variation (Zhang, et al., 2007).
5. ECOdrug: A database connecting drugs and conservation of their targets across species (Verbruggen, et al., 2018).
6. DGIdb: Consolidates, organizes and presents drug-gene interactions and gene druggability information from papers, databases and web resources (Cotto, et al., 2018).
7. CTD: A free resource that provides manually curated information on chemical, gene, phenotype, and disease relationships to advance understanding of the effect of environmental exposures on human health (Grondin, et al., 2019).
8. DrugCentral: A drug information resource (Ursu, et al., 2019).
(11) Orthologous information
1. InParanoid: Provides a user interface to orthologs inferred by the InParanoid algorithm (Sonnhammer, et al., 2015).
2. OMA: A leading resource to relate genes across many species from all of life (Altenhoff, et al., 2018).
3. OrthoDB: A comprehensive catalog of orthologs, genes inherited by extant species from a single gene in their last common ancestor (Kriventseva, et al., 2019).
4. HOGENOM: Database of complete genome homologous genes families (Penel, et al., 2009).
(12) Biological pathway
1. KEGG: A database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (Kanehisa, et al., 2019).
2. SignaLink: An integrated resource to analyze signaling pathway cross-talks, transcription factors, miRNAs and regulatory enzymes (Fazekas, et al., 2013).
3. Reactome: Provides molecular details of signal transduction, transport, DNA replication, metabolism, and other cellular processes as an ordered network of molecular transformations-an extended version of a classic metabolic map, in a single consistent data model (Fabregat, et al., 2018).
(13) Transcriptional regulator
1. TRRUST: An expanded reference database of human and mouse transcriptional regulatory interactions (Han, et al., 2018).
2. HEDD: An integrated human enhancer disease database (Wang, et al., 2018).
3. DroID: A comprehensive, integrated resource for protein, transcription factor, RNA and gene interactions for Drosophila (Murali, et al., 2011).
4. YTRP: Aimed to find the TRP information for the TFPE-identified TF-gene regulatory pairs (Yang, et al., 2014).
5. RegNetwork: Gene regulatory networks for human and mouse by collecting the documented regulatory interactions among TFs, miRNAs and target genes (Liu, et al., 2015).
(14) mRNA expreesion
1. TCGA: The Cancer Genome Atlas (TCGA) has generated comprehensive, multi-dimensional maps of the key genomic changes in 33 types of cancer. The TCGA dataset, 2.5 petabytes of data describing tumor tissue and matched normal tissues from more than 11,000 patients, is publically available and has been used widely by the research community (Hutter, et al., 2018).
2. ICGC: The Data Portal currently contains data from 24 cancer projects, and consists of 3478 genomes and 13 cancer types and subtypes (Zhang, et al., 2019).
3. ArrayExpress: An archive of functional genomics data (Athar, et al., 2019).
4. GXD: The mouse Gene Expression Database (Smith, et al., 2019).
5. COSMIC: Describes 2 002 811 coding point mutations in over one million tumor samples and across most human genes (Tate, et al., 2019).
6. BioXpress: A curated gene expression and disease association database (Dingerdissen, et al., 2018).
7. TissGDB: Tissue-specific Gene DataBase in cancer (Kim, et al., 2018).
8. FFGED: The filamentous fungal gene expression database (Zhang, et al., 2010).
9. SZDB: A comprehensive resource for schizophrenia research (Wu, et al., 2017).
10. TISSUES 2.0: An integrative web resource on mammalian tissue expression (Palasca, et al. 2018).
11. The Human Protein Atlas (HPA): 11,200 unique proteins corresponding to over 50% of all human protein-encoding genes have been analysed (Uhlen, et al., 2017).
12. Human Proteome Map (HPM): Including 30 histologically normal human samples, resulted in identification of proteins encoded by 17,294 genes (Kim, et al., 2014).
(15) Protein expression/Proteomics
1. The Human Protein Atlas (HPA): 11,200 unique proteins corresponding to over 50% of all human protein-encoding genes have been analysed (Uhlen, et al., 2017).
2. Human Proteome Map (HPM): Including 30 histologically normal human samples, resulted in identification of proteins encoded by 17,294 genes (Kim, et al., 2014).
(16) Subcellular localization
1. NLSdb: Nuclear Localization Signals (Bernhofer, et al., 2018).
2. COMPARTMENTS: Unification and visualization of protein subcellular localization evidence (Binder, et al., 2014).
3. Translocatome: Translocating human proteins (Mendik, et al., 2019).