UniProtKB query fields
Supported query fields for searching specific data in UniProtKB (see also query syntax).
rest.uniprot.org field | rest.uniprot.org example | Description |
---|---|---|
accession | accession:P62988 | The old behaviour was to list all entries with primary or secondary accession number P62988. The new behaviour will list all primary / canonical isoform accessions P62988. To search over secondary accessions, we have introduced the sec_acc field. |
active | active:false | Lists all obsolete entries. |
Refer to the page: Sequence Annotations | Lists all entries with:
|
|
lit_author | lit_author:ashburner | Lists all entries with at least one reference co-authored by Michael Ashburner. |
protein_name | protein_name:CD233 | Lists all entries whose cluster of differentiation number is CD233 (see cdlist.txt). |
chebi | chebi:18420 | Lists all entries which are associated with the small molecule corresponding to ChEBI identifier 18420, Mg(2+) (see How can I search UniProt for chemical or reaction data?). |
uniprot_id (/uniref), then uniref_cluster_90 (/uniprotkb) |
|
Find all entries in the UniRef 90% identity cluster whose representative sequence is UniProtKB entry A5YMT3 (about UniRef). |
xrefcount_pdb (or xref_count) | xref_count_pdb:[20 TO *] | Lists all entries with 20 or more cross-references to PDB |
date_created | date_created:[2012-10-01 TO *] | Lists all entries created since October 1st 2012. |
database, xref |
|
Lists all entries with:
|
ec | ec:3.2.1.23 | Lists all beta-galactosidases (Enzyme nomenclature database). |
Refer to the pages: Comments or Sequence Annotations | Lists all entries with:
|
|
existence | existence:3 | See Protein existence criteria. |
family | family:serpin | Lists all entries belonging to the Serpin family of proteins (Index of protein domains and families). |
fragment | fragment:true | Lists all entries with an incomplete sequence. |
gene | gene:HPSE | Lists all entries for proteins encoded by gene HPSE, but also by HPSE2. |
gene_exact | gene_exact:HPSE | Lists all entries for proteins encoded by gene HPSE, but excluding variations like HPSE2 or HPSE_0. |
go | go:0015629) | Lists all entries associated with the GO term Actin cytoskeleton and any subclasses |
virus_host_name, virus_host_id | virus_host_id:10090 | Lists all entries for viruses infecting Mus musculus (Mouse) |
accession_id | accession_id:P00750 | Returns the entry with the primary accession number P00750. |
inchikey | inchikey:WQZGKKKJIJFFOK-GASJEMHNSA-N | Returns entries associated with the small molecule identified by the InChIKey WQZGKKKJIJFFOK-GASJEMHNSA-N, i.e. D-glucopyranose (see How can I search UniProt for chemical or reaction data?). To get the CHEBI identifier for an Inchikey value, one can now use the advanced search builder. |
protein_name | protein_name:Anakinra | Lists all entries whose protein name includes the "International Nonproprietary Name" is Anakinra. |
interactor | interactor:P00520 | Lists all entries describing interactions with the protein described by entry P00520. |
keyword |
|
|
length | length:[500 TO 700] | Lists all entries describing sequences of length between 500 and 700 residues. |
mass | mass:[500000 TO *] | Lists all entries describing sequences with a mass of at least 500,000 Da. |
cc_mass_spectrometry | cc_mass_spectrometry:maldi | Lists all entries for proteins identified by: matrix-assisted laser desorption/ionization (MALDI), crystallography (X-Ray). The method field searches names of physico-chemical identification methods in the 'Biophysicochemical properties' subsection of the 'Function' section, the 'Publications' and 'Cross-references' sections. |
date_modified | modified:[2012-01-01 TO 2019-03-01] AND active:true | Lists all active entries that were last modified between January and March 2019. |
protein_name | protein_name:"prion protein" | Lists all entries for prion proteins. |
organelle | organelle:Mitochondrion | Lists all entries for proteins encoded by a gene of the mitochondrial chromosome. |
organism_name, organism_id |
|
Lists all entries for proteins expressed in sheep (first 2 examples) and organisms whose name contains the term "sheep" (UniProt taxonomy). |
plasmid | plasmid:ColE1 | Lists all entries for proteins encoded by a gene of plasmid ColE1 (Controlled vocabulary of plasmids). |
proteome | proteome:UP000005640 | Lists all entries from the human proteome. |
proteomecomponent | proteomecomponent:"chromosome 1" AND organism_id:9606 | Lists all entries from the human chromosome 1. |
sec_acc | sec_acc:P02023 | Lists all entries that were created from a merge with entry P02023 (see FAQ). |
reviewed | reviewed:true | Lists all UniProtKB/Swiss-Prot entries (about UniProtKB). |
scope | scope:mutagenesis | Lists all entries containing a reference that was used to gather information about mutagenesis (Entry view: "Cited for", See 'Publications' section of the user manual). |
sec_acc | sec_acc:P62988 | Lists all entries containing a secondary accession P62988. |
sequence | accession:P05067-9 AND is_isoform:true | Lists all entries containing a link to isoform 9 of the sequence described in entry P05067. Allows searching by specific sequence identifier. |
date_sequence_modified |
|
|
strain | strain:wistar | Lists all entries containing a reference relevant to strain wistar (Lists of strains in reference comments and Taxonomy help: organism strains). |
taxonomy_name, taxonomy_id |
|
Lists all entries for proteins expressed in Mammals. This field is used to retrieve entries for all organisms classified below a given taxonomic node (taxonomy classification). |
tissue | tissue:liver | Lists all entries containing a reference describing the protein sequence obtained from a clone isolated from liver (Controlled vocabulary of tissues). |
cc_webresource | cc_webresource:wikipedia | Lists all entries for proteins that are described in Wikipedia. |