4. Query functions¶
Contents
4.1. Before you query¶
4.1.1. 1. You can use % as a wildcard.¶
import pyuniprot
query = pyuniprot.query()
# exact search
query.entry(recommended_name='Amyloid beta A4 protein')
# starts with 'Amyloid beta'
query.entry(recommended_name='Amyloid beta%')
# ends with 'A4 protein'
query.entry(recommended_name='%A4 protein')
# contains 'beta A4'
query.entry(recommended_name='%beta A4%')
4.1.2. 2. limit to restrict number of results¶
import pyuniprot
query = pyuniprot.query()
query.entry(limit=10)
Use an offset by paring a tuple (page_number, number_of_results_per_page) to the parameter limit.
page_number starts with 0!
import pyuniprot
query = pyuniprot.query()
# first page with 3 results (every page have 3 results)
query.entry(limit=(0,3))
# fourth page with 10 results (every page have 10 results)
query.entry(limit=(4,10))
4.1.3. 3. Return pandas.DataFrame
as result¶
This is very useful if you want to profit from amazing pandas functions.
import pyuniprot
query = pyuniprot.query()
query.entry(as_df=True)
4.1.4. 4. show all columns as dict¶
import pyuniprot
query = pyuniprot.query()
first_entry = query.entry(limit=1)[0]
first_entry.__dict__
4.1.5. 5. Return single values with key name¶
import pyuniprot
query = pyuniprot.query()
query.entry(recommended_full_name='%kinase')[0].recommended_full_name
4.1.6. 6. Access to the linked data models (1-n, n-m)¶
For example entry can access
- sequence
- accessions
- organism_hosts
- features
- functions
- ec_numbers
- db_references
- alternative_full_names
- alternative_short_names
- disease_comments
- tissue_specificities
- other_gene_names
import pyuniprot
query = pyuniprot.query()
r = query.entry(limit=1)[0]
r.sequence
r.accessions
r.organism_hosts
r.features
r.functions
r.ec_numbers
r.db_references
r.alternative_full_names
r.alternative_short_names
r.disease_comments
r.tissue_specificities
r.other_gene_names
But from EC number you can go back to entry
import pyuniprot
query = pyuniprot.query()
r = query.ec_number(ec_number='1.1.1.1')
[x.entry for x in r]
# following is crazy but possible, again go back to ec_number
[x.entry.ec_numbers for x in r]
4.1.7. 7. Entry name is available in almost all methods¶
In almost all function you have the parameter entry_name (primary key for UniProt entries) even it is not part of the model.
import pyuniprot
query = pyuniprot.query()
query.other_gene_name(entry_name='A4_HUMAN')
4.2. entry¶
import pyuniprot
query = pyuniprot.query()
query.entry(name='1433E_HUMAN', recommended_full_name='14-3-3 protein epsilon', gene_name='YWHAE')
Check documentation of pyuniprot.manager.query.QueryManager.entry()
for all available parameters.
4.3. disease¶
import pyuniprot
query = pyuniprot.query()
query.disease(acronym='AD')
Check documentation of pyuniprot.manager.query.QueryManager.disease()
for all available parameters.
4.4. disease_comment¶
import pyuniprot
query = pyuniprot.query()
query.disease_comment(comment='%Alzheimer%')
Check documentation of pyuniprot.manager.query.QueryManager.disease_comment()
for all available parameters.
4.5. other_gene_name¶
import pyuniprot
query = pyuniprot.query()
query.other_gene_name(entry_name='A4_HUMAN'))
Check documentation of pyuniprot.manager.query.QueryManager.other_gene_name()
for all available parameters.
4.6. alternative_full_name¶
import pyuniprot
query = pyuniprot.query()
query.alternative_full_name(name='Alzheimer disease amyloid protein')
Check documentation of pyuniprot.manager.query.QueryManager.alternative_full_name()
for
all available parameters.
4.7. alternative_short_name¶
import pyuniprot
query = pyuniprot.query()
query.alternative_short_name(name='Alzheimer disease amyloid protein', entry_name='A4_HUMAN')
Check documentation of pyuniprot.manager.query.QueryManager.alternative_short_name()
for all
available parameters.
4.8. accession¶
import pyuniprot
query = pyuniprot.query()
query.accession(accession='P05067', entry_name='A4_HUMAN')
Check documentation of pyuniprot.manager.query.QueryManager.accession()
for all available parameters.
4.9. pmid¶
import pyuniprot
query = pyuniprot.query()
query.pmid(pmid=7644510)
Check documentation of pyuniprot.manager.query.QueryManager.pmid()
for all available parameters.
4.10. organismHost¶
import pyuniprot
query = pyuniprot.query()
query.organism_host(taxid=9606)
# 0 results if you have only installed human
Check documentation of pyuniprot.manager.query.QueryManager.organismHost()
for all available parameters.
4.11. dbReference¶
import pyuniprot
query = pyuniprot.query()
query.db_reference(type_='EMBL', identifier='U20972')
Check documentation of pyuniprot.manager.query.QueryManager.dbReference()
for all available parameters.
4.12. feature¶
import pyuniprot
query = pyuniprot.query()
query.feature(type_='sequence variant', limit=1)
Check documentation of pyuniprot.manager.query.QueryManager.feature()
for all available parameters.
4.13. function¶
import pyuniprot
query = pyuniprot.query()
query.function(text='%Alzheimer%')
Check documentation of pyuniprot.manager.query.QueryManager.function()
for all available parameters.
4.14. keyword¶
import pyuniprot
query = pyuniprot.query()
r = query.keyword(name='Phagocytosis')[0]
[x.entry for x in r] # all proteins linked to keyword Phagocytosis
Check documentation of pyuniprot.manager.query.QueryManager.keyword()
for all available parameters.
4.15. ec_number¶
import pyuniprot
query = pyuniprot.query()
query.ec_number(ec_number='1.1.1.1')
Check documentation of pyuniprot.manager.query.QueryManager.ec_number()
for all available parameters.
4.16. subcellular_location¶
import pyuniprot
query = pyuniprot.query()
query.subcellular_location(location='Autophagosome lumen')
Check documentation of pyuniprot.manager.query.QueryManager.subcellular_location()
for all available
parameters.
4.17. tissue_specificity¶
import pyuniprot
query = pyuniprot.query()
query.tissue_specificity(comment='%brain%', limit=1)
Check documentation of pyuniprot.manager.query.QueryManager.tissue_specificity()
for all available
parameters.
4.18. tissue_in_reference¶
import pyuniprot
query = pyuniprot.query()
query.tissue_in_reference(tissue: 'Substantia nigra')
Check documentation of pyuniprot.manager.query.QueryManager.tissue_in_reference()
for all available
parameters.
5. Query properties¶
5.1. dbreference_types¶
import pyuniprot
query = pyuniprot.query()
query.dbreference_types
5.2. taxids¶
import pyuniprot
query = pyuniprot.query()
query.taxids
5.3. datasets¶
import pyuniprot
query = pyuniprot.query()
query.datasets
5.4. feature_types¶
import pyuniprot
query = pyuniprot.query()
query.feature_types
5.5. subcellular_locations¶
import pyuniprot
query = pyuniprot.query()
query.subcellular_locations
5.6. tissues_in_references¶
import pyuniprot
query = pyuniprot.query()
query.tissues_in_references
5.7. keywords¶
import pyuniprot
query = pyuniprot.query()
query.keywords