The SignalP 5.0 server predicts the presence of signal peptides and the location of their cleavage sites in proteins from Archaea, Gram-positive Bacteria, Gram-negative Bacteria and Eukarya.
get_signalp5(data, ...) # S3 method for character get_signalp5( data, org_type = c("euk", "gram-", "gram+", "archea"), splitter = 2500L, attempts = 2, progress = FALSE, ... ) # S3 method for data.frame get_signalp5(data, sequence, id, ...) # S3 method for list get_signalp5(data, ...) # S3 method for default get_signalp5(data = NULL, sequence, id, ...) # S3 method for AAStringSet get_signalp5(data, ...)
data | A data frame with protein amino acid sequences as strings in one column and corresponding id's in another. Alternatively a path to a .fasta file with protein sequences. Alternatively a list with elements of class |
---|---|
... | currently no additional arguments are accepted apart the ones documented bellow. |
org_type | One of "euk", "gram-", "gram+" or "archea". Default is "euk". Are the protein sequences from Eukarya, Gram-negative Bacteria, Gram-positive Bacteria or Archaea. |
splitter | An integer indicating the number of sequences to be in each .fasta file that is to be sent to the server. Default is 2500. Change only in case of a server side error. Accepted values are in range of 1 to 5000. |
attempts | Integer, number of attempts if server unresponsive, at default set to 2. |
progress | Boolean, whether to show messages of the job id for each batch. Default is FALSE. |
sequence | A vector of strings representing protein amino acid sequences, or the appropriate column name if a data.frame is supplied to data argument. If .fasta file path, or list with elements of class "SeqFastaAA" provided to data, this should be left blank. |
id | A vector of strings representing protein identifiers, or the appropriate column name if a data.frame is supplied to data argument. If .fasta file path, or list with elements of class "SeqFastaAA" provided to data, this should be left blank. |
https://services.healthtech.dtu.dk/service.php?SignalP-5.0
if org_type is "euk" a data frame with columns:
Character, as from input
Integer, The type of signal peptide predicted: Sec/SPI, Tat/SPI, Sec/SPII or Other if no signal peptide predicted
Numeric, marginal probability that the protein contains a Sec N-terminal signal peptide (Sec/SPI).
Numeric, the probability that the sequence does not have any kind of signal peptid.
Character, cleavage site position.
Numeric, probability of the predicted cleavage site position.
Numeric,llocal amino acid sequence arround the predicted cleavage site.
Logical, did SignalP5 predict the presence of a signal peptide.
Integer, length of the predicted signal peptide.
if org_type is one of "gram-", "gram+" or "archea" the returned data frame will have two additional columns between `SP.Sec.SPI` and `Other`:
Numeric, marginal probability that the protein contains a Tat N-terminal signal peptide (Tat/SPI).
Numeric, marginal probability that the protein contains a Lipoprotein N-terminal signal peptide (Sec/SPII).
This function creates temporary files in the working directory. If something goes wrong during communication with the server and progress was set to TRUE, predictions can be obtained using `file.path("http://www.cbs.dtu.dk/services/SignalP-5.0/tmp", jobid, "output_protein_type.txt")` eg `read.delim(file.path(...), header = TRUE, skip = 1)`.
Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, von Heijne G, Nielsen H. (2019) SignalP 5.0 improves signal peptide predictions using deep neural networks. Nature Biotechnology, 37:420-423, doi:10.1038/s41587-019-0036-z
#> id Prediction SP.Sec.SPI Other CS_pos Pr cleave.site #> 1 ATCG00660.1 OTHER 0.000375 0.999625 NA #> 2 AT2G43600.1 SP(Sec/SPI) 0.999802 0.000198 22-23 0.9639 VFS-QN #> 3 AT2G28410.1 SP(Sec/SPI) 0.990424 0.009576 22-23 0.8897 ALA-QD #> 4 AT2G22960.1 SP(Sec/SPI) 0.998142 0.001858 22-23 0.9424 AES-GS #> 5 AT2G19580.1 OTHER 0.264792 0.735208 NA #> 6 AT2G19690.2 SP(Sec/SPI) 0.989540 0.010460 28-29 0.9030 ARS-EE #> 7 AT2G19690.1 SP(Sec/SPI) 0.989540 0.010460 28-29 0.9030 ARS-EE #> 8 AT2G33130.1 SP(Sec/SPI) 0.959278 0.040722 26-27 0.5119 VVG-SR #> 9 AT2G05520.1 SP(Sec/SPI) 0.999493 0.000507 23-24 0.5666 VAA-AS #> 10 AT2G05520.2 SP(Sec/SPI) 0.999493 0.000507 23-24 0.5666 VAA-AS #> is.signalp sp.length #> 1 FALSE NA #> 2 TRUE 22 #> 3 TRUE 22 #> 4 TRUE 22 #> 5 FALSE NA #> 6 TRUE 28 #> 7 TRUE 28 #> 8 TRUE 26 #> 9 TRUE 23 #> 10 TRUE 23