The SignalP 5.0 server predicts the presence of signal peptides and the location of their cleavage sites in proteins from Archaea, Gram-positive Bacteria, Gram-negative Bacteria and Eukarya.

get_signalp5(data, ...)

# S3 method for character
get_signalp5(
  data,
  org_type = c("euk", "gram-", "gram+", "archea"),
  splitter = 2500L,
  attempts = 2,
  progress = FALSE,
  ...
)

# S3 method for data.frame
get_signalp5(data, sequence, id, ...)

# S3 method for list
get_signalp5(data, ...)

# S3 method for default
get_signalp5(data = NULL, sequence, id, ...)

# S3 method for AAStringSet
get_signalp5(data, ...)

Arguments

data

A data frame with protein amino acid sequences as strings in one column and corresponding id's in another. Alternatively a path to a .fasta file with protein sequences. Alternatively a list with elements of class SeqFastaAA resulting from read.fasta call. Alternatively an AAStringSet object. Should be left blank if vectors are provided to sequence and id arguments.

...

currently no additional arguments are accepted apart the ones documented bellow.

org_type

One of "euk", "gram-", "gram+" or "archea". Default is "euk". Are the protein sequences from Eukarya, Gram-negative Bacteria, Gram-positive Bacteria or Archaea.

splitter

An integer indicating the number of sequences to be in each .fasta file that is to be sent to the server. Default is 2500. Change only in case of a server side error. Accepted values are in range of 1 to 5000.

attempts

Integer, number of attempts if server unresponsive, at default set to 2.

progress

Boolean, whether to show messages of the job id for each batch. Default is FALSE.

sequence

A vector of strings representing protein amino acid sequences, or the appropriate column name if a data.frame is supplied to data argument. If .fasta file path, or list with elements of class "SeqFastaAA" provided to data, this should be left blank.

id

A vector of strings representing protein identifiers, or the appropriate column name if a data.frame is supplied to data argument. If .fasta file path, or list with elements of class "SeqFastaAA" provided to data, this should be left blank.

Source

https://services.healthtech.dtu.dk/service.php?SignalP-5.0

Value

if org_type is "euk" a data frame with columns:

id

Character, as from input

Prediction

Integer, The type of signal peptide predicted: Sec/SPI, Tat/SPI, Sec/SPII or Other if no signal peptide predicted

SP.Sec.SPI

Numeric, marginal probability that the protein contains a Sec N-terminal signal peptide (Sec/SPI).

Other

Numeric, the probability that the sequence does not have any kind of signal peptid.

CS_pos

Character, cleavage site position.

Pr

Numeric, probability of the predicted cleavage site position.

cleave.site

Numeric,llocal amino acid sequence arround the predicted cleavage site.

is.signalp

Logical, did SignalP5 predict the presence of a signal peptide.

sp.length

Integer, length of the predicted signal peptide.

if org_type is one of "gram-", "gram+" or "archea" the returned data frame will have two additional columns between `SP.Sec.SPI` and `Other`:

TAT.Tat.SPI

Numeric, marginal probability that the protein contains a Tat N-terminal signal peptide (Tat/SPI).

LIPO.Sec.SPII

Numeric, marginal probability that the protein contains a Lipoprotein N-terminal signal peptide (Sec/SPII).

Note

This function creates temporary files in the working directory. If something goes wrong during communication with the server and progress was set to TRUE, predictions can be obtained using `file.path("http://www.cbs.dtu.dk/services/SignalP-5.0/tmp", jobid, "output_protein_type.txt")` eg `read.delim(file.path(...), header = TRUE, skip = 1)`.

References

Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, von Heijne G, Nielsen H. (2019) SignalP 5.0 improves signal peptide predictions using deep neural networks. Nature Biotechnology, 37:420-423, doi:10.1038/s41587-019-0036-z

See also

Examples

library(ragp) sp5_pred <- get_signalp5(data = at_nsp[1:10,], sequence, Transcript.id) sp5_pred
#> id Prediction SP.Sec.SPI Other CS_pos Pr cleave.site #> 1 ATCG00660.1 OTHER 0.000375 0.999625 NA #> 2 AT2G43600.1 SP(Sec/SPI) 0.999802 0.000198 22-23 0.9639 VFS-QN #> 3 AT2G28410.1 SP(Sec/SPI) 0.990424 0.009576 22-23 0.8897 ALA-QD #> 4 AT2G22960.1 SP(Sec/SPI) 0.998142 0.001858 22-23 0.9424 AES-GS #> 5 AT2G19580.1 OTHER 0.264792 0.735208 NA #> 6 AT2G19690.2 SP(Sec/SPI) 0.989540 0.010460 28-29 0.9030 ARS-EE #> 7 AT2G19690.1 SP(Sec/SPI) 0.989540 0.010460 28-29 0.9030 ARS-EE #> 8 AT2G33130.1 SP(Sec/SPI) 0.959278 0.040722 26-27 0.5119 VVG-SR #> 9 AT2G05520.1 SP(Sec/SPI) 0.999493 0.000507 23-24 0.5666 VAA-AS #> 10 AT2G05520.2 SP(Sec/SPI) 0.999493 0.000507 23-24 0.5666 VAA-AS #> is.signalp sp.length #> 1 FALSE NA #> 2 TRUE 22 #> 3 TRUE 22 #> 4 TRUE 22 #> 5 FALSE NA #> 6 TRUE 28 #> 7 TRUE 28 #> 8 TRUE 26 #> 9 TRUE 23 #> 10 TRUE 23