TMHMM server offers prediction of transmembrane helices in proteins

get_tmhmm(data, ...)

# S3 method for character
get_tmhmm(data, splitter = 2500L, attempts = 2, progress = FALSE, ...)

# S3 method for data.frame
get_tmhmm(data, sequence, id, ...)

# S3 method for list
get_tmhmm(data, ...)

# S3 method for default
get_tmhmm(data = NULL, sequence, id, ...)

# S3 method for AAStringSet
get_tmhmm(data, ...)

Arguments

data

A data frame with protein amino acid sequences as strings in one column and corresponding id's in another. Alternatively a path to a .fasta file with protein sequences. Alternatively a list with elements of class SeqFastaAA resulting from read.fasta call. Alternatively an AAStringSet object. Should be left blank if vectors are provided to sequence and id arguments.

...

currently no additional arguments are accepted apart the ones documented bellow.

splitter

An integer indicating the number of sequences to be in each .fasta file that is to be sent to the server. Default is 2500. Change only in case of a server side error. Accepted values are in range of 1 to 10000.

attempts

Integer, number of attempts if server unresponsive, at default set to 2.

progress

Boolean, whether to show messages of the job id for each batch. Default is FALSE

sequence

A vector of strings representing protein amino acid sequences, or the appropriate column name if a data.frame is supplied to data argument. If .fasta file path, or list with elements of class "SeqFastaAA" provided to data, this should be left blank.

id

A vector of strings representing protein identifiers, or the appropriate column name if a data.frame is supplied to data argument. If .fasta file path, or list with elements of class "SeqFastaAA" provided to data, this should be left blank.

Source

https://services.healthtech.dtu.dk/service.php?TMHMM-2.0

Value

A data frame with columns:

id

Character, name of the submitted sequence.

length

Integer, length of the protein sequence

ExpAA

Numeric, the expected number of amino acids in transmembrane helices.

First60

Numeric, the expected number of amino acids in transmembrane helices in the first 60 amino acids of the protein.

tm

Integer, the number of predicted transmembrane segments.

prediction

Character string, predicted topology of the protein.

Note

This function creates temporary files in the working directory. If something goes wrong during communication with the server and progress was set to TRUE, predictions can be obtained using the web address `paste("https://services.healthtech.dtu.dk/cgi-bin/webface2.cgi?jobid=", jobid, "&wait=20", sep = "")`.

References

Krogh A, Larsson B, von Heijne G, Sonnhammer EL (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305(3):567-80.

See also

Examples

library(ragp) tmhmm_pred <- get_tmhmm(data = at_nsp[1:10,], sequence, Transcript.id) tmhmm_pred
#> id length ExpAA First60 tm prediction #> 1 ATCG00660.1 117 12.93 0.59 0 i #> 2 AT2G43600.1 273 6.17 6.13 0 o #> 3 AT2G28410.1 115 12.94 9.53 0 o #> 4 AT2G22960.1 184 0.01 0.01 0 o #> 5 AT2G19580.1 270 90.00 41.19 4 i9-31o41-63i70-92o231-253i #> 6 AT2G19690.2 148 21.89 19.81 1 i7-26o #> 7 AT2G19690.1 147 20.76 19.79 1 i7-26o #> 8 AT2G33130.1 103 20.43 20.43 1 i7-26o #> 9 AT2G05520.1 145 20.20 20.20 1 i7-29o #> 10 AT2G05520.2 138 20.28 20.27 1 i7-29o