Plots a diagram of protein structure based on several types of annotations and predictions.

plot_prot(
  sequence,
  id,
  hyp_col = "#868686FF",
  gpi_col = "#0073C2FF",
  nsp_col = "#CD534CFF",
  ag_col = "#E5E5E5FF",
  tm_col = "#EFC000FF",
  hyp = TRUE,
  gpi = c("bigpi", "predgpi", "netgpi", "none"),
  nsp = c("signalp", "signalp5", "none"),
  ag = TRUE,
  tm = c("phobius", "tmhmm", "none"),
  domain = c("cdd", "hmm", "none"),
  disorder = FALSE,
  hyp_scan = if (ag == TRUE && hyp == TRUE) TRUE else FALSE,
  dom_sort = c("ievalue", "abc", "cba"),
  progress = FALSE,
  gpi_size = 4,
  gpi_shape = 18,
  ...
)

Arguments

sequence

String representing a protein amino acid sequence.

id

String representing a protein identifier. Will be converted using make.names.

hyp_col

Plotting color of predicted hydroxyproline positions. At default set to: '#868686FF'.

gpi_col

Plotting color of the predicted omega site (glycosylphosphatidylinositol attachment). At default set to: '#0073C2FF'.

nsp_col

Plotting color of the N-terminal signal peptide. At default set to: '#CD534CFF'.

ag_col

Plotting color of the AG glycomodul spans. At default set to: '#E5E5E5FF'.

tm_col

Plotting color of the transmembrane regions. At default set to: '#EFC000FF'.

hyp

Boolean, should hydroxyprolines be plotted.

gpi

A string indicating if get_big_pi (gpi = "bigpi"), get_pred_gpi (gpi = "predgpi") or get_netGPI (gpi = "netgpi") should be called when predicting omega sites. To turn off omega site prediction use gpi = "none". At default set to "netgpi". Alternatively the output data frame of the mentioned functions (called with simplify = TRUE) can be supplied.

nsp

A string indicating if get_signalp5 (nsp = "signalp5") or get_signalp (nsp = "signalp") should be used to obtain N-sp predictions. Alternatively a data frame containing three columns: a character column "id" indicating the protein id as from input, a logical column "is.signalp" and an integer column "sp.length". See get_signalp5 or get_signalp for details.

ag

Boolean, should the AG glycomodul spans be plotted.

tm

A string indicating if get_phobius (tm = "phobius") or get_tmhmm (tm = "tmhmm") should be used to obtain transmembrane region predictions. Alternatively a data frame with two columns: a character column "id" indicating the protein id as from input and a "prediction" column containing the topology of the transmembrane regions (example "42o81-101i108-126o"). To turn off tm prediction use tm = "none".

domain

A string indicating if get_cdd (domain = "cdd") or get_hmm (domain = "hmm") should be used to obtain domain annotation. Alternatively a data frame with five columns: a character column "id" indicating the protein id as from input, a character column "acc" indicating the accession of the domain hit, a character column "desc" indicating the description of the domain hit, a numeric column "align_start" indicating the start of the domain hit, a numeric column "align_end" indicating the end the domain hit.

disorder

Boolean, should disordered region predictions obtained using get_espritz be plotted. Alternatively the output data frame from get_espritz (called with simplify = TRUE) can be supplied.

hyp_scan

Boolean, if ag = TRUE, should scan_ag be performed on predict_hyp output thus scanning only arabinogalactan motifs which contain predicted hydroxyprolines.

dom_sort

One of c("ievalue", "abc", "cba"), defaults to "ievalue". Domain plotting order. If 'ievalue' domains with the lowest ievalue as determined by hmmscan will be plotted above. If 'abc' or 'cba' the order is determined by domain Names.

progress

Boolean, whether to show the progress bar, at default set to FALSE.

gpi_size

Integer, the size of the gpi symbol. Appropriate values are 1 - 10.

gpi_shape

Integer, the shape of the gpi symbol. Appropriate values are 0 - 25

...

Appropriate arguments passed to get_signalp5, get_signalp, get_espritz, predict_hyp, get_hmm and scan_ag.

Value

A ggplot2 plot object

See also

Examples

library(ragp) library(ggplot2) ind <- c(23, 5, 80, 81, 345) pred <- plot_prot(sequence = at_nsp$sequence[ind], id = at_nsp$Transcript.id[ind])
#> sequence vector contains O, O will be considered instead of P
pred + theme(legend.position = "bottom", legend.direction = "vertical")
#alternatively: nsp <- get_signalp(data = at_nsp[ind,], id = Transcript.id, sequence = sequence) hmm <- get_hmm(data = at_nsp[ind,], #default is to use get_cdd() id = Transcript.id, sequence = sequence) gpi <- get_netGPI(data = at_nsp[ind,], id = Transcript.id, sequence = sequence) tm <- get_phobius(data = at_nsp[ind,], id = Transcript.id, sequence = sequence) disorder <- get_espritz(data = at_nsp[ind,], id = Transcript.id, sequence = sequence) pred2 <- plot_prot(sequence = at_nsp$sequence[ind], id = at_nsp$Transcript.id[ind], tm = tm, nsp = nsp, gpi = gpi, domain = hmm, disorder = disorder)
#> sequence vector contains O, O will be considered instead of P
pred2 + theme(legend.position = "bottom", legend.direction = "vertical")
#mixing both methods is also a possibility