1 Introduction

A sequence logo, based on information theory, has been widely used as a graphical representation of sequence conservation (aka motif) in multiple amino acid or nucleic acid sequences. Sequence motif represents conserved characteristics such as DNA binding sites, where transcription factors bind, and catalytic sites in enzymes. Although many tools, such as seqlogo1, have been developed to create sequence motif and to represent it as individual sequence logo, software tools for depicting the relationship among multiple sequence motifs are still lacking. We developed a flexible and powerful open-source R/Bioconductor package, motifStack, for visualization of the alignment of multiple sequence motifs.

2 Prepare environment

You will need ghostscript. The full path to the executable can be set using the environment variable R_GSCMD. If this is unset, environment variable gs (Unix, Linux or Mac) or GSC (Window) will be searched by name on your path. If gs/GSC is unset, GhostScript executable “gswi64c.exe” and “gswin32c.exe” are tried sequentially.

Here is an example on Windows assuming that the gswin32c.exe is installed at C:\ Program Files\ gs\ gs9.06\ bin. In a R session, please try:

Sys.setenv(R_GSCMD=file.path("C:", "Program Files", "gs", 
                             "gs9.06", "bin", "gswin32c.exe"))

3 Examples of using motifStack

3.1 plot a DNA sequence logo with different fonts and colors

Users can select different fonts and colors to draw the sequence logo.

pcm <- read.table(file.path(find.package("motifStack"), 
                            "extdata", "bin_SOLEXA.pcm"))
pcm <- pcm[,3:ncol(pcm)]
rownames(pcm) <- c("A","C","G","T")
motif <- new("pcm", mat=as.matrix(pcm), name="bin_SOLEXA")
##pfm object
#motif <- pcm2pfm(pcm)
#motif <- new("pfm", mat=motif, name="bin_SOLEXA")
#plot the logo with same height
plot(motif, ic.scale=FALSE, ylab="probability")
#try a different font
plot(motif, font="mono,Courier")
#try a different font and a different color group
motif@color <- colorset(colorScheme='basepairing')
Plot a DNA sequence logo with different fonts and colors

Figure 1: Plot a DNA sequence logo with different fonts and colors