


# Genes with Zelda motif

**Author**: Jeff Johnston

**Date**: 2013-07-19 23:30:06

## Background

The transcription factor Zelda was found to bind to "TAGteam" sites and play a role in the zygotic genome activation in *Drosophila* (<a href="http://dx.doi.org/10.1038/nature07388">Liang et al. 2008</a> ). 

## Overview

In this analysis, we will use Bioconductor (<a href="http://dx.doi.org/10.1186/gb-2004-5-10-r80">Gentleman et al. 2004</a> ) tools to identify genes with multiple TAGteam motifs in their promoters and perform a GO analysis on these genes.

## Distribution of Zelda motifs in promoters










As we have 23017 transcripts and 14869 genes, we will select a single transcript per gene using the highest Zelda motif count.




The distribution of genes by Zelda motif count is plotted below:

![Figure 1.](zelda_analysis_figures/plot_histogram.png) 


<!-- html table generated in R 3.0.1 by xtable 1.7-1 package -->
<!-- Fri Jul 19 23:30:24 2013 -->
<TABLE border=1>
<TR> <TH> Zelda_Motif_Count </TH> <TH> Number_of_Genes </TH>  </TR>
  <TR> <TD> 0 </TD> <TD align="right"> 7024 </TD> </TR>
  <TR> <TD> 1 </TD> <TD align="right"> 5075 </TD> </TR>
  <TR> <TD> 2 </TD> <TD align="right"> 1992 </TD> </TR>
  <TR> <TD> 3 </TD> <TD align="right"> 547 </TD> </TR>
  <TR> <TD> 4 </TD> <TD align="right"> 185 </TD> </TR>
  <TR> <TD> 5 </TD> <TD align="right"> 31 </TD> </TR>
  <TR> <TD> 6 </TD> <TD align="right"> 12 </TD> </TR>
  <TR> <TD> 7 </TD> <TD align="right"> 3 </TD> </TR>
   </TABLE>


## GO analysis of genes with multiple Zelda motifs

Based on the above histogram, we will select those genes with at least 4 Zelda motifs and perform GO analysis to identify enriched GO categories using GOstats (<a href="http://dx.doi.org/10.1093/bioinformatics/btl567">Falcon & Gentleman, 2006</a> ).




Below are the over-represented GO categories among these 231 genes (p < 0.001):

<!-- html table generated in R 3.0.1 by xtable 1.7-1 package -->
<!-- Fri Jul 19 23:31:14 2013 -->
<TABLE border=1>
<TR> <TH> GOBPID </TH> <TH> Pvalue </TH> <TH> OddsRatio </TH> <TH> ExpCount </TH> <TH> Count </TH> <TH> Size </TH> <TH> Term </TH>  </TR>
  <TR> <TD> GO:0006334 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 12.40 </TD> <TD align="right"> 2.4 </TD> <TD align="right"> 22 </TD> <TD align="right"> 130 </TD> <TD> nucleosome assembly </TD> </TR>
  <TR> <TD> GO:0031497 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 11.43 </TD> <TD align="right"> 2.5 </TD> <TD align="right"> 22 </TD> <TD align="right"> 139 </TD> <TD> chromatin assembly </TD> </TR>
  <TR> <TD> GO:0034728 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 11.33 </TD> <TD align="right"> 2.6 </TD> <TD align="right"> 22 </TD> <TD align="right"> 140 </TD> <TD> nucleosome organization </TD> </TR>
  <TR> <TD> GO:0006333 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 10.96 </TD> <TD align="right"> 2.6 </TD> <TD align="right"> 22 </TD> <TD align="right"> 144 </TD> <TD> chromatin assembly or disassembly </TD> </TR>
  <TR> <TD> GO:0065004 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 10.36 </TD> <TD align="right"> 2.8 </TD> <TD align="right"> 22 </TD> <TD align="right"> 151 </TD> <TD> protein-DNA complex assembly </TD> </TR>
  <TR> <TD> GO:0071824 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 9.26 </TD> <TD align="right"> 3.0 </TD> <TD align="right"> 22 </TD> <TD align="right"> 166 </TD> <TD> protein-DNA complex subunit organization </TD> </TR>
  <TR> <TD> GO:0006323 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 8.54 </TD> <TD align="right"> 3.2 </TD> <TD align="right"> 22 </TD> <TD align="right"> 178 </TD> <TD> DNA packaging </TD> </TR>
  <TR> <TD> GO:0071103 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 8.06 </TD> <TD align="right"> 3.4 </TD> <TD align="right"> 22 </TD> <TD align="right"> 187 </TD> <TD> DNA conformation change </TD> </TR>
  <TR> <TD> GO:0034622 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 5.83 </TD> <TD align="right"> 5.0 </TD> <TD align="right"> 24 </TD> <TD align="right"> 274 </TD> <TD> cellular macromolecular complex assembly </TD> </TR>
  <TR> <TD> GO:0065003 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 5.25 </TD> <TD align="right"> 5.5 </TD> <TD align="right"> 24 </TD> <TD align="right"> 301 </TD> <TD> macromolecular complex assembly </TD> </TR>
  <TR> <TD> GO:0006325 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 5.35 </TD> <TD align="right"> 5.1 </TD> <TD align="right"> 23 </TD> <TD align="right"> 282 </TD> <TD> chromatin organization </TD> </TR>
  <TR> <TD> GO:0044260 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 2.37 </TD> <TD align="right"> 56.3 </TD> <TD align="right"> 92 </TD> <TD align="right"> 3088 </TD> <TD> cellular macromolecule metabolic process </TD> </TR>
  <TR> <TD> GO:0043933 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 3.84 </TD> <TD align="right"> 8.3 </TD> <TD align="right"> 27 </TD> <TD align="right"> 454 </TD> <TD> macromolecular complex subunit organization </TD> </TR>
  <TR> <TD> GO:0051276 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 3.83 </TD> <TD align="right"> 7.6 </TD> <TD align="right"> 25 </TD> <TD align="right"> 418 </TD> <TD> chromosome organization </TD> </TR>
  <TR> <TD> GO:0006259 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 3.80 </TD> <TD align="right"> 7.0 </TD> <TD align="right"> 23 </TD> <TD align="right"> 384 </TD> <TD> DNA metabolic process </TD> </TR>
  <TR> <TD> GO:0044237 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 2.07 </TD> <TD align="right"> 72.9 </TD> <TD align="right"> 104 </TD> <TD align="right"> 4002 </TD> <TD> cellular metabolic process </TD> </TR>
  <TR> <TD> GO:0090304 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 2.30 </TD> <TD align="right"> 28.2 </TD> <TD align="right"> 53 </TD> <TD align="right"> 1547 </TD> <TD> nucleic acid metabolic process </TD> </TR>
  <TR> <TD> GO:0043170 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 2.02 </TD> <TD align="right"> 69.0 </TD> <TD align="right"> 99 </TD> <TD align="right"> 3787 </TD> <TD> macromolecule metabolic process </TD> </TR>
  <TR> <TD> GO:1901360 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 2.15 </TD> <TD align="right"> 35.3 </TD> <TD align="right"> 61 </TD> <TD align="right"> 1938 </TD> <TD> organic cyclic compound metabolic process </TD> </TR>
  <TR> <TD> GO:0046483 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 2.15 </TD> <TD align="right"> 34.5 </TD> <TD align="right"> 60 </TD> <TD align="right"> 1895 </TD> <TD> heterocycle metabolic process </TD> </TR>
  <TR> <TD> GO:0006139 </TD> <TD align="right"> 0.00000 </TD> <TD align="right"> 2.17 </TD> <TD align="right"> 32.9 </TD> <TD align="right"> 58 </TD> <TD align="right"> 1806 </TD> <TD> nucleobase-containing compound metabolic process </TD> </TR>
  <TR> <TD> GO:0006725 </TD> <TD align="right"> 0.00001 </TD> <TD align="right"> 2.12 </TD> <TD align="right"> 34.2 </TD> <TD align="right"> 59 </TD> <TD align="right"> 1878 </TD> <TD> cellular aromatic compound metabolic process </TD> </TR>
  <TR> <TD> GO:0044238 </TD> <TD align="right"> 0.00002 </TD> <TD align="right"> 1.91 </TD> <TD align="right"> 80.3 </TD> <TD align="right"> 108 </TD> <TD align="right"> 4406 </TD> <TD> primary metabolic process </TD> </TR>
  <TR> <TD> GO:0034641 </TD> <TD align="right"> 0.00002 </TD> <TD align="right"> 2.01 </TD> <TD align="right"> 35.7 </TD> <TD align="right"> 59 </TD> <TD align="right"> 1957 </TD> <TD> cellular nitrogen compound metabolic process </TD> </TR>
  <TR> <TD> GO:0022607 </TD> <TD align="right"> 0.00002 </TD> <TD align="right"> 2.68 </TD> <TD align="right"> 11.4 </TD> <TD align="right"> 27 </TD> <TD align="right"> 627 </TD> <TD> cellular component assembly </TD> </TR>
  <TR> <TD> GO:0044085 </TD> <TD align="right"> 0.00004 </TD> <TD align="right"> 2.54 </TD> <TD align="right"> 12.5 </TD> <TD align="right"> 28 </TD> <TD align="right"> 684 </TD> <TD> cellular component biogenesis </TD> </TR>
  <TR> <TD> GO:0007540 </TD> <TD align="right"> 0.00006 </TD> <TD align="right"> 82.18 </TD> <TD align="right"> 0.1 </TD> <TD align="right"> 3 </TD> <TD align="right"> 5 </TD> <TD> sex determination, establishment of X:A ratio </TD> </TR>
  <TR> <TD> GO:0006807 </TD> <TD align="right"> 0.00007 </TD> <TD align="right"> 1.90 </TD> <TD align="right"> 40.4 </TD> <TD align="right"> 63 </TD> <TD align="right"> 2215 </TD> <TD> nitrogen compound metabolic process </TD> </TR>
  <TR> <TD> GO:0071704 </TD> <TD align="right"> 0.00007 </TD> <TD align="right"> 1.82 </TD> <TD align="right"> 85.7 </TD> <TD align="right"> 111 </TD> <TD align="right"> 4701 </TD> <TD> organic substance metabolic process </TD> </TR>
  <TR> <TD> GO:0048598 </TD> <TD align="right"> 0.00010 </TD> <TD align="right"> 3.82 </TD> <TD align="right"> 3.8 </TD> <TD align="right"> 13 </TD> <TD align="right"> 207 </TD> <TD> embryonic morphogenesis </TD> </TR>
  <TR> <TD> GO:0007419 </TD> <TD align="right"> 0.00026 </TD> <TD align="right"> 10.24 </TD> <TD align="right"> 0.6 </TD> <TD align="right"> 5 </TD> <TD align="right"> 32 </TD> <TD> ventral cord development </TD> </TR>
  <TR> <TD> GO:0007538 </TD> <TD align="right"> 0.00027 </TD> <TD align="right"> 15.72 </TD> <TD align="right"> 0.3 </TD> <TD align="right"> 4 </TD> <TD align="right"> 18 </TD> <TD> primary sex determination </TD> </TR>
  <TR> <TD> GO:0010565 </TD> <TD align="right"> 0.00033 </TD> <TD align="right"> Inf </TD> <TD align="right"> 0.0 </TD> <TD align="right"> 2 </TD> <TD align="right"> 2 </TD> <TD> regulation of cellular ketone metabolic process </TD> </TR>
  <TR> <TD> GO:0016331 </TD> <TD align="right"> 0.00060 </TD> <TD align="right"> 4.65 </TD> <TD align="right"> 1.9 </TD> <TD align="right"> 8 </TD> <TD align="right"> 104 </TD> <TD> morphogenesis of embryonic epithelium </TD> </TR>
  <TR> <TD> GO:0045893 </TD> <TD align="right"> 0.00060 </TD> <TD align="right"> 3.31 </TD> <TD align="right"> 4.0 </TD> <TD align="right"> 12 </TD> <TD align="right"> 217 </TD> <TD> positive regulation of transcription, DNA-dependent </TD> </TR>
  <TR> <TD> GO:0045944 </TD> <TD align="right"> 0.00064 </TD> <TD align="right"> 3.77 </TD> <TD align="right"> 2.9 </TD> <TD align="right"> 10 </TD> <TD align="right"> 159 </TD> <TD> positive regulation of transcription from RNA polymerase II promoter </TD> </TR>
  <TR> <TD> GO:0016348 </TD> <TD align="right"> 0.00065 </TD> <TD align="right"> 23.47 </TD> <TD align="right"> 0.2 </TD> <TD align="right"> 3 </TD> <TD align="right"> 10 </TD> <TD> imaginal disc-derived leg joint morphogenesis </TD> </TR>
  <TR> <TD> GO:0036022 </TD> <TD align="right"> 0.00065 </TD> <TD align="right"> 23.47 </TD> <TD align="right"> 0.2 </TD> <TD align="right"> 3 </TD> <TD align="right"> 10 </TD> <TD> limb joint morphogenesis </TD> </TR>
  <TR> <TD> GO:0031325 </TD> <TD align="right"> 0.00072 </TD> <TD align="right"> 2.81 </TD> <TD align="right"> 5.8 </TD> <TD align="right"> 15 </TD> <TD align="right"> 319 </TD> <TD> positive regulation of cellular metabolic process </TD> </TR>
  <TR> <TD> GO:0010628 </TD> <TD align="right"> 0.00077 </TD> <TD align="right"> 3.21 </TD> <TD align="right"> 4.1 </TD> <TD align="right"> 12 </TD> <TD align="right"> 223 </TD> <TD> positive regulation of gene expression </TD> </TR>
  <TR> <TD> GO:0010604 </TD> <TD align="right"> 0.00083 </TD> <TD align="right"> 2.88 </TD> <TD align="right"> 5.3 </TD> <TD align="right"> 14 </TD> <TD align="right"> 290 </TD> <TD> positive regulation of macromolecule metabolic process </TD> </TR>
  <TR> <TD> GO:0040034 </TD> <TD align="right"> 0.00088 </TD> <TD align="right"> 20.53 </TD> <TD align="right"> 0.2 </TD> <TD align="right"> 3 </TD> <TD align="right"> 11 </TD> <TD> regulation of development, heterochronic </TD> </TR>
  <TR> <TD> GO:0009893 </TD> <TD align="right"> 0.00089 </TD> <TD align="right"> 2.75 </TD> <TD align="right"> 5.9 </TD> <TD align="right"> 15 </TD> <TD align="right"> 326 </TD> <TD> positive regulation of metabolic process </TD> </TR>
  <TR> <TD> GO:0048522 </TD> <TD align="right"> 0.00094 </TD> <TD align="right"> 2.26 </TD> <TD align="right"> 10.7 </TD> <TD align="right"> 22 </TD> <TD align="right"> 585 </TD> <TD> positive regulation of cellular process </TD> </TR>
  <TR> <TD> GO:0051254 </TD> <TD align="right"> 0.00097 </TD> <TD align="right"> 3.12 </TD> <TD align="right"> 4.2 </TD> <TD align="right"> 12 </TD> <TD align="right"> 229 </TD> <TD> positive regulation of RNA metabolic process </TD> </TR>
   </TABLE>


## References


- S. Falcon, R. Gentleman,   (2006) Using Gostats to Test Gene Lists For go Term Association.  *Bioinformatics*  **23**  257-258  [10.1093/bioinformatics/btl567](http://dx.doi.org/10.1093/bioinformatics/btl567)
- Robert C Gentleman, Vincent J Carey, Douglas M Bates, Ben Bolstad, Marcel Dettling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J Rossini, Gunther Sawitzki, Colin Smith, Gordon Smyth, Luke Tierney, Jean YH Yang, Jianhua Zhang,   (2004) Unknown.  *Genome Biology*  **5**  R80-NA  [10.1186/gb-2004-5-10-r80](http://dx.doi.org/10.1186/gb-2004-5-10-r80)
- Hsiao-Lan Liang, Chung-Yi Nien, Hsiao-Yun Liu, Mark M. Metzstein, Nikolai Kirov, Christine Rushlow,   (2008) The Zinc-Finger Protein Zelda is A Key Activator of The Early Zygotic Genome in Drosophila.  *Nature*  **456**  400-403  [10.1038/nature07388](http://dx.doi.org/10.1038/nature07388)


## Session info


```
## R version 3.0.1 Patched (2013-07-10 r63263)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
##  [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
##  [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
## [10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] knitcitations_0.4-7                      
##  [2] bibtex_0.3-5                             
##  [3] xtable_1.7-1                             
##  [4] org.Dm.eg.db_2.9.0                       
##  [5] GOstats_2.27.0                           
##  [6] graph_1.39.3                             
##  [7] Category_2.27.2                          
##  [8] GO.db_2.9.0                              
##  [9] RSQLite_0.11.4                           
## [10] DBI_0.2-7                                
## [11] Matrix_1.0-12                            
## [12] lattice_0.20-15                          
## [13] ggplot2_0.9.3.1                          
## [14] plyr_1.8                                 
## [15] BSgenome.Dmelanogaster.UCSC.dm3_1.3.19   
## [16] BSgenome_1.29.0                          
## [17] Biostrings_2.29.13                       
## [18] TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.9.0
## [19] GenomicFeatures_1.13.19                  
## [20] AnnotationDbi_1.23.17                    
## [21] Biobase_2.21.6                           
## [22] GenomicRanges_1.13.34                    
## [23] XVector_0.1.0                            
## [24] IRanges_1.19.19                          
## [25] BiocGenerics_0.7.3                       
## [26] knitr_1.3                                
## [27] BiocInstaller_1.11.3                     
## 
## loaded via a namespace (and not attached):
##  [1] AnnotationForge_1.3.10 GSEABase_1.23.0        MASS_7.3-27           
##  [4] RBGL_1.37.2            RColorBrewer_1.0-5     RCurl_1.95-4.1        
##  [7] Rsamtools_1.13.24      XML_3.98-1.1           annotate_1.39.0       
## [10] biomaRt_2.17.2         bitops_1.0-5           colorspace_1.2-2      
## [13] dichromat_2.0-0        digest_0.6.3           evaluate_0.4.4        
## [16] formatR_0.8            genefilter_1.43.0      grid_3.0.1            
## [19] gtable_0.1.2           httr_0.2               labeling_0.2          
## [22] munsell_0.4.2          proto_0.3-10           reshape2_1.2.2        
## [25] rtracklayer_1.21.8     scales_0.2.3           splines_3.0.1         
## [28] stats4_3.0.1           stringr_0.6.2          survival_2.37-4       
## [31] tools_3.0.1            zlibbioc_1.7.0
```

