Time Incorporated miR-mRNA Generation of Networks (TimiRGeN) is aimed at researchers who wish to explore interactions in time series microRNA-mRNA expression data. This package integrates, functionally analyses and generates small networks for hypothesis generation.
To achieve data reduction without reducing biological signal, the TimiRGeN package utilises several published packages and employs their functions in a synergistic fashion for time series multi-omic analysis. The following packages have been built upon for several functions in the TimiRGeN package:
TimiRGeN is very selective and only uses miR-mRNA interaction data from databases curated within the last 2 years. To reduce the number of false-positives, TimiRGeN also only uses predictive databases which use seed site specificity as their main input.
TimiRGeN does have the capability to generate networks in R, however this package is uniquely open ended, as the output can be easily be exported to cytoscape  or pathvisio  for better visualisation options.
TimiRGeN solely uses wikipathways for functional pathway analysis, and is the first tool to allow this for time series data. Wikipathways is a user curated pathway database that contains 1000s of mechanistic signalling pathways from multiple species . Furthermore, wikipathways works very well with pathvisio which is our recommended tool for GRN (gene regulatory network) design. Please read the TimiRGeN/inst/Pathvisio_GRN_guide.pdf for a step-by-step tutorial for our GRN creation process.
The TimiRGeN package has several options for miR-mRNA analysis. Currently the package can analyse human or mouse data, perform analysis of miR and mRNA data combined or separately, and can use entrez or ensembl gene IDs. This is because most wikipathways are annotated with either entrez IDs or ensembl gene IDs. This tool can be best used after differential expression (DE) analysis, and has potential to become a staple part of any miR-mRNA expression data study.
In this section the combined method will be used to analyse a mouse kidney fibrosis data set. The mRNA data was published in Craciun et al (2016)  which was downloaded from GSE65267. The associated miR data was published in Pellegrini et al (2016)  and this was downloaded from GSE61328.
Notice the standard nomenclature used in the column names. Do follow the this standard for your own input data. The time point should come first and is followed by a
.. The time point should consist of alphabetical characters followed by numerical characters e.g. D1, H6, TP3. After the
. the column name should continue to display the specific result types from differential expression analysis.
Note. There should only be one
. in each column name and no
_ characters. Having more than one
. or any
_ characters will confuse some functions.
Note. There should be no NAs in your miR and mRNA data files.
MultiAssayExperiment (MAE) to contain information. The dataframes and matrices will be stored as
assays, S4 objects will be stored as
Experiments and the lists will be stored as
If unfamiliar with MultiAssayExpriments please read through the vignette to understand how data can be accessed or go through the user guide which can be found on the MultiAssayExperiment bioconductor page .
getIds functions to produce dataframes containing entrezgene and ensembl ID annotations for genes. This is useful for downstream analysis.
Many wikipathways use either entrezgene IDs or ensembl gene IDs for annotation. Having both formats available can be useful.
Due to the nature of miRs, many NAs may be found in the output of
getIdsMir functions. Entrezgene IDs and ensemble IDs are insensitive to miRs with -3p and -5p strands. Therefore, adjusted entrezgene IDs and ensemble IDs are also created.
getIdsMrna functions, if a connection time out error occurs or if downloads are very slow, try to use other mirrors e.g.
mirror = "useast".
mRNA and miR data can be combined using
genesList function will transform the large dataframe into multiple nested dataframes within a list. The data will be separated by the
timeString parameter. In this example by
D (days), because it was the non-numeric character before the
. in the column names.
Significantly differentially expressed genes can be retrieved from each nested dataframe using the
significantVals function. In this example. only genes which had an adjusted P value of less than
0.05 would remain in the list.
Now entrezgene IDs or ensembl IDs which were created before can be integrated into the filtered dataframes of genes using
addIds. In this example entrezgene IDs were added.
Lists of entrez IDs/ ensembl IDs can be extracted for further analysis using the
Once we have a list of significant genes per time point we can put this through gene set enrichment analysis to find enriched pathways in each time point in the data. TimiRGeN uses wikipathways  for GSEA.
This is standard GSEA, here the
enrichWiki function wraps around enrichment functions from DOSE and clusterProfiler [2,3] but applies these functions for time series analysis with wikipathways.
Note. Making multiple separate MAE objects makes it easier to work with all the generated data files.
path_name can be found as output from the
For a more stringent check, a unique
universe can be used e.g. all possible genes found in a microarray or all known genes expressed in a cell type.
To plot results from GSEA, the
savePlots function can save all plots in the current working directory. Either bar plots or dot plots can be generated by using either
quickDot, and the plots can be saved to file in a variety of formats.