COBI Lab - TimesVector

TimesVector-web tutorial

The TimesVector-Web present analysis of time series RNA sequencing data

Recommand K for K-means clustering by elbow point
Cosine similarity mesured clustering
Enrichment & Pathway information from g:Profiler result
additional geneID mapping to CISBP & miRDB database for TF & miRNA

Follow steps

TimesVector-web workflow

TimesVector-web input interface

TimesVector-web input interface is divided into four sections.

① Input files info

Upload the gene expression files to analyze
Check whether the information on the uploaded file is correct through the info table.

② Options for input file

'Use only protein coding genes'

Users can constrain the use of protein-coding genes only in the gene list
If the user clicks 'yes', only genes that are labeled as protein coding are selected from the input file

'Data type'

This option is to select whether the data type of the input file is microarray type or RNA-seq type.

'Do you need Normalization?'

TimesVector-web supports option for normalizing input data
If the user clicks 'yes', quantile normalization is performed on the data

③ Characteristics of input file

K is the number of clusters desired to detect (INTEGER)

If K is unknown, you can get recommendated k by clicking on "K test"
K is recommended by elbow method graph

Max K is a parameter for setting the range of K user want to select
Ex) Elbow method graph after elbow method is performed (Max K: 800)

④ Organism for biological downstream analysis

By selected organism, the gene list of cluster is converted to RefSeq id and ENSEMBL id from Biomart
The selected organism and converted ENSEMBL id list are the parameter for g:Profiler
TimesVector-web supports organisms such as Homo_sapiens, Mus_musculus, Oryza_sativa_Japonica_Group and Saccharomyces_cerevisiae.

⑤ Run TimesVector by clicking "RUN"

Wait until process is finished

If you don't want to only staring at monitor you can use shortcut
Copy short cut code after run and paste in main page
Every results will be removed every second days

STEP ONE : Data pre-processing

This step encompass several requirements for pre-processing the data the user wants to analyze on TimesVector-web. First, our web service only supports time series and multiple condition RNA sequencing data. The user must prepare input files according to the multiple condition of the data to be analyzed. For example, if there are three conditions of the files to be analyzed, three input files are required. As shown in Figure 1, the user must prepare three input files that meet each condition. These data can be downloaded by GeneExpressionOmnibus(GEO). Second, before uploading your data, you need to convert the file format to meet the requirements described below.

The input file should be a TAB delimited gene expression matrix.
It is recommended that the user upload the file to be analyzed as a file composed of the same number of genes.
Our web service recognizes the first row of the input file as a header.
The first column 'GeneID' of the header is mandatory and must be used as is.
Except for the first column, the names of the other columns follow the following syntax.

'Condition_'TimePoint'(e.g, DV10_Day2).
The condition and time points are seperated by an underline character("_").
If the replicates of experiments for input file exists then 'Condition'_'TimePoint'_'Replicates'(e.g, DV10_Day2_rep1).

Figure 1

Figure 2

* If you are uncomfortable with programming, recommend that you use Spreadsheets tools

STEP TWO : Upload input files

This step is for the user to upload the data obtained through STEP ONE.

First, click the browse button.

Second, please select the files to upload as shown in the picture below.

Third, check whether the information on the uploaded file is correct through the info table.

STEP THREE : Select options for input files

This step proposes several options that help the user analyze. The TimesVector-web offers three options.

'Use only protein coding genes'

This option selects only protein coding genes for input files uploaded by the user.

'Data type'

This option is to select whether the data type of the input file is microarray type or RNA-seq type.

'Do you need Normalization?'

This option performs log2 and quantile normalization on the user's input file.

STEP FOUR : Select the number of clusters on input data

This step is to recommend the appropriate number of clusters for input data.
To analyze the gene expression pattern in data, it is important to select an appropriate number of K clusters. The appropriate number of clusters here is when the number of clusters is the smallest, the distance of genes within the same cluster is small and the distance of genes across different clusters is large. Our web service recommends the appropriate number of clusters through the 'K-test' button and 'maxK' parameter. This parameter is for setting the range of K that the user wants to select since the result of K test can vary depending on the size or characteristics of the input data.

Figure 1

TimesVector

If the user clicks the K-test button, it recommends the appropriate number of clusters (K) for input data as shown in the Figure2. The K-test can take several minutes depending on the size of the data and maxK.

Figure 2

STEP FIVE : Select an organism for biological downstream analysis.

This step is to select an organism corresponding to the user's input data. The organism is the parameter for g:Profiler and our web service supports organisms such as Homo_sapiens, Mus_musculus, Oryza_sativa_Japonica_Group and Saccharomyces_cerevisiae.

Figure 1

STEP SIX: Run the TimesVector-web

After uploading data and selecting parameters user can run the analysis.
If program run successfully, short cut code is generated.

User can wait on this page untill program finish.

Or user can copy short cut code and leave this page. And find result page with pasting short cut code on the Home page.
The prgram could not finished until 30 minutes depending on the data size.

STEP SEVEN : Interpret your Results

Summary of the result

Thumbnail of Clusters

The visualized plots of DEP, ODEP and SEP clusters are buttons for selecting the patterns

Genes in cluster

Left plot is showing only cluster representative genes
Right plot is showing every genes in cluster

Gene List

List of genes in the cluster
Gene description
Gene type
Matched RefSeq ID
Matched EMSEMBL ID

cisBP

Result of mapping gene set with Transcription Factor(TF) database cisBP

miRDB

micro RNA(miRNA) prediction and functional annotation database miRDB
The number of counted microRNAs in the cluster is show above
The list of microRNAs in the cluster is show below

g:Profiler result

g:Profiler result with gene set and selected organism
For running g:Profiler, every genes in every cluster of each DEP, ODEP, SEP will be used.

TimesVector-web Trial tutorial

TimesVector-web workflow

STEP ONE: choose case study

STEP TWO: upload data

STEP THREE: run the TimesVector-web

STEP FOUR: result

TimesVector-web tutorial

TimesVector-web workflow

TimesVector-web input interface

STEP ONE : Data pre-processing

STEP TWO : Upload input files

STEP THREE : Select options for input files

STEP FOUR : Select the number of clusters on input data

STEP FIVE : Select an organism for biological downstream analysis.

STEP SIX: Run the TimesVector-web

STEP SEVEN : Interpret your Results