WGCNA: an R package for weighted correlation network analysis

WGCNA: an R package for weighted correlation network analysis

WGCNA: an R package for weighted correlation network analysis

Peter Langfelder1 and Steve Horvath*2

Abstract

Weighted gene co-expression network analysis is a method for describing the correlation patterns among genes across samples.
WGCNA can find clusters (modules) of high correlated genes.
Those gene clusters are called as module eigengene or intramodular hub gene.
And WGCNA can masure the correlation of modules to one another or external traits.
It can be used to identify candidate biomarkers or therapeutic targets.

Background

Functions of WGCNA:

  1. Cluster co-expression genes
  2. Correlation of modules to traits identification
  3. Significant modules identification
  4. Module annotation
  5. Define the network neighborhood
  6. Screen of nodes
  7. Contract network

Results

Overview of typical analysis steps and the rational behind them:

DeepinScreenshot_select-area_20200608133456
© Peter Langfelder 2008

  1. co-expression
  2. Using dynamic tree cut to idnetify modules.
  3. Correlation with traits:
    • Clinical data, SNPs, proteomics
    • ontology, functional enrichment
    • find biology interest modules
  4. Module relationships
  5. Find the Key drivers in interesting modules

1. Gene Cluster

Question: Soft Threadhold power???

2. Module Detection

2.1 Algorithm

Modules are defined as clusters of densely interconnected genes.
hierarhical cluster is used to cluster the genes.
Short Coming of this algorithm:

2.2 Biological Meaning

It could reflect:

  1. Biological signal
  2. Noise

So, gene ontology information can be used.

Algorithms of Modules detection

  • Fuzzy measure of module membership
  • Automatic block-wise module detection
  • Consensus module detection

3. Module and Gene Selection

ummmm…
= =

4. Topological Properties

To study about network concept.

  1. Whole Network Connectivity (degree)
  2. Intramodular Connectivity
  3. Topological Overlap
  4. Clustering Coefficient

…Skip

Mouse Data Application

For computational reason, only 3600 most related genes are selected.
18 modules…
img
© Peter Langfelder 2008

As is how in Graph D above, weight is mostly correlated to brown, red, and salmon.
By GO enrichment result, we can find that brown is significantly enriched in categories “glycoprotein” and “signal”, red is enriched in “cell cycle”, and “salmon” is enriched in “chromosome”. Overall, it is biological meaningful.

Figure E shows body weight between genes significant…


  • 1. Dudoit S, Fridlyand J: A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol 2002, 3(7):RESEARCH0036.
  • WGCNA: an R package for weighted correlation network analysis

    https://karobben.github.io/2020/07/07/LearnNotes/paper_WGCNA/

    Author

    Karobben

    Posted on

    2020-07-07

    Updated on

    2024-01-11

    Licensed under

    Comments