Transcriptomic Knowledge-Graph Omics Integration for Human Pathway Analysis
The tkoi package provides an integrative framework that combines
transcriptomic data with a human-specific biological knowledge graph.
This enables network-aware enrichment, functional interpretation, and
gene prioritization via personalized PageRank and ontology-aware
annotation.
For non-power users, please use the web application of
tKOI at tkoi.org.
For subsequent analysis upon getting network enrichment statistics, please use tKOIAgent for contextualization and network treversal.
Please refer to Documentation for detailed documentation of this R package.
To install the development version from GitHub:
# Install devtools if necessary
install.packages("devtools")
# Install tkoi
devtools::install_github("Broccolito/tkoi")- Personalized PageRank propagation using transcriptomic weights
- Permutation-based enrichment scoring for network nodes
- Functional annotation using Gene Ontology, Disease Ontology, Cell Ontology, Reactome, and more
- Modular and extensible S4 object design (
tKOIList) - Export and visualization tools for enriched subnetworks
- Seamless compatibility with
clusterProfiler,enrichplot, andggplot2## Getting Started
This section walks you through a complete example using the tkoi
package—from reading expression data, running the core network analysis,
to visualizing enrichment results.
The tkoi package includes a small example CSV file containing
simulated gene expression results. We’ll read it using data.table for
performance.
library(tkoi)
library(data.table)
# Get the file path of the example expression data
file_path = system.file("extdata", "example_data.csv", package = "tkoi")
# Read the CSV file
expression_data = fread(file_path)
head(expression_data)The file includes columns:
gene_name: Ensembl gene identifierslogfc: log2 fold-change valuespvalue: associated p-values for differential expression
tKOI integrates transcriptomic changes with a biological knowledge
graph using a personalized PageRank algorithm. It also performs
permutations to assess statistical enrichment.
tkoi_result = run_tkoi(
expression_data = expression_data,
subnetwork = tkoi::tkoi_net, # Predefined igraph network included with the package
pvalue_threshold = 0.05, # p-value filter for differential expression
logfc_threshold = 0.25, # Minimum log fold change
indirect_link_threshold = 3, # Required indirect connectivity for downstream inclusion
topology_similarity = 0.9, # Similarity for selecting matched genes in permutations
n_permutation = 100, # Number of random permutations
damping_factor = 0.85, # PageRank damping factor
maximum_iteration = 500 # Max iterations for convergence
)The result is an S4 object (tKOIList) that stores PageRank scores,
permutation statistics, and network annotations.
You can extend the analysis by integrating GO term enrichment using
clusterProfiler. This allows for side-by-side comparisons of
ontology-based and graph-based enrichment.
tkoi_result = run_gene_enrichment(tkoi_result)This adds a gene_enrichment_comparison slot containing GO enrichment
tables and visual summaries.
Two visualizations are automatically generated:
tkoi_result@gene_enrichment_comparison$comparison_scatter1tkoi_result@gene_enrichment_comparison$comparison_scatter2These plots compare tKOI network enrichment (beta) with gene ontology
q-values.
The make_gene_exploration_plot() function highlights upregulated and
downregulated genes in a scatter plot based on both experimental and
network evidence.
plt1 = make_gene_exploration_plot(
tkoi_list = tkoi_result,
sig_color = "#F39B7FB2",
non_sig_color = "gray"
)
plt1This returns a data frame containing logFC, p-values, PageRank scores, and FDRs for each gene.
gene_data = export_gene_exploration_data(tkoi_result)
head(gene_data)Use visualize_topn() to highlight the most significantly enriched
genes, pathways, or biological concepts based on network-level
statistics.
plt2 = visualize_topn(
tkoi_list = tkoi_result,
category = "Gene", # Can also be "Pathway", "BiologicalProcess", etc.
top_n = 25,
high_color = "#FF5733", # Strong enrichment
low_color = "#154360" # Moderate enrichment
)
plt2Save your full analysis object for future use:
save(tkoi_result, file = "tkoi_result.rda")tKOIList is an S4 object returned by run_tkoi() with the following
slots:
expression_data: Input transcriptomic measurementspagerank_data: Personalized PageRank vectorsnetwork_summary_statistics: Node-level enrichment resultsgene_enrichment_comparison: GO enrichment overlay and plots
Built-in annotation tables support functional interpretation of the knowledge graph:
go_annotation,disease_annotation,celltype_annotation,anatomy_annotationcompound_annotation,protein_annotation,complex_annotationreaction_annotation,pathway_annotation,pwgroup_annotation, etc.
Inspect them like so:
data(go_annotation)
head(go_annotation)MIT + file LICENSE
Gu, W., Bellucci, G., Peetoom, B., McDonagh, M., & Baranzini, S. (in preparation). Integrating Large-Scale Knowledge Graphs to Enhance Transcriptomics Analysis.
Wanjun Gu wanjun.gu@ucsf.edu ORCID: 0000-0002-7342-7000
