Cell Visualization
In this tutorial, we will make visualizations of the graph components that constitute cells in MPX data. The data we will use comes from an experiment where T cells have been stimulated with a chemokine to induce them to attain a migratory phenotype. The migratory T cells form a uropod; a bulge on the cell that helps to propel the cell forward.
Upon uropod formation some proteins on the surface of the cell migrate to one pole of the cell surface to form the uropod, which we will try to visualize here. After completing this tutorial, you will be able to:
- Select components of interest based on colocalization scores or other metrics, choosing cells with potential uropods or other features of interest.
- Load cell graphs of the selected component IDs from a PXL file.
- Visualize individual cell graphs and color nodes based on pixel type (A or B) or other attributes.
- Visualize multiple components and markers, and their abundance in each node (UPIA) using color scales.
- Visualize marker detection by coloring nodes based on whether specific markers were detected or not.
Setup
library(pixelatorR)
library(SeuratObject)
library(dplyr)
library(tidyr)
library(stringr)
library(ggplot2)
library(here)
library(patchwork)
Load data
In this tutorial we will continue with the stimulated T cell data that we analyzed in the previous tutorial Spatial Analysis (Karlsson et al, Nature Methods 2024). There is no need to load the object again if you still have it in your workspace after running the previous tutorial.
DATA_DIR <- "path/to/local/folder"
Sys.setenv("DATA_DIR" = DATA_DIR)
pg_data_combined <- readRDS(file.path(DATA_DIR, "uropod_spatial.rds"))
pg_data_combined
An object of class Seurat
1325 features across 786 samples within 3 assays
Active assay: colocalization (1225 features, 406 variable features)
2 layers present: data, scale.data
2 other assays present: mpxCells, polarization
2 dimensional reductions calculated: pca, umap
The data we are using constists of two samples; one control sample of unstimulated T cells, and one sample of T cells stimulated by CD54 immobilization and a chemokine to encourage cells to attain a migratory phenotype. In the previous tutorial, we found that many of the stimulated cells formed uropods, a bulge that helps the cell to migrate, through the increased colocalization of CD37, CD50, and CD162, along with the polarization of these and several other proteins.
# Make sure that we are using the correct assay
DefaultAssay(pg_data_combined) <- "mpxCells"
# Fetch colocalization scores from Seurat object
colocalization_scores <- ColocalizationScores(pg_data_combined, meta_data_columns = "sample") %>%
# Remove self-correlation
filter(marker_1 != marker_2)
colocalization_scores %>%
filter(marker_1 == "CD162",
marker_2 == "CD37") %>%
ggplot(aes(sample, pearson_z, fill = sample)) +
geom_violin(draw_quantiles = 0.5) +
theme_minimal() +
labs(x = "Sample",
y = "MPX colocalization score",
title = "Colocalization between CD162 and CD37")
In this tutorial, we will select some components that have formed the uropod, as well as some that haven’t, and visualize these. Although many cells seem to have formed uropods in the stimulated samples, there are also a few in the control sample. To get some examples, let’s select the cells with the highest colocalization of CD37 and CD162 from both the control and stimulated samples. Let’s also pick some cells with low colocalization between CD37 and CD162 to compare with. We will call the cells with high colocalization of CD37 and CD162 “Uropod”, while the ones with low colocalization score “No uropod”.
plot_components <-
colocalization_scores %>%
unite(contrast, marker_1, marker_2, sep = "/") %>%
filter(contrast == "CD162/CD37") %>%
arrange(-pearson_z) %>%
filter(pearson_z > 0) %>%
group_by(sample) %>%
# Keep the 2 with highest score per sample
filter(row_number() %in% 1:2 | rev(row_number()) %in% 1:2) %>%
select(sample, component, contrast, pearson_z) %>%
# Add a column of whether the cell is "uropod-like"
mutate(uropod_type = ifelse(pearson_z > 10,
"Uropod",
"No uropod"),
component_number = str_extract(component, "[0-9]+") %>% as.numeric())
plot_components
# A tibble: 8 × 6
# Groups: sample [2]
sample component contrast pearson_z uropod_type component_number
<chr> <chr> <chr> <dbl> <chr> <dbl>
1 Stimulated Stimulated_RCVCMP0… CD162/C… 41.2 Uropod 1141
2 Stimulated Stimulated_RCVCMP0… CD162/C… 40.2 Uropod 1214
3 Control Control_RCVCMP0000… CD162/C… 32.1 Uropod 175
4 Control Control_RCVCMP0000… CD162/C… 31.9 Uropod 736
5 Stimulated Stimulated_RCVCMP0… CD162/C… 0.0854 No uropod 416
6 Control Control_RCVCMP0000… CD162/C… 0.0248 No uropod 431
7 Stimulated Stimulated_RCVCMP0… CD162/C… 0.0173 No uropod 392
8 Control Control_RCVCMP0000… CD162/C… 0.00870 No uropod 261
Visualize single cell graph components in 2D
To start with, let’s visualize a single component graph. Each component graph corresponds to a cell, and is represented by a bipartite graph where nodes are UPIAs and UPIBs. The edges linking a UPIA node and a UPIB node correspond to antibody molecules that have been detected in the intersection of a specific pair of one A pixel (UPIA) and one B pixel (UPIB). MPX graphs can be converted to 2D or 3D layouts using various graph drawing algorithms. Here we will try out two different graph drawing algorithms to create 2D layouts.
Load graphs
Before we can compute and visualize the
layouts for
our component graphs, we need to load the graphs into our Seurat object
with the LoadCellGraphs
method.
Note that
LoadCellGraphs
creates the component graphs from the edgelist stored in the PXL file(s). Thus, it is crucial that the paths to the PXL files are correct and that the PXL files contain the correct data. If you have loaded the Seurat object from an RDS file, now is the time to ensure that the PXL files are in the right location. We can use theFSMap
getter function to check the paths to the PXL files:
FSMap(pg_data_combined[["mpxCells"]])
If the paths are incorrect, we can use the
FSMap<-
setter method to update them.
FSMap(pg_data_combined[["mpxCells"]]) <- UpdatedFSMap
When we call LoadCellGraphs
, we can decide exactly what components to
load (cells = ...
). Here, we will load a single component graph:
pg_data_combined
An object of class Seurat
1325 features across 786 samples within 3 assays
Active assay: mpxCells (50 features, 50 variable features)
2 layers present: counts.1, counts.2
2 other assays present: polarization, colocalization
2 dimensional reductions calculated: pca, umap
pg_data_combined <- pg_data_combined %>%
LoadCellGraphs(cells = "Control_RCVCMP0000318")
Once component graphs have been loaded, we can access the graph data
using the CellGraphs
method. CellGraphs
returns a list with one
element per component. Components that have not been loaded yet will
have a NULL
value in the list. Each loaded component graph is stored
in a CellGraph
object that contains a graph, a node count matrix and
an empty list of layouts:
CellGraphs(pg_data_combined)[["Control_RCVCMP0000318"]]
A CellGraph object containing a bipartite graph with 1682 nodes and 3517 edges
Number of markers: 75
Pre-computed layouts
If we have access to layouts in the PXL files, we can load them by
setting load_layouts = TRUE
when calling the LoadCellGraphs
method
on our Seurat object. If we do this, you will now see that we have one
layout available for this component: wpmds_3d
.
You can find more information about the pmds layout methods here.
pg_data_combined <- pg_data_combined %>%
LoadCellGraphs(cells = "Control_RCVCMP0000318", load_layouts = TRUE, force = TRUE)
# Show content of the CellGraph object
CellGraphs(pg_data_combined)[["Control_RCVCMP0000318"]]
A CellGraph object containing a bipartite graph with 1682 nodes and 3517 edges
Number of markers: 75
Layouts: wpmds_3d
As wpmds_3d
is a 3D layout, we will not use it for this tutorial in 2D
cell visualization.
As mentioned before, LoadCellGraphs
fetches the graph data from the
edgelist(s) in the PXL file(s) stored on disk, and returns a graph
representation of our selected component stored in memory in the Seurat
object. If we were to export this Seurat object to an RDS file, the
component graph that we just loaded would also be saved and we no longer
need the PXL file to access it. We can of course load all component
graphs in the Seurat object, but they are quite large and we rarely need
to load more than a few for visualization. For this reason, we recommend
carefully selecting components of interest to load.
Compute layouts
If we do not have access to pre-computed layouts, we can generate them
using the ComputeLayout
method. ComputeLayout
will check what
components have been loaded and compute a layout for each of these
components. All we need to do is to specify the layout method we want to
use. Here, we will test two methods: wpmds
and fr
.
pg_data_combined <- pg_data_combined %>%
ComputeLayout(layout_method = "wpmds") %>%
ComputeLayout(layout_method = "fr")
# Plot component with fr layout
Plot2DGraph(pg_data_combined, cells = "Control_RCVCMP0000318", layout_method = "wpmds")
# Plot component with kk layout
Plot2DGraph(pg_data_combined, cells = "Control_RCVCMP0000318", layout_method = "fr")
Plot2DGraph
can also color the nodes by marker counts with the
marker
parameter. For bipartite graphs, it is also possible to color
the nodes by their types: A or B.
Plot2DGraph(pg_data_combined, cells = "Control_RCVCMP0000318", layout_method = "wpmds",
marker = "node_type", show_Bnodes = TRUE)
If we now add edges to the plot, we see how in the bipartite graph always connects one UPIA to one UPIB.
Plot2DGraph(pg_data_combined, cells = "Control_RCVCMP0000318", layout_method = "wpmds",
marker = "node_type", show_Bnodes = TRUE, map_edges = TRUE)
Now, let’s calculate layouts for all components we picked. Here we only display the UPIA nodes.
pg_data_combined <- pg_data_combined %>%
LoadCellGraphs(cells = plot_components$component) %>%
ComputeLayout(layout_method = "wpmds", seed = 1)
Plot2DGraph(pg_data_combined, cells = plot_components$component, layout_method = "wpmds") +
plot_layout(ncol = 4) &
theme(plot.title = element_text(size = 8))
Now, let’s visualize the abundance of some markers within the
components. We will visualize the uropod markers CD37, CD50, and CD162,
as well as some background markers CD45, CD3E, and B2M. Here we simply
visualize the antibody count (log10(n + 1)
) by coloring each UPIA by
the number of antibody molecules detected, scaled from minimum to
maximum per marker.
plot_markers <-
c("CD50", "CD37", "CD162", "CD45", "CD3E", "B2M")
plot_titles <- paste(plot_components$sample,
plot_components$uropod_type,
plot_components$component_number,
sep = "\n") %>%
setNames(nm = plot_components$component)
Plot2DGraphM(pg_data_combined,
layout_method = "wpmds",
cells = plot_components$component,
markers = plot_markers,
colors = viridis::inferno(n = 11),
titles = plot_titles) +
plot_layout(heights = c(1, rep(2, 6)))
We can also visualize the distribution of a combination of markers. Here, we simply color UPIAs by whether CD37 and/or CD162 are detected.
detection_palette <-
c("Not detected" = "gray",
"CD162 detected" = "#0AA3B8",
"CD37 detected" = "#DB1048",
"Both detected" = "#F4B301")
# Fetch layout coordinates
plot_data <- lapply(plot_components$component, function(nm) {
cg <- CellGraphs(pg_data_combined)[[nm]]
counts <- CellGraphData(cg, slot = "counts")[, c("CD162", "CD37")] %>%
as.matrix() %>% as_tibble()
node_type <- CellGraphData(cg, slot = "cellgraph") %>%
pull(node_type)
CellGraphData(cg, slot = "layout")$wpmds %>%
mutate(component = nm) %>%
mutate(counts, node_type) %>%
filter(node_type == "A")
}) %>% do.call(bind_rows, .) %>%
left_join(plot_components, by = "component")
# Create categories depending on marker counts
plot_data_detection <-
plot_data %>%
mutate(category = case_when(CD37 == 0 &
CD162 == 0 ~ "Not detected",
CD37 == 0 ~ "CD162 detected",
CD162 == 0 ~ "CD37 detected",
TRUE ~ "Both detected")) %>%
arrange(CD37 + CD162)
plot_data_detection %>%
ggplot(aes(x, y, color = category)) +
geom_point(size = 0.5) +
scale_color_manual(values = detection_palette) +
facet_wrap( ~ uropod_type + sample + component_number, ncol = 4, scales = "free") +
theme_void() +
ggtitle("Colocalization of CD37 and CD162")
In this tutorial, we chose interesting cell components from our MPX data
and created layouts, depicting their spatial structure and marker
abundance in 2D. While 2D visualizations give us valuable insights into
the structure and composition of individual cell graphs, they are
simplified projections of
structures that are actually three-dimensional.
Those who are familiar with dimensionality reduction techniques such as UMAP and t-SNE have probably noticed that adding a third dimension can reveal patterns and relationships that are hidden in 2D. Still, 3D plots are rarely suitable to include in printed material such as publications and presentation since they can be much more difficult to interpret and requires a fixed perspective. MPX components are clearly 3D structures, but 2D plots can still be useful to summarize patterns across many components. We need to be mindful about the fact that these plots are simplified representations.
With that being said, if we want to show the full complexity of the intricate patterns of proteins on the cell membrane, we need to visualize the component graphs in 3D. If you are looking for that, jump to the 3D cell visualization tutorial to learn how to explore cell graphs in 3D.