do_DimPlot() | Dimensional Reduction scatter plots

Dimensional reduction Plots (DimPlots) are a highly recognizable visualization in single-cell experiments. They allow users to visualize cells in a dimensional reduction embedding, such as PCA or UMAP. Users can color cells according to any desired groups, enabling visualization of any kind of categorical data on the cells in the dimensional reduction embedding.

Basic usage

DimPlots can be generated in SCpubr using the function SCpubr::do_DimPlot():

p <- SCpubr::do_DimPlot(sample = sample)

p

Modifying axes behavior

Axes are removed by default from the plot, together with the axes titles. This behaviour can be overridden with plot.axes = TRUE:

# Bring back the Axes.
p <- SCpubr::do_DimPlot(sample = sample,
                        plot.axes = TRUE)
p

Label the clusters

In some cases, we might want to remove the legend entirely, and instead plot labels on top of each cluster. This can be achieved by using label = TRUE.

# Put labels on top of the clusters.
p <- SCpubr::do_DimPlot(sample, 
                     label = TRUE)
p

These labels are, in essence the result of applying ggplot2::geom_label() on the plot. However, we might also want to have them as pure text instead of the labels. We can achieve that by providing label.box = FALSE.

# Labels as text
p <- SCpubr::do_DimPlot(sample = sample, 
                        label = TRUE,
                        label.box = FALSE)
p

However, we can play further with other parameters of the function such as label.color which will provide a different color for the text inside the labels and label.fill to modify the background of the labels.:

# Change the color of the label text.
p1 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.color = "white",
                         label.fill = "black")

# Change the color of the text.
p2 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.color = "black",
                         label.box = FALSE)
p <- p1 | p2
p

Also, we can modify the size of the labels/text by using label.size paramter:

# Change the size of the label text.
p1 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.size = 6)

# Change the size of the text.
p2 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.box = FALSE,
                         label.size = 6)
p <- p1 | p2
p

Finally, if the labels/text are overlapping one to another, we can try to fix that by repelling the labels from one to another. We can achieve this by using repel = TRUE:

# Repel the labels.
p1 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.color = "black",
                         repel = TRUE)

# Repel the text.
p2 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.color = "black",
                         label.box = FALSE,
                         repel = TRUE)
p <- p1 | p2
p

Changing the order of plotting

Be default SCpubr::do_DimPlot() plots cells randomly using shuffle = TRUE. This is different from the default behavior of Seurat::DimPlot(), which plots cells based on their identity factor levels.

p1 <- SCpubr::do_DimPlot(sample = sample,
                         reduction = "pca",
                         shuffle = TRUE)

p2 <- SCpubr::do_DimPlot(sample = sample,
                         reduction = "pca",
                         shuffle = FALSE)

p <- p1 | p2
p

Highlighting cells

We can highlight a certain group of cells in the plot. This is achieved by using the cells.highlight parameter.

cells.use <- sample(x = colnames(sample), 
                    size = 1500)

p <- SCpubr::do_DimPlot(sample = sample,
                        cells.highlight = cells.use)

p

One can also change the color of the highligted cells by providing single color to colors.use and the color of the not selected cells with na.value:

# Change color of highlighted and non-highlighted cells.
p <- SCpubr::do_DimPlot(sample = sample, 
                        cells.highlight = cells.use,
                        colors.use = "dodgerblue",
                        na.value = "grey90")
p

The size of the highlighted dots can be modified with the parameter sizes.highlight.

# Increase the size of the highlighted cells.
p <- SCpubr::do_DimPlot(sample = sample, 
                        cells.highlight = cells.use, 
                        sizes.highlight = 2)
p

We can also highlight whole identities with idents.highlight parameter. For this, just provide the desired identities to be selected. It can also work in combination with cells.highlight.

# Using cells.highlight.
p1 <- SCpubr::do_DimPlot(sample = sample, 
                         cells.highlight = cells.use)

# Using idents.highlight.
p2 <- SCpubr::do_DimPlot(sample = sample, 
                         idents.highlight = c("6"))

# Using both.
p3 <- SCpubr::do_DimPlot(sample = sample, 
                         cells.highlight = cells.use, 
                         idents.highlight = c("6"))

p <- p1 | p2 | p3
p

Restrict the identitites displayed

Sometimes, we are interested in showing only some of the identities or groups in our sample. Instead of highlighting cells, we still want to keep the original colors and legend. For this use case, one might be tempted to just subset the sample as follows:

# Subset desired identities in a DimPlot.
p <- SCpubr::do_DimPlot(sample = sample[, sample$seurat_clusters %in% c("0", "5", "2", "4")])

p

However, we end up losing the UMAP silhouette. For this use case, SCpubr::do_DimPlot() introduces idents.keep parameter, for which you can provide a vector with the identities you want to keep. This will assign to the rest of the cells a value of NA and they will be colored according to na.value parameter:

# Select identities with idents.keep.
p1 <- SCpubr::do_DimPlot(sample = sample,
                         idents.keep = c("0", "5", "2",  "4"))

# Also, non-selected cells's color can be modified.
p2 <- SCpubr::do_DimPlot(sample = sample,
                         idents.keep = c("0", "5", "2",  "4"),
                         na.value = "grey50")
p <- p1 | p2
p

Group by another metadata variable

So far, all the DimPlots displayed have shown the identities as the ones that are currently set in the object. This can be consulted by using Seurat::Idents(sample). However, naturally, we might want to display different metadata variables. This can be easily achieved by using group.by parameter:

# Generate another metadata variable to group the cells by.
sample$annotation <- sample(c("A", "B", "C"), ncol(sample), replace = TRUE)

# Group by another metadata variable.
p1 <- SCpubr::do_DimPlot(sample, 
                         group.by = "seurat_clusters")

p2 <- SCpubr::do_DimPlot(sample, 
                         group.by = "annotation")

p <- p1 | p2
p

Splitting by a category

Another useful parameter is split.by, which allows you to split your DimPlot into multiple panels, each one containing a different unique value of the metadata variable you have provided to the argument. One can understand this as using the group.by parameter and then splitting the resulting DimPlot into different panels. In this example, we are going to use the different clusters as an example. This is how it looks by default:

# SCpubr's DimPlot using split.by
p <- SCpubr::do_DimPlot(sample, 
                        split.by = "seurat_clusters", 
                        ncol = 5, 
                        legend.position = "none",
                        font.size = 24)

p

This will produce a plot with each group in a different panel together with a summary plot containing all groups. If we are interested only in a subset of the possible values, we can use idents.keep alongside a vector containing the values to keep from the unique values in split.by:

# Using split.by and restricting the number of output plots with idents.keep.
p <- SCpubr::do_DimPlot(sample, 
                        split.by = "seurat_clusters", 
                        ncol = 3, 
                        idents.keep = c("0", "1", "7"),
                        legend.position = "none",
                        font.size = 24)

p

Group by a variable but split by another

Finally, but also importantly, users might want to split the UMAP using split.by, while also grouping (coloring) the values by another variable using group.by. Using these two parameters in combination yields the following:

# Using split.by and group.by in combination.
sample$orig.ident <- sample(c("A", "B", "C"), 
                            ncol(sample), 
                            replace = TRUE, 
                            prob = c(0.05, 0.1, 0.85))

p <- SCpubr::do_DimPlot(sample, 
                        group.by = "seurat_clusters",
                        split.by = "orig.ident", 
                        font.size = 24)

p

And more!

There are many synergies across functions. Have a look at the section Shared features across functions!