Conversion Vignette • scanalysis

For this tutorial, we demonstrate the conversion utilities in scanalysis to streamline the analysis process by using functions from Bioconductor and Seurat interchangably. In the current implementation of Seurat::as.SingleCellExperiment and Seurat::as.Seurat, lots of information is lost, preventing downstream analysis and causing errors if the object was converted at some point. Some examples of this are shown below before we begin to use the functionality in this package.

In this example, we aim to integrate two Seurat objects with the overall analysis plan as below:

Normalize using SCTransform (includes identifying highly variable genes)
Convert to SingleCellExperiment and back to Seurat objects (many possible reasons for this in practice, but we keep it general by just converting back and forth)
Integrate objects

We will note some roadblocks using the Seurat conversion implementation and introduce our implementation. Note that our implementation utilizes the Seurat implementation but adds upon it.

First we load scanalysis and load our example Seurat object.

library(scanalysis)

seurat_original = Seurat::pbmc_small

Next we normalize using SCTransform and run PCA on the highly variable genes.

seurat_normalized = seurat_original %>%
  Seurat::SCTransform(verbose = FALSE)

Next we convert to a SingleCellExperiment object, using the Seurat implementation.

sce_converted = seurat_normalized %>%
  Seurat::as.SingleCellExperiment()

We convert this back into a Seurat object now, and note the information lost in the conversion process:

seurat_converted = sce_converted %>%
  Seurat::as.Seurat()

First we try to look for the highly variable genes in the converted object.

# Variable Features post-conversion:
head(Seurat::VariableFeatures(seurat_converted))
## logical(0)

#Variable Features pre-conversion:
head(Seurat::VariableFeatures(seurat_normalized))
## [1] "NKG7"  "PPBP"  "GNLY"  "PF4"   "GNG11" "GZMB"

In the converted object, the HVGs are lost - this makes sense as they were never stored in the SCE object either.

Let’s look at some assay specific information now also. When SCTransform is run, it creates a new assay named SCT (or alternate experiment in SingleCellExperiment terminology). The details shown below generalize to any other assays, such as hashtags (commonly named HTO) or antibodies in a CITE-seq experiment (commonly named ADT).

# Alternate Experiment (SingleCellExperiment) names:
SingleCellExperiment::altExps(sce_converted)
## List of length 0
## names(0):

# Assay (Seurat) names post-conversion:
Seurat::Assays(seurat_converted)
## [1] "RNA"

# Assay (Seurat) names pre-conversion:
Seurat::Assays(seurat_normalized)
## [1] "RNA" "SCT"

Here, we see that the SCT assay was not converted into an alternate experiment in the SingleCellExperiment object, and that the converted Seurat object subsequently lost that information. This means that downstream analysis on the normalized data would not be possible after converting to SingleCellExperiment and back. If there were other assays such as ADT or HTO in the Seurat object, this would be lost in the SingleCellExperiment and subsequently the Seurat object.

We illustrate an example where this is an issue in an integration workflow. For simplicity and to not go out of scope of this tutorial, we begin to integrate an object with itself, according to this vignette under the SCTransform tab. In practice, integrating an object with itself would not be actually done, but it still illustrates the need for comprehensive conversion between object types.

integration_list = list(seurat_converted, seurat_converted)

features = Seurat::SelectIntegrationFeatures(object.list = integration_list, nfeatures = 3000)

integration_list = Seurat::PrepSCTIntegration(integration_list, anchor.features = features)
## Error: The following assays have not been processed with SCTransform:
##  object: 1 - assay: RNA
##  object: 2 - assay: RNA

This fails, because the SCT assay information was not carried over. In fact, even if the assay was converted into an alternate experiment and back into an assay, it would still fail. It requires the information from each of the slots in an Assay object to be converted into a SingleCellExperiment and vice versa. In this particular case, the data from the misc slot of each assay must be transferred to the metadata of the SingleCellExperiment. For example, some information like VariableFeatures in the Seurat object, requires the addition of some new slots to the SingleCellExperiment object, which we detail below. For now, we illustrate how our implementation avoids all the above problems.

We start with the normalized object, and convert into SCE and immediately back into Seurat.

sce_converted_new = seurat_normalized %>%
  seurat_to_sce()

seurat_converted_new = sce_converted_new %>%
  sce_to_seurat()

Now we try to look for the highly variable genes in the converted object.

# Variable Features post-conversion:
head(Seurat::VariableFeatures(seurat_converted_new))
## [1] "NKG7"  "PPBP"  "GNLY"  "PF4"   "GNG11" "GZMB"

# Variable Features pre-conversion:
head(Seurat::VariableFeatures(seurat_normalized))
## [1] "NKG7"  "PPBP"  "GNLY"  "PF4"   "GNG11" "GZMB"

The highly variable genes were successfully retained during the conversion process.

Now we look at the assay information:

# Alternate Experiment (SingleCellExperiment) names:
SingleCellExperiment::altExps(sce_converted_new)
## List of length 1
## names(1): RNA

# Assay (Seurat) names post-conversion:
Seurat::Assays(seurat_converted_new)
## [1] "SCT" "RNA"

# Assay (Seurat) names pre-conversion:
Seurat::Assays(seurat_normalized)
## [1] "RNA" "SCT"

# Default Assay post-conversion:
Seurat::DefaultAssay(seurat_converted_new)
## [1] "SCT"

# Default Assay pre-conversion:
Seurat::DefaultAssay(seurat_normalized)
## [1] "SCT"

The assays once again were successfully retained during the conversion process.

And the integration doesn’t fail:

integration_list = list(seurat_converted_new, seurat_converted_new)

features = Seurat::SelectIntegrationFeatures(object.list = integration_list, nfeatures = 3000)

integration_list = Seurat::PrepSCTIntegration(integration_list, anchor.features = features)

We don’t finish the integration, as this was just to illustrate an error caused by the Seurat conversion implementation.

Now we describe briefly the implementation of conversion and corresponding places for information in the Seurat and SingleCellExperiment objects. As both of these data types may evolve over time, these conversion implementations will also need to, and if there is any loss of data not covered by this implementation, we ask that you create an issue so that we are aware and can implement a solution.

Each Assay in the Seurat object is stored as an altExp in the SingleCellExperiment object. Of note, the SingleCellExperiment has the altExps stored as SingleCellExperiments under one main SingleCellExperiment. However, the Seurat object treats all Assays at the same level, but there is a DefaultAssay attribute. When objects are converted, the Assay to be treated as the main one can be specified in the default_assay argument to seurat_to_sce. Right now it is not supported to interactively swap the main experiment in the SingleCellExperiment as would be done using the DefaultAssay function in Seurat. However, all functions in this package take an altExp argument where it makes sense, and if altExp = NULL, it uses the main highest-level data.

Within each object, information from each Assay is added into each (altExp) SingleCellExperiment. Information from the misc slot in the Seurat objects is transferred to the metadata attribute of the SingleCellExperiment. There is no slot for the scaled data in a SingleCellExperiment, so this is added to the metadata(sce)$scaled. This is also the case for the variable features, so this is added to the metadata(sce)$variable_features.