This vignette will walk you through the process of setting up a dataset in scExploreR. This vignette is based on the Small Cell Lung Carcinoma dataset analyzed by Chan et al. 2021. To download the datasets for this tutorial, go here and download the following datasets:
- Combined samples
- SCLC epithelial cells
Chan et al. use all three datasets in their analysis, and the same will be done in this example. The “Epithelial cells” dataset is a subset of the “Combined samples” dataset that has been re-clustered. Likewise, the SCLC epithelial cells dataset is a subset of the epithelial cells dataset. See Figure 1A of Chan et al. for a visual representation.
You may choose either the Anndata or the Seurat formats for download, though we recommend the Anndata format due to improved performance and decreased RAM requirements. This article will cover operations on the Anndata object.
Note
While the anndata format is exclusive to python, anndata objects can be explored in R Studio with R-style commands using the Reticulate package. The example in this vignette uses Reticulate, along with the anndata R package, and SCUBA, which provides utility functions for common exploration/data accession operations in single cell objects.
By the end of this example, you will be able to set up an scExploreR instance with both datasets.
Process to host a dataset in scExploreR
- Explore object to determine appropriate data to display in app.
- Use the config app to select relevant data and generate a config file
- Launch scExploreR using the created config file
Dataset Exploration
To host a dataset in scExploreR, you must indicate which metadata variables, reductions, and modalities to include in the app. Whether you are setting up the app for end users or yourself, careful selection of the most relevant data is key to maximizing the effectiveness of scExploreR for your dataset.
To understand what aspects to include, it is best to explore the object to identify relevant modalities, metadata, and reductions. When hosting objects you have preprocesssed, you will likely be familiar enough with the object to skip this step.
In this section, we will load the “Combined Samples” and “SCLC epithelial cells” objects and explore relevant data to include in the scExploreR instance.
Note
The example below is one of many ways this step can be approached. With other objects, the process of understanding relevant data may also involve communicating with staff involved in sequencing of your samples and preprocessing of the data.
Note
For objects that you preprocess, it is good practice to include
details on the metadata in your object. This is helpful for when you
share your object upon publication, or for sharing your object
internally or with collaborators. There are many ways to do this, though
anndata formats make this easy using the unstructured (uns
)
data slot. One suggestion is to add a python dictionary to
uns
with relevant variable names as keys, and information
on each variable as values.
Run the code below to load the Combined Samples dataset, replacing the path with the path to the downloaded object.
library(anndata)
full_dataset <-
anndata::read_h5ad(
# Replace the path below with the path to the downloaded dataset
"~/datasets/small_cell_lung_cancer/SCLC_Full.h5ad"
)
Metadata
Metadata variables are chosen when setting up an object to populate dropdowns with options for grouping (coloring) cells on plots, or for splitting plots. Chosen metadata also populates the choices avaliable in subsetting, determining how the user can select sub-populations for further exploration.
scExploreR allows you to select as many metadata variables as you like, even all of the ones in the object, but the app functions best when included metadata is limited to a handful of relevant variables. Adding variables less relevant to the end user provides limited benefit, and may even overwhelm users.
The object in this example contains 35 metadata variables. Let’s explore the variables to narrow down which are the most relevant.
SCUBA::meta_varnames(full_dataset)
## [1] "ngenes"
## [2] "libsize"
## [3] "mito_frac"
## [4] "RBP_frac"
## [5] "batch"
## [6] "donor_id"
## [7] "treatment"
## [8] "procedure"
## [9] "histo"
## [10] "cell_type_coarse"
## [11] "cell_type_fine"
## [12] "cell_type_general"
## [13] "clusters"
## [14] "cell_type_med"
## [15] "H_knn"
## [16] "development_stage_ontology_term_id"
## [17] "sex_ontology_term_id"
## [18] "self_reported_ethnicity_ontology_term_id"
## [19] "assay_ontology_term_id"
## [20] "is_primary_data"
## [21] "tissue_ontology_term_id"
## [22] "disease_ontology_term_id"
## [23] "organism_ontology_term_id"
## [24] "cell_type_ontology_term_id"
## [25] "suspension_type"
## [26] "HTAN_Biospecimen_ID"
## [27] "HTAN_Participant_ID"
## [28] "cell_type"
## [29] "assay"
## [30] "disease"
## [31] "organism"
## [32] "sex"
## [33] "tissue"
## [34] "self_reported_ethnicity"
## [35] "development_stage"
Only categorical metadata variables may be selected for use in the app. Below we identify categorical metadata variables.
full_unique_vars <- SCUBA::meta_varnames(full_dataset)
# Iterate through metadata variables and return a vector specifying whether
# the variable is categorical (character, factor), or numeric (numeric,
# double, integer)
var_classes <-
sapply(
full_unique_vars,
# Compute the class of the variable
function(meta_var){
class(sclc_dataset$obs[[meta_var]])
}
)
# Sort to group classes together
sort(var_classes)
## batch
## "factor"
## donor_id
## "factor"
## treatment
## "factor"
## procedure
## "factor"
## histo
## "factor"
## clusters
## "factor"
## development_stage_ontology_term_id
## "factor"
## sex_ontology_term_id
## "factor"
## self_reported_ethnicity_ontology_term_id
## "factor"
## assay_ontology_term_id
## "factor"
## tissue_ontology_term_id
## "factor"
## disease_ontology_term_id
## "factor"
## organism_ontology_term_id
## "factor"
## cell_type_ontology_term_id
## "factor"
## suspension_type
## "factor"
## HTAN_Biospecimen_ID
## "factor"
## HTAN_Participant_ID
## "factor"
## cell_type
## "factor"
## assay
## "factor"
## disease
## "factor"
## organism
## "factor"
## sex
## "factor"
## tissue
## "factor"
## self_reported_ethnicity
## "factor"
## development_stage
## "factor"
## is_primary_data
## "logical"
## ngenes
## "NULL"
## libsize
## "NULL"
## cell_type_coarse
## "NULL"
## cell_type_fine
## "NULL"
## cell_type_general
## "NULL"
## cell_type_med
## "NULL"
## H_knn
## "NULL"
## mito_frac
## "numeric"
## RBP_frac
## "numeric"
This example will identify relevant metadata following the guiding questions below. The questions apply to other objects as well.
Guiding Questions for Metadata Selection
- Which variables represent cell type annotation, or clustering?
- If there are multiple variables for cell type, which are most relevant to the end user?
- Do the different variables reflect resolutions for cell type, and if so, which resolution is most appropriate?
- Is there a variable for patient (or sample)?
- Are there clinically relevant variables, or variables reflecting research goals of the study?
- Are there any other variables that delineate subsets of cells users may find interesting?
Note
While Numeric metadata can’t be used for subsetting or choices on plots, it may be plotted like features, if allowed via the config app. See the config app section for instructions on how to enable/disable this usage.
In the Chan et al. dataset, there are six metadata variables related to cell type: cell_type_coarse, cell_type, cell_type_fine, cell_type_general, clusters, and cell_type_med.
To identify which are relevant, one approach is to look at the unique
values for each variable. You can get this information in a single
function call using the unique_values()
function from the
SCUBA
package.
SCUBA::unique_values(
object = full_dataset,
var = "cell_type_fine"
)
## [1] T cell SCLC-N Endothelial SCLC-A SCLC-P
## [6] AE1 Basal Macrophage DC NSCLC
## [11] Ciliated Neuroendocrine Plasma cell Ionocyte Hepatocyte
## [16] Mast B cell Fibroblast AEP Neutrophil
## [21] Mucinous Erythrocyte Club Tuft
## 24 Levels: AE1 AEP B cell Basal Ciliated Club DC Endothelial ... Tuft
It may also be helpful to look at the number of unique values in the object for each variable.
cell_type_vars <-
c("cell_type_coarse",
"cell_type_fine",
"cell_type_general",
"cell_type",
"clusters",
"cell_type_med"
)
for (meta_var in cell_type_vars){
print(meta_var)
# Pass results of unique_values to length() to get the number of values
n_unique <-
SCUBA::unique_values(
object = full_dataset,
var = meta_var
) |>
length()
print(n_unique)
}
## [1] "cell_type_coarse"
## [1] 4
## [1] "cell_type_fine"
## [1] 24
## [1] "cell_type_general"
## [1] 3
## [1] "cell_type"
## [1] 11
## [1] "clusters"
## [1] 56
## [1] "cell_type_med"
## [1] 16
We see that cell_type_general
and
cell_type_coarse
have the lowest resolution, and that there
are very many clusters in clusters
.
cell_type_fine
and cell_type_med
have a
middling resolution.
Which resolution is most appropriate? Let’s plot the each one find out.
library(patchwork)
cell_type_vars <-
c("cell_type_general",
"cell_type_coarse",
"cell_type",
"cell_type_med",
"cell_type_fine",
"clusters"
)
plots <- list()
for (meta_var in cell_type_vars){
plots[[meta_var]] <-
SCUBA::plot_reduction(
full_dataset,
group_by = meta_var,
label = TRUE
)
}
wrap_plots(plots, ncol = 2)
From the plots, we see that the resolution in
cell_type_coarse
matches the resolution in Figure
1A.
Next, we look at patient or sample metadata. There are also multiple
variables for sample metadata in this object: donor_id
,
HTAN_Biospecimen_ID
, HTAN_Participant_ID
, and
batch
. The levels in the donor_id
variable
most closely match the patient IDs specified by Chan et al. 2021 in
Supplementary Table S1.
SCUBA::unique_values(
object = full_dataset,
var = "donor_id"
) |>
stringr::str_sort(numeric = TRUE)
## [1] "PleuralEffusion" "RU255" "RU426" "RU653"
## [5] "RU661" "RU666" "RU675" "RU676"
## [9] "RU681" "RU682" "RU684" "RU685"
## [13] "RU699" "RU701" "RU779" "RU1038"
## [17] "RU1057" "RU1061" "RU1065" "RU1066"
## [21] "RU1080" "RU1108" "RU1124" "RU1128"
## [25] "RU1134" "RU1135" "RU1137" "RU1138"
## [29] "RU1144" "RU1145" "RU1152" "RU1170"
## [33] "RU1181" "RU1195" "RU1215" "RU1229"
## [37] "RU1231" "RU1262" "RU1271" "RU1293"
## [41] "RU1311" "RU1322"
treatment
, tissue
, and
self_reported_ethnicity
are also of interest, since they
allow for comparison of cells in the app based on clinical metadata.
treatment
and tissue
appear in Figure
1B and Figure 1C of Chan et al.,
respectively.
Modalities and Reductions
The features included in the app for plotting are based on the
included modalities. For anndata objects, expression data for features
in the primary modality is stored in the X
slot, and data
for additional modalities is stored in obsm
. Dimensional
reduction coordinates for each cell are also stored in
obsm
. Let’s use the obsm_keys()
method to see
what alternate modalities and reductions exist.
full_dataset$obsm_keys()
## [1] "X_pca" "X_umap"
We observe see two matrices in obsm
: X_umap
and X_pca
. These correspond to the UMAP and PCA reductions,
respectively.
The UMAP reduction is preferred as it is the current standard for scRNA-seq visualization. The PCA reduction is of limited utility, partially due to the number of dimensions. Plotting in the app is best suited for reductions with exactly two dimensions, so only the UMAP reduction should be included.
# Number of dimensions in each reduction
full_dataset$obsm$X_umap |> dim()
## [1] 147137 2
full_dataset$obsm$X_pca |> dim()
## [1] 147137 50
No alternate modalities are observed in the obsm
slot
for this object. There is only the main modality, gene expression, in
the X
slot.
Note
Other datasets may include surface protein (ADT) data, computed gene
signatures per cell, or other data. For anndata objects,
alternate modalities must be stored in obsm
to be
accessible to scExploreR (with the exception of X
,
of course). Seurat and SingleCellExperiment objects have defined slots
for modalities: @assays
in Seurat objects, and “alternate
experiments” for SingleCellExperiment objects. For these object types,
scExploreR will use these slots to access modality information. As long
as feature expression data for each modality is stored using the
convention for Seurat and SingleCellExperiment objects, no additional
action is needed for scExploreR to find this data.
SCLC Epithelial Cells Object
Next, we explore the SCLC Epithelial Cells dataset to see if the characteristics exposed differ from the combined dataset.
library(anndata)
sclc_dataset <-
anndata::read_h5ad(
# Replace the code below with the path to the downloaded dataset
"~/datasets/small_cell_lung_cancer/Epithelial_SCLC.h5ad"
)
To determine if there are metadata variables that are in the SCLC
object that are not in the full object, run setdiff()
on
the variable names from each. If you want to see variables that are in
the full object that are not in the SCLC object, switch the order of the
parameters to setdiff()
.
# Identify metadata variables that are specific to the SCLC object
sclc_unique_vars <-
setdiff(
SCUBA::meta_varnames(sclc_dataset),
SCUBA::meta_varnames(full_dataset)
)
sclc_unique_vars
## [1] "n_genes_by_counts" "log1p_n_genes_by_counts"
## [3] "total_counts" "log1p_total_counts"
## [5] "SCLC_subtype" "SCLC_subtype_plus_TP53_RB1_wt"
## [7] "pval_SCLC-A" "pval_SCLC-N"
## [9] "pval_SCLC-P" "NEUROD genes"
## [11] "MHCI_PRESENTATION" "MHCII_PRESENTATION"
## [13] "SCLC_major_subtype_of_sample" "treatment_categorized"
## [15] "recurrent_cluster" "clusters_fine"
From this we see 16 variables that are not in the full object. Let’s view which variables are categorical.
# Iterate through metadata variables and return a vector specifying whether
# the variable is categorical (character, factor), or numeric (numeric,
# double, integer)
var_classes <-
sapply(
sclc_unique_vars,
# Compute the class of the variable
function(meta_var){
class(sclc_dataset$obs[[meta_var]])
}
)
# Sort to group classes together
sort(var_classes)
## SCLC_subtype SCLC_subtype_plus_TP53_RB1_wt
## "factor" "factor"
## SCLC_major_subtype_of_sample treatment_categorized
## "factor" "factor"
## recurrent_cluster clusters_fine
## "factor" "factor"
## n_genes_by_counts log1p_n_genes_by_counts
## "numeric" "numeric"
## total_counts log1p_total_counts
## "numeric" "numeric"
## pval_SCLC-A pval_SCLC-N
## "numeric" "numeric"
## pval_SCLC-P NEUROD genes
## "numeric" "numeric"
## MHCI_PRESENTATION MHCII_PRESENTATION
## "numeric" "numeric"
We see that SCLC_subtype, SCLC_subtype_plus_TP53_RB1_wt,
SCLC_major_subtype_of_sample, treatment_categorized, recurrent_cluster,
and clusters_fine are categorical variables. Let’s plot them below to
better understand them. clusters
and treatment
are added for comparison with clusters_fine
and
treatment_categorized
, respectively.
cell_type_vars <-
c("SCLC_subtype",
"SCLC_subtype_plus_TP53_RB1_wt",
"SCLC_major_subtype_of_sample",
"treatment_categorized",
"treatment",
"recurrent_cluster",
"clusters_fine",
"clusters"
)
plots <- list()
for (meta_var in cell_type_vars){
plots[[meta_var]] <-
SCUBA::plot_reduction(
sclc_dataset,
group_by = meta_var,
label = TRUE
)
}
patchwork::wrap_plots(plots, ncol = 2)
From the plots, we see that SCLC_major_subtype_of_sample
is identical to SCLC_subtype
, and
SCLC_subtype_plus_TP53_RB1_wt
has the the same information
as the former variables, but includes “TP53/RB1-wt SCLC”. Figure
1G of Chan et al. Includes this designation, so we will use
SCLC_subtype_plus_TP53_RB1_wt
.
For clustering, the clusters in clusters
appear to align
with the representation of cluster 22 in Figure 2C, and
clusters_fine
does not, so we will use
clusters
.
The recurrent_cluster
variable appears to align with
cluster 22 in clusters
. Below we test if the cell IDs in
cluster 22 are identical to the IDs for cells where
recurrent_cluster
is TRUE.
# Use SCUBA::get_all_cells to return cell IDs for cells with a value
# of TRUE for the "recurrent cluster" column
recurrent_cluster_cell_names <-
SCUBA::get_all_cells(
# All cells from the subset below will be returned
sclc_dataset[sclc_dataset$obs$recurrent_cluster == "True", ]
)
# Return cell IDs for cluster 22
cluster_22_cell_names <-
SCUBA::get_all_cells(
sclc_dataset[sclc_dataset$obs$clusters == "22", ]
)
# Test if ID vectors are identical
all.equal(
recurrent_cluster_cell_names,
cluster_22_cell_names
)
## [1] TRUE
The cells where recurrent_cluster
is TRUE are the same
cells as those in cluster 22. Since cluster 22 recurs across patients
(Figures 3 A-C of Chan et al.), and is shown to be
associated with a malignant phenotype (Figure S3B),
including a binary designation for this cluster would create a useful
shortcut for end users wishing to examine cells in cluster 22 versus
other cells. We will therefore include this variable for the SCLC
object.
For treatment_categorized
, the values in this variable
align with the information in Figure 1B and
Table S1 of Chan et al. We will use
treatment_categorized
instead of
treatment
.
Modalities and Reductions
There are no alternate modalities in obsm
. The
reductions in obsm
are the same as for the main object,
with the addition of X_diffmap
. This reduction does not
appear in the figures of Chan et al. and is not explicitly mentioned in
the methods section, so it is best to exclude it to keep things simple
for the end user.
Configure Dataset-specific scExploreR Settings in the Config App
Now that we have identified metadata, reductions, and modalities to include for the dataset, the next step is to apply these settings in the config app.
Anndata-specific Preparation for the Config App
Before loading anndata objects into the config app, the keys of
alternate modalities and reductions must be defined, since both types of
data are stored in obsm
, and there is no exclusive storage
slot for reductions or modalities in anndata objects.
Alternate modalities and reductions should be stored in
uns
, under scExploreR_assays
and
scExploreR_reductions
, respectively. This is done below in
python, via reticulate, for both the full object and the SCLC
object.
Note
This step is not required for Seurat or SingleCellExperiment objects.
# If you are working in RStudio, make sure to replace the `r` in the
# code chunk with `python`
# Full object
# Use `r.` to access the full dataset in python using Reticulate
# No non-RNA modalities
r.full_dataset.uns["scExploreR_assays"] = []
r.full_dataset.uns["scExploreR_reductions"] = ["X_umap", "X_pca"]
# SCLC object
r.sclc_dataset.uns["scExploreR_assays"] = []
# Indicate that X_diffmap is a reduction, even though it will not be used
r.sclc_dataset.uns["scExploreR_reductions"] = ["X_umap", "X_pca", "X_diffmap"]
After adding the data to uns
, make sure to save both
objects. They can now be loaded into the config app for analysis.
anndata::write_h5ad(
full_dataset,
filename = "~/datasets/small_cell_lung_cancer/SCLC_Full.h5ad"
)
anndata::write_h5ad(
sclc_dataset,
filename = "~/datasets/small_cell_lung_cancer/Epithelial_SCLC.h5ad"
)
Launching the Config App
To launch the config app, run the code below replacing
object_path
with the path to the downloaded dataset.
scExploreR::run_config(
object_path = "~/datasets/small_cell_lung_cancer/SCLC_Full.h5ad"
)
Upon loading the config app, you will notice several tabs. The general tab is selected by default. This tab is used to enter basic information on the dataset, which is displayed in the app on the menu that appears when switching between datasets. This is useful when an app is created with multiple datasets, which is the case here.
Use the “Label for Dataset” field to add a title for the dataset. In this case, we will use “Small Cell Lung Carcinoma Atlas”.
In the “Description of Dataset” field, add a summary of the dataset. This should be a high-level description, which can include information like the number of cells, the number of samples, the types of samples included, and other relevant details on the experimental design.
147k cells from 19 patients. Twenty-one small cell lung cancer samples are included, along with 24 lung adenocarcinoma samples, and four tumor-adjacent normal lung samples included as controls. See Chan et al. Cancer Cell 2021 for more details.
Next, use the “Content for Dataset Preview” menu to add a plot thumbnail used to preview the dataset in the selection menu. In this example, the plot will serves as an additional guide when switching between the datasets, since they can be compared against the figures in the paper. You can use the settings below to determine which metadata to show, and which reduction to use (in this example, this should be “X_umap”).
Assays (Modalities) Tab
Switch to the assays tab to enter information about the modalities in
the object. We identified previously that there is only one modality in
this object (gene expression, in X
).
To include a modality in scExploreR, select it under “Available Assays”. Once selected, it will appear under “Selected Assays”, and a card will appear on the right side of the screen with modality-specific options. Use the text entry to provide a display name for the modality. The display name should be singular instead of plural since the name will display in the context of a single feature from that assay, so in this case, it should be “Gene”.
If you want the name to display in plots, click the checkbox. This is helpful when you have features in multiple modalities with the same name, though in this example there is only one modality, so the box can be left unchecked.
Note
When configuring an object with multiple modalities, the checkbox for the “Gene” assay should generally not be checked. Usually most of the features in an object will be genes, so it is not necessary to specify that a feature is a gene. For other modalities, it generally does make sense to add the name of modality to the plot. It is especially helpful to add a label when feature names overlap with names in the gene assay. This can be the case for ADT data, where several surface proteins overlap with human gene symbols (i.e. CD34 and CD36).
Metadata Tab
The metadata tab is used to identify the metadata variables to use in the app, and specify settings for each variable.
Drag a metadata variable from the “Available Metadata” column to the “Selected Metadata” column to include it in the app. Use the text box to enter a display name for the variable. Drag “clusters” to the “Selected Metadata” column and add the label “Clusters”.
You can drag the variables in the “Selected Metadata” column to change their order. Menus in the app will reflect the order of the variables.
Drag the metadata columns listed below to the “Selected Metadata” column, and add the suggested display names.
Suggested Metadata Options
Reductions Tab
Next, set settings for reductions in the reductions tab. All reductions are included by default, though the PCA in this example is not useful for plotting, so there is no reason to include it. Select the PCA reduction to remove it.
The labels for each reduction default to the keys assigned to them in the object. Let’s rename the label for the UMAP reduction to “UMAP”.
Save Config file
Now that we’ve filled out all the information in the config file, let’s save the file to disk. To do this, navigate to the orange button, select “Save Config File”, and use the system prompt that appears to save the file. The file will save with a “.yaml” extension.
Note
The recommended filename is the the name of your object plus “_config”, though you can use any name you wish.
Create Config File for Epithelial SCLC Object
Next, we’ll create the config file for the SCLC object. Run the config app and repeat the process above, using the suggested settings below as a guide.
scExploreR::run_config(
object_path = "~/datasets/small_cell_lung_cancer/Epithelial_SCLC.h5ad"
)
Suggested Settings for Epithelial SCLC Object
General Tab
Label for dataset: SCLC Epithelial Cells
Description: 54k cells from epithelial tissue, designated as SCLC tumor cells. This is a subset of the combined samples dataset that has been re-clustered. For more information, see Chan et al. Cancer Cell 2021.
Dataset Preview:
- DimPlot
- Metadata to Group By: SCLC_subtype_plus_TP53_RB1_wt
- Metadata to Split By: none
- Choose Reduction: X_umap
- Label Groups: Checked
- Show Legend: Checked
Create an App Config File
The final step before launching scExploreR is to create an app config file. The app config file specifies the paths of all objects used in the app and their respective config files, and it also defines app-wide settings. An example app config file is given below.
datasets
is a list of all the datasets in the app. Each
key under datasets
represents a dataset to be loaded in the
app instance (in this case, the dataset keys are
combined_sclc
and epithelial_sclc
). The keys
under datasets
can be any name, as long as spaces are not
used (use underscores instead). Any descriptive name for the dataset
will work.
Under the dataset keys, there are two keys: object
and
config
. These should be the path to the object and config
file, respectively.
The RSTUDIO_PANDOC
key is used to define a path to an
RStudio pandoc installation. This should only be necessary when running
on a linux server.
To determine if this is necessary, run
rmarkdown::pandoc_available()
in your R console. If
TRUE
is returned, RSTUDIO_PANDOC
does not need
to be specified. In place of a path, enter a YAML NULL
(~
) for this key. If you see FALSE
, you will
need to install pandoc manually and provide the path to the
installation. rmarkdown::find_pandoc()
may be able to find
the path.
# YAML version
%YAML 1.1
---
datasets:
combined_sclc:
# Replace path with path to object
object: "~/datasets/small_cell_lung_cancer/SCLC_Full.h5ad"
# Replace with path to config file
config: "~/datasets/small_cell_lung_cancer/SCLC_Full_config.yaml"
epithelial_sclc:
object: "~/datasets/small_cell_lung_cancer/Epithelial_SCLC.h5ad"
config: "~/datasets/small_cell_lung_cancer/Epithelial_SCLC_config.yaml"
RSTUDIO_PANDOC: ~
You can create a config file from R Studio by going to File >> New File >> Text File. After creating the text file, save it in the same directory as the object and config files, and be sure to add “.yaml” to the filename. After saving initially, use the structure above and define the paths to the object and config files.
Note
Don’t forget the YAML version at the top. Version 1.1 is the latest
version supported by the yaml
package used to read the data
into R.
Run scExploreR
The app can now be ran by providing the path to the app config file.
Use the browser_config
parameter to specify the path.
scExploreR::run_scExploreR(
browser_config = "~/datasets/small_cell_lung_cancer/sclc_app_config.yaml"
)
Note
If you only have one object, you don’t need to make an app config
file. You can simply run run_scExploreR
, using the
object_path
and config_path
parameters instead
of browser_config
.
scExploreR::run_scExploreR(
object_path = "~/datasets/small_cell_lung_cancer/SCLC_Full.h5ad",
config_path = "~/datasets/small_cell_lung_cancer/SCLC_Full_config.yaml"
)