| Title: | Adaptive Machine Learning-Powered, Context-Matching Tool for Single-Cell and Spatial Transcriptomics Annotation |
|---|---|
| Description: | Annotates single-cell and spatial-transcriptomic (ST) data using context-matching marker datasets. It creates a unified marker list (`Markers_list`) from multiple sources: built-in curated databases ('Cellmarker2', 'PanglaoDB', 'ScType', 'scIBD', 'TCellSI', 'PCTIT', 'PCTAM'), Seurat objects with cell labels, or user-provided Excel tables. SlimR first uses adaptive machine learning for parameter optimization, and then offers two automated annotation approaches: 'cluster-based' and 'per-cell'. Cluster-based annotation assigns one label per cluster, expression-based probability calculation, and AUC validation. Per-cell annotation assigns labels to individual cells using three scoring methods with adaptive thresholds and ratio-based confidence filtering, plus optional UMAP spatial smoothing, making it ideal for heterogeneous clusters and rare cell types. The package also supports semi-automated workflows with heatmaps, feature plots, and combined visualizations for manual annotation. For more information, see the package documentation at <https://github.com/zhaoqing-wang/SlimR>. |
| Authors: | Zhaoqing Wang [aut, cre] (ORCID: <https://orcid.org/0000-0001-8348-7245>) |
| Maintainer: | Zhaoqing Wang <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.5 |
| Built: | 2026-06-04 11:06:08 UTC |
| Source: | https://github.com/zhaoqing-wang/slimr |
Measures the degree of separation between different cell clusters based on expression patterns.
calculate_cluster_variability(data.features, features)calculate_cluster_variability(data.features, features)
data.features |
Data frame containing expression data and cluster labels |
features |
Feature names to include in analysis |
Numeric value representing cluster separation strength
Other Section_1_Functions_Use_in_Package:
calculate_expression(),
calculate_expression_skewness(),
calculate_probability(),
compute_adaptive_parameters(),
estimate_batch_effect(),
extract_dataset_features()
Counts average expression of gene set (Use in package)
calculate_expression( object, features, assay = NULL, cluster_col = NULL, colour_low = "white", colour_high = "navy" )calculate_expression( object, features, assay = NULL, cluster_col = NULL, colour_low = "white", colour_high = "navy" )
object |
Enter a Seurat object. |
features |
Enter one or a set of markers. |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = NULL". |
cluster_col |
Enter the meta.data column in the Seurat object to be annotated, such as "seurat_cluster". Default parameters use "cluster_col = NULL". |
colour_low |
Color for lowest expression level. (default = "white") |
colour_high |
Color for highest expression level. (default = "black") |
Average expression genes and relatied informations in the input "Seurat" object given "cluster_col" and given "features".
Other Section_1_Functions_Use_in_Package:
calculate_cluster_variability(),
calculate_expression_skewness(),
calculate_probability(),
compute_adaptive_parameters(),
estimate_batch_effect(),
extract_dataset_features()
Computes the average skewness of gene expression distributions across all features.
calculate_expression_skewness(expression_matrix)calculate_expression_skewness(expression_matrix)
expression_matrix |
Matrix of expression values |
Mean absolute skewness across all genes
Other Section_1_Functions_Use_in_Package:
calculate_cluster_variability(),
calculate_expression(),
calculate_probability(),
compute_adaptive_parameters(),
estimate_batch_effect(),
extract_dataset_features()
Calculate gene set expression and infer probabilities with control datasets (Use in package)
calculate_probability( object, features, assay = NULL, cluster_col = NULL, min_expression = 0.1, specificity_weight = 3 )calculate_probability( object, features, assay = NULL, cluster_col = NULL, min_expression = 0.1, specificity_weight = 3 )
object |
Enter a Seurat object. |
features |
Enter one or a set of markers. |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = NULL". |
cluster_col |
Enter the meta.data column in the Seurat object to be annotated, such as "seurat_cluster". Default parameters use "cluster_col = NULL". |
min_expression |
The min_expression parameter defines a threshold value to determine whether a cell's expression of a feature is considered "expressed" or not. It is used to filter out low-expression cells that may contribute noise to the analysis. Default parameters use "min_expression = 0.1". |
specificity_weight |
The specificity_weight parameter controls how much the expression variability (standard deviation) of a feature within a cluster contributes to its "specificity score." It amplifies or suppresses the impact of variability in the final score calculation.Default parameters use "specificity_weight = 3". |
Average expression of genes in the input "Seurat" object given "cluster_col" and given "features".
Other Section_1_Functions_Use_in_Package:
calculate_cluster_variability(),
calculate_expression(),
calculate_expression_skewness(),
compute_adaptive_parameters(),
estimate_batch_effect(),
extract_dataset_features()
A dataset containing marker genes for different cell types from Cellmarker2
Cellmarker2Cellmarker2
A data frame with 8 columns:
This dataset is used to filter and create a standardized marker list. The dataset can be filtered based on species, tissue class, tissue type, cancer type, and cell type to generate a list of marker genes for specific cell types.
http://117.50.127.228/CellMarker/
Other Section_0_Database:
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
A dataset containing marker genes for different cell types from Cellmarker2
Cellmarker2_rawCellmarker2_raw
A data frame with 20 columns contined in the Cellmarker2 database:
This dataset is used to filter and create a standardized marker list. The dataset can be filtered based on species, tissue class, tissue type, cancer type, and cell type to generate a list of marker genes for specific cell types.
http://117.50.127.228/CellMarker/
Other Section_0_Database:
Cellmarker2,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
A dataset containing marker genes for different cell types from Cellmarker2
Cellmarker2_tableCellmarker2_table
A list contain different types like species, tissue_class, tissue_type, cancer_type, cell_type
This list is used to choose filters for creation of standardized marker list.
http://117.50.127.228/CellMarker/
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
This function assigns SlimR predicted cell types to a Seurat object based on cluster annotations, and stores the results in the meta.data slot.
Celltype_Annotation( seurat_obj, cluster_col, SlimR_anno_result, plot_UMAP = TRUE, annotation_col = "Cell_type_SlimR" )Celltype_Annotation( seurat_obj, cluster_col, SlimR_anno_result, plot_UMAP = TRUE, annotation_col = "Cell_type_SlimR" )
seurat_obj |
A Seurat object containing cluster information in meta.data. |
cluster_col |
Character string indicating the column name in meta.data that contains cluster IDs. |
SlimR_anno_result |
List generated by function Celltype_Calculate() which containing a data.frame in $Prediction_results with: 1.cluster_col (Cluster identifiers (should match cluster_col in meta.data)) 2.Predicted_cell_type (Predicted cell types for each cluster). |
plot_UMAP |
logical(1); if TRUE, plot the UMAP with cell type annotations. |
annotation_col |
The location to write in 'meta.data' that contains the predicted cell type. (default = "Cell_type_SlimR") |
A Seurat object with updated meta.data containing the predicted cell types.
If plot_UMAP = TRUE, this function will print a UMAP plot as a side effect.
Other Section_3_Automated_Annotation:
Celltype_Annotation_PerCell(),
Celltype_Calculate(),
Celltype_Calculate_PerCell(),
Celltype_Verification(),
Celltype_Verification_PerCell(),
Parameter_Calculate(),
percell_workflow
## Not run: sce <- Celltype_Annotation(seurat_obj = sce, cluster_col = "seurat_clusters", SlimR_anno_result = SlimR_anno_result, plot_UMAP = TRUE, annotation_col = "Cell_type_SlimR" ) ## End(Not run)## Not run: sce <- Celltype_Annotation(seurat_obj = sce, cluster_col = "seurat_clusters", SlimR_anno_result = SlimR_anno_result, plot_UMAP = TRUE, annotation_col = "Cell_type_SlimR" ) ## End(Not run)
Uses "marker_list" from Cellmarker2 for cell annotation
Celltype_annotation_Cellmarker2( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, min_counts = 1, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy" )Celltype_annotation_Cellmarker2( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, min_counts = 1, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy" )
seurat_obj |
Enter the Seurat object with annotation columns such as "seurat_cluster" in meta.data to be annotated. |
gene_list |
Enter the standard "Marker_list" generated by the Cellmarker2 database for the SlimR package, generated by the "Markers_filter_Cellmarker2 ()" function. |
species |
This parameter selects the species "Human" or "Mouse" for standard gene format correction of markers entered by "Marker_list". |
cluster_col |
Enter annotation columns such as "seurat_cluster" in meta.data of the Seurat object to be annotated. Default parameters use "cluster_col = 'seurat_clusters'". |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = "RNA"". |
save_path |
The output path of the cell annotation picture. Example parameters use "save_path = './SlimR/Celltype_annotation_Cellmarker2/'". |
min_counts |
The minimum number of counts of genes in "Marker_list" entered. This number represents the number of the same gene in the same species and the same location in the Cellmarker2 database used for annotation of this cell type. Default parameters use "min_counts = 1". |
colour_low |
Color for lowest expression level. (default = "white") |
colour_high |
Color for highest expression level. (default = "navy") |
colour_low_mertic |
Color for lowest mertic level. (default = "white") |
colour_high_mertic |
Color for highest mertic level. (default = "navy") |
The cell annotation picture is saved in "save_path".
Other Section_5_Other_Functions_Provided:
Celltype_Compare(),
Celltype_annotation_Excel(),
Celltype_annotation_PanglaoDB(),
Celltype_annotation_Seurat()
## Not run: Celltype_annotation_Cellmarker2(seurat_obj = sce, gene_list = Markers_list_Cellmarker2, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Cellmarker2") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)## Not run: Celltype_annotation_Cellmarker2(seurat_obj = sce, gene_list = Markers_list_Cellmarker2, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Cellmarker2") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)
Uses "marker_list" to generate combined plot for cell annotation
Celltype_Annotation_Combined( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, colour_low = "white", colour_high = "navy" )Celltype_Annotation_Combined( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, colour_low = "white", colour_high = "navy" )
seurat_obj |
Enter the Seurat object with annotation columns such as "seurat_cluster" in meta.data to be annotated. |
gene_list |
A list of cells and corresponding gene controls, the name of the list is cell type, and the first column of the list corresponds to markers. Lists can be generated using functions such as "Markers_filter_Cellmarker2 ()", "Markers_filter_PanglaoDB ()", "read_excel_markers ()", "read_seurat_markers ()", etc. |
species |
This parameter selects the species "Human" or "Mouse" for standard gene format correction of markers entered by "Marker_list". |
cluster_col |
Enter annotation columns such as "seurat_cluster" in meta.data of the Seurat object to be annotated. Default parameters use "cluster_col = 'seurat_clusters'". |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = 'RNA'". |
save_path |
The output path of the cell annotation picture. Example parameters use "save_path = './SlimR/Celltype_annotation_Bar/'". |
colour_low |
Color for lowest expression level. (default = "white") |
colour_high |
Color for highest expression level. (default = "navy") |
The cell annotation picture is saved in "save_path".
Other Section_4_Semi_Automated_Annotation:
Celltype_Annotation_Features(),
Celltype_Annotation_Heatmap()
## Not run: Celltype_Annotation_Combined(seurat_obj = sce, gene_list = Markers_list, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_Annotation_Combined"), colour_low = "white", colour_high = "navy" ) ## End(Not run)## Not run: Celltype_Annotation_Combined(seurat_obj = sce, gene_list = Markers_list, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_Annotation_Combined"), colour_low = "white", colour_high = "navy" ) ## End(Not run)
Uses "marker_list" from Excel input for cell annotation
Celltype_annotation_Excel( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, metric_names = NULL, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy" )Celltype_annotation_Excel( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, metric_names = NULL, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy" )
seurat_obj |
Enter the Seurat object with annotation columns such as "seurat_cluster" in meta.data to be annotated. |
gene_list |
Enter the standard "Marker_list" generated by the Excel files database for the SlimR package, generated by the "read_excel_markers()" function. |
species |
This parameter selects the species "Human" or "Mouse" for standard gene format correction of markers entered by "Marker_list". |
cluster_col |
Enter annotation columns such as "seurat_cluster" in meta.data of the Seurat object to be annotated. Default parameters use "cluster_col = "seurat_clusters"". |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = 'RNA'". |
save_path |
The output path of the cell annotation picture. Example parameters use "save_path = './SlimR/Celltype_annotation_Excel/'". |
metric_names |
Change the row name for the input mertics, not recommended unless necessary. (NULL is used as default parameter) |
colour_low |
Color for lowest expression level. (default = "white") |
colour_high |
Color for highest expression level. (default = "navy") |
colour_low_mertic |
Color for lowest mertic level. (default = "white") |
colour_high_mertic |
Color for highest mertic level. (default = "navy") |
The cell annotation picture is saved in "save_path".
Other Section_5_Other_Functions_Provided:
Celltype_Compare(),
Celltype_annotation_Cellmarker2(),
Celltype_annotation_PanglaoDB(),
Celltype_annotation_Seurat()
## Not run: Celltype_annotation_Excel(seurat_obj = sce, gene_list = Markers_list_Excel, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Excel") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)## Not run: Celltype_annotation_Excel(seurat_obj = sce, gene_list = Markers_list_Excel, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Excel") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)
This function dynamically selects the appropriate annotation method
based on the gene_list_type parameter. It supports marker databases from
Cellmarker2, PanglaoDB, Seurat (via FindAllMarkers), or Excel files.
Celltype_Annotation_Features( seurat_obj, gene_list, gene_list_type = "Default", species = NULL, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, min_counts = 1, metric_names = NULL, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ... )Celltype_Annotation_Features( seurat_obj, gene_list, gene_list_type = "Default", species = NULL, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, min_counts = 1, metric_names = NULL, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ... )
seurat_obj |
A valid Seurat object with cluster annotations in |
gene_list |
A list of data frames containing marker genes and metrics.
Format depends on
|
gene_list_type |
Type of marker database to use. Be one of:
|
species |
Species of the dataset: |
cluster_col |
Column name in |
assay |
Assay layer in the Seurat object (default: |
save_path |
Directory to save output PNGs. Must be explicitly specified. |
min_counts |
Minimum number of counts for Cellmarker2 annotations (default: |
metric_names |
Optional. Change the row name for the input mertics, not recommended unless necessary. (NULL is used as default parameter; used in "Seurat"/"Excel"). |
colour_low |
Color for lowest expression level. (default = "white") |
colour_high |
Color for highest expression level. (default = "navy") |
colour_low_mertic |
Color for lowest mertic level. (default = "white") |
colour_high_mertic |
Color for highest mertic level. (default = "navy") |
... |
Additional parameters passed to the specific annotation function. |
Saves cell type annotation PNGs in save_path. Returns invisibly.
Other Section_4_Semi_Automated_Annotation:
Celltype_Annotation_Combined(),
Celltype_Annotation_Heatmap()
## Not run: # Example for Cellmarker2 Celltype_Annotation_Features(seurat_obj = sce, gene_list = Markers_list_Cellmarker2, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Cellmarker2"), colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) # Example for PanglaoDB Celltype_Annotation_Features(seurat_obj = sce, gene_list = Markers_list_panglaoDB, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_PanglaoDB") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) # Example for Seurat marker list Celltype_Annotation_Features(seurat_obj = sce, gene_list = Markers_list_Seurat, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Seurat") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) # Example for Excel marker list Celltype_Annotation_Features(seurat_obj = sce, gene_list = Markers_list_Excel, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Excel") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)## Not run: # Example for Cellmarker2 Celltype_Annotation_Features(seurat_obj = sce, gene_list = Markers_list_Cellmarker2, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Cellmarker2"), colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) # Example for PanglaoDB Celltype_Annotation_Features(seurat_obj = sce, gene_list = Markers_list_panglaoDB, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_PanglaoDB") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) # Example for Seurat marker list Celltype_Annotation_Features(seurat_obj = sce, gene_list = Markers_list_Seurat, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Seurat") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) # Example for Excel marker list Celltype_Annotation_Features(seurat_obj = sce, gene_list = Markers_list_Excel, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Excel") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)
Uses "marker_list" to generate heatmap for cell annotation
Celltype_Annotation_Heatmap( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", min_expression = 0.1, specificity_weight = 3, colour_low = "navy", colour_high = "firebrick3" )Celltype_Annotation_Heatmap( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", min_expression = 0.1, specificity_weight = 3, colour_low = "navy", colour_high = "firebrick3" )
seurat_obj |
Enter the Seurat object with annotation columns such as "seurat_cluster" in meta.data to be annotated. |
gene_list |
A list of cells and corresponding gene controls, the name of the list is cell type, and the first column of the list corresponds to markers. Lists can be generated using functions such as "Markers_filter_Cellmarker2 ()", "Markers_filter_PanglaoDB ()", "read_excel_markers ()", "read_seurat_markers ()", etc. |
species |
This parameter selects the species "Human" or "Mouse" for standard gene format correction of markers entered by "Marker_list". |
cluster_col |
Enter annotation columns such as "seurat_cluster" in meta.data of the Seurat object to be annotated. Default parameters use "cluster_col = 'seurat_clusters'". |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = 'RNA'". |
min_expression |
The min_expression parameter defines a threshold value to determine whether a cell's expression of a feature is considered "expressed" or not. It is used to filter out low-expression cells that may contribute noise to the analysis. Default parameters use "min_expression = 0.1". |
specificity_weight |
The specificity_weight parameter controls how much the expression variability (standard deviation) of a feature within a cluster contributes to its "specificity score." It amplifies or suppresses the impact of variability in the final score calculation.Default parameters use "specificity_weight = 3". |
colour_low |
Color for lowest probability level in Heatmap visualization of probability matrix. (default = "navy") |
colour_high |
Color for highest probability level Heatmap visualization of probability matrix. (default = "firebrick3") |
The heatmap of the comparison between "cluster_col" in the Seurat object and the given gene set "gene_list" needs to be annotated.
Other Section_4_Semi_Automated_Annotation:
Celltype_Annotation_Combined(),
Celltype_Annotation_Features()
## Not run: Celltype_Annotation_Heatmap(seurat_obj = sce, gene_list = Markers_list, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", min_expression = 0.1, specificity_weight = 3, colour_low = "navy", colour_high = "firebrick3" ) ## End(Not run)## Not run: Celltype_Annotation_Heatmap(seurat_obj = sce, gene_list = Markers_list, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", min_expression = 0.1, specificity_weight = 3, colour_low = "navy", colour_high = "firebrick3" ) ## End(Not run)
Uses "marker_list" from PanglaoDB for cell annotation
Celltype_annotation_PanglaoDB( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, metric_names = NULL, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy" )Celltype_annotation_PanglaoDB( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, metric_names = NULL, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy" )
seurat_obj |
Enter the Seurat object with annotation columns such as "seurat_cluster" in meta.data to be annotated. |
gene_list |
Enter the standard "Marker_list" generated by the PanglaoDB database for the SlimR package, generated by the "Markers_filter_PanglaoDB ()" function. |
species |
This parameter selects the species "Human" or "Mouse" for standard gene format correction of markers entered by "Marker_list". |
cluster_col |
Enter annotation columns such as "seurat_cluster" in meta.data of the Seurat object to be annotated. Default parameters use "cluster_col = 'seurat_clusters'". |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = 'RNA'". |
save_path |
The output path of the cell annotation picture. Example parameters use "save_path = './SlimR/Celltype_annotation_PanglaoDB/'". |
metric_names |
Warning: Do not enter information. This parameter is used to check if "Marker_list" conforms to the PanglaoDB database output. |
colour_low |
Color for lowest expression level. (default = "white") |
colour_high |
Color for highest expression level. (default = "navy") |
colour_low_mertic |
Color for lowest mertic level. (default = "white") |
colour_high_mertic |
Color for highest mertic level. (default = "navy") |
The cell annotation picture is saved in "save_path".
Other Section_5_Other_Functions_Provided:
Celltype_Compare(),
Celltype_annotation_Cellmarker2(),
Celltype_annotation_Excel(),
Celltype_annotation_Seurat()
## Not run: Celltype_annotation_PanglaoDB(seurat_obj = sce, gene_list = Markers_list_panglaoDB, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_PanglaoDB") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)## Not run: Celltype_annotation_PanglaoDB(seurat_obj = sce, gene_list = Markers_list_panglaoDB, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_PanglaoDB") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)
This function assigns SlimR per-cell predicted cell types directly to individual cells in a Seurat object's meta.data slot.
Celltype_Annotation_PerCell( seurat_obj, SlimR_percell_result, plot_UMAP = TRUE, annotation_col = "Cell_type_PerCell_SlimR", plot_confidence = FALSE )Celltype_Annotation_PerCell( seurat_obj, SlimR_percell_result, plot_UMAP = TRUE, annotation_col = "Cell_type_PerCell_SlimR", plot_confidence = FALSE )
seurat_obj |
A Seurat object. |
SlimR_percell_result |
List generated by Celltype_Calculate_PerCell() containing Cell_annotations data.frame with Cell_barcode and Predicted_cell_type columns. |
plot_UMAP |
Logical; if TRUE, plot the UMAP with cell type annotations. Default: TRUE. |
annotation_col |
Column name to write in meta.data. Default: "Cell_type_PerCell_SlimR". |
plot_confidence |
Logical; if TRUE, also plot a UMAP colored by confidence scores. Default: FALSE. |
A Seurat object with updated meta.data containing:
annotation_col: Predicted cell type for each cell
paste0(annotation_col, "_score"): Max score for each cell
paste0(annotation_col, "_confidence"): Confidence score for each cell
If plot_UMAP = TRUE, this function will print UMAP plot(s) as a side effect.
Other Section_3_Automated_Annotation:
Celltype_Annotation(),
Celltype_Calculate(),
Celltype_Calculate_PerCell(),
Celltype_Verification(),
Celltype_Verification_PerCell(),
Parameter_Calculate(),
percell_workflow
## Not run: # Run per-cell annotation result <- Celltype_Calculate_PerCell( seurat_obj = sce, gene_list = Markers_list, species = "Human" ) # Annotate Seurat object sce <- Celltype_Annotation_PerCell( seurat_obj = sce, SlimR_percell_result = result, plot_UMAP = TRUE, annotation_col = "Cell_type_PerCell_SlimR" ) ## End(Not run)## Not run: # Run per-cell annotation result <- Celltype_Calculate_PerCell( seurat_obj = sce, gene_list = Markers_list, species = "Human" ) # Annotate Seurat object sce <- Celltype_Annotation_PerCell( seurat_obj = sce, SlimR_percell_result = result, plot_UMAP = TRUE, annotation_col = "Cell_type_PerCell_SlimR" ) ## End(Not run)
Uses "marker_list" from Seurat object for cell annotation
Celltype_annotation_Seurat( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, metric_names = NULL, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy" )Celltype_annotation_Seurat( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", save_path = NULL, metric_names = NULL, colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy" )
seurat_obj |
Enter the Seurat object with annotation columns such as "seurat_cluster" in meta.data to be annotated. |
gene_list |
Enter the standard "Marker_list" generated by the Seurat object database for the SlimR package, generated by the "read_seurat_markers()" function. |
species |
This parameter selects the species "Human" or "Mouse" for standard gene format correction of markers entered by "Marker_list". |
cluster_col |
Enter annotation columns such as "seurat_cluster" in meta.data of the Seurat object to be annotated. Default parameters use "cluster_col = 'seurat_clusters'". |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = 'RNA'". |
save_path |
The output path of the cell annotation picture. Example parameters use "save_path = './SlimR/Celltype_annotation_Seurat/'". |
metric_names |
Change the row name for the input mertics, not recommended unless necessary. (NULL is used as default parameter) |
colour_low |
Color for lowest expression level. (default = "white") |
colour_high |
Color for highest expression level. (default = "navy") |
colour_low_mertic |
Color for lowest mertic level. (default = "white") |
colour_high_mertic |
Color for highest mertic level. (default = "navy") |
The cell annotation picture is saved in "save_path".
Other Section_5_Other_Functions_Provided:
Celltype_Compare(),
Celltype_annotation_Cellmarker2(),
Celltype_annotation_Excel(),
Celltype_annotation_PanglaoDB()
## Not run: Celltype_annotation_Seurat(seurat_obj = sce, gene_list = Markers_list_Seurat, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Seurat") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)## Not run: Celltype_annotation_Seurat(seurat_obj = sce, gene_list = Markers_list_Seurat, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", save_path = file.path(tempdir(),"SlimR_Celltype_annotation_Seurat") colour_low = "white", colour_high = "navy", colour_low_mertic = "white", colour_high_mertic = "navy", ) ## End(Not run)
Uses "marker_list" to calculate probability, prediction results, AUC and generate heatmap for cell annotation
Celltype_Calculate( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", min_expression = 0.1, specificity_weight = 3, threshold = 0.6, compute_AUC = TRUE, plot_AUC = TRUE, AUC_correction = FALSE, colour_low = "navy", colour_high = "firebrick3" )Celltype_Calculate( seurat_obj, gene_list, species, cluster_col = "seurat_clusters", assay = "RNA", min_expression = 0.1, specificity_weight = 3, threshold = 0.6, compute_AUC = TRUE, plot_AUC = TRUE, AUC_correction = FALSE, colour_low = "navy", colour_high = "firebrick3" )
seurat_obj |
Enter the Seurat object with annotation columns such as "seurat_cluster" in meta.data to be annotated. |
gene_list |
A list of cells and corresponding gene controls, the name of the list is cell type, and the first column of the list corresponds to markers. Lists can be generated using functions such as "Markers_filter_Cellmarker2 ()", "Markers_filter_PanglaoDB ()", "read_excel_markers ()", "read_seurat_markers ()", etc. |
species |
This parameter selects the species "Human" or "Mouse" for standard gene format correction of markers entered by "Marker_list". |
cluster_col |
Enter annotation columns such as "seurat_cluster" in meta.data of the Seurat object to be annotated. Default parameters use "cluster_col = 'seurat_clusters'". |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = 'RNA'". |
min_expression |
The min_expression parameter defines a threshold value to determine whether a cell's expression of a feature is considered "expressed" or not. It is used to filter out low-expression cells that may contribute noise to the analysis. Default parameters use "min_expression = 0.1". |
specificity_weight |
The specificity_weight parameter controls how much the expression variability (standard deviation) of a feature within a cluster contributes to its "specificity score." It amplifies or suppresses the impact of variability in the final score calculation.Default parameters use "specificity_weight = 3". |
threshold |
This parameter refers to the normalized similarity between the "alternative cell type" and the "predicted cell type" in the returned results. (the default parameter is 0.6) |
compute_AUC |
Logical indicating whether to calculate AUC values for predicted cell types. AUC measures how well the marker genes distinguish the cluster from others. When TRUE, adds an AUC column to the prediction results. (default: TRUE) |
plot_AUC |
The logic indicates whether to draw an AUC curve for the predicted cell type. When TRUE, add an AUC_plot to result. (default: TRUE) |
AUC_correction |
Logical value controlling AUC-based correction. (default = FALSE) When set to TRUE: 1.Computes AUC values for candidate cell types. (probability > threshold) 2.Selects the cell type with the highest AUC as the final predicted type. 3.Records the selected type's AUC value in the "AUC" column. |
colour_low |
Color for lowest probability level in Heatmap visualization of probability matrix. (default = "navy") |
colour_high |
Color for highest probability level Heatmap visualization of probability matrix. (default = "firebrick3") |
A list containing:
Expression_list: List of expression matrices for each cell type
Proportion_list: List of proportion of expression for each cell type
Expression_scores_matrix: Matrix of expression scores
Probability_matrix: Matrix of normalized probabilities
Prediction_results: Data frame with cluster annotations including:
cluster_col: Cluster identifier
Predicted_cell_type: Primary predicted cell type
AUC: Area Under the Curve value (when compute_AUC = TRUE)
Alternative_cell_types: Semi-colon separated alternative cell types
Heatmap_plot: Heatmap visualization of probability matrix (pheatmap object).
Can be displayed using print() or plot()
AUC_plot: AUC visualization of Predicted cell type (ggplot object)
AUC_list: The resulting list of AUC values calculated for genes in alternative cell types above the approximate threshold
Other Section_3_Automated_Annotation:
Celltype_Annotation(),
Celltype_Annotation_PerCell(),
Celltype_Calculate_PerCell(),
Celltype_Verification(),
Celltype_Verification_PerCell(),
Parameter_Calculate(),
percell_workflow
## Not run: SlimR_anno_result <- Celltype_Calculate(seurat_obj = sce, gene_list = Markers_list, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", min_expression = 0.1, specificity_weight = 3, threshold = 0.6, compute_AUC = TRUE, plot_AUC = TRUE, AUC_correction = FALSE, colour_low = "navy", colour_high = "firebrick3" ) ## End(Not run)## Not run: SlimR_anno_result <- Celltype_Calculate(seurat_obj = sce, gene_list = Markers_list, species = "Human", cluster_col = "seurat_clusters", assay = "RNA", min_expression = 0.1, specificity_weight = 3, threshold = 0.6, compute_AUC = TRUE, plot_AUC = TRUE, AUC_correction = FALSE, colour_low = "navy", colour_high = "firebrick3" ) ## End(Not run)
Unlike cluster-based annotation, this function assigns cell type labels to each individual cell based on marker gene expression profiles. Optionally uses UMAP coordinates to smooth predictions via k-nearest neighbor voting.
Celltype_Calculate_PerCell( seurat_obj, gene_list, species, assay = "RNA", method = c("weighted", "mean", "AUCell"), min_expression = 0.1, use_umap_smoothing = FALSE, umap_reduction = "umap", k_neighbors = 15, smoothing_weight = 0.3, min_score = "auto", min_confidence = 1.2, return_scores = FALSE, ncores = 1, chunk_size = 5000, verbose = TRUE )Celltype_Calculate_PerCell( seurat_obj, gene_list, species, assay = "RNA", method = c("weighted", "mean", "AUCell"), min_expression = 0.1, use_umap_smoothing = FALSE, umap_reduction = "umap", k_neighbors = 15, smoothing_weight = 0.3, min_score = "auto", min_confidence = 1.2, return_scores = FALSE, ncores = 1, chunk_size = 5000, verbose = TRUE )
seurat_obj |
Seurat object with normalized expression data. |
gene_list |
A standardized marker list (same format as Celltype_Calculate). |
species |
"Human" or "Mouse" for gene name formatting. |
assay |
Assay to use (default: "RNA"). |
method |
Scoring method: "AUCell" (rank-based), "mean" (average expression), or "weighted" (expression * detection weighted). Default: "weighted". |
min_expression |
Minimum expression threshold for detection. Default: 0.1. |
use_umap_smoothing |
Logical. If TRUE, apply k-NN smoothing using UMAP coordinates to improve annotation consistency. Default: FALSE. |
umap_reduction |
Name of UMAP reduction in Seurat object. Default: "umap". |
k_neighbors |
Number of neighbors for UMAP smoothing. Default: 15. |
smoothing_weight |
Weight for neighbor votes vs cell's own score (0-1). Higher values give more weight to neighbors. Default: 0.3. |
min_score |
Minimum score threshold to assign a cell type. Cells below this threshold are labeled "Unassigned". Default: "auto" which adaptively sets the threshold based on number of cell types (1.5 / n_celltypes). Set to a numeric value (e.g., 0.1) to use a fixed threshold. |
min_confidence |
Minimum confidence threshold. Cells with confidence below this value are labeled "Unassigned". Confidence is calculated as the ratio of max score to second-highest score. Default: 1.2 (max must be 20% higher than second). Set to 1.0 to disable confidence filtering. |
return_scores |
If TRUE, return full score matrix. Default: FALSE. |
ncores |
Number of cores for parallel processing. Default: 1. |
chunk_size |
Number of cells to process per chunk (memory optimization). Default: 5000. |
verbose |
Print progress messages. Default: TRUE. |
"weighted" (recommended): Combines normalized expression with detection rate. For each cell and cell type: score = mean(expr_i * weight_i) where weight_i is derived from the marker's specificity across the dataset.
"mean": Simple average of normalized marker expression. Fast but less discriminative for overlapping marker sets.
"AUCell": Rank-based scoring similar to AUCell package. For each cell, genes are ranked by expression, and the score is the proportion of marker genes in the top X% of expressed genes. Robust to technical variation.
When use_umap_smoothing = TRUE, the function:
Computes initial per-cell scores
Finds k nearest neighbors in UMAP space for each cell
Smooths scores by weighted averaging with neighbors
Re-assigns cell types based on smoothed scores
This helps reduce noise and improve consistency of annotations within spatially coherent regions.
A list containing:
Cell_annotations: Data frame with Cell_barcode, Predicted_cell_type, Max_score, Confidence
Cell_confidence: Numeric vector of confidence scores per cell
Summary: Summary table of cell type counts and percentages
Expression_list: List of mean expression matrices per cell type (for verification)
Proportion_list: List of detection proportion matrices per cell type
Prediction_results: Summary data frame with per-cell-type statistics
Probability_matrix: Full cell × cell_type probability matrix (normalized)
Raw_score_matrix: Full cell × cell_type raw score matrix (before normalization)
Parameters: List of parameters used including adaptive thresholds
Cell_scores: (if return_scores=TRUE) Same as Probability_matrix
Other Section_3_Automated_Annotation:
Celltype_Annotation(),
Celltype_Annotation_PerCell(),
Celltype_Calculate(),
Celltype_Verification(),
Celltype_Verification_PerCell(),
Parameter_Calculate(),
percell_workflow
## Not run: # Basic per-cell annotation result <- Celltype_Calculate_PerCell( seurat_obj = sce, gene_list = Markers_list, species = "Human", method = "weighted" ) # Add annotations to Seurat object sce$Cell_type_PerCell <- result$Cell_annotations$Predicted_cell_type # With UMAP smoothing for more consistent annotations result_smooth <- Celltype_Calculate_PerCell( seurat_obj = sce, gene_list = Markers_list, species = "Human", use_umap_smoothing = TRUE, k_neighbors = 20, smoothing_weight = 0.3 ) ## End(Not run)## Not run: # Basic per-cell annotation result <- Celltype_Calculate_PerCell( seurat_obj = sce, gene_list = Markers_list, species = "Human", method = "weighted" ) # Add annotations to Seurat object sce$Cell_type_PerCell <- result$Cell_annotations$Predicted_cell_type # With UMAP smoothing for more consistent annotations result_smooth <- Celltype_Calculate_PerCell( seurat_obj = sce, gene_list = Markers_list, species = "Human", use_umap_smoothing = TRUE, k_neighbors = 20, smoothing_weight = 0.3 ) ## End(Not run)
This function automatically aligns cell barcodes between two Seurat objects using a variety of normalization transformations, then cross-tabulates a cell type label column (from the first object) against a grouping column (from the second object). It returns count tables, proportion tables, a dominant mapping, and a heatmap.
Celltype_Compare( sce_label, sce, label_col = NULL, group_col = NULL, barcode_col = NULL, color_low = "grey70", color_high = "navy", show_plot = TRUE )Celltype_Compare( sce_label, sce, label_col = NULL, group_col = NULL, barcode_col = NULL, color_low = "grey70", color_high = "navy", show_plot = TRUE )
sce_label |
A Seurat object containing the cell type label column. |
sce |
A Seurat object containing the grouping column. |
label_col |
Character. Name of the metadata column in |
group_col |
Character. Name of the metadata column in |
barcode_col |
Optional character. Name of a metadata column in both objects
that contains the cell barcode identifiers. If |
color_low |
Character. Color for low proportion values in the heatmap. Default: "grey70". |
color_high |
Character. Color for high proportion values in the heatmap. Default: "navy". |
show_plot |
Logical. If |
Cell barcode alignment:
The function automatically tries a set of normalization functions on the cell
identifiers (either from barcode_col or from column names) to maximise the
number of shared barcodes between the two objects. Transformations include:
identity, drop_numeric_suffix (removes e.g., "-1-2"), drop_suffix (removes
"-1"), and several prefix removals. The transformation pair yielding the highest
number of shared identifiers is selected.
Proportion calculation:
Proportions are computed within each group_col level (column-wise),
i.e. for each group, the sum of proportions across all cell types equals 1.
Plot:
The heatmap uses ggplot2::geom_tile() with a fixed coordinate ratio and a
colour gradient from color_low to color_high.
A list with five components:
count_table |
A data frame (wide format) with rows = unique
|
prop_table |
Same shape as |
main_to_sub |
A data frame mapping each |
plot |
A ggplot2 heatmap object visualizing the proportion table. |
match_info |
A tibble with columns |
Other Section_5_Other_Functions_Provided:
Celltype_annotation_Cellmarker2(),
Celltype_annotation_Excel(),
Celltype_annotation_PanglaoDB(),
Celltype_annotation_Seurat()
## Not run: # Basic usage with two Seurat objects and default barcode alignment result <- Celltype_Compare( sce_label = seurat_obj1, sce = seurat_obj2, label_col = "cell_type", group_col = "cluster" ) # Access the proportion table head(result$prop_table) # View the dominant mapping print(result$main_to_sub) # Display the heatmap print(result$plot) # Use a custom barcode column result2 <- Celltype_Compare( sce_label = seurat_obj1, sce = seurat_obj2, label_col = "cell_type", group_col = "cluster", barcode_col = "cell_barcode" ) ## End(Not run)## Not run: # Basic usage with two Seurat objects and default barcode alignment result <- Celltype_Compare( sce_label = seurat_obj1, sce = seurat_obj2, label_col = "cell_type", group_col = "cluster" ) # Access the proportion table head(result$prop_table) # View the dominant mapping print(result$main_to_sub) # Display the heatmap print(result$plot) # Use a custom barcode column result2 <- Celltype_Compare( sce_label = seurat_obj1, sce = seurat_obj2, label_col = "cell_type", group_col = "cluster", barcode_col = "cell_barcode" ) ## End(Not run)
This function performs verification of predicted cell types by selecting high log2FC and high expression proportion genes and generates and generate the validation dotplot.
Celltype_Verification( seurat_obj, SlimR_anno_result, assay = "RNA", gene_number = 5, colour_low = "white", colour_high = "navy", annotation_col = "Cell_type_SlimR" )Celltype_Verification( seurat_obj, SlimR_anno_result, assay = "RNA", gene_number = 5, colour_low = "white", colour_high = "navy", annotation_col = "Cell_type_SlimR" )
seurat_obj |
A Seurat object containing single-cell data. |
SlimR_anno_result |
A list containing SlimR annotation results with: Expression_list - List of expression matrices for each cell type. Prediction_results - Data frame with cluster annotations. |
assay |
Enter the assay used by the Seurat object, such as "RNA". Default parameters use "assay = 'RNA'". |
gene_number |
Integer specifying number of top genes to select per cell type. |
colour_low |
Color for lowest expression level. (default = "white") |
colour_high |
Color for highest expression level. (default = "navy") |
annotation_col |
Character string specifying the column in meta.data to use for grouping. |
A ggplot object showing expression of top variable genes.
Other Section_3_Automated_Annotation:
Celltype_Annotation(),
Celltype_Annotation_PerCell(),
Celltype_Calculate(),
Celltype_Calculate_PerCell(),
Celltype_Verification_PerCell(),
Parameter_Calculate(),
percell_workflow
## Not run: Celltype_Verification(seurat_obj = sce, SlimR_anno_result = SlimR_anno_result, assay = "RNA", gene_number = 5, colour_low = "white", colour_high = "navy", annotation_col = "Cell_type_SlimR" ) ## End(Not run)## Not run: Celltype_Verification(seurat_obj = sce, SlimR_anno_result = SlimR_anno_result, assay = "RNA", gene_number = 5, colour_low = "white", colour_high = "navy", annotation_col = "Cell_type_SlimR" ) ## End(Not run)
This function verifies per-cell SlimR annotations by generating a dotplot showing marker gene expression across predicted cell types.
Celltype_Verification_PerCell( seurat_obj, SlimR_percell_result, assay = "RNA", gene_number = 5, colour_low = "white", colour_high = "navy", annotation_col = "Cell_type_PerCell_SlimR", min_cells = 10 )Celltype_Verification_PerCell( seurat_obj, SlimR_percell_result, assay = "RNA", gene_number = 5, colour_low = "white", colour_high = "navy", annotation_col = "Cell_type_PerCell_SlimR", min_cells = 10 )
seurat_obj |
A Seurat object with per-cell annotations. |
SlimR_percell_result |
A list from Celltype_Calculate_PerCell() containing Expression_list with marker genes per cell type. |
assay |
Assay to use. Default: "RNA". |
gene_number |
Number of top genes to show per cell type. Default: 5. |
colour_low |
Color for lowest expression. Default: "white". |
colour_high |
Color for highest expression. Default: "navy". |
annotation_col |
Column in meta.data with cell type annotations. Default: "Cell_type_PerCell_SlimR". |
min_cells |
Minimum number of cells required for a cell type to be included in the plot. Default: 10. |
A ggplot object showing marker gene expression dotplot.
Other Section_3_Automated_Annotation:
Celltype_Annotation(),
Celltype_Annotation_PerCell(),
Celltype_Calculate(),
Celltype_Calculate_PerCell(),
Celltype_Verification(),
Parameter_Calculate(),
percell_workflow
## Not run: # After running Celltype_Calculate_PerCell and Celltype_Annotation_PerCell dotplot <- Celltype_Verification_PerCell( seurat_obj = sce, SlimR_percell_result = result, gene_number = 5, annotation_col = "Cell_type_PerCell_SlimR" ) print(dotplot) ## End(Not run)## Not run: # After running Celltype_Calculate_PerCell and Celltype_Annotation_PerCell dotplot <- Celltype_Verification_PerCell( seurat_obj = sce, SlimR_percell_result = result, gene_number = 5, annotation_col = "Cell_type_PerCell_SlimR" ) print(dotplot) ## End(Not run)
Calculates optimal min_expression, specificity_weight, and threshold parameters using continuous adaptive algorithms based on dataset characteristics.
compute_adaptive_parameters(dataset_features, n_celltypes = 50)compute_adaptive_parameters(dataset_features, n_celltypes = 50)
dataset_features |
List of dataset characteristics from extract_dataset_features() |
n_celltypes |
Expected number of cell types in marker database |
List containing min_expression, specificity_weight, threshold, and rationale
Other Section_1_Functions_Use_in_Package:
calculate_cluster_variability(),
calculate_expression(),
calculate_expression_skewness(),
calculate_probability(),
estimate_batch_effect(),
extract_dataset_features()
Roughly estimates the potential impact of batch effects using available metadata.
estimate_batch_effect(seurat_obj, assay)estimate_batch_effect(seurat_obj, assay)
seurat_obj |
Seurat object |
assay |
Assay name |
Batch effect score (0 indicates no detectable batch effect)
Other Section_1_Functions_Use_in_Package:
calculate_cluster_variability(),
calculate_expression(),
calculate_expression_skewness(),
calculate_probability(),
compute_adaptive_parameters(),
extract_dataset_features()
Computes various statistical features from single-cell data that are used as input for the parameter prediction model.
extract_dataset_features( seurat_obj, features, assay = NULL, cluster_col = NULL )extract_dataset_features( seurat_obj, features, assay = NULL, cluster_col = NULL )
seurat_obj |
Seurat object |
features |
Features to analyze |
assay |
Assay name |
cluster_col |
Cluster column name |
List of dataset characteristics including expression statistics, variability measures, and cluster properties
Other Section_1_Functions_Use_in_Package:
calculate_cluster_variability(),
calculate_expression(),
calculate_expression_skewness(),
calculate_probability(),
compute_adaptive_parameters(),
estimate_batch_effect()
Create Marker_list from the Cellmarkers2 database
Markers_filter_Cellmarker2( df, species = NULL, tissue_class = NULL, tissue_type = NULL, cancer_type = NULL, cell_type = NULL )Markers_filter_Cellmarker2( df, species = NULL, tissue_class = NULL, tissue_type = NULL, cancer_type = NULL, cell_type = NULL )
df |
Standardized Cellmarkers2 database. It is read as data(Cellmarkers2) in the SlimR library. |
species |
Species information in Cellmarkers2 database. The default input is "Human" or "Mouse".The input can be retrieved by "Cellmarkers2_table". For more information,please refer to http://117.50.127.228/CellMarker/ on Cellmarkers2's official website. |
tissue_class |
Tissue_class information in Cellmarkers2 database. The input can be retrieved by "Cellmarkers2_table". For more information, please refer to http://117.50.127.228/CellMarker/ on Cellmarkers2's official website. |
tissue_type |
Tissue_type information in Cellmarkers2 database. The input can be retrieved by "Cellmarkers2_table". For more information, please refer to http://117.50.127.228/CellMarker/ on Cellmarkers2's official website. |
cancer_type |
Cancer_type information in Cellmarkers2 database. The input can be retrieved by "Cellmarkers2_table". For more information, please refer to http://117.50.127.228/CellMarker/ on Cellmarkers2's official website. |
cell_type |
Cell_type information in Cellmarkers2 database. The input can be retrieved by "Cellmarkers2_table". For more information, please refer to http://117.50.127.228/CellMarker/ on Cellmarkers2's official website. |
The standardized "Marker_list" in the SlimR package
Other Section_2_Standardized_Markers_List:
Markers_filter_PanglaoDB(),
Markers_filter_ScType(),
Read_excel_markers(),
Read_seurat_markers()
Cellmarker2 <- SlimR::Cellmarker2 Markers_list_Cellmarker2 <- Markers_filter_Cellmarker2( Cellmarker2, species = "Human", tissue_class = "Intestine", tissue_type = NULL, cancer_type = NULL, cell_type = NULL )Cellmarker2 <- SlimR::Cellmarker2 Markers_list_Cellmarker2 <- Markers_filter_Cellmarker2( Cellmarker2, species = "Human", tissue_class = "Intestine", tissue_type = NULL, cancer_type = NULL, cell_type = NULL )
Create Marker_list from the PanglaoDB database
Markers_filter_PanglaoDB(df, species_input, organ_input)Markers_filter_PanglaoDB(df, species_input, organ_input)
df |
Standardized PanglaoDB database. It is read as data(PanglaoDB) in the SlimR library. |
species_input |
Species information in PanglaoDB database. The default input is "Human" or "Mouse".The input can be retrieved by "PanglaoDB_table". For more information,please refer to https://panglaodb.se/ on PanglaoDB's official website. |
organ_input |
Organ type information in the PanglaoDB database. The input can be retrieved by "PanglaoDB_table".For more information, please refer to https://panglaodb.se/ on PanglaoDB's official website. |
The standardized "Marker_list" in the SlimR package
Other Section_2_Standardized_Markers_List:
Markers_filter_Cellmarker2(),
Markers_filter_ScType(),
Read_excel_markers(),
Read_seurat_markers()
PanglaoDB <- SlimR::PanglaoDB Markers_list_panglaoDB <- Markers_filter_PanglaoDB( PanglaoDB, species_input = 'Human', organ_input = 'GI tract' )PanglaoDB <- SlimR::PanglaoDB Markers_list_panglaoDB <- Markers_filter_PanglaoDB( PanglaoDB, species_input = 'Human', organ_input = 'GI tract' )
Create Marker_list from the ScType database
Markers_filter_ScType(df, tissue_type = NULL, cell_name = NULL)Markers_filter_ScType(df, tissue_type = NULL, cell_name = NULL)
df |
Standardized ScType database. It is read as |
tissue_type |
Tissue type information in the ScType database. The input
can be retrieved by |
cell_name |
Cell type name information in the ScType database. The input
can be retrieved by |
The standardized "Marker_list" in the SlimR package
Other Section_2_Standardized_Markers_List:
Markers_filter_Cellmarker2(),
Markers_filter_PanglaoDB(),
Read_excel_markers(),
Read_seurat_markers()
ScType <- SlimR::ScType Markers_list_ScType <- Markers_filter_ScType( ScType, tissue_type = "Immune system", cell_name = NULL )ScType <- SlimR::ScType Markers_list_ScType <- Markers_filter_ScType( ScType, tissue_type = "Immune system", cell_name = NULL )
A dataset containing marker genes for different Macrophage subtypes from the article "Macrophage diversity in cancer revisited in the era of single-cell omics"
Markers_list_PCTAMMarkers_list_PCTAM
A list with 7 tables.
This list is a table of 7 types of Tumor-associated macrophages (TAMs) markers obtained from the article "Macrophage diversity in cancer revisited in the era of single-cell omics". The data source is "https://doi.org/10.1016/j.it.2022.04.008", and the reference literature is: Ruo-Yu Ma et al. (2022) https://doi.org/10.1016/j.it.2022.04.008.
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
A dataset containing marker genes for different T cell types from the article "Pan-cancer single cell landscape of tumor-infiltrating T cells"
Markers_list_PCTITMarkers_list_PCTIT
A list with 40 tables.
This list is a table of 40 types of pan-cancer tumor-infiltrating T cell (PCTIT) markers obtained from the article "Pan-cancer single cell landscapeof tumor-infiltrating T cells". The data source is "https://doi.org/10.1126/science.abe6474", and the reference literature is: L. Zheng et al. (2021) https://doi.org/10.1126/science.abe6474.
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
A dataset containing marker genes for different human intestine cell types from scIBD
Markers_list_scIBDMarkers_list_scIBD
A list with one hundred and one tables.
This list is a table of 101 types of human intestine cell types markers obtained from scIBD. The article doi source is "https://doi.org/10.1038/s43588-023-00464-9", and the reference literature is: Nie et al. (2023) https://doi.org/10.1038/s43588-023-00464-9. Note: The 'Markers_list_scIBD' was generated using section 2.5.2 and the parameters 'sort_by = "logFC"' and 'gene_filter = 20' were set.
doi:10.1038/s43588-023-00464-9
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
A dataset containing marker genes for different T cell subtypes from TCellSI
Markers_list_TCellSIMarkers_list_TCellSI
A list with ten tables.
This list is a table of 10 types of T cell markers obtained from TCellSI. The data source is "https://github.com/GuoBioinfoLab/TCellSI/blob/main/data/markers.rda", and the reference literature is: Yang et al. (2024) https://doi.org/10.1002/imt2.231.
https://github.com/GuoBioinfoLab/TCellSI/
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
A dataset containing marker genes for different cell types from PanglaoDB
PanglaoDBPanglaoDB
A data frame with 9 columns:
This dataset is used to filter and create a standardized marker list.'
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
A dataset containing marker genes for different cell types from PanglaoDB
PanglaoDB_rawPanglaoDB_raw
A data frame with 14 columns contined in the PanglaoDB database:
This dataset is used to filter and create a standardized marker list.'
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_table,
ScType,
ScType_raw,
ScType_table
A dataset containing marker genes for different cell types from PanglaoDB
PanglaoDB_tablePanglaoDB_table
A list contain different types like species, organ, cell type.
This list is used to choose filters for creation of standardized marker list.
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
ScType,
ScType_raw,
ScType_table
This function automatically determines optimal min_expression, specificity_weight, and threshold parameters for single-cell data analysis based on dataset characteristics using adaptive algorithms derived from empirical analysis of single-cell datasets.
Parameter_Calculate( seurat_obj, features = NULL, assay = NULL, cluster_col = NULL, n_celltypes = 50, verbose = TRUE )Parameter_Calculate( seurat_obj, features = NULL, assay = NULL, cluster_col = NULL, n_celltypes = 50, verbose = TRUE )
seurat_obj |
A Seurat object containing single-cell data |
features |
Character vector of feature names (genes) to analyze. If NULL, will use highly variable features from the Seurat object. |
assay |
Name of assay to use (default: default assay) |
cluster_col |
Column name in metadata containing cluster information |
n_celltypes |
Expected number of cell types in marker database (default: 50). Used for threshold recommendation calculation. |
verbose |
Whether to print progress messages (default: TRUE) |
A list containing:
min_expression: Recommended expression threshold
specificity_weight: Recommended specificity weight
threshold: Recommended probability threshold for candidate selection
dataset_features: Extracted dataset characteristics
parameter_rationale: Explanation of parameter choices
Other Section_3_Automated_Annotation:
Celltype_Annotation(),
Celltype_Annotation_PerCell(),
Celltype_Calculate(),
Celltype_Calculate_PerCell(),
Celltype_Verification(),
Celltype_Verification_PerCell(),
percell_workflow
## Not run: SlimR_params <- Parameter_Calculate( seurat_obj = sce, features = c("CD3E", "CD4", "CD8A"), assay = "RNA", cluster_col = "seurat_clusters", n_celltypes = 98, verbose = TRUE ) ## End(Not run)## Not run: SlimR_params <- Parameter_Calculate( seurat_obj = sce, features = c("CD3E", "CD4", "CD8A"), assay = "RNA", cluster_col = "seurat_clusters", n_celltypes = 98, verbose = TRUE ) ## End(Not run)
Example workflow for using SlimR's per-cell annotation functions
The per-cell annotation workflow in SlimR provides an alternative to cluster-based annotation by scoring and labeling individual cells based on marker expression. This is useful when:
Clusters contain mixed cell types
You want finer-grained annotations
Cell states exist on a continuum
UMAP spatial context can improve annotation quality
# 1. Prepare your Seurat object (must have normalized data)
library(SlimR)
library(Seurat)
# 2. Create or load marker list
Markers_list <- Markers_filter_Cellmarker2(
Cellmarker2,
species = "Human",
tissue_class = "Intestine"
)
# 3. Run per-cell annotation
result <- Celltype_Calculate_PerCell(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human",
method = "weighted", # "weighted", "mean", or "AUCell"
min_expression = 0.1,
min_score = 0.1,
verbose = TRUE
)
# 4. Annotate Seurat object
sce <- Celltype_Annotation_PerCell(
seurat_obj = sce,
SlimR_percell_result = result,
plot_UMAP = TRUE,
plot_confidence = TRUE,
annotation_col = "Cell_type_PerCell"
)
# 5. Verify annotations
dotplot <- Celltype_Verification_PerCell(
seurat_obj = sce,
SlimR_percell_result = result,
gene_number = 5,
annotation_col = "Cell_type_PerCell"
)
print(dotplot)
UMAP Spatial Smoothing:
# Use UMAP coordinates to smooth predictions via k-NN
# This reduces noise and improves consistency in spatial regions
result_smooth <- Celltype_Calculate_PerCell(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human",
use_umap_smoothing = TRUE,
k_neighbors = 20, # Number of neighbors to consider
smoothing_weight = 0.3, # 30
verbose = TRUE
)
# Compare smoothed vs unsmoothed
sce$Cell_type_Smooth <- result_smooth$Cell_annotations$Predicted_cell_type
sce$Cell_type_Raw <- result$Cell_annotations$Predicted_cell_type
DimPlot(sce, group.by = "Cell_type_Raw") |
DimPlot(sce, group.by = "Cell_type_Smooth")
# Method 1: Weighted (recommended for most cases)
# Combines expression with marker specificity and detection rate
result_weighted <- Celltype_Calculate_PerCell(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human",
method = "weighted"
)
# Method 2: Mean (simple, fast)
# Just averages normalized marker expression
result_mean <- Celltype_Calculate_PerCell(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human",
method = "mean"
)
# Method 3: AUCell (rank-based, robust to batch effects)
# Scores based on proportion of markers in top 5
result_aucell <- Celltype_Calculate_PerCell(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human",
method = "AUCell"
)
# Cluster-based annotation (original SlimR approach)
cluster_result <- Celltype_Calculate(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human",
cluster_col = "seurat_clusters"
)
sce <- Celltype_Annotation(
seurat_obj = sce,
cluster_col = "seurat_clusters",
SlimR_anno_result = cluster_result,
annotation_col = "Cell_type_Cluster"
)
# Per-cell annotation
percell_result <- Celltype_Calculate_PerCell(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human"
)
sce <- Celltype_Annotation_PerCell(
seurat_obj = sce,
SlimR_percell_result = percell_result,
annotation_col = "Cell_type_PerCell"
)
# Compare
library(ggplot2)
library(patchwork)
p1 <- DimPlot(sce, group.by = "Cell_type_Cluster") +
ggtitle("Cluster-based")
p2 <- DimPlot(sce, group.by = "Cell_type_PerCell") +
ggtitle("Per-cell")
p1 | p2
# Check agreement
table(sce$Cell_type_Cluster, sce$Cell_type_PerCell)
# For large datasets, adjust chunk_size to manage memory
result <- Celltype_Calculate_PerCell(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human",
chunk_size = 10000, # Process 10k cells at a time
verbose = TRUE
)
# For UMAP smoothing, install RANN for 10-100x speedup
# install.packages("RANN")
result_smooth <- Celltype_Calculate_PerCell(
seurat_obj = sce,
gene_list = Markers_list,
species = "Human",
use_umap_smoothing = TRUE,
k_neighbors = 15
# RANN will be used automatically if installed
)
# Cell-level annotations
head(result$Cell_annotations)
# Cell_barcode Predicted_cell_type Max_score Confidence
# 1 AAACCTGAG... Enterocyte 0.85 0.62
# 2 AAACCTGCA... Goblet cell 0.72 0.45
# Summary statistics
result$Summary
# Cell_type Count Percentage
# 1 Enterocyte 5432 45.2
# 2 Goblet cell 2156 17.9
# Full probability matrix (if return_scores = TRUE)
result$Probability_matrix[1:5, 1:3]
# Enterocyte Goblet_cell Stem_cell
# AAACCTGAG... 0.85 0.10 0.05
# Extract high-confidence cells
high_conf <- result$Cell_annotations$Cell_barcode[
result$Cell_annotations$Confidence > 0.5
]
# Extract uncertain cells for manual review
uncertain <- result$Cell_annotations$Cell_barcode[
result$Cell_annotations$Confidence < 0.2
]
Other Section_3_Automated_Annotation:
Celltype_Annotation(),
Celltype_Annotation_PerCell(),
Celltype_Calculate(),
Celltype_Calculate_PerCell(),
Celltype_Verification(),
Celltype_Verification_PerCell(),
Parameter_Calculate()
This S3 method allows pheatmap objects (returned by Celltype_Calculate())
to be plotted using the generic plot() function. Without this method,
attempting to use plot() on a pheatmap object results in an error.
## S3 method for class 'pheatmap' plot(x, ...)## S3 method for class 'pheatmap' plot(x, ...)
x |
A pheatmap object, typically from |
... |
Additional arguments (currently ignored) |
Pheatmap objects contain a gtable component that needs to be drawn using
grid graphics. This method handles that automatically when plot() is called.
Alternative ways to display pheatmaps:
print(pheatmap_object) - Works natively
plot(pheatmap_object) - Works after loading SlimR
grid::grid.draw(pheatmap_object$gtable) - Direct access
Invisibly returns the input pheatmap object after displaying it
## Not run: # After running Celltype_Calculate() cluster_results <- Celltype_Calculate( seurat_obj = sce, gene_list = Markers_list, species = "Human" ) # Now both of these work: print(cluster_results$Heatmap_plot) plot(cluster_results$Heatmap_plot) ## End(Not run)## Not run: # After running Celltype_Calculate() cluster_results <- Celltype_Calculate( seurat_obj = sce, gene_list = Markers_list, species = "Human" ) # Now both of these work: print(cluster_results$Heatmap_plot) plot(cluster_results$Heatmap_plot) ## End(Not run)
Create "Marker_list" from Excel files ".xlsx"
Read_excel_markers(path, has_colnames = TRUE)Read_excel_markers(path, has_colnames = TRUE)
path |
The path information of Marker files stored in ".xlsx" format. The Sheet name in the file is filled with cell type. The first line of each Sheet is the table head, the first column is filled with markers information, and the following column is filled with mertic information. |
has_colnames |
Logical value indicating whether the first row contains column names. If FALSE, the first column will be named "Markers" and subsequent columns will be named "Col1", "Col2", etc. |
The standardized "Marker_list" in the SlimR package.
Other Section_2_Standardized_Markers_List:
Markers_filter_Cellmarker2(),
Markers_filter_PanglaoDB(),
Markers_filter_ScType(),
Read_seurat_markers()
## Not run: Markers_list_Excel <- Read_excel_markers( "D:/Laboratory/Marker_load.xlsx" ) ## End(Not run)## Not run: Markers_list_Excel <- Read_excel_markers( "D:/Laboratory/Marker_load.xlsx" ) ## End(Not run)
Create "Marker_list" from Seurat object
Read_seurat_markers( df, sources = c("Seurat", "presto"), sort_by = "FSS", gene_filter = 20 )Read_seurat_markers( df, sources = c("Seurat", "presto"), sort_by = "FSS", gene_filter = 20 )
df |
Dataframe generated by "FindAllMarkers" function, recommend to use parameter "group.by = "Cell_type"" and "only.pos = TRUE". |
sources |
Type of markers sources to use. Be one of: |
sort_by |
Marker sorting parameter, for Seurat sources, select "avg_log2FC" or
"p_val_adj" or "FSS" (Feature Significance Score, FSS, product value of |
gene_filter |
The number of markers left for each cell type based on the "sort_by" parameter's level of difference. Default parameters use "gene_fliter = 20" |
The standardized "Marker_list" in the SlimR package.
Other Section_2_Standardized_Markers_List:
Markers_filter_Cellmarker2(),
Markers_filter_PanglaoDB(),
Markers_filter_ScType(),
Read_excel_markers()
## Not run: # Example for Seurat sources markers seurat_markers <- Seurat::FindAllMarkers( object = sce, group.by = "Cell_type", only.pos = TRUE) Markers_list_Seurat <- Read_seurat_markers(seurat_markers, sources = "Seurat", sort_by = "avg_log2FC", gene_filter = 20 ) # Example for presto sources markers seurat_markers <- dplyr::filter( presto::wilcoxauc( X = sce, group_by = "Cell_type", seurat_assay = "RNA" ), padj < 0.05, logFC > 0.5 ) Markers_list_Seurat <- Read_seurat_markers(seurat_markers, sources = "presto", sort_by = "logFC", gene_filter = 20 ) ## End(Not run)## Not run: # Example for Seurat sources markers seurat_markers <- Seurat::FindAllMarkers( object = sce, group.by = "Cell_type", only.pos = TRUE) Markers_list_Seurat <- Read_seurat_markers(seurat_markers, sources = "Seurat", sort_by = "avg_log2FC", gene_filter = 20 ) # Example for presto sources markers seurat_markers <- dplyr::filter( presto::wilcoxauc( X = sce, group_by = "Cell_type", seurat_assay = "RNA" ), padj < 0.05, logFC > 0.5 ) Markers_list_Seurat <- Read_seurat_markers(seurat_markers, sources = "presto", sort_by = "logFC", gene_filter = 20 ) ## End(Not run)
A processed long-format dataset containing marker genes for different cell types from the ScType database. Each row represents one marker gene for a given tissue type and cell type.
ScTypeScType
A tibble with 3 columns:
Tissue type (e.g., "Immune system", "Brain", "Liver")
Cell type name, formatted as "cellName(shortName)" when a short name is available, or "cellName" otherwise
Gene symbol of the marker
This dataset is used to filter and create a standardized marker list.
The dataset can be filtered based on tissue type and cell name to generate
a list of marker genes for specific cell types using
Markers_filter_ScType.
https://github.com/IanevskiAleksandr/sc-type
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType_raw,
ScType_table
The original ScType marker database before processing.
ScType_rawScType_raw
A tibble with 5 columns:
Tissue type
Full cell type name
Comma-separated positive marker genes
Comma-separated negative marker genes (not used in processing)
Abbreviated cell type name
https://github.com/IanevskiAleksandr/sc-type
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_table
A list of frequency tables summarizing the ScType database, useful for exploring available tissue types and cell types before filtering.
ScType_tableScType_table
A list with 2 elements:
Frequency table of tissue types
Frequency table of cell type names
https://github.com/IanevskiAleksandr/sc-type
Other Section_0_Database:
Cellmarker2,
Cellmarker2_raw,
Cellmarker2_table,
Markers_list_PCTAM,
Markers_list_PCTIT,
Markers_list_TCellSI,
Markers_list_scIBD,
PanglaoDB,
PanglaoDB_raw,
PanglaoDB_table,
ScType,
ScType_raw