scglue.models.configure_dataset

scglue.models.configure_dataset(adata, prob_model, use_highly_variable=True, use_layer=None, use_rep=None, use_batch=None, use_cell_type=None, use_dsc_weight=None, use_obs_names=False, nan_sparse=False)[source]

Configure dataset for model training

Parameters:

adata (AnnData) – Dataset to be configured
prob_model (str) – Probabilistic generative model used by the decoder, must be one of {"Normal", "ZIN", "ZILN", "NB", "ZINB", "Beta"}.
use_highly_variable (bool) – Whether to use highly variable features
use_layer (typing.Optional[str]) – Data layer to use (key in adata.layers)
use_rep (typing.Optional[str]) – Data representation to use as the first encoder transformation (key in adata.obsm)
use_batch (typing.Optional[str]) – Data batch to use (key in adata.obs)
use_cell_type (typing.Optional[str]) – Data cell type to use (key in adata.obs)
use_dsc_weight (typing.Optional[str]) – Discriminator sample weight to use (key in adata.obs)
use_obs_names (bool) – Whether to use obs_names to mark paired cells across different datasets
nan_sparse (bool) – Whether missing entries in sparse matrix indicate nan

Return type:

None

Note

The use_rep option applies to encoder inputs, but not the decoders, which are always fitted on data in the original space.