scglue.models.scglue.configure_dataset(adata, prob_model, use_highly_variable=True, use_layer=None, use_rep=None, use_batch=None, use_cell_type=None, use_dsc_weight=None, use_uid=None)[source]

Configure dataset for model training

  • adata (AnnData) – Dataset to be configured

  • prob_model (str) – Probabilistic generative model used by the decoder, must be one of {"Normal", "ZIN", "ZILN", "NB", "ZINB"}.

  • use_highly_variable (bool) – Whether to use highly variable features

  • use_layer (Optional[str]) – Data layer to use (key in adata.layers)

  • use_rep (Optional[str]) – Data representation to use as the first encoder transformation (key in adata.obsm)

  • use_batch (Optional[str]) – Data batch to use (key in adata.obs)

  • use_cell_type (Optional[str]) – Data cell type to use (key in adata.obs)

  • use_dsc_weight (Optional[str]) – Discriminator sample weight to use (key in adata.obs)

  • use_uid (Optional[str]) – Unique cell ID used to mark paired cells across multiple datasets (key in adata.obsm)


The use_rep option applies to encoder inputs, but not the decoders, which are always fitted on data in the original space.

Return type