scglue.models.data.ArrayDataset

class scglue.models.data.ArrayDataset(*arrays, getitem_size=1)[source]

Bases: Dataset

Array dataset for numpy.ndarray and scipy.sparse.spmatrix objects. Different arrays are considered as unpaired, and thus do not need to have identical sizes in the first dimension. Smaller arrays are recycled. Also, data fetched from this dataset are automatically densified.

Parameters:

*arrays (typing.Union[numpy.ndarray, scipy.sparse._matrix.spmatrix]) – An arbitrary number of data arrays

Note

We keep using arrays because sparse tensors do not support slicing. Arrays are only converted to tensors after minibatch slicing.

Methods

accept_shuffle

Accept shuffling result

propose_shuffle

Propose shuffling using a given random seed

random_split

Randomly split the dataset into multiple subdatasets according to given fractions.

Attributes

arrays

Internal array objects

logger