starling.utility
Module contents
- class starling.utility.ConcatDataset(datasets)[source]
Bases:
Dataset
A dataset composed of datasets
- Parameters:
datasets (
list
[Tensor
]) – the datasets to concatenate, each ofd.shape[0] == m
- starling.utility.compute_p_s_given_gamma(S, Theta, dist_option)[source]
- Returns:
# of obs x # of cluster x # of cluster matrix - p(s_n | gamma_n = [c,c’])
- starling.utility.compute_p_s_given_gamma_model_overlap(S, Theta)[source]
- Returns:
# of obs x # of cluster x # of cluster matrix - p(s_n | gamma_n = [c,c’])
- starling.utility.compute_p_s_given_z(S, Theta, dist_option)[source]
- Returns:
# of obs x # of cluster matrix - p(s_n | z_n = c)
- starling.utility.compute_p_y_given_gamma(Y, Theta, dist_option)[source]
- Returns:
# of obs x # of cluster x # of cluster matrix - p(y_n | gamma_n = [c,c’])
- starling.utility.compute_p_y_given_z(Y, Theta, dist_option)[source]
- Returns:
# of obs x # of cluster matrix - p(y_n | z_n = c)
- starling.utility.init_clustering(initial_clustering_method, adata, k=None, labels=None)[source]
Compute initial cluster centroids, variances & labels
- Parameters:
adata (
AnnData
) – The initial data to be analyzedinitial_clustering_method (
Literal
['User'
,'KM'
,'GMM'
,'FS'
,'PG'
]) – The method for computing the initial clusters, one ofKM
(KMeans),GMM
(Gaussian Mixture Model),FS
(FlowSOM),User
(user-provided), orPG
(PhenoGraph).k (
Optional
[int
]) – The number of clusters, must ben_components
wheninitial_clustering_method
isGMM
(required),k
wheninitial_clustering_method
isKM
(required),k
wheninitial_clustering_method
isFS
(required),?
wheninitial_clustering_method
isPG
(optional), and can be ommited wheninitial_clustering_method
is “User”, because user will be passing in their own labels.labels (
Optional
[ndarray
]) – optional, user-provided labels
- Raises:
ValueError
- Return type:
AnnData
- Returns:
The annotated data with labels, centroids, and variances
- starling.utility.model_parameters(adata, singlet_prop)[source]
Return initial model parameters
- Parameters:
adata (
AnnData
) – The sample to be analyzed, with clusters and annotations frominit_clustering()
singlet_prop (
float
) – The proportion of anticipated segmentation error free cells
- Return type:
Dict
[str
,ndarray
]- Returns:
the model parameters
- starling.utility.predict(dataLoader, model_params, dist_option, model_cell_size, model_zplane_overlap, threshold=0.5)[source]
return singlet/doublet probabilities, singlet cluster assignment probabilty matrix & assignment labels
- Parameters:
dataLoader (
DataLoader
) – the dataloadermodel_params (
Dict
[str
,Tensor
]) – the model parametersdist_option (
str
) – str, one of ‘T’ for Student-T (df=2) or ‘N’ for Normal (Gaussian)model_cell_size (
bool
) – boolmodel_zplane_overlap (
bool
) – whether z-plane overlap is modeledthreshold (
float
)
- Returns:
- starling.utility.simulate_data(Y, S=None, model_overlap=True)[source]
Use real data to simulate singlets/doublets (equal proportions). Return same number of cells as in Y/S, half of them are singlets and another half are doublets
- Parameters:
Y (
Tensor
) – data matrix of shape m x nS (
Optional
[Tensor
]) – data matrix of shape mmodel_overlap (
bool
) – If cell size is modelled, should STARLING model z-plane overlap
- Return type:
Tuple
[tensor
]- Returns:
the simulated data
- starling.utility.validate_starling_arguments(adata, dist_option, singlet_prop, model_cell_size, cell_size_col_name, model_zplane_overlap, model_regularizer, learning_rate)[source]
- Parameters:
adata (AnnData)
dist_option (str)
singlet_prop (float)
model_cell_size (bool)
cell_size_col_name (str)
model_zplane_overlap (bool)
model_regularizer (float)
learning_rate (float)