Spatial transcriptomic tasks

Spatial domain

class dance.modules.spatial.spatial_domain.Louvain(resolution=1)[source]

Louvain classBaseClassificationMethod.

Parameters:

resolution (float) – Resolution parameter.

fit(adj, partition=None, weight='weight', randomize=None, random_state=None)[source]

Fit function for model training.

Parameters:
  • adj – adjacent matrix.

  • partition (dict) – a dictionary where keys are graph nodes and values the part the node belongs to

  • weight (str) – the key in graph to use as weight. Default to “weight”

  • randomize (boolean) – Will randomize the node evaluation order and the community evaluation order to get different partitions at each call

  • random_state (int, RandomState instance or None) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by numpy.random().

predict(x=None)[source]

Prediction function.

Parameters:

x – Not used. For compatibility with dance.modules.base.BaseMethod.fit_score(), which calls fit() with x.

class dance.modules.spatial.spatial_domain.SpaGCN(l=None, device='cpu')[source]

SpaGCN class.

Parameters:

l (float) – the parameter to control percentage p

fit(x, y=None, *, num_pcs=50, lr=0.005, epochs=2000, weight_decay=0, opt='admin', init_spa=True, init='louvain', n_neighbors=10, n_clusters=None, res=0.4, tol=0.001)[source]

Fit function for model training.

Parameters:
  • embed – Input data.

  • adj – Adjacent matrix.

  • num_pcs (int) – The number of component used in PCA.

  • lr (float) – Learning rate.

  • epochs (int) – Maximum number of epochs.

  • weight_decay (float) – Weight decay.

  • opt (str) – Optimizer.

  • init_spa (bool) – Initialize spatial.

  • init (str) – “louvain” or “kmeans”.

  • n_neighbors (int) – The number of neighbors used by Louvain.

  • n_clusters (int) – The number of clusters usedd by kmeans.

  • res (float) – The resolution parameter used by Louvain.

  • tol (float) – Oolerant value for searching l.

predict(x)[source]

Prediction function.

Returns:

The predicted labels and the predicted probabilities.

Return type:

Tuple[np.ndarray, np.ndarray]

predict_proba(x)[source]

Prediction function.

Returns:

The predicted labels and the predicted probabilities.

Return type:

Tuple[np.ndarray, np.ndarray]

search_l(p, adj, start=0.01, end=1000, tol=0.01, max_run=100)[source]

Search best l.

Parameters:
  • p (float) – Percentage.

  • adj – Adjacent matrix.

  • start (float) – Starting value for searching l.

  • end (float) – Ending value for searching l.

  • tol (float) – Tolerant value for searching l.

  • max_run (int) – Maximum number of runs.

Returns:

l – best l, the parameter to control percentage p.

Return type:

float

search_set_res(x, l, target_num, start=0.4, step=0.1, tol=0.005, lr=0.05, epochs=10, max_run=10)[source]

Search for optimal resolution parameter.

set_l(l)[source]

Set l.

Parameters:

l (float) – The parameter to control percentage p.

class dance.modules.spatial.spatial_domain.StLouvain(resolution=1)[source]

StLouvain class.

Parameters:

resolution (float) – Resolution parameter.

fit(adj, partition=None, weight='weight', randomize=None, random_state=None)[source]

Fit function for model training.

Parameters:
  • adj – Adjacent matrix.

  • partition (dict) – A dictionary where keys are graph nodes and values the part the node belongs to

  • weight (str,) – The key in graph to use as weight. Default to “weight”

  • resolution (float) – Resolution.

  • randomize (boolean) – Will randomize the node evaluation order and the community evaluation order to get different partitions at each call

  • random_state (int, RandomState instance or None) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by numpy.random().

predict(x=None)[source]

Prediction function.

class dance.modules.spatial.spatial_domain.Stagate(hidden_dims, device='auto', pretrain_path=None)[source]

Stagate class.

Parameters:
  • hidden_dims – Hidden dimensions.

  • device (str) – Computation device.

  • pretrain_path (Optional[str]) – Save the cell representations from the trained STAGATE model to the specified path. Do not save if unspecified.

fit(inputs, epochs=100, lr=0.001, gradient_clipping=5, weight_decay=0.0001, num_cluster=7, gmm_reg_covar=0.00015, gmm_n_init=10, gmm_max_iter=300, gmm_tol=0.0002, random_state=None)[source]

Fit function for training.

Parameters:
  • inputs (Tuple[ndarray, ndarray]) – A tuple containing (1) the input features and (2) the edge index array (coo representation) of the adjacency matrix.

  • epochs (int) – Number of epochs.

  • lr (float) – Learning rate.

  • gradient_clipping (float) – Gradient clipping.

  • weight_decay (float) – Weight decay.

  • num_cluster (int) – Number of cluster.

  • gmm_reg_covar (float) –

  • gmm_n_init (int) –

  • gmm_max_iter (int) –

  • gmm_tol (float) –

  • random_state (int | None) –

forward(features, edge_index)[source]

Forward function for training.

Parameters:
  • features – Node features.

  • edge_index – Adjacent matrix.

Returns:

The second and the forth hidden layerx.

Return type:

Tuple[Tensor, Tensor]

predict(x=None)[source]

Prediction function.

Parameters:

x (Optional[Any]) – Not used, for compatibility with BaseClusteringMethod.

Cell type deconvolution

class dance.modules.spatial.cell_type_deconvo.Card(basis, random_state=42)[source]

The CARD cell-type deconvolution model.

Parameters:
  • basis (DataFrame) – The cell-type profile basis.

  • random_state (int | None) –

fit(inputs, y=None, max_iter=100, epsilon=0.0001, sigma=0.1, location_free=False)[source]

Fit function for model training.

Parameters:
  • inputs (Tuple[ndarray, ndarray]) – A tuple containing (1) the input features encoding the scRNA-seq counts to be deconvoluted, and (2) a 2-d array of spatial location of each spot (spots x 2). If the spatial location information is all zero, or the location_free option is set to True, then do not use the spatial location information.

  • y (Optional[Any]) – Not used, for compatibility with the BaseRegressionMethod class.

  • max_iter (int) – Maximum number of iterations for optimization.

  • epsilon (float) – Optimization threshold.

  • sigma (float) – Spatial gaussian kernel scaling factor.

  • location_free (bool) – Do not use spatial location info if set to True.

predict(x=None)[source]

Prediction function.

Parameters:

x (Optional[Any]) – Not used, for compatibility with the BaseRegressionMethod class.

Returns:

Predictions of cell-type proportions.

Return type:

numpy.ndarray

class dance.modules.spatial.cell_type_deconvo.DSTG(nhid=32, bias=False, dropout=0, device='auto')[source]

DSTG cell-type deconvolution model.

Parameters:
  • nhid (int) – Number of units in the hidden layer (graph convolution).

  • bias (bool) – Include bias term, default False.

  • dropout (float) – Dropout rate, default 0.

  • device (str) – Computation device.

fit(inputs, y, lr=0.005, max_epochs=50, weight_decay=0)[source]

Fit function for model training.

Parameters:
  • inputs (Tuple[Tensor, Tensor, Tensor]) – A tuple containing (1) the DSTG adjacency matrix, (2) the gene expression feature matrix, (3) the training mask indicating the training samples.

  • y (Tensor) – Cell type portions label.

  • lr (float) – Learning rate.

  • max_epochs (int) – Maximum number of epochs to train.

  • weight_decay (float) – Weight decay parameter for optimization (Adam).

predict(x)[source]

Prediction function.

Parameters:

x (Optional[Any]) – Not used, for compatibility with the BaseRegressionMethod class.

Returns:

Predictions of cell-type proportions.

Return type:

pred

class dance.modules.spatial.cell_type_deconvo.SPOTlight(ref_count, ref_annot, ct_select, rank=2, bias=False, init_bias=None, device='auto')[source]

SPOTlight.

Parameters:
  • ref_count (ndarray) – Reference single cell RNA-seq counts data (cell x gene).

  • ref_annot (ndarray) – Reference cell-type label information.

  • ct_select (List[str]) – Selected cell-types to be considered for deconvolution.

  • rank (int) – Rank of the matrix factorization.

  • bias – Include bias term, default False.

  • init_bias – Initial bias term (background estimate).

fit(x, lr=0.001, max_iter=1000)[source]

Fit function for model training.

Parameters:
  • x (Tensor) – Mixed cell expression to be deconvoluted (cell x gene).

  • lr (float) – Learning rate.

  • max_iter (int) – Maximum iterations allowed for matrix factorization solver.

predict(x=None)[source]

Prediction function.

Parameters:

x (Optional[Any]) – Not used, for compatibility with the BaseRegressionMethod class.

Returns:

Predicted cell-type proportions (cell x cell-type).

Return type:

pred

class dance.modules.spatial.cell_type_deconvo.SpatialDecon(ct_profile, ct_select, bias=False, device='auto')[source]

SpatialDecon.

Parameters:
  • ct_profile – Cell type characteristic profiles (cell-type x gene).

  • ct_select – Selected cell-types to be considered for deconvolution.

  • bias – Include bias term, default False.

fit(x, lr=0.0001, max_iter=500, print_period=100)[source]

Fit function for model training.

Parameters:
  • x (Tensor) – Input expression matrix (cell x gene).

  • lr (float) – Learning rate.

  • max_iter (int) – Maximum number of iterations for optimizat.

  • print_period (int) – Indicates number of iterations until training results print.

predict(x=None)[source]

Return fiited parameters as cell-type portion predictions.

Parameters:

x (Optional[Any]) – Not used, for compatibility with the BaseRegressionMethod class.

Returns:

Predictions of cell-type proportions (cell x cell-type).

Return type:

proportion_preds