CellScope.cs.Manifold_Fitting_1
CellScope.cs. Manifold_Fitting_1 (fea : np.ndarray | csr_matrix, num_pca : int = 100, num_Selected_Gene : int = 500, knn : int = 20, num_center : int = 0, random_seed : int = 83)
Manifold_Fitting_1 fits the data onto a low-dimensional manifold by identifying significant genes.
This process involves dimensionality reduction through PCA [AW10], selecting manifold seeds [RL14], identifying high-confidence cliques, and selecting key genes that define the signal space based on these cliques.
Parameters
fea (
np.ndarray|csr_matrix):The feature matrix of shape (n_cell, n_gene), where rows correspond to cells and columns to genes. Supports both dense (NumPy array) and sparse (CSR matrix) formats.
num_pca (
int, optional, default=100):The number of principal components to retain after performing PCA on fea. This parameter determines the dimensionality of the PCA-transformed data.
num_Selected_Gene (
int, optional, default=500):The number of selected genes representing the signal space, based on the lowest p-values.
knn (
int, optional, default=20):The number of nearest neighbors used to compute the local density (rho) and distance to higher density neighbors (delta).
num_center (
int, optional, default=0):The number of cluster centers to select. If set to 0, the cluster centers are automatically determined using the findCenters function based on the product of rho and delta.
random_seed (
int, optional, default=83):The random seed for ensuring reproducibility of PCA results and Truncated SVD.
Return
fea_selected (
ndarray|csr_matrix):A matrix of shape (n_cell, num_Selected_Gene), representing the gene-selected data matrix.
significant_features_index (
ndarray):A vector containing the indices of the selected genes, corresponding to the most significant features based on p-values.
id_max_deltarho (
ndarray):An array of cluster center indices, representing the cells identified as cluster centers.