-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Hi,
I am getting a segmentation fault with DiffusionMap
. I am not changing any of the defaults.
- When inputting a data matrix as
data
, I get the following error:
*** caught segfault ***
address 0x7f7675a30cb0, cause 'memory not mapped'
Error: Could not call find_knn. Consider specifyingknn_params = list(M = <larger number>)
. Original error:
long vectors not supported yet: ../../src/include/Rinlinedfuns.h:537
- When inputting a SingleCellExperiment object as
data
, I get the following error:
*** caught segfault ***
address 0x7f9dc6fb2cb0, cause 'memory not mapped'Traceback:
1: knn_asym(data, k, distance)
2: knn.covertree::find_knn(data, k, query = query, distance = distance, sym = sym)
3: (function (data, k, ..., query = NULL, distance = c("euclidean", "cosine", "rankcor", "l2"), method = c("covertree", "hnsw"), sym = TRUE, verbose = FALSE) { p <- utils::modifyList(formals(RcppHNSW::hnsw_knn), list(...)) method <- match.arg(method) distance <- match.arg(distance) if (!is.double(data)) { warning("find_knn does not yet support sparse matrices, converting data to a dense matrix.") data <- as.matrix(data) } if (method == "covertree") { return(knn.covertree::find_knn(data, k, query = query, distance = distance, sym = sym)) } if (distance == "rankcor") { distance <- "cosine" data <- rank_mat(data) if (!is.null(query)) query <- rank_mat(query) } if (is.null(query)) { knn <- hnsw_knn(data, k + 1L, distance, M = p$M, ef_construction = p$ef_construction, ef = p$ef, verbose = verbose) knn$idx <- knn$idx[, -1, drop = FALSE] knn$dist <- knn$dist[, -1, drop = FALSE] } else { index <- hnsw_build(data, distance, M = p$M, ef = p$ef_construction, verbose = verbose) knn <- hnsw_search(query, index, k, ef = p$ef, verbose = verbose) } names(knn)[[1L]] <- "index" knn$dist_mat <- sparseMatrix(rep(seq_len(nrow(knn$index)), k), as.vector(knn$index), x = as.vector(knn$dist), dims = c(nrow(if (is.null(query)) data else query), nrow(data))) if (is.null(query)) { if (sym) knn$dist_mat <- symmetricise(knn$dist_mat) nms <- rownames(data) } else { nms <- rownames(query) } rownames(knn$dist_mat) <- rownames(knn$index) <- rownames(knn$dist) <- nms colnames(knn$dist_mat) <- rownames(data) knn})(new("dgCMatrix", i = c(11854L, 32418L, 46422L, 42L, 100L, 173L, 285L, 293L, 419L, 504L, 629L, 694L, 743L, 777L, 835L, 1122L, 1183L, 1214L, 1259L, 1318L, 1382L, 1389L, 1402L, 1407L, 1655L, 1738L, 1779L, 1997L, 2008L, 2018L, 2023L, 2060L, 2204L, 2241L, 2416L, 2500L, 2558L, 2635L, 2690L, 2701L, 2715L, 2738L, 2742L, 2908L, 2982L, 3118L, 3119L, 3153L, 3311L, 3420L, 3566L, 3605L, 3691L, 3695L, 3715L, 3759L, 4015L, 4108L, 4164L, 4209L, 4260L, 4307L, 4319L, 4373L, 4649L, 4672L, 4702L, 4860L, 5361L, 5426L, 5593L, 5595L, 5638L, 5643L, 5675L, 5791L, 5934L, 5937L, 5942L, 6441L, 6442L, 6604L, 6714L, 6731L, 6740L, 6800L, 6844L, 6881L, 6906L, 6954L, 6984L, 7027L, 7033L, 7099L, 7177L, 7196L, 7260L, 7343L, 7356L, 7376L, 7569L, 7688L, 7831L, 7952L, 8024L, 8071L, 8097L, 8128L, 8131L, 8179L, 8207L, 8216L, 8444L, 8503L, 8527L, 8698L, 8718L, 8776L, 8820L, 8856L, 8987L, 8994L, 9116L, 9362L, 9363L, 9383L, 9449L, 9631L, 9686L, 9714L, 9750L, 9826L, 9873L, 10063L, 10079L, 10392L, 10400L, 10469L, 10504L, 10579L, 10600L, 10646L, 10866L, 10961L, 11055L, 11501L, 11511L, 11671L, 11780L, 11823L, 12115L, 12134L, 12242L, 12290L, 12353L, 12411L, 12544L, 12571L, 12890L, 12982L, 13013L, 13019L, 13029L, 13193L, 13259L, 13497L, 13548L, 13646L, 13704L, 13820L, 13896L, 13922L, 14016L, 14026L, 14045L, 14135L, 14158L, 14213L, 14221L, 14280L, 14368L, 14376L, 14390L, 14527L, 14598L, 14776L, 14850L, 14910L, 14942L, 15176L, 15356L, 15496L, 15505L, 15507L, 15566L, 15792L, 15824L, 15842L, 15951L, 16007L, 16331L, 16340L, 16345L, 16352L, 16406L, 16416L, 16471L, 16595L, 16656L, 16785L, 16869L, 16880L, 17217L, 17392L, 17461L, 17579L, 17582L, 17897L, 17948L, 18031L, 18195L, 18331L, 18378L, 18456L, 18459L, 18560L, 18590L, 18657L, 18820L, 18851L, 19034L, 19073L, 19181L, 19403L, 19689L, 19800L, 19851L, 19866L, 19918L, 19967L, 20026L, 20101L, 20104L, 20180L, 20225L, 20262L, 20549L, 20666L, 20737L, 20900L, 21116L, 21412L, 21725L, 21749L
I assume these errors are both down to the large size of my data (~100,000 cells x ~20000 genes) and the best approach would be to input PCA scores rather than the normalised expression values? Or is there another way around this?
Best wishes,
Lucy