-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Starting in RandBLAS 1.0, there are qualitative differences in the distributions of major-axis vectors of SparseDist depending on whether the major axis is short or long. This is setting us up with a limitation where we can't generate a tall SparseSkOp with column-wise hashing-like structure, or a wide SparseSkOp with row-wise hashing-like structure. That can make it awkward when updating a sketch in a way that requires adding a small number of short-axis vectors. The only way to get around this is to define a larger operator and work with submatrices of that operator. For DenseSkOp that's a small ask since we only generate the submatrix that's needed. For SparseSkOp that's a big ask since we generate the entire operator even if we only work with a proper submatrix.
In RandBLAS 0.2 we attached memory to SparseSkOps inside sketching functions. In v1.0 we won't do that. In RandBLAS 1.1 we'll want to enable only generating the major-axis vectors that are needed in a given sketching operation.
Some notes for whoever works on that
Let's look at LSKGES as an example. The entire implementation of that function is as follows:
if (S.nnz < 0) {
SparseSkOp<T,RNG,sint_t> shallowcopy(S.dist, S.seed_state); // shallowcopy.own_memory = true.
fill_sparse(shallowcopy);
lskges(layout, opS, opA, d, n, m, alpha, shallowcopy, ro_s, co_s, A, lda, beta, B, ldb);
return;
}
auto Scoo = coo_view_of_skop(S);
left_spmm(
layout, opS, opA, d, n, m, alpha, Scoo, ro_s, co_s, A, lda, beta, B, ldb
);
return;
The most likely situation is that we'll need to change now fill_sparse
works. Here's the implementation of that function
using sint_t = typename SparseSkOp::index_t;
using T = typename SparseSkOp::scalar_t;
int64_t full_nnz = S.dist.full_nnz;
if (S.own_memory) {
if (S.rows == nullptr) S.rows = new sint_t[full_nnz];
if (S.cols == nullptr) S.cols = new sint_t[full_nnz];
if (S.vals == nullptr) S.vals = new T[full_nnz];
}
randblas_require(S.rows != nullptr);
randblas_require(S.cols != nullptr);
randblas_require(S.vals != nullptr);
fill_sparse_unpacked_nosub(S.dist, S.nnz, S.vals, S.rows, S.cols, S.seed_state);
return;
Let's take a peek at how we handle working with submatrices in LSKGE3
auto [rows_submat_S, cols_submat_S] = dims_before_op(d, m, opS);
constexpr bool maybe_denseskop = !std::is_same_v<std::remove_cv_t<DenseSkOp>, BLASFriendlyOperator<T>>;
if constexpr (maybe_denseskop) {
if (!S.buff) {
// DenseSkOp doesn't permit defining a "black box" distribution, so we have to pack the submatrix
// into an equivalent datastructure ourselves.
auto submat_S = submatrix_as_blackbox<BLASFriendlyOperator<T>>(S, rows_submat_S, cols_submat_S, ro_s, co_s);
lskge3(layout, opS, opA, d, n, m, alpha, submat_S, 0, 0, A, lda, beta, B, ldb);
return;
} // else, continue with the function as usual.
}
and submatrix_as_blackbox
currently works llike this:
template <typename BFO, typename DenseSkOp>
BFO submatrix_as_blackbox(const DenseSkOp &S, int64_t n_rows, int64_t n_cols, int64_t ro_s, int64_t co_s) {
randblas_require(ro_s + n_rows <= S.n_rows);
randblas_require(co_s + n_cols <= S.n_cols);
using T = typename DenseSkOp::scalar_t;
T *buff = new T[n_rows * n_cols];
auto layout = S.layout;
fill_dense_unpacked(layout, S.dist, n_rows, n_cols, ro_s, co_s, buff, S.seed_state);
int64_t dim_major = S.dist.dim_major;
BFO submatrix{layout, n_rows, n_cols, buff, dim_major, true};
return submatrix;
}
This all suggests that we'd need
- A fill_sparse_unpacked function.
- A "submatrix_as_coo" function that that returns a new operator instead of writing to the sketching operator in-place, similar to how submatrix_as_blackbox works.