Skip to content

[RFC]: Expansion of Extended BLAS Routines with WebAssembly Implementation and Enhanced Data Handling #112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
7 tasks done
0PrashantYadav0 opened this issue Mar 22, 2025 · 11 comments
Assignees
Labels
2025 2025 GSoC proposal. received feedback A proposal which has received feedback. rfc Project proposal.

Comments

@0PrashantYadav0
Copy link
Member

0PrashantYadav0 commented Mar 22, 2025

Full Name

Prashant Kumar Yadav

University status

Yes

University name

Indian Institute of Information Technology, Lucknow (IIITL)

University program

Bachelor of Technology in Computer Science and Artificial Intelligence

Expected graduation

May 2027

Short biography

I am a second-year B.Tech student at IIITL, majoring in Computer Science and Artificial Intelligence. I actively participate in AXIOS—the technical society at IIITL—as a member of both the FOSS and Web wings.

Recently, I completed an internship at UBIQCURE, a medical startup in India, where I developed an admin panel and a homepage using React, Express, and Next.js while leading a team of four. This experience deepened my understanding of JavaScript and the deployment process.

I have also won a hackathon by building a Next.js app for a Web3 project. My journey in technology began in the 11th grade when I selected computer science as one of my subjects, and since then, my passion for coding and innovation has grown steadily.

Over the past two years at IIITL, I have taken courses in full stack development (with JavaScript), data structures, algorithms in C, object-oriented programming in Java, computer networks, and compiler design. More recently, I have developed a keen interest in DevOps and have learned Go, Docker, and Kubernetes, which I am eager to explore further.

Timezone

Indian Standard Time (UTC +5:30)

Contact details

Platform

Mac

Editor

VSCode, for its handy extensions, excellent Git support, and compatibility with various programming languages.

Programming experience

I began learning JavaScript and Python in high school and later pursued full stack development in college using a project-based learning approach. Some of my notable projects include:

  • Scholarship DApp: Developed for a hackathon, this decentralized application allows users to apply for or sponsor scholarships. I integrated a 3D model and a chatbot on the homepage, utilizing shadcn for UI components.
  • Pokedex: Built with TypeScript and SCSS, this project enables users to search for Pokémon and create custom teams, with Firebase as the backend.
  • Blogging Site: Created using React and Appwrite (via appwrite-react-sdk), this site allows users to write and publish posts.

More projects can be found on my portfolio site: prashantyadav.site.

JavaScript experience

I've used JavaScript in many different technologies. For backend development, I've worked with frameworks like Hono, Express, and Elysia—all JavaScript-based, also i worked in golang for backend development. On the frontend, I've learned React, Vite, and Next.js. Since full stack development in JavaScript was part of my college coursework, I'm very familiar with the JavaScript ecosystem.

As a fun project, I built an app using Bun and Hono, which turned out pretty cool. You can check it out here: Expense Tracking App.

What I love most about JavaScript is how simple and easy it is to understand and work with. It also has a vast collection of packages—you can find one for almost anything!
The one downside, in my experience, is that JavaScript isn’t always accurate with calculations and lacks type safety, which can lead to errors.

Still, I really enjoy working with JavaScript and love building with it!

Node.js experience

As I started learning JavaScript, I came across Node.js, and since then, it has been a constant part of my journey. Whether building vanilla JavaScript projects or developing backends, Node.js has always been helpful.

I've used Node.js for backend development with Express and have also worked with various Node.js libraries, such as fs (File System), path, and dotenv. Additionally, I’ve used many Node packages in my projects, making development much easier and more efficient.

C/Fortran experience

I completed my Data Structures and Design and Analysis of Algorithms courses in my first and second semesters, respectively, using C.

For my first-year college project, I built a Ticket Machine—a terminal-based tool that allows users to select tickets, choose optional add-ons, and generates a final receipt.

Interest in stdlib

While looking for a JavaScript library with strong mathematical support, I came across stdlib. I needed a math tool for a project, and as I explored more, I found everything I was looking for. It also offers many additional features that are really useful.

My favorite features include:

  • Mathematical tools
  • Matrix support and functions
  • Linear algebra package
  • Various statistical distributions, like binomial distribution
  • Complex number support

As I explored it further, I really liked the idea behind it and wanted to contribute. I found a few good first issues, opened some PRs, and started getting involved.

Version control

Yes

Contributions to stdlib

I started contributing by adding a C implementation for stats/base/dists. After opening PRs for several statistical distributions, I submitted a PR to add a C ndarray interface and refactor the implementation for the stats/base package. Later, I also opened a PR to add a C ndarray interface and refactor the implementation for the blas/ext/base/* package.
After that, I submitted a PR for adding the base C implementation for LAPACK and another PR for adding math/base/special/log1pf, along with a few other math/base/special packages. Later, I opened some PRs to add WebAssembly implementation for blas/ext/base/* and a PR for WebAssembly implementations in stats/strided.

Here are all my merged PRs:
Merged PRs

Here are all my open PRs:
Open PRs

Stdlib Showcase :

JavaScript Neural Network Classifier with stdlib

I built this lightweight neural network classifier in pure JavaScript that works with the Iris dataset. Instead of using big ML frameworks, I implemented everything from scratch using stdlib's math functions.

The project features a simple feedforward network with one hidden layer that I trained using stochastic gradient descent. I added data normalization to improve accuracy and made sure it runs in both browsers and Node.js.

It was a fun challenge to create an efficient machine learning implementation without any heavyweight dependencies - just JavaScript and stdlib's excellent mathematical utilities!

Project Overview

The goal of this project is to enhance extended BLAS routines in the stdlib library by:

  1. Adding WebAssembly Implementations: Improve performance and efficiency by compiling extended BLAS routines into WebAssembly.
  2. Implementing a C ndarray Interface: Ensure all packages have a C ndarray interface for enhanced performance and compatibility.
  3. Refactoring Existing Packages: Update code to follow best practices and improve maintainability.
  4. Expanding Extended BLAS Routines: Include new routines with support for:
    • NaN Values: Implement functions that ignore NaN values during computations.
    • Masking: Add routines that operate on specific elements based on a given mask.
    • Complex Numbers and Multiple Data Types: Extend support to complex numbers and various data types (e.g., int, float, double).
  5. Enhancing Documentation, Tests, Examples, and Benchmarks: Provide comprehensive documentation and robust test coverage along with example usage and performance benchmarks.

Detailed Implementation Steps

1. Update the Manifest File

In the manifest.json file of the WebAssembly package, update the source dependency to point to the upstream JavaScript package (which includes the C implementation):

"dependencies": [
  "@stdlib/blas/ext/base/<package-name>"
]

2. Generate WebAssembly Binaries

After creating the required scripts (common across most packages), run:

make clean-wasm PKGS_WASM_PATTERN="blas/ext/base/wasm/<package-name>"

This command will generate:

  • main.wasm
  • main.wat

Ensure that the main.wat file includes the external Apache-2.0 license header.

3. Add REPL Files and Type Definitions

  • Create type definitions and REPL files for modules and routines.
  • Include tests for all modules, routines, and package functions in the types directory.

4. Implement JavaScript APIs

Add JavaScript files in the lib directory for modules and routines. These files will serve as the main API for external use.

5. Create Example Files

Develop example files for modules and routines that import functions from the lib directory. Test these examples using:

make examples EXAMPLES_FILTER=".*/blas/ext/base/wasm/<package-name>/.*"

6. Create Benchmark Files

Develop benchmarks for each module and routine. To run all benchmarks, use:

make benchmark BENCHMARKS_FILTER=".*/blas/ext/base/wasm/<package-name>/.*"

7. Write Test Files

Develop tests that cover various strides, offsets, and behaviors, including external tests for both routines and module functions. Run tests with:

make test TESTS_FILTER=".*/blas/ext/base/wasm/<package-name>/.*"

8. Create a README File

Provide clear descriptions and explanations for each method in the README file. Document the different function types, such as:

  • *.main() and *.ndarray()
  • Module functions (mod.main(), mod.ndarray())
  • Routine functions

Expected Outcome

  • Improved Performance: Users will benefit from faster numerical computations via WebAssembly-based APIs.
  • Expanded Functionality: New routines supporting NaN values, masking, complex numbers, and various data types will increase the versatility of extended BLAS routines.
  • Enhanced Compatibility: The inclusion of a C ndarray interface will ensure smoother integration with other libraries.
  • Better Documentation and Testing: Comprehensive documentation, examples, and benchmarks will help users quickly adopt and integrate these new features.

Project Timeline

The project will be executed in three phases over a 12-week schedule:

Phase 1: Single Precision Extended BLAS Routines

  • Community Bonding (3 weeks): Set up the development environment, review existing PRs for the C ndarray interface, and plan for extended BLAS routines that lack this interface.
  • Weeks 1-3: Develop WebAssembly implementations for single-precision routines. Prioritize routines based on dependencies (e.g., implement blas/ext/base/sapxsumkbn before blas/ext/base/sapxsum).

Phase 2: Double Precision Extended BLAS Routines

  • Weeks 4-6: Extend WebAssembly implementations to double-precision routines, following a similar dependency-based prioritization.
  • Week 6 (Midterm): Present progress for evaluation by mentors.
  • Weeks 6-7: Continuation of WebAssembly implementations of double-precision routines.

Phase 3: Expension of extended BLAS Routines and Refactoring

  • Weeks 8-11: Implement new routines with support for NaN values and masking. Ensure all packages have a C ndarray interface.
  • Weeks 11-12: Implement new routines with complex numbers, and multiple data types supports for extended BLAS routines.
  • Final week: Present progress for final evaluation by mentors.

Why This Project?

The project is exciting because it combines low-level programming (C and WebAssembly) with high-level JavaScript to dramatically improve the performance and efficiency of numerical computing. BLAS routines are vital for scientific and engineering applications, and accelerating them with WebAssembly will benefit a broad developer community. My background in both open-source contributions and numerical computing uniquely positions me to contribute effectively to this project.

Qualifications

  • Strong Background in C and JavaScript: My coursework in data structures and algorithm design using C, combined with extensive JavaScript development, provides a solid foundation for this project.
  • Experience with BLAS and Numerical Computing: Contributions to stdlib, including refactoring BLAS packages and implementing LAPACK routines, have given me practical experience in numerical computing.
  • WebAssembly and Low-Level Optimizations: I have previously worked on integrating C ndarray interfaces and WebAssembly implementations in stdlib, which aligns directly with the project’s goals.
    View my WebAssembly PRs for blas/ext/base
  • Proven Open-Source Contributions: Leading teams, contributing to multiple projects, and collaborating effectively with maintainers highlight my commitment and ability to drive this project to success.

Prior Art

This project builds on similar initiatives in numerical computing:

  • Optimized BLAS Implementations: Libraries like OpenBLAS and Intel MKL provide highly optimized routines in C and Fortran.
  • WebAssembly for Numerical Computing: Projects such as TensorFlow.js and discussions around running NumPy in the browser illustrate the potential of WebAssembly for accelerating computations.
  • Existing stdlib WebAssembly Packages: Several BLAS routines in stdlib already have WebAssembly implementations, and this project will extend that work.
  • Technical Blogs and Papers: Articles such as the Testdriven.io blog on Python WebAssembly offer insight into using WebAssembly for performance gains.

Commitment

I plan to dedicate approximately 30 hours per week during GSoC, ensuring steady progress and timely completion:

  • Pre-GSoC (Community Bonding): ~10 hrs/week – Engage with the community, set up the development environment, and finalize milestones.
  • During GSoC (Coding Period): ~30 hrs/week – Implement features, attend mentor meetings, and refine the project.
  • Post-GSoC: ~5 hrs/week – Address feedback, update documentation, and support the community.

I have no major scheduling conflicts and will maintain regular communication with my mentors and the community.

Related Issues

No response

Checklist

  • I have read and understood the Code of Conduct.
  • I have read and understood the application materials found in this repository.
  • I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
  • I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
  • I have read and understood the stdlib showcase requirement which is necessary for my application to be considered for acceptance.
  • The issue name begins with [RFC]: and succinctly describes your proposal.
  • I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.
@0PrashantYadav0 0PrashantYadav0 added 2025 2025 GSoC proposal. rfc Project proposal. labels Mar 22, 2025
@0PrashantYadav0 0PrashantYadav0 changed the title [RFC]: add WebAssembly implementations for extended BLAS routines [RFC]: add WebAssembly implementations and expension for extended BLAS routines Mar 29, 2025
@0PrashantYadav0 0PrashantYadav0 changed the title [RFC]: add WebAssembly implementations and expension for extended BLAS routines [RFC]: add WebAssembly implementations and expansion for extended BLAS routines Mar 29, 2025
@0PrashantYadav0 0PrashantYadav0 changed the title [RFC]: add WebAssembly implementations and expansion for extended BLAS routines [RFC]: Expansion of Extended BLAS Routines with WebAssembly Implementation and Enhanced Data Handling Mar 30, 2025
@0PrashantYadav0
Copy link
Member Author

@kgryte @Planeshifter Please review my proposal and provide your feedback. Thank you.

@0PrashantYadav0
Copy link
Member Author

0PrashantYadav0 commented Apr 1, 2025

Will you please also comment on my title for the proposal?

@0PrashantYadav0
Copy link
Member Author

@kgryte Do i have to change anything?

@kgryte kgryte self-assigned this Apr 4, 2025
@kgryte
Copy link
Member

kgryte commented Apr 4, 2025

@0PrashantYadav0 Thanks for opening this RFC. A few comments/questions:

  1. Do you have a sense as to how many routines are missing C ndarray interfaces? I'd like to understand a bit more the scope of the task both for updating existing packages and for adding dedicated WebAssembly packages.
  2. You opted to prioritize single-precision over double-precision routines. Is there a reason for this?
  3. What extended BLAS routines are most amenable to complex number support?
  4. Have you given any thought to higher-level ndarray wrappers supporting stacking, as we have in @stdlib/blas (e.g., @stdlib/blas/dswap)?

@kgryte kgryte added the received feedback A proposal which has received feedback. label Apr 4, 2025
@0PrashantYadav0
Copy link
Member Author

0PrashantYadav0 commented Apr 4, 2025

  1. Packages like dsort2ins, dsort2sh, dsort2hp, dsortsh, ssort2sh, ssort2hp, ssortsh, ssorthp, and ssortins are missing C ndarray interfaces. I have already opened PRs for most of them.

We already have WebAssembly implementations for dnansumpw, dasumpw, dapxsumpw, dapxsumors, dapxsumkbn, and dapxsum. For the rest, we have a tracking issue that lists all packages needing WebAssembly implementations:
👉 Add WebAssembly implementations for extended BLAS routines (tracking issue)

What we need to check (for the remaining ~86 routines):

  • Whether a C implementation exists
  • Whether a WebAssembly implementation exists

  1. I prioritized single-precision routines because we still need to add several nan and msk packages for them. For example, in single precision, we need packages like snancusumkbn2, snancusumpw, snancusumkbn, snancusumors, snancusum, and many more. Currently, the number of single-precision routines is relatively small, which makes adding WebAssembly implementations easier.

As I explored the project, I noticed that there are no common dependencies between single- and double-precision routines. Beyond that, I chose to focus on single precision first due to the following reasons:

  • Performance Optimization
  • Wider Usage in ML/AI Applications
  • Simpler Debugging & Smaller Test Cases
  • Strategic Phased Approach

  1. Routines that operate on pairs of real numbers or inherently involve directional transforms, accumulation, or multiplication are naturally suited for complex number support.

Examples include:

  • csum, ccusum, cnansum, cdcusum: Complex numbers can be linearly accumulated.
  • cfill, crev: Trivial to implement for complex types.
  • csortsh, csort2sh: Sorting based on magnitude or phase.
  • capxsum, ccusapxsum: Common in signal processing; suitable for real/imaginary decomposition.
  • capx, capxsumpw: Complex scalar operations align well here.

Operations like dot products, FFTs, and matrix multiplications with complex support are traditionally more complex. However, extended BLAS’s scalar/vector-style routines are typically simpler to adapt.


  1. Absolutely — supporting higher-level ndarray wrappers (like @stdlib/blas/dswap) for extended BLAS routines is a natural evolution for enhancing usability and composability within the @stdlib ecosystem.

While I haven’t yet included this in my proposal, I’d be happy to add it if you recommend doing so.

Here’s why we should consider higher-level ndarray wrappers:

  • Improved Developer Experience: Easier to use in applications involving batch processing, broadcasting, or multi-dimensional operations. It mirrors NumPy-like APIs, making it easier for developers transitioning from Python.
  • Consistency with Existing @stdlib/blas: Since @stdlib/blas already offers stacked APIs (e.g., dswap, daxpy), extending this design to @stdlib/blas/ext/base ensures uniformity and reduces confusion.
  • Enables Composability: Wrappers could optimize calls to lower-level routines while offering a clean, vectorized interface. This is particularly useful for domains like signal processing, statistics, and ML, where multi-dimensional inputs are common.

Thank you @kgryte for your feedback, i will add the change in my final proposal as well.

@0PrashantYadav0
Copy link
Member Author

@kgryte I added showcase project as well please review my showcase project.

@0PrashantYadav0
Copy link
Member Author

GSOC-2025.pdf

This is my final proposal please review.

@0PrashantYadav0
Copy link
Member Author

where should i put this project on, medium or large section?

@gunjjoshi
Copy link
Member

GSOC-2025.pdf

This is my final proposal please review.

Hey, @0PrashantYadav0, a few comments:

  • I think you can exclude the "Follow Up Questions" section from the proposal.
  • For merged PRs, you have added a link to "Closed" PRs. You can link it to "Merged" ones.
  • You can remove the “Set up the development environment” part from the Community Bonding Period, as I believe you might have already set it up, given your contributions to stdlib earlier.

@0PrashantYadav0
Copy link
Member Author

Okay I will change these, thank you for feedback

@0PrashantYadav0
Copy link
Member Author

GSOC-2025.pdf

make few changes now.

@kgryte kgryte closed this as completed May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2025 2025 GSoC proposal. received feedback A proposal which has received feedback. rfc Project proposal.
Projects
None yet
Development

No branches or pull requests

3 participants