Skip to content
View jasonlin316's full-sized avatar

Block or report jasonlin316

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jasonlin316/README.md

I am a Ph.D. Candidate in FPGA/Parallel Computing Lab at University of Southern California, advised by Prof. Prasanna. I will be joining the AI System Co-Design team at Meta as a Research Scientist this Summer. My recent work focuses on developing Automated ML Systems, such as an auto-tuning runtime system for GNN training, and automatic model parallelization for LLM serving. I was a Research Intern at the Microsoft AI Framework team, where I worked on optimizing distributed inference for LLM serving. Please refer to my personal website for more details. I completed my undergraduate study in National Taiwan University (NTU), majored in Electrical Engineering. I was also an Electrical Engineering Intern at Hewlett-Packard (HP), Taipei.

In addition to research interests, I'm also interested in video editing and photography. Check out my vlogs and photos.

Pinned Loading

  1. RISC-V-CPU RISC-V-CPU Public

    A RISC-V 5-stage pipelined CPU that supports vector instructions. Tape-out with U18 technology.

    Verilog 130 27

  2. A-Single-Path-Delay-32-Point-FFT-Processor A-Single-Path-Delay-32-Point-FFT-Processor Public

    A 32-point pipelined Fast Fourier transform processor, using single path delay architecture, and based on radix2-DIF(decimation-in-frequency) algorithm. The average SNR = 58.76.

    Verilog 48 13

  3. Systolic-Array-for-Smith-Waterman Systolic-Array-for-Smith-Waterman Public

    This work implements a dynamic programming algorithm for performing local sequence alignment. Through parallelism, it can run 136X times faster than a software running the same algorithm.

    Verilog 26 8

  4. GCN-Inference-Acceleration-HLS GCN-Inference-Acceleration-HLS Public

    An end-to-end GCN inference accelerator written in HLS

    C++ 19 4

  5. microsoft/apex_plus microsoft/apex_plus Public

    APEX+ is an LLM Serving Simulator

    Python 5 2