Hi there! I'm a curious builder with a passion for making machines a little smarter β and maybe even teaching them to write better code than I can (still working on that part).
This repository is a collection of deep learning projects I've implemented as part of my CS coursework, research, and late-night experiments with coffee and curiosity as my best friends.
Theme: "Backprop through time, one vanishing gradient at a time."
- Implemented basic RNNs and GRUs using PyTorch
- Explored hyperparameter tuning, gradient clipping, and ReLU activation
- Conducted ablation studies to understand how GRU gates affect performance
- Trained on H.G. Wells' The Time Machine β which feels poetic, really
Theme: "Teaching neural networks where to look."
- Implemented additive (Bahdanau) attention mechanism from scratch
- Built attention-based sequence-to-sequence models for machine translation
- Visualized attention weights to understand model focus during translation
- Evaluated translation quality using BLEU scores across configurations
Theme: "Compressing reality into lower dimensions."
- Built autoencoder architectures with configurable latent dimensions
- Experimented with different latent space sizes (2, 8, 32, 64)
- Analyzed the trade-off between compression and reconstruction quality
- Visualized how increasing latent dimensions improves image reconstruction
- Python, NumPy, PyTorch
- Jupyter Notebook, Matplotlib
- Git, GitHub
- To get my hands dirty implementing everything from the ground up
- To deeply understand how and why deep learning models work
- To develop strong debugging, experimentation, and research skills
- To document it all for the future reference
If you're working on cool things in AI, education, or robotics β or just want to chat tech, coffee, or build ideas β I'd love to connect.
Name: Aashish Dhakal
LinkedIn: linkedin.com/in/aashishdhakal
This repo is an evolving archive of projects I've loved building β and breaking β on my journey through deep learning.