GSoC 2025: Discussion on "Lazy Trajectory Analysis with Dask and a Lazy Timeseries API" #4986
Abdulrahman-PROG
started this conversation in
GSoC Discussions
Replies: 1 comment
-
Thanks for submitting your preproposal! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @ljwoods2, @orbeckst, @yuxuanzhuang, and the MDAnalysis team,
I’m Abdulrahman Yasser Elbanna , a Bachelor’s student in Artificial Intelligence at Kafrelsheikh University, Egypt. I’m very interested in contributing to the "Lazy Trajectory Analysis with Dask and a Lazy Timeseries API" project for GSoC 2025. I’ve already contributed to MDAnalysis through PR #4951 ( for issue #3743), which is currently under review.Additionally, I contributed to MolecularNodes by updating the van der Waals radii for elements 47–103 in PR link, which was successfully merged. I’m excited about improving MDAnalysis’s performance in HPC environments through this project.
I have strong experience in Python, data analysis, and object-oriented programming, having worked on projects like sign language classification using CNNs and Automatic Number Plate Recognition (ANPR) with OpenCV and EasyOCR. These projects honed my skills in handling large datasets and using NumPy-like interfaces, which are directly applicable to MDAnalysis’s trajectory analysis needs. I’ve been researching the project and related issues (#4713, #4598, #4561, #2865). I understand that the goal is to implement a
lazy_timeseries
API using Dask for trajectory readers, provide a sample H5MD implementation, and develop a lazy analysis base with an example algorithm like RMSD. I’m currently learning Dask and exploring the existingtimeseries
API andH5MDReader
inpackage/MDAnalysis/coordinates/H5MD.py
.I’ve submitted a pre-proposal for this project, where I outlined my plan to develop a Dask-based lazy reader, create a
lazy_timeseries
interface compatible with H5MDReader, implement a lazy RMSD analysis, achieve 90% test coverage, and reduce analysis time by 20% for large trajectories in HPC. I’d love to discuss a few key aspects of the project to ensure my final proposal aligns with your expectations:lazy_timeseries
API: What specific features or performance benchmarks are you looking for in thelazy_timeseries
interface? For example, should it prioritize memory efficiency, speed, or compatibility with existing readers?I’m eager to contribute to MDAnalysis and make this project a success. Any guidance or feedback would be greatly appreciated as I prepare my final proposal, due on March 24, 2025. Thank you for your time!
Best regards,
Abdulrahman Yasser Elbanna
Beta Was this translation helpful? Give feedback.
All reactions