2025/05/02 Meeting Notes #186
himanshunaidu
started this conversation in
Meeting Notes
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Progress Update
Segmentation Post-Processing
The inconsistencies with the camera frame, segmentation mask, depth map, detected objects, etc. have been seemingly fixed, at least for iPhone Pro Portrait mode.
Still need to test for other modes (landscape and upside down) of iPhone, which will be contingent upon fixing the layout issues themselves (currently, when we shift to other modes, the preview layers end up in an incorrect layout). Need to fix that with dynamic SwiftUI Layout features.
Also need to test for other device types such as iPad Pro devices.
Turns out, most of the homography issues were because of the orientation issues themselves.
Now that the orientation issues have been fixed, the homography transforms seem correct, which has been qualitatively tested.
Other issues related to scaling and normalization of the matrix depending on the usage, have also been fixed.
The quantitative testing may be de-prioritized for now, as it can be done with all other post-processing techniques together, or with a separate testing app which will take some time to properly develop.
Depth Calculation
Due to the orientation and homography transformation fixes, depth Calculation has also been fixed to utilize the correct centroids of the segmentation masks.
Centroid Tracker Reliability
Due to the orientation and homography transformation fixes, Centroid Tracker is now tracking the detected object centroids more reliably.
Computer Vision ML Pipeline
Train a new pedestrian centric model using Coco-stuff and Hand-labelled datasets
This was done in a first attempt, although performance is still found wanting.
TODO: Perform further hyper parameter tuning, and retrain with coco_stuff in a 2 phase strategy.
Analyze conversion issues with ESPNetv2
Issue raised with Apple coremltools team.
Conversion of ESPNetv2 semantic segmentation model from PyTorch to CoreML format using coremltools, taking very long apple/coremltools#2497
Fixed bug created with general torch model evaluation (ESPNetv2 and BiSeNetv2)
A couple of issues have been fixed related to getting the cumulative evaluation metrics.
Get More Specific Performance Metrics
Evaluation metrics per class have been implemented (e.g. mIoU)
Once the last bug related to evaluation metrics has been fixed, we can get the correct metrics for these.
Next Steps
Segmentation Post-Processing
Performance Issues
Some of these include:
Implementing a Metal version of contour detection and comparing with VNDetectContoursRequest.
Assessing the performance of current object tracking when there are many more objects for comparison, and assessing accordingly on how to optimize it.
And more given in the Github issues.
Implementation of Union of Masks
TODO: Implement Union of Masks (with 3-5 consecutive frames)
Implement sidewalk segmentation logic
Can use classes such as human (hand) as proxy for calculation.
Fix remaining GPS-related issues with object location calculation algorithm
Take care of remaining orientation issues by using dynamic layout related SwiftUI classes such as Layout.
Computer Vision Pipeline
Compare and validate old and new implementations of ROM/RUM
Explore other models of interest.
Beta Was this translation helpful? Give feedback.
All reactions