-
Notifications
You must be signed in to change notification settings - Fork 969
Description
First, thank you for sharing this excellent work! I have a question regarding the data splitting strategy and performance benchmarking:
I noticed that Area 5 is designated as the test set but appears to be simultaneously included in the validation set during training. Could this potentially introduce data leakage or lead to overestimated performance metrics?
Would an alternative approach—such as using 90% of Areas 1-4 + Area 6 for training and 10% for validation (while reserving Area 5 exclusively for testing)—prove more methodologically rigorous? I'm particularly curious whether you've experimented with this configuration and how it might impact the mIoU scores.
Additionally, while most reproduced implementations of PointNet++ on S3DIS (with Area 5 as test set) report mIoU around 0.53, I've been unable to achieve comparable scores using the dataset partitioning method I described above. Would you be able to shed light on any critical implementation details that might explain this discrepancy?
Your expertise would be greatly appreciated!