-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Fix segfault, added tests for extractLabeledEuclideanClusters #4084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix segfault, added tests for extractLabeledEuclideanClusters #4084
Conversation
Currently, it seems like it's the caller's responsibility to allocate properly based on the number of labels. You're shifting the responsibility to Keeping that in mind, if PCL seg-faults on it's own code, would it not make more sense to fix it at the source rather than inside the function? If it's not caused by PCL, it'd make more sense to pre-allocate based on the max label_cluster idx that can be possibly accessed const auto& max_label_idx = *std::max_element(cloud.begin(), cloud.end(), [](const auto& a, const auto&b) { return a.label < b.label; });
labeled_clusters.resize(max_label_idx); |
Hmm yes I can see how this is an enhancement rather than a fix. I suppose I framed it that way because it seems rather counter-intuitive to have to preallocate the output parameter, but then again I'm new to PCL. At the very least, it would help if this requirement was documented somewhere, since I spent many hours tracking this down. Simply changing the documentation would help with that. Also, consider the edge case with a single pointcloud with label of std::numeric_limits::max(). You would need to preallocate an array of that size just to store a single cluster, which seems very inefficient to me as opposed to enhancing the PCL code. I'm happy to make any changes; I'm just trying to prevent others from going through the same pain I've just went through :) |
Where are the labels coming from? I'm actually not 100% sure that the labels are generated solely by the library. If they are coming via library, they'd be in order and packed. Else your point stands and using a map-like approach makes sense.
You can create another PR to fix that while we discuss improvements here |
I am generating my points externally, and using the labels for semantic meaning. I want to extract/cluster the labeled points for further processing. In my case, I'm using object classification on 2D images to classify objects (and using the Label for classid with a custom map), and then projecting them to 3D points in space so I can later merge/deconflict/etc with RADAR/LIDAR scans, which are already pointclouds. At this point I can choose/limit my label values so I don't expect to need more than 30-40 right now (with values <= that amount), and I don't need this PR. I was just trying to smooth out what I perceived to be a rough edge in the library. Happy to implement a |
Instead of a map implementation, I'd recommend the following:
|
segmentation/include/pcl/segmentation/impl/extract_labeled_clusters.hpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This will significantly change the behavior of extractLabeledEuclideanClusters
with std::vector<std::vector<pcl::PointIndices>>
, esp wrt
- speed
- memory pressure
Could someone please benchmark this with a real-world dataset? IFF the difference is significant, no need to panic. We can refactor the meat-of-the-proposal using a std::variant<std::map<T>, std::vector<T>>
(with T = std::vector<pcl::Indices>
) and a visitor on operator[]
|
||
// get max label | ||
const auto extracted_max_label = std::max_element( labeled_map.begin(), labeled_map.end(), | ||
[]( const pcl::labeled_cluster_map_t::value_type& lhs, const pcl::labeled_cluster_map_t::value_type& rhs ) { return lhs.first < rhs.first; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[]( const pcl::labeled_cluster_map_t::value_type& lhs, const pcl::labeled_cluster_map_t::value_type& rhs ) { return lhs.first < rhs.first; } | |
[]( const auto& lhs, const auto& rhs ) { return lhs.first < rhs.first; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any C++11 concerns? Or are we >= C++14 now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PCL requires C++14
} | ||
|
||
// move clusters to output vector | ||
// convert pcl::Indices to pcl:PointIndices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are planning on deprecating the PointIndices
. Could you please add an interface for vec-o-vec-o-Indices
and move this code to modify the output of call to that interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you envision this class having 3 extraction methods? I had assumed 2: the new map
method would supercede the vec-o-vec-o-PointIndices
. This function exists to provide backcompat. Or I may be misunderstanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're touching this, I thought it might be easy to add it now, rather than later when someone is going through the interface updating the API.
That means 3 interfaces:
- vec-o-vec-Indices: new
- map-o-vec-Indices: new
- vec-o-vec-PointIndices: old
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If vec-o-vec-Indices
will behave similarly to vec-o-vec-PointIndices
, and therefore inherit the memory inefficiencies in the sparse label case, why not just deprecate and replace with the map
version?
Actually, in searching for existing usage within PCL, I stumbled across ConditionalEuclideanClustering
, which seems like a more generic solution to this grouping problem. Are there any good reasons not to simply deprecate this entire class in favor of ConditionalEuclideanClustering
with a trivial label matching function? Or, under the hood, rewrite this class to use ConditionalEuclideanClustering
, and then transform the result to comply with existing behavior. Maybe I'm missing something obvious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrt deprecation and the alternate of rewriting, I'm not a user of these classes. Hopefully others can chime in
// iterator in labeled_clusters for the current label | ||
const auto current_label_cluster_it = labeled_clusters.find( current_label ); | ||
|
||
// max labels check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to check for max_labels
? The interface had an extra parameter but we don't need to add a potential pitfall for that (esp since it is unused in the existing implementation, with no check or documentation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to break the interface by removing max_labels
, but I didn't want it to be left unimplemented either. So I implemented it. Should it be removed entirely, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the new interface: remove it; in the old interface: keep it. I'll create a PR for deprecating it while adding an overload without the extra parameter.
{ | ||
// Search for sq_idx | ||
int ret = tree->radiusSearch (seed_queue[sq_idx], tolerance, nn_indices, nn_distances, std::numeric_limits<int>::max()); | ||
pcl::Indices nn_indices = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulling out the vectors from the loop reduces the total dynamic allocations required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I had assumed the compiler would optimize this but a brief googling tells me I am probably wrong. I will allocate outside and clear
the array instead (which radiusSearch
should take care of, but just in case...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the compiler would optimize this
C++20 brings that (compiler optimize matching pair of new and delete). In any case, it'd not apply here because the allocations/deallocations aren't happening in pairs. If only there was a way to share a buffer in a loop from one iteration to another (without needing to use a PITA allocator)
////////////////////////////////////////////////////////////////////////////////////////////////////////////////// | ||
/** \brief Map of label and associated clusters | ||
*/ | ||
using labeled_cluster_map_t = std::map<std::uint32_t, std::vector<pcl::Indices>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the keys are uint32
, it might make sense to either let the user choose the map by templating the function, or use a faster map than std::map
.
Pinging @Morwenn for the assist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any plans to typedef the label_type
? Ideally, a label_t
or equivalent would go right here.
Also, I'm torn between unordered_map
and map
, any suggestions? I'm assuming a small n
, where n
==number of labels, and a map
may be slightly better. But I could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any plans to typedef the
label_type
Disclaimer: Personal opinion: None, though it makes sense.
if ( current_label_cluster_it == labeled_clusters.end() ) // label not found in map, add | ||
{ | ||
std::vector<pcl::Indices> v = {}; | ||
v.emplace_back(std::move(seed_queue)); | ||
labeled_clusters.emplace(current_label, std::move(v)); | ||
} | ||
else // label already exists in map; append seed_queue to label vector | ||
current_label_cluster_it->second.emplace_back(std::move(seed_queue)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ( current_label_cluster_it == labeled_clusters.end() ) // label not found in map, add | |
{ | |
std::vector<pcl::Indices> v = {}; | |
v.emplace_back(std::move(seed_queue)); | |
labeled_clusters.emplace(current_label, std::move(v)); | |
} | |
else // label already exists in map; append seed_queue to label vector | |
current_label_cluster_it->second.emplace_back(std::move(seed_queue)); | |
labeled_clusters[current_label].emplace_back(std::move(seed_queue)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how I missed that one -- I guess it's been a while since I used map
. Thanks!
extractLabeledEuclideanClusters
would segfault at the following line if the output arraylabeled_clusters
was not properly sized, or was empty:pcl/segmentation/include/pcl/segmentation/impl/extract_labeled_clusters.hpp
Line 114 in f9f214f
This PR fixes this issue and adds some basic tests for
LabeledEuclideanClusterExtraction