Skip to content

curl_share_cleanup() not checked for errors #6722

@pitrou

Description

@pitrou

Describe the bug

On Apache Arrow we're getting Valgrind failures that happen in our Azure filesystem layer, which relies on the Azure SDK for C++.

According to Valgrind, the leak is in memory allocated by curl_share_init.
The Azure SDK does wrap the pointer returned by curl_share_init it in a smart pointer-like class CURLSHWrapper. However, the CURLSHWrapper destructor does not account for the fact that curl_share_cleanup can fail: not only according to the docs, but it can concretely return CURLSHE_IN_USE if the shared handle is still in use (by some ongoing operation perhaps?).

I don't know if that's concretely the problem in our case, but it seems the Azure SDK should at least check for errors returned by curl_share_cleanup and log an error somewhere.

Exception or Stack Trace

I'm pasting one stack trace for reference, but the other ones look similar:

==17420== 39,976 (928 direct, 39,048 indirect) bytes in 4 blocks are definitely lost in loss record 43 of 43
==17420==    at 0x49C6B9F: calloc (vg_replace_malloc.c:1675)
==17420==    by 0x7173253: curl_share_init (in /opt/conda/envs/arrow/lib/libcurl.so.4.8.0)
==17420==    by 0x5EE4A33: Azure::Core::Http::CurlConnection::CurlConnection(Azure::Core::Http::Request&, Azure::Core::Http::CurlTransportOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::chrono::duration<long, std::ratio<1l, 1000l> >) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5EE729A: Azure::Core::Http::_detail::CurlConnectionPool::ExtractOrCreateCurlConnection(Azure::Core::Http::Request&, Azure::Core::Http::CurlTransportOptions const&, std::chrono::duration<long, std::ratio<1l, 1000l> >, bool) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5EE7B03: Azure::Core::Http::CurlTransport::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F12F6D: Azure::Core::Http::Policies::_internal::TransportPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F06E1D: Azure::Core::Http::Policies::_internal::LogPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F0DCAB: Azure::Core::Http::Policies::_internal::RequestActivityPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x4BB254D: Azure::Storage::_internal::SharedKeyPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-storage-files-datalake.so)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x64BE701: Azure::Storage::_internal::StoragePerRetryPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-storage-common.so.12.10.0)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x64BF864: Azure::Storage::_internal::StorageSwitchToSecondaryPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-storage-common.so.12.10.0)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F11250: Azure::Core::Http::Policies::_internal::RetryPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F11D3B: Azure::Core::Http::Policies::_internal::TelemetryPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x4BB2816: Azure::Core::Http::Policies::_internal::RequestIdPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-storage-files-datalake.so)
==17420==    by 0x5F07BDA: Azure::Core::Http::Policies::NextHttpPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-core.so.1.16.0)
==17420==    by 0x4BB18AF: Azure::Storage::_internal::StorageServiceVersionPolicy::Send(Azure::Core::Http::Request&, Azure::Core::Http::Policies::NextHttpPolicy, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-storage-files-datalake.so)
==17420==    by 0x4BFDFA3: Azure::Storage::Files::DataLake::_detail::PathClient::GetAccessControlList(Azure::Core::Http::_internal::HttpPipeline&, Azure::Core::Url const&, Azure::Storage::Files::DataLake::_detail::PathClient::GetPathAccessControlListOptions const&, Azure::Core::Context const&) (in /opt/conda/envs/arrow/lib/libazure-storage-files-datalake.so)
==17420==    by 0x4BC45F9: Azure::Storage::Files::DataLake::DataLakePathClient::GetAccessControlList(Azure::Storage::Files::DataLake::GetPathAccessControlListOptions const&, Azure::Core::Context const&) const (in /opt/conda/envs/arrow/lib/libazure-storage-files-datalake.so)
==17420==    by 0x5BA501B: arrow::fs::internal::CheckIfHierarchicalNamespaceIsEnabled(Azure::Storage::Files::DataLake::DataLakeFileSystemClient const&, arrow::fs::AzureOptions const&) (azurefs.cc:1363)
==17420==    by 0x40B6E76: arrow::fs::TestAzuriteFileSystem_CheckIfHierarchicalNamespaceIsEnabledTransportError_Test::TestBody() (azurefs_test.cc:2426)
==17420==    by 0x5E9A14F: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2653)
==17420==    by 0x5EA8B45: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2689)
==17420==    by 0x5EA8D65: testing::Test::Run() (gtest.cc:2728)
==17420==    by 0x5EA90BB: testing::TestInfo::Run() (gtest.cc:2874)
==17420==    by 0x5EB1336: testing::TestSuite::Run() (gtest.cc:3052)
==17420==    by 0x5EB3D8C: testing::internal::UnitTestImpl::RunAllTests() (gtest.cc:6004)
==17420==    by 0x5E9A6A3: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (gtest.cc:2653)
==17420==    by 0x5EA932C: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (gtest.cc:2689)
==17420==    by 0x5EA94F7: testing::UnitTest::Run() (gtest.cc:5583)
==17420==    by 0x4B61841: RUN_ALL_TESTS() (gtest.h:2334)
==17420==    by 0x4B6187E: main (gtest_main.cc:64)
==17420== 

Setup (please complete the following information):

  • Azure SDK 1.16.0

Additional context
Add any other context about the problem here.

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added

Metadata

Metadata

Assignees

No one assigned

    Labels

    Azure.CoreClientThis issue points to a problem in the data-plane of the library.Service AttentionWorkflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions