Skip to content

GH-47052: [CI] Use Alpine Linux 3.20 instead of 3.18 #47148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Jul 21, 2025

Rationale for this change

Alpine Linux 3.18 is currently deprecated.

What changes are included in this PR?

Update version of Alpine Linux.

Are these changes tested?

Via CI

@raulcd raulcd added the CI: Extra Run extra CI label Jul 21, 2025
Copy link

⚠️ GitHub issue #47052 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Jul 21, 2025
@raulcd
Copy link
Member Author

raulcd commented Jul 21, 2025

This is failing locally, it seems link flags (lz4) are not correctly propagated when using the system Orc? Orc is linked statically here:

[1836/1899] Linking CXX executable debug/arrow-acero-plan-test
ninja: job failed: : && /usr/lib/ccache/bin/c++ -Wredundant-move -Wno-noexcept-type -Wno-self-move  -fdiagnostics-color=always  -Wall -Wno-conversion -Wno-sign-conversion -Wdate-time -Wimplicit-fallthrough -Wunused-result -fno-semantic-interposition -msse4.2  -g -Werror -O0 -ggdb  src/arrow/acero/CMakeFiles/arrow_acero_testing.dir/test_nodes.cc.o src/arrow/acero/CMakeFiles/arrow_acero_testing.dir/test_util_internal.cc.o src/arrow/compute/CMakeFiles/arrow_compute_testing.dir/test_env.cc.o src/arrow/dataset/CMakeFiles/arrow_dataset_testing.dir/test_util_internal.cc.o src/arrow/dataset/CMakeFiles/arrow-dataset-file-orc-test.dir/file_orc_test.cc.o -o debug/arrow-dataset-file-orc-test  -Wl,-rpath,/build/cpp/debug  src/arrow/compute/CMakeFiles/arrow_compute_core_testing.dir/./test_util_internal.cc.o  -ldl  /usr/lib/liborc.a  debug/libarrow_dataset.so.2100.0.0  debug/libparquet.so.2100.0.0  debug/libarrow_testing.so.2100.0.0  /usr/lib/libgmock.so.1.14.0  /usr/lib/libgtest_main.so.1.14.0  debug/libarrow_acero.so.2100.0.0  debug/libarrow_compute.so.2100.0.0  /usr/lib/libgtest.so.1.14.0  debug/libarrow.so.2100.0.0  -ldl  /usr/lib/libprotobuf.so.24.4.0  /usr/lib/libabsl_log_internal_check_op.so.2308.0.0  /usr/lib/libabsl_leak_check.so.2308.0.0  /usr/lib/libabsl_die_if_null.so.2308.0.0  /usr/lib/libabsl_log_internal_conditions.so.2308.0.0  /usr/lib/libabsl_log_internal_message.so.2308.0.0  /usr/lib/libabsl_log_internal_nullguard.so.2308.0.0  /usr/lib/libabsl_examine_stack.so.2308.0.0  /usr/lib/libabsl_log_internal_format.so.2308.0.0  /usr/lib/libabsl_log_internal_proto.so.2308.0.0  /usr/lib/libabsl_log_internal_log_sink_set.so.2308.0.0  /usr/lib/libabsl_log_sink.so.2308.0.0  /usr/lib/libabsl_log_entry.so.2308.0.0  /usr/lib/libabsl_flags.so.2308.0.0  /usr/lib/libabsl_flags_internal.so.2308.0.0  /usr/lib/libabsl_flags_marshalling.so.2308.0.0  /usr/lib/libabsl_flags_reflection.so.2308.0.0  /usr/lib/libabsl_flags_config.so.2308.0.0  /usr/lib/libabsl_flags_program_name.so.2308.0.0  /usr/lib/libabsl_flags_private_handle_accessor.so.2308.0.0  /usr/lib/libabsl_flags_commandlineflag.so.2308.0.0  /usr/lib/libabsl_flags_commandlineflag_internal.so.2308.0.0  /usr/lib/libabsl_log_initialize.so.2308.0.0  /usr/lib/libabsl_log_globals.so.2308.0.0  /usr/lib/libabsl_log_internal_globals.so.2308.0.0  /usr/lib/libabsl_raw_hash_set.so.2308.0.0  /usr/lib/libabsl_hash.so.2308.0.0  /usr/lib/libabsl_city.so.2308.0.0  /usr/lib/libabsl_low_level_hash.so.2308.0.0  /usr/lib/libabsl_hashtablez_sampler.so.2308.0.0  /usr/lib/libabsl_statusor.so.2308.0.0  /usr/lib/libabsl_status.so.2308.0.0  /usr/lib/libabsl_cord.so.2308.0.0  /usr/lib/libabsl_cordz_info.so.2308.0.0  /usr/lib/libabsl_cord_internal.so.2308.0.0  /usr/lib/libabsl_cordz_functions.so.2308.0.0  /usr/lib/libabsl_exponential_biased.so.2308.0.0  /usr/lib/libabsl_cordz_handle.so.2308.0.0  /usr/lib/libabsl_crc_cord_state.so.2308.0.0  /usr/lib/libabsl_crc32c.so.2308.0.0  /usr/lib/libabsl_crc_internal.so.2308.0.0  /usr/lib/libabsl_crc_cpu_detect.so.2308.0.0  /usr/lib/libabsl_bad_optional_access.so.2308.0.0  /usr/lib/libabsl_str_format_internal.so.2308.0.0  /usr/lib/libabsl_strerror.so.2308.0.0  /usr/lib/libabsl_synchronization.so.2308.0.0  /usr/lib/libabsl_stacktrace.so.2308.0.0  /usr/lib/libabsl_symbolize.so.2308.0.0  /usr/lib/libabsl_debugging_internal.so.2308.0.0  /usr/lib/libabsl_demangle_internal.so.2308.0.0  /usr/lib/libabsl_graphcycles_internal.so.2308.0.0  /usr/lib/libabsl_kernel_timeout_internal.so.2308.0.0  /usr/lib/libabsl_malloc_internal.so.2308.0.0  /usr/lib/libabsl_time.so.2308.0.0  /usr/lib/libabsl_strings.so.2308.0.0  /usr/lib/libabsl_string_view.so.2308.0.0  /usr/lib/libabsl_throw_delegate.so.2308.0.0  /usr/lib/libabsl_strings_internal.so.2308.0.0  /usr/lib/libabsl_base.so.2308.0.0  /usr/lib/libabsl_spinlock_wait.so.2308.0.0  /usr/lib/libabsl_int128.so.2308.0.0  /usr/lib/libabsl_civil_time.so.2308.0.0  /usr/lib/libabsl_time_zone.so.2308.0.0  /usr/lib/libabsl_bad_variant_access.so.2308.0.0  /usr/lib/libabsl_raw_logging_internal.so.2308.0.0  /usr/lib/libabsl_log_severity.so.2308.0.0 && :
/usr/lib/gcc/x86_64-alpine-linux-musl/13.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: /usr/lib/liborc.a(Compression.cc.o): undefined reference to symbol 'LZ4_compressBound'
/usr/lib/gcc/x86_64-alpine-linux-musl/13.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: /usr/lib/liblz4.so.1: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
ninja: subcommand failed

@kou any idea what might be the issue? to be honest I am unsure whether this is a problem with the built orc package on alpine or on our Third Party Toolchain.
I have tried with ORC_SOURCE=BUNDLED and it is successful.

@kou
Copy link
Member

kou commented Jul 21, 2025

It seems that apache-orc-dev 2.0.3 on Alpine Linux 3.20 and 3.21 don't provide orcConfig.cmake. (apache-orc-dev on Alpine Linux 3.22 provides it.)

How about the following as workaround?

diff --git a/cpp/cmake_modules/FindorcAlt.cmake b/cpp/cmake_modules/FindorcAlt.cmake
index ce8cd11b4c..406f186a48 100644
--- a/cpp/cmake_modules/FindorcAlt.cmake
+++ b/cpp/cmake_modules/FindorcAlt.cmake
@@ -66,10 +66,25 @@ find_package_handle_standard_args(
 
 if(orcAlt_FOUND)
   if(NOT TARGET orc::orc)
+    # For old Apache Orc. For example, apache-orc 2.0.3 on Alpine
+    # Linux 3.20 and 3.21.
     add_library(orc::orc STATIC IMPORTED)
     set_target_properties(orc::orc
                           PROPERTIES IMPORTED_LOCATION "${ORC_STATIC_LIB}"
                                      INTERFACE_INCLUDE_DIRECTORIES "${ORC_INCLUDE_DIR}")
+    if(ARROW_WITH_LZ4 AND TARGET LZ4::lz4)
+      target_link_libraries(orc::orc INTERFACE LZ4::lz4)
+    endif()
+    if(ARROW_WITH_SNAPPY AND TARGET Snappy::snappy)
+      target_link_libraries(orc::orc INTERFACE Snappy::snappy)
+    endif()
+    if(ARROW_WITH_ZSTD)
+      if(TARGET zstd::libzstd_shared)
+        target_link_libraries(orc::orc INTERFACE zstd::libzstd_shared)
+      elseif(TARGET zstd::libzstd_static)
+        target_link_libraries(orc::orc INTERFACE zstd::libzstd_static)
+      endif()
+    endif()
   endif()
   set(orcAlt_VERSION ${ORC_VERSION})
 endif()

@raulcd
Copy link
Member Author

raulcd commented Jul 22, 2025

Thanks @kou ! I had to add Zlib too to your snippet and with that we were able to build. The job is still failing but now with some core dump and failures around ORC tests, it will require some more investigation. The same tests are successful if I build ORC ORC_SOURCE=BUNDLED

@raulcd
Copy link
Member Author

raulcd commented Jul 22, 2025

Python failures are unrelated, they are failing on other PRs, I opened:

@kou
Copy link
Member

kou commented Jul 23, 2025

https://github.com/apache/arrow/actions/runs/16442827719/job/46467431083#step:6:4162

 70/104 Test  #72: arrow-dataset-file-orc-test ..................***Failed   11.12 sec
Running arrow-dataset-file-orc-test, redirecting output into /build/cpp/build/test-logs/arrow-dataset-file-orc-test.txt (attempt 1/1)
/arrow/cpp/build-support/run-test.sh: line 88: 23669 Aborted                 (core dumped) $TEST_EXECUTABLE "$@" > $LOGFILE.raw 2>&1
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1753185218.973151   23669 descriptor_database.cc:656] File already exists in database: orc_proto.proto
F0000 00:00:1753185218.973271   23669 descriptor.cc:2160] Check failed: GeneratedDatabase()->Add(encoded_file_descriptor, size)

It seems that ORC was loaded multiple times. liborc.a may be linked to the test file multiple times.

Let's check the link command line for the test file.

@raulcd
Copy link
Member Author

raulcd commented Jul 24, 2025

I've tested with -DCMAKE_VERBOSE_MAKEFILE=ON. It doesn't seem to be linked more than once:

[1820/1899] : && /usr/lib/ccache/bin/c++ -Wredundant-move -Wno-noexcept-type -Wno-self-move -fdiagnostics-color=always -Wall -Wno-conversion -Wno-sign-conversion -Wdate-time -Wimplicit-fallthrough -Wunused-result -fno-semantic-interposition -msse4.2 -g -Werror -O0 -ggdb src/arrow/acero/CMakeFiles/arrow_acero_testing.dir/test_nodes.cc.o src/arrow/acero/CMakeFiles/arrow_acero_testing.dir/test_util_internal.cc.o src/arrow/compute/CMakeFiles/arrow_compute_testing.dir/test_env.cc.o src/arrow/dataset/CMakeFiles/arrow_dataset_testing.dir/test_util_internal.cc.o src/arrow/dataset/CMakeFiles/arrow-dataset-file-orc-test.dir/file_orc_test.cc.o -o debug/arrow-dataset-file-orc-test -Wl,-rpath,/build/cpp/debug src/arrow/compute/CMakeFiles/arrow_compute_core_testing.dir/./test_util_internal.cc.o -ldl /usr/lib/liborc.a debug/libarrow_dataset.so.2100.0.0 debug/libparquet.so.2100.0.0 debug/libarrow_testing.so.2100.0.0 /usr/lib/libgmock.so.1.14.0 /usr/lib/libgtest_main.so.1.14.0 debug/libarrow_acero.so.2100.0.0 debug/libarrow_compute.so.2100.0.0 /usr/lib/libgtest.so.1.14.0 debug/libarrow.so.2100.0.0 -ldl /usr/lib/liblz4.so /usr/lib/libsnappy.so.1.1.10 /usr/lib/libzstd.so /lib/libz.so /usr/lib/libprotobuf.so.24.4.0 /usr/lib/libabsl_log_internal_check_op.so.2308.0.0 /usr/lib/libabsl_leak_check.so.2308.0.0 /usr/lib/libabsl_die_if_null.so.2308.0.0 /usr/lib/libabsl_log_internal_conditions.so.2308.0.0 /usr/lib/libabsl_log_internal_message.so.2308.0.0 /usr/lib/libabsl_log_internal_nullguard.so.2308.0.0 /usr/lib/libabsl_examine_stack.so.2308.0.0 /usr/lib/libabsl_log_internal_format.so.2308.0.0 /usr/lib/libabsl_log_internal_proto.so.2308.0.0 /usr/lib/libabsl_log_internal_log_sink_set.so.2308.0.0 /usr/lib/libabsl_log_sink.so.2308.0.0 /usr/lib/libabsl_log_entry.so.2308.0.0 /usr/lib/libabsl_flags.so.2308.0.0 /usr/lib/libabsl_flags_internal.so.2308.0.0 /usr/lib/libabsl_flags_marshalling.so.2308.0.0 /usr/lib/libabsl_flags_reflection.so.2308.0.0 /usr/lib/libabsl_flags_config.so.2308.0.0 /usr/lib/libabsl_flags_program_name.so.2308.0.0 /usr/lib/libabsl_flags_private_handle_accessor.so.2308.0.0 /usr/lib/libabsl_flags_commandlineflag.so.2308.0.0 /usr/lib/libabsl_flags_commandlineflag_internal.so.2308.0.0 /usr/lib/libabsl_log_initialize.so.2308.0.0 /usr/lib/libabsl_log_globals.so.2308.0.0 /usr/lib/libabsl_log_internal_globals.so.2308.0.0 /usr/lib/libabsl_raw_hash_set.so.2308.0.0 /usr/lib/libabsl_hash.so.2308.0.0 /usr/lib/libabsl_city.so.2308.0.0 /usr/lib/libabsl_low_level_hash.so.2308.0.0 /usr/lib/libabsl_hashtablez_sampler.so.2308.0.0 /usr/lib/libabsl_statusor.so.2308.0.0 /usr/lib/libabsl_status.so.2308.0.0 /usr/lib/libabsl_cord.so.2308.0.0 /usr/lib/libabsl_cordz_info.so.2308.0.0 /usr/lib/libabsl_cord_internal.so.2308.0.0 /usr/lib/libabsl_cordz_functions.so.2308.0.0 /usr/lib/libabsl_exponential_biased.so.2308.0.0 /usr/lib/libabsl_cordz_handle.so.2308.0.0 /usr/lib/libabsl_crc_cord_state.so.2308.0.0 /usr/lib/libabsl_crc32c.so.2308.0.0 /usr/lib/libabsl_crc_internal.so.2308.0.0 /usr/lib/libabsl_crc_cpu_detect.so.2308.0.0 /usr/lib/libabsl_bad_optional_access.so.2308.0.0 /usr/lib/libabsl_str_format_internal.so.2308.0.0 /usr/lib/libabsl_strerror.so.2308.0.0 /usr/lib/libabsl_synchronization.so.2308.0.0 /usr/lib/libabsl_stacktrace.so.2308.0.0 /usr/lib/libabsl_symbolize.so.2308.0.0 /usr/lib/libabsl_debugging_internal.so.2308.0.0 /usr/lib/libabsl_demangle_internal.so.2308.0.0 /usr/lib/libabsl_graphcycles_internal.so.2308.0.0 /usr/lib/libabsl_kernel_timeout_internal.so.2308.0.0 /usr/lib/libabsl_malloc_internal.so.2308.0.0 /usr/lib/libabsl_time.so.2308.0.0 /usr/lib/libabsl_strings.so.2308.0.0 /usr/lib/libabsl_string_view.so.2308.0.0 /usr/lib/libabsl_throw_delegate.so.2308.0.0 /usr/lib/libabsl_strings_internal.so.2308.0.0 /usr/lib/libabsl_base.so.2308.0.0 /usr/lib/libabsl_spinlock_wait.so.2308.0.0 /usr/lib/libabsl_int128.so.2308.0.0 /usr/lib/libabsl_civil_time.so.2308.0.0 /usr/lib/libabsl_time_zone.so.2308.0.0 /usr/lib/libabsl_bad_variant_access.so.2308.0.0 /usr/lib/libabsl_raw_logging_internal.so.2308.0.0 /usr/lib/libabsl_log_severity.so.2308.0.0 && :

@kou
Copy link
Member

kou commented Jul 25, 2025

Ah, we may not need liborc.a here because file_orc_test.cc doesn't use ORC API directly:

diff --git a/cpp/src/arrow/dataset/CMakeLists.txt b/cpp/src/arrow/dataset/CMakeLists.txt
index d87afdf5bd..fa6875527d 100644
--- a/cpp/src/arrow/dataset/CMakeLists.txt
+++ b/cpp/src/arrow/dataset/CMakeLists.txt
@@ -191,8 +191,7 @@ if(ARROW_JSON)
 endif()
 
 if(ARROW_ORC)
-  add_arrow_dataset_test(file_orc_test EXTRA_LINK_LIBS ${ARROW_DATASET_TEST_LINK_LIBS}
-                         orc::orc)
+  add_arrow_dataset_test(file_orc_test EXTRA_LINK_LIBS ${ARROW_DATASET_TEST_LINK_LIBS})
 endif()
 
 if(ARROW_PARQUET)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting committer review Awaiting committer review CI: Extra Run extra CI Component: C++
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants