feat: Add JNI-based Hadoop FileSystem support for S3 and other Hadoop-compatible stores #1992

drexler-sky · 2025-07-07T02:35:41Z

Which issue does this PR close?

Closes #.

Rationale for this change

This PR adds support for Approach 2 (JNI-based Hadoop FileSystem access) to enable S3 reads via the native DataFusion Parquet scanner. The original discussion of both approaches can be found in issue #1766.

What changes are included in this PR?

Added JNI-based integration for accessing Hadoop FileSystem in Comet

Introduced a new config flag: spark.comet.use_jni_object_store to toggle this feature

How are these changes tested?

new test

…-compatible stores

codecov-commenter · 2025-07-07T07:36:54Z

Codecov Report

Attention: Patch coverage is 9.09091% with 50 lines in your changes missing coverage. Please review.

Project coverage is 58.01%. Comparing base (f09f8af) to head (97e6aee).
Report is 312 commits behind head on main.

Files with missing lines	Patch %	Lines
...n/java/org/apache/comet/parquet/JniHDFSBridge.java	0.00%	50 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #1992      +/-   ##
============================================
+ Coverage     56.12%   58.01%   +1.88%     
- Complexity      976     1152     +176     
============================================
  Files           119      134      +15     
  Lines         11743    13095    +1352     
  Branches       2251     2432     +181     
============================================
+ Hits           6591     7597    +1006     
- Misses         4012     4264     +252     
- Partials       1140     1234      +94

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

parthchandra

Thanks @drexler-sky for this PR! I've been working on something similar and there is overlap as well as some differences in our respective methods. In particular I am looking at how to integrate with the native_iceberg_compat mode, as well as reusing the SeekableInputStream that native_iceberg_compat` already has.
I'll put my code in github in a day or so and we can discuss how to get to an implementation that works for both paths.

parthchandra · 2025-07-08T01:36:38Z

common/src/main/java/org/apache/comet/parquet/Native.java

@@ -292,4 +299,104 @@ public static native void currentColumnBatch(
   * @param handle
   */
  public static native void closeRecordBatchReader(long handle);
+


Can we move these to a new file, say JniHDFSBridge ?

Yes, fixed.

parthchandra · 2025-07-08T01:43:21Z

native/core/Cargo.toml

@@ -50,6 +50,7 @@ lazy_static = "1.4.0"
 prost = "0.13.5"
 jni = "0.21"
 snap = "1.1"
+chrono = { version = "0.4", default-features = false, features = ["clock"] }


Maybe use the dependency define in the main Cargo.toml (i.e. make this a workspace dependency)?

parthchandra · 2025-07-08T01:45:59Z

native/core/src/parquet/objectstore/jni.rs

+/// # Returns
+/// Returns `Ok(usize)` with the file size in bytes on success, or an `ObjectStoreError`
+/// if the operation fails.
+pub fn call_get_length(


just get_length perhaps?

Changed to get_length

parthchandra · 2025-07-08T01:46:29Z

native/core/src/parquet/objectstore/jni.rs

+/// # Returns
+/// Returns `Ok(Vec<u8>)` containing the requested bytes on success, or an
+/// `ObjectStoreError` if the operation fails.
+pub fn call_read(


just read or maybe read_range ?

Changed to read

parthchandra · 2025-07-08T01:50:18Z

native/core/src/parquet/objectstore/jni.rs

+        &self,
+        _prefix: Option<&Path>,
+    ) -> BoxStream<'static, Result<ObjectMeta, ObjectStoreError>> {
+        futures::stream::empty().boxed()


Did you intend to return an empty stream here?

Changed to todo!()

parthchandra · 2025-07-08T01:54:07Z

native/core/src/parquet/parquet_support.rs

@@ -384,6 +392,17 @@ pub(crate) fn prepare_object_store_with_configs(

    let (object_store, object_store_path): (Box<dyn ObjectStore>, Path) = if scheme == "hdfs" {
        parse_hdfs_url(&url)
+    } else if scheme == "s3" && use_jni {


Ah, this is a tricky bit. If use_jni is true we should be able to use any file system available to hadoop. We really need to scan the config for all spark.hadoop.fs.customfs.impl and register the jni based object store for every customfs

Thanks! I updated the code to scan for spark.hadoop.fs.<scheme>.impl configs and use JniObjectStore for any matching scheme when use_jni is true.

parthchandra · 2025-07-08T01:56:03Z

native/core/src/parquet/objectstore/jni.rs

@@ -0,0 +1,332 @@
+// Licensed to the Apache Software Foundation (ASF) under one


Can we name this jni_hdfs.rs since this is a jni based implementation of an hdfs object store?

parthchandra

Thanks for making the changes @drexler-sky. I did a more detailed pass and have some more comments.

parthchandra · 2025-07-08T23:26:41Z

native/core/src/parquet/objectstore/jni_hdfs.rs

+static JVM: OnceCell<JavaVM> = OnceCell::new();
+
+pub fn init_jvm(env: &JNIEnv) {
+    let _ = JVM.set(env.get_java_vm().expect("Failed to get JavaVM"));
+}
+
+fn get_jni_env<'a>() -> jni::AttachGuard<'a> {
+    JVM.get()
+        .expect("JVM not initialized")
+        .attach_current_thread()
+        .expect("Failed to attach thread")
+}
+


We are already executing this code in a JVM and the JNIEnv is available to us in (jni_api.rs) Java_org_apache_comet_Native_executePlan We could probably pass that in all the way here.

parthchandra · 2025-07-08T23:33:02Z

native/core/src/parquet/objectstore/jni_hdfs.rs

+        Ok(jmap)
+    }
+
+    pub fn jni_error(e: jni::errors::Error) -> ObjectStoreError {


jvm_bridge/mod.rs has a jni_map_error macro. It also has the very useful jni_call and jni_static_call. Perhaps we can use those and make the code consistent?

parthchandra · 2025-07-08T23:56:22Z

native/core/src/parquet/parquet_support.rs

    let (object_store, object_store_path): (Box<dyn ObjectStore>, Path) = if scheme == "hdfs" {
        parse_hdfs_url(&url)
+    } else if use_jni && hadoop_schemes.contains(scheme) {


I think this check should come first, even before the s3a scheme has been renamed to s3. Note that a user could override the implementation to be used for the hdfs scheme.
Either way, if the use_jni flag is set, we should use jni if the scheme is hdfs or if the config specifies an implementation for the scheme.
Also , we can rename s3a to s3 only if use_jni is false; this way we won't be renaminbg s3a back and forth.

parthchandra · 2025-07-08T23:58:16Z

spark/src/test/scala/org/apache/comet/parquet/ParquetReadFromS3Suite.scala

+
+  test("Comet uses JNI object store when use_jni is true") {
+    spark.conf.set("spark.comet.use_jni_object_store", "true")
+    runParquetScanAndAssert()


Is there a way we can verify the jni implementation did get called?

parthchandra · 2025-07-09T00:07:47Z

@comphead @Kontinuation you might be interested in looking at this PR.

comphead

Thanks @drexler-sky for the PR
Please help me to understand how this object store is different from https://github.com/apache/datafusion-comet/blob/main/native/hdfs/src/object_store/hdfs.rs ?

Kontinuation · 2025-07-09T02:18:29Z

I took a look at the datafusion-comet-objectstore-hdfs module Comet and found that it largely overlaps with the Hadoop FileSystem bridge we are building here. A better approach is to reuse datafusion-comet-objectstore-hdfs but find a way to pass additional Hadoop configurations to it. Users may configuring credentials for accessing the storage in Spark configuration so passing them correctly when constructing the ObjectStore is necessary.

datafusion-comet-objectstore-hdfs works as follows:

                                               FFI            JNI
datafusion-comet-objectstore-hdfs --> fs-hdfs -----> libhdfs -----> Hadoop File System (JVM)

libhdfs and fs-hdfs should be able to support all Hadoop File System implementations, not just HDFS. The current problem is that fs-hdfs does not provide a way to instantiate an HdfsFs instance using custom Hadoop configurations. libhdfs does provide hdfsBuilderConfSetStr, so we need to open up new APIs in fs-hdfs to make use of it.

BTW, is there any concern enabling hdfs support by default and switching the default fs-hdfs dependency to fs-hdfs3?

datafusion-comet/native/hdfs/Cargo.toml

Lines 34 to 37 in d885f4a

    
           [features] 
        
           default = ["hdfs", "try_spawn_blocking"] 
        
           hdfs = ["fs-hdfs"] 
        
           hdfs3 = ["fs-hdfs3"]

comphead · 2025-07-09T15:23:35Z

I took a look at the datafusion-comet-objectstore-hdfs module Comet and found that it largely overlaps with the Hadoop FileSystem bridge we are building here. A better approach is to reuse datafusion-comet-objectstore-hdfs but find a way to pass additional Hadoop configurations to it. Users may configuring credentials for accessing the storage in Spark configuration so passing them correctly when constructing the ObjectStore is necessary.

datafusion-comet-objectstore-hdfs works as follows:
                                               FFI            JNI
datafusion-comet-objectstore-hdfs --> fs-hdfs -----> libhdfs -----> Hadoop File System (JVM)
libhdfs and fs-hdfs should be able to support all Hadoop File System implementations, not just HDFS. The current problem is that fs-hdfs does not provide a way to instantiate an HdfsFs instance using custom Hadoop configurations. libhdfs does provide hdfsBuilderConfSetStr, so we need to open up new APIs in fs-hdfs to make use of it.

BTW, is there any concern enabling hdfs support by default and switching the default fs-hdfs dependency to fs-hdfs3?

datafusion-comet/native/hdfs/Cargo.toml

Lines 34 to 37 in d885f4a

[features]

default = ["hdfs", "try_spawn_blocking"]

hdfs = ["fs-hdfs"]

hdfs3 = ["fs-hdfs3"]

Thanks @Kontinuation I was about to create a PR to enable hdfs support by default @kazuyukitanimura cc
One stopper for me was that HDFS is less popular now and probably has not that many consumers to be enabled by default. But we probably should do it, hopefully the binary would be much bigger in this case.

for fs-hdfs3 let me check that.

Reg to configuring the Hadoop Client from Rust side I used the command line

    // LIBHDFS3_CONF=/Users/ovoievodin/tmp/hadoop-3.2.4/etc/hadoop/ JAVA_HOME="/opt/homebrew/opt/openjdk@11" RUST_BACKTRACE=1 RUSTFLAGS="-L /opt/homebrew/opt/openjdk@11/libexec/openjdk.jdk///Contents/Home/lib/server" cargo test --lib tests::test_read_hdfs -- --nocapture

So the Rust was able to get the Hadoop client configurations.
If running from Comet, hadoop configuration can passed through Spark passing hdfs-site.xml to classpath or using spark.hadoop.* params like in https://datafusion.apache.org/comet/user-guide/datasources.html#using-experimental-native-datafusion-reader

But if it could be improved feel free to extend the hdfs crate so the user can instantiate HdfsFs correctly.

drexler-sky · 2025-07-09T17:06:55Z

@comphead @Kontinuation

Thank you for the comments. That all makes sense to me.

Here’s the plan I propose:

Extend fs-hdfs to support passing in a Hadoop configuration programmatically, instead of relying solely on environment variables.
Update datafusion-comet-objectstore-hdfs to accept a config map and forward it to fs-hdfs.
In Comet, extract relevant Hadoop configurations from SparkConf and pass them through.

While it’s currently possible to set configurations via environment variables or spark.hadoop.*, I believe enabling a more explicit and programmatic approach will improve flexibility and user experience.

Does this approach sound reasonable to everyone? If so, I’ll start with step 1 and submit a PR to arrow-rs once it’s ready.

parthchandra · 2025-07-09T18:41:35Z

datafusion-comet-objectstore-hdfs --> fs-hdfs -----> libhdfs -----> Hadoop File System (JVM)

I did not realize that fs-hdfs was using libhdfs. How is libhdfs packaged into the build? Do we have to worry about the libc version and different platforms or is this taken care of by the fs-hdfs crate?

parthchandra · 2025-07-09T18:48:33Z

I have a couple of use cases in mind that I'm hoping this will cover -

Custom credentials providers - Say a credentials provider extending org.apache.hadoop.fs.s3a.auth.AbstractSessionCredentialsProvider
Custom FileSystem implementations. i.e. implementations of org.apache.hadoop.fs.FileSystem

I'm assuming that as long as the we are using libhdfs, these cases are covered?

comphead · 2025-07-09T19:37:53Z

@comphead @Kontinuation

Thank you for the comments. That all makes sense to me.

Here’s the plan I propose:

Extend fs-hdfs to support passing in a Hadoop configuration programmatically, instead of relying solely on environment variables.

Update datafusion-comet-objectstore-hdfs to accept a config map and forward it to fs-hdfs.

In Comet, extract relevant Hadoop configurations from SparkConf and pass them through.

While it’s currently possible to set configurations via environment variables or spark.hadoop.*, I believe enabling a more explicit and programmatic approach will improve flexibility and user experience.

Does this approach sound reasonable to everyone? If so, I’ll start with step 1 and submit a PR to arrow-rs once it’s ready.

Thanks @drexler-sky I think this approach makes a lot of sense. You probably also want to cover the cases if multiple config sources are set, like env var, spark hadoop and programmatical, which one should be overriding others.

@parthchandra I'm not sure if libhdfs packaged

with hdfs feature and without the libcomet.dylib has the same size of 69M, so static linking is unlikely. And checking dynamic libs

with hdfs (libjvm linked)

otool -L  native/target/release/libcomet.dylib 
native/target/release/libcomet.dylib:
        /opt/homebrew/opt/openjdk@11/libexec/openjdk.jdk/Contents/Home/lib/server/libjvm.dylib (compatibility version 1.0.0, current version 1.0.0)
        /System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 61439.120.27)
        /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 3502.1.255)
        /usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1351.0.0)
MacBook-Pro-110:arrow-datafusion-comet ovoievodin$ ls -la native/target/release/

without hdfs

        /System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 61439.120.27)
        /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 3502.1.255)
        /usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1351.0.0)

comphead · 2025-07-09T19:38:46Z

I have a couple of use cases in mind that I'm hoping this will cover -

Custom credentials providers - Say a credentials provider extending org.apache.hadoop.fs.s3a.auth.AbstractSessionCredentialsProvider

Custom FileSystem implementations. i.e. implementations of org.apache.hadoop.fs.FileSystem

I'm assuming that as long as the we are using libhdfs, these cases are covered?

Hope so, my understanding is libhdfs is a wrapper, so authentication like kerberos should work

parthchandra · 2025-07-09T21:05:30Z

@parthchandra I'm not sure if libhdfs packaged

with hdfs feature and without the libcomet.dylib has the same size of 69M, so static linking is unlikely. And checking dynamic libs

@comphead were you able to run this on a system with no hadoop/hdfs installed? I have an old version of hadoop on my mac and libhdfs.a is less than 500K so it is entirely possible to be statically linked in. I don't know if binaries are available for mac, btw, so I don't know how that would work.

parthchandra · 2025-07-11T00:34:19Z

Update on this:

fs-hdfs includes its own version of libhdfs as part of its build so we do not need to worry about that
fs-hdfs recognizes only file, hdfs, and viewfs as schemes. This is not a showstopper as it is easily addressed.

With the above addressed and with appropriate modifications to Comet to always use the hdfs object store, I was able to access an s3a uri. The read failed but it may be unrelated (I will investigate).

I'll verify with a custom credentials provider as well.

So it appears that using the datafusion-comet-objectstore-hdfs object store is a viable path forward. @drexler-sky I will agree with @comphead and @Kontinuation so if you want to go ahead with your proposed plan, please go ahead.

Kontinuation · 2025-07-12T04:29:08Z

There are some problems with the approach of using fs-hdfs (libhdfs).

Problems

1. Linking against libjvm.so

As we discovered before, using fs-hdfs makes libcomet dynamically link against libjvm.so. libjvm.so is only necessary when we want to call the Java invocation API, these APIs are for creating new Java VMs or finding existing VMs, etc. These APIs are not needed by JNI modules, because the JNI native functions accept a JNIEnv parameter, there's no need for Java VM discovery or creation.

Here are the problems with linking against libjvm.so

This is not a common practice of building JNI modules. I have inspected some commonly used JNI libraries including snappy-java, lz4-java, zstd-jni, arrow_cdata_jni, blas, etc. None of them link against libjvm.so.
This introduces additional friction for development. When hdfs feature is enabled, all the test binaries will link against libjvm.so, and there's a high chance that the tests won't run out-of-box. These are the failures I got when running cargo test --features hdfs:

Linux:

     **Running** unittests src/lib.rs (target/debug/deps/comet-b5e7808e7551d6a8)

/home/kontinuation/documents/github/datafusion-comet/native/target/debug/deps/comet-b5e7808e7551d6a8: error while loading shared libraries: libjvm.so: cannot open shared object file: No such file or directory

**error****:** test failed, to rerun pass `-p datafusion-comet --lib`

macOS:

Running unittests src/lib.rs (target/debug/deps/comet-2a4d21387bc9c825)
dyld[61398]: Library not loaded: @rpath/libjvm.dylib
  Referenced from: <D3FE1520-33E0-3D0F-BDDB-AF7ADB755CBE> /Users/bopeng/workspace/github/datafusion-comet/native/target/debug/deps/comet-2a4d21387bc9c825
  Reason: tried: '/Users/bopeng/workspace/github/datafusion-comet/native/target/debug/build/aws-lc-sys-93a6aa5d6483d3f7/out/libjvm.dylib' (no such file), '/Users/bopeng/workspace/github/datafusion-comet/native/target/debug/build/blake3-0740d36ff36fef54/out/libjvm.dylib' (no such file), '/Users/bopeng/workspace/github/datafusion-comet/native/target/debug/build/fs-hdfs3-ef462506593c877b/out/libjvm.dylib' (no such file), '/Users/bopeng/workspace/github/datafusion-comet/native/target/debug/build/psm-0620469b47746bcc/out/libjvm.dylib' (no such file), '/Users/bopeng/workspace/github/datafusion-comet/native/target/debug/build/ring-c54438584ee3a8bc/out/libjvm.dylib' (no such file), '/Users/bopeng/workspace/github/datafusion-comet/native/target/debug/build/zstd-sys-948618cdd9c42d49/out/libjvm.dylib' (no such file), '/Users/bopeng/workspace/github/datafusion-comet/native/target/debug/deps/libjvm.dylib' (no such file), '/Users/bopeng/workspace/github/datafusion-comet/native/target/debug/libjvm.dylib' (no such file), '/Users/bopeng/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/aarch64-apple-darwin/lib/libjvm.dylib' (no such file), '/Users/bopeng/.rustup/toolchains/stable-aarch64-apple-darwin/lib/libjvm.dylib' (no such file), '/Users/bopeng/lib/libjvm.dylib' (no such file), '/usr/local/lib/libjvm.dylib' (no such file), '/usr/lib/libjvm.dylib' (no such file, not in dyld cache)
error: test failed, to rerun pass `-p datafusion-comet --lib`

We have to point LD_LIBRARY_PATH or DYLD_LIBRARY_PATH to the directory containing libjvm to run those tests. This is quite inconvenient.

2. Problem with `AttachCurrentThread` and `DetachCurrentThread`

The JNIEnv retrieval and thread attachment management policy of Comet and libhdfs does not play well with each other.

Comet uses JavaVM::attach_current_thread to obtain a JNIEnv for calling Java methods in multiple places, it automatically calls AttachCurrentThread if necessary, and returns an AttachGuard to automatically call DetachCurrentThread when the JNIEnv is dropped.
libhdfs always call AttachCurrentThread when obtaining JNIEnv, and cache the JNIEnv handle to thread local variables. It will assume that DetachCurrentThread will never be called and the cached JNIEnv will always be valid.

There might be cases where Comet calls DetachCurrentThread at some point then libhdfs tries to reuse the stale cached thread local JNIEnv, This may happen in theory, I'm not sure if this problem has been observed in practice.

Solutions

I am trying to solve these problems by adding a new no_jvm_invocation feature to fs-hdfs (datafusion-contrib/fs-hdfs#26). When this feature is enabled, libjvm.so won't be linked. It adds a new API set_java_vm and requires the user to provide a handle to a JavaVM before calling other functions. This eliminates the need for discovering VMs by calling JNI_GetCreatedJavaVMs, which is an invocation API requires libjvm.

The policy for managing JNIEnv is also revised:

It tries to obtain JNIEnv from JavaVM by calling (*vm)->GetEnv. If it succeeded, the thread was already attached by the caller and may be detached by the caller in the future, libhdfs won't cache the JNIEnv in thread local storage in this case.
If (*vm)->GetEnv fails, it calls AttachCurrentThread to obtain an JNIEnv, and cache the JNIEnv in thread local storage. The cached JNIEnv will be reused within this thread. When this thread terminates, DetachCurrentThread will be called to free the resources.

This policy ensures that only the caller of AttachCurrentThread would call DetachCurrentThread, this is a more easy to follow principle than the mess of current status.

Other Considerations

The complexity of this approach may be higher than what we expected before
The code quality of libhdfs is ... questionable. I'm not sure if it is a good idea to build general Hadoop FileSystem support upon it anymore.

parthchandra · 2025-07-15T00:03:46Z

We have to point LD_LIBRARY_PATH or DYLD_LIBRARY_PATH to the directory containing libjvm to run those tests. This is quite inconvenient.

I did not run into this issue when trying to use fs-hdfs with S3. I did have to add S3 to fs-hdfs (I'll submit a PR for this) and a hit a few minor issues but nothing major. I do recall we had the requirement to have libjvm in the library load path early on, but don't recall what changed to remove the requirement.

parthchandra · 2025-07-15T00:05:46Z

I'm not sure if it is a good idea to build general Hadoop FileSystem support upon it anymore.

With @drexler-sky and @Kontinuation's changes we might be in a better position to evaluate this approach in a production environment.

parthchandra · 2025-07-15T00:06:40Z

@drexler-sky's PR: datafusion-contrib/fs-hdfs#27

parthchandra · 2025-07-15T00:09:05Z

I am trying to solve these problems by adding a new no_jvm_invocation feature to fs-hdfs (datafusion-contrib/fs-hdfs#26). When this feature is enabled, libjvm.so won't be linked. It adds a new API set_java_vm and requires the user to provide a handle to a JavaVM before calling other functions. This eliminates the need for discovering VMs by calling JNI_GetCreatedJavaVMs, which is an invocation API requires libjvm.

+1

Kontinuation · 2025-07-15T08:44:10Z

I found another problem when setting up GitHub Workflow for fs-hdfs:
https://github.com/Kontinuation/fs-hdfs/actions/runs/16287559163/job/45989546965#step:9:461

HdfsFile::read and HdfsFile::read_with_pos do not guarantee that the output buffer will be filled when there's more data available than the capacity of the output buffer, a common practice is to read data in a loop until we read enough data or EOF is hit.

The get and read_range functions in Comet's hdfs support are not aware of this, we need to fix this problem as well.

UPDATE: The CI failure is not caused by incomplete reads, but caused by hardcoding the value of O_APPEND on a specific platform instead of defining it in a portable way. But my point still holds.

parthchandra · 2025-07-15T16:19:36Z

The get and read_range functions in Comet's hdfs support are not aware of this, we need to fix this problem as well.

Yes, I saw this too. I will have a PR ready today.

mbutrovich · 2025-07-18T19:56:42Z

Should we set this to draft as we discuss the design going forward?

drexler-sky added 4 commits July 6, 2025 18:29

feat: Add JNI-based Hadoop FileSystem support for S3 and other Hadoop…

f5c800e

…-compatible stores

fix format

bf465d0

fix format

81cb9bd

fix format

63e815f

parthchandra reviewed Jul 8, 2025

View reviewed changes

drexler-sky added 3 commits July 7, 2025 23:40

address comments

8e2fcc0

fix style

87b936e

fix style

97e6aee

parthchandra reviewed Jul 9, 2025

View reviewed changes

comphead reviewed Jul 9, 2025

View reviewed changes

Kontinuation mentioned this pull request Jul 12, 2025

feat: Add a no_jvm_invocation feature for JNI use cases such as Comet datafusion-contrib/fs-hdfs#26

Draft

parthchandra mentioned this pull request Jul 16, 2025

fix: hdfs read into buffer fully #2031

Merged

drexler-sky marked this pull request as draft July 20, 2025 04:39

		@@ -0,0 +1,332 @@
		// Licensed to the Apache Software Foundation (ASF) under one

feat: Add JNI-based Hadoop FileSystem support for S3 and other Hadoop-compatible stores #1992

Are you sure you want to change the base?

feat: Add JNI-based Hadoop FileSystem support for S3 and other Hadoop-compatible stores #1992

Uh oh!

Conversation

drexler-sky commented Jul 7, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

codecov-commenter commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

parthchandra left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

parthchandra left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

parthchandra commented Jul 9, 2025

Uh oh!

comphead left a comment

Choose a reason for hiding this comment

Uh oh!

Kontinuation commented Jul 9, 2025

Uh oh!

comphead commented Jul 9, 2025

Uh oh!

drexler-sky commented Jul 9, 2025

Uh oh!

parthchandra commented Jul 9, 2025

Uh oh!

parthchandra commented Jul 9, 2025

Uh oh!

comphead commented Jul 9, 2025

Uh oh!

comphead commented Jul 9, 2025

Uh oh!

parthchandra commented Jul 9, 2025

Uh oh!

parthchandra commented Jul 11, 2025

Uh oh!

codecov-commenter commented Jul 7, 2025 •

edited

Loading

Kontinuation commented Jul 12, 2025 •

edited

Loading

2. Problem with `AttachCurrentThread` and `DetachCurrentThread`

Kontinuation commented Jul 15, 2025 •

edited

Loading