Credentials Vending #4415
Replies: 2 comments
-
To be more specific about the credentials loading part, you can pass in a custom credentials provider in rust by implementing one with The real problem is that how to load such a provider from configuration. Looks like the only way I know is to do it through #[no_mangle]
pub extern "C" fn make_provider() -> *mut dyn ProvideCredentials { ... } In the custom provider, and then in object_store, do something like: use libloading::{Library, Symbol};
use aws_types::credentials::ProvideCredentials;
let lib = Library::new("/path/to/libmyprovider.so")?;
unsafe {
let constructor: Symbol<unsafe fn() -> *mut dyn ProvideCredentials> =
lib.get(b"make_provider")?;
let raw_ptr = constructor();
let provider: Box<dyn ProvideCredentials> = Box::from_raw(raw_ptr);
// use provider
} but that seems quite unsafe and cumbersome. Not sure if there is any better way. |
Beta Was this translation helpful? Give feedback.
-
One alternative is to do it in Lance above the Arrow object_store level. In Lance ObjectStore, we add a specific code path for credentials vending, so we track the expiration time of the credentials passed in. When expired, we re-initialize a new underlying object_store. This seems to be a much better approach than doing it the Iceberg-way. We will be able to consistently support credentials vending to any storage backend and across all languages as long as we make it work in rust core. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
As we have now also added Lance namespace as another layer on top of the table and file format, features like credentials vending come into discussion.
The expectation is that, users do not have real access to the underlying object store. Instead, when they call
DescribeTable
against a namespace, the namespace vends temporary credentials that can be used for subsequent operations.The initial step is pretty straightforward, you just pass in the credentials from
DescribeTableResponse
asaccess_key_id
,secret_access_key
,session_token
to initialize the dataset.The challenge comes when the credentials expires in the middle of the operation (these temp credentials are typically very short-lived), and there is a callback needed to get a set of new credentials to keep the operation going.
In systems like Iceberg, it was done by plugging in a custom credentials provider, but in Lance, there are a few challenges,
Ref: current list of AWS configurations in object_store https://github.com/apache/arrow-rs-object-store/blob/main/src/aws/builder.rs#L398-L441
Note: ideally we will do it for AWS, GCS and Azure, but starting with AWS is probably the easiest, and we can apply the same strategy to other cloud once we figure out a solution.
cc @wjones127 @Xuanwo @bryanck
Beta Was this translation helpful? Give feedback.
All reactions