Skip to content

datasets.get() returns DatasetVersion with incorrect blob reference since name of dataset in foundry mismatch name of blob in storage #41960

Open
@m-gheini

Description

@m-gheini
  • Package Name: azure-ai-projects
  • Package Version: azure-ai-projects==1.0.0b11
  • Operating System: Windows
  • Python Version: 3.12.10

Describe the bug
I am trying to retrieve dataset uploaded on Foundry and access its content using datasets.get() and datasets.get_credentials() operation. I used the dataset name and version as specified during the upload function. datasets.get() function returns a DatasetVersion object with a dataUri pointing to a blob storage location, rather than dataset. I tried using sasUri from datasets.get_credentials() to get blob from storage. The main issue is that name provided in DatasetVersion object does not match blob name in the storage container. (Storage instance uses a different naming than what appears in Foundry)

This mismatch makes it impossible to retrieve the dataset directly using get function.

To Reproduce
Steps to reproduce the behavior:

    from azure.storage.blob.aio import BlobServiceClient, BlobClient
    data = await self.project_client.datasets.upload_file(
                   name=dataset_name,
                   version="1.0",
                    file_path=<path_to_file>,
                )
 
    dataset = await project_client.datasets.get(name=dataset_name, version=dataset_version)
 
    dataset_credential = await project_client.datasets.get_credentials(
                name=dataset_name, 
                version=dataset_version
            )
 
    blob_uri, sas_token = sas_uri.split("?")
    print(f"Downloading data from: {blob_uri+"/"+dataset.name+"?" + sas_token}")
    blob_client = BlobClient.from_blob_url(blob_uri+"/"+dataset.name+"?" + sas_token)
    print(f"Blob client created for: {blob_client}")
 
 
    stream = await blob_client.download_blob()

Expected behavior

  1. .get function returns dataset rather than DatasetVersion.
  2. Get the blob.

Error
azure.core.exceptions.ResourceNotFoundError: The specified blob does not exist.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AIAI ProjectsService AttentionWorkflow: This issue is responsible by Azure service team.bugThis issue requires a change to an existing behavior in the product in order to be resolved.customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions