Skip to content

feat: support azure blob storage #1242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wcy-fdu
Copy link
Contributor

@wcy-fdu wcy-fdu commented Apr 24, 2025

Which issue does this PR close?

  • Closes #.

What changes are included in this PR?

This PR is similar to the previous one that supported GCS as storage, and it adds support for Azure Blob as storage. The authentication here uses account_name, account_key, and endpoint URL. Connectivity and correctness have already been verified on the RisingWave side, allowing for read and write access to Azure Blob.

Are these changes tested?

@Xuanwo
Copy link
Member

Xuanwo commented Apr 27, 2025

Azblob and azdls are different storage services, but most Iceberg implementations seem to treat them as the same.

CC @Fokko, what are your thoughts? Would it be better to add native azblob support, or should we just add azdls?

@corleyma
Copy link

I think which blob storage to use in Azure should be a choice for the folks deploying the warehouse and not something that needs to be decided by iceberg sdks -- in other words, why not both? But azdls is definitely recommended for this kind of workload.

@Xuanwo
Copy link
Member

Xuanwo commented Apr 29, 2025

I think which blob storage to use in Azure should be a choice for the folks deploying the warehouse and not something that needs to be decided by iceberg sdks -- in other words, why not both? But azdls is definitely recommended for this kind of workload.

Hi, I agree with your comments. The issue I'm trying to resolve is which service's API specifications we're using: azblob and azdls.

From the java's code, seems we should talk with azure with azdls instead:

https://github.com/apache/iceberg/blob/829ae7a11dc1eb62246c801ce1c7e501356c5463/azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java#L39C1-L44C29

 * For compatibility, locations using the wasb scheme are also accepted but will use the Azure Data
 * Lake Storage Gen2 REST APIs instead of the Blob Storage REST APIs.
 *
 * <p>See <a
 * href="https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction-abfs-uri#uri-syntax">Azure
 * Data Lake Storage URI</a>

@christophediprima
Copy link

christophediprima commented May 12, 2025

Hi, we are also working with Iceberg and Azure and we can't really use this as the only scheme our current supported catalogs handle are wasb or abfs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants