Skip to content

Azure Batch worker pool supports managed identity #5670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

adamrtalbot
Copy link
Collaborator

@adamrtalbot adamrtalbot commented Jan 14, 2025

This feature adds an option to tell Azure Batch the worker pool has a managed identity attached.

azcopy needs to be updated, since the current version isn't recent enough.

When using this feature, you can tell Nextflow that a node pool has a managed identity attached:

azure {

    storage {
        accountName = 'seqeralabs'
    }

    batch {
        location = 'eastus'
        accountName = 'seqeralabs'
        copyToolInstallMode = "off"
        pools {
            "hello-world-entra-mi" {
                vmType = "standard_e2ds_v5"
                managedIdentityId = "myManagedIdentityId"
            }

        }
    }
}

it will then use the native azcopy managed identity authentication to download and upload files.

I haven't handled all the old code yet, so Nextflow will generate a SAS which it only uses sometimes.

Importantly, this 'fixes' #5669, because it doesn't need a SAS token to access the files (tested).

To use:

  1. Create a node pool in Azure Batch and attach a user-assigned managed identity. Follow the instructions here or try this if you want to use Terraform
  2. Change the start task to have the following settings to update azcopy:
    • Command line: bash -c "tar -xzvf azcopy.tar.gz && chmod +x azcopy*/azcopy && mkdir -p $AZ_BATCH_NODE_SHARED_DIR/bin/ && cp azcopy*/azcopy $AZ_BATCH_NODE_SHARED_DIR/bin/"
    • Resource file URL: https://aka.ms/downloadazcopy-v10-linux, path azcopy.tar.gz
  3. Retrieve the managed identity client ID
  4. Use the following Nextflow config, where the pool name is my-pool:
process.queue = 'my-pool'
azure {

    storage {
        accountName = 'seqeralabs'
    }

    batch {
        location = 'eastus'
        accountName = 'seqeralabs'
        copyToolInstallMode = "off"
        pools {
            "my-pool" {
                managedIdentityId = "myManagedIdentityId"
            }

        }
    }
}
  1. Launch pipeline and 🤞

To do:

  • See if we can apply the managed ID to pools created by Nextflow. This might need an update from Microsoft
  • Documentation
  • Remove creation of SAS when using managed identity to make everything more secure and allow this to work if we switch off SAS tokens

This feature adds an option to tell Azure Batch the worker pool has a managed identity attached.

Currently, the only feature is it passes the managed identity client id to the task as an environment variable but this could be extended to support file staging in or out.

Signed-off-by: adamrtalbot <[email protected]>
Copy link

netlify bot commented Jan 14, 2025

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 29524a4
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/67f92d58ef25550008426ffa
😎 Deploy Preview https://deploy-preview-5670--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Signed-off-by: adamrtalbot <[email protected]>
No longer requires SAS tokens for file staging, it can use the managed identity to authenticate.

Signed-off-by: adamrtalbot <[email protected]>
@adamrtalbot adamrtalbot requested a review from a team as a code owner January 14, 2025 18:18
@Takadonet
Copy link

Hello,
This new feature will be very helpful for our project to avoid long duration SaS token.
Just wondering if a timeline or we can provide some assistance.

@pditommaso pditommaso force-pushed the master branch 2 times, most recently from 5a93547 to 27345a6 Compare February 10, 2025 21:46
@adamrtalbot adamrtalbot changed the title feat: Azure Batch worker pool supports managed identity [draft] Azure Batch worker pool supports managed identity Feb 17, 2025
Comment on lines 500 to 503
// This is a bad solution and breaks Fusion for everyone
if (!(pool.opts.managedIdentityId && it.key == "AZURE_STORAGE_SAS_TOKEN")) {
opts += "-e $it.key=$it.value "
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks Fusion for everyone who isn't using managed identities. Ideally we take the pool.opts when creating the launcher but I've ran out of time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale? remove AZURE_STORAGE_SAS_TOKEN when pool.opts.managedIdentityId is provided?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were having authentication problems with Fusion using the SAS token instead of the identity, they might be resolved now.

Ideally, we would not generate a SAS token if we already have a managed identity to reduce our security surface further.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bentsherman and anyone who can help, I've spent ages trying to work out how to get the poolOpts into AzFusionEnv to disable creating a SAS token here, but I can't see a good way of doing it without doing a full Azure specific implementation of TaskBean, FusionScriptLauncher, etc. Is there a better way?

I wonder if I can create a class that is task + pool, kinda like the AzJobKey then use that in the Fusion env builder.

Anyway, all of this to say a proper solution for Fusion would require a bit of a refactor top-to-bottom and I don't know how far I should go with this before punting off to actual good developers 😆

Using azcopy is fine, it's basically ready to go.

Comment on lines +529 to +530
env.put('AZCOPY_AUTO_LOGIN_TYPE', 'MSI') // azcopy
env.put('AZCOPY_MSI_CLIENT_ID', pool.opts.managedIdentityId) // azcopy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These looks unrelated

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what makes azcopy work.

The first iteration of this PR did not include Fusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be affected

Comment on lines 500 to 503
// This is a bad solution and breaks Fusion for everyone
if (!(pool.opts.managedIdentityId && it.key == "AZURE_STORAGE_SAS_TOKEN")) {
opts += "-e $it.key=$it.value "
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale? remove AZURE_STORAGE_SAS_TOKEN when pool.opts.managedIdentityId is provided?

Builds out tests for new fusion functionality for Azure Batch including managed identity support. Required a refactor of the AzBatchService to use the new createBatchTaskContent method for more precise testing but that should be generally helpful.

Signed-off-by: adamrtalbot <[email protected]>
@adamrtalbot adamrtalbot changed the title [draft] Azure Batch worker pool supports managed identity Azure Batch worker pool supports managed identity Apr 10, 2025
@adamrtalbot
Copy link
Collaborator Author

Superseded by #6118

@pditommaso
Copy link
Member

Closing in favour of #6118

@pditommaso pditommaso closed this May 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants