Skip to content

Azure docs rewrite #6087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open

Azure docs rewrite #6087

wants to merge 40 commits into from

Conversation

adamrtalbot
Copy link
Collaborator

@adamrtalbot adamrtalbot commented May 15, 2025

It started as a quick update to the docs and turned into a full re-write.

Changes:

  • restructures Azure docs to be more logical
    • authentication
    • storage
    • compute
    • advanced
  • Removes redundancy and clarifies some bits
  • Simplifies where possible
  • Guides user from A to Z in a more straightforward manner

The Azure environment variables were not documented. This PR adds them to the documentation pages.

Signed-off-by: adamrtalbot <[email protected]>
Changes:
- restructures Azure docs to be more logical
    - authentication
    - storage
    - compute
    - advanced
- Removes redundancy and clarifies some bits
- Simplifies where possible
- Guides user from A to Z in a more strict manner

Signed-off-by: adamrtalbot <[email protected]>
@adamrtalbot adamrtalbot requested a review from a team as a code owner May 15, 2025 16:48
@adamrtalbot adamrtalbot changed the title azure docs rewrite Azure docs rewrite May 15, 2025
Copy link

netlify bot commented May 15, 2025

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 845423a
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/6840717d5af8cb0008fb2691
😎 Deploy Preview https://deploy-preview-6087--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Signed-off-by: adamrtalbot <[email protected]>
…orks and move some stuff around for clarity

Signed-off-by: adamrtalbot <[email protected]>
@nextflow-io nextflow-io deleted a comment from adamrtalbot May 16, 2025
Copy link
Collaborator

@christopher-hakkaart christopher-hakkaart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of suggestions looks much worse than it is. I try to do suggestions on separate lines so you can screen and accept and reject specific comments rather than edit the suggestion.

adamrtalbot and others added 12 commits May 16, 2025 15:24
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Christopher Hakkaart <[email protected]>
Signed-off-by: Christopher Hakkaart <[email protected]>
@adamrtalbot
Copy link
Collaborator Author

@christopher-hakkaart this is ready for a re-review.

Signed-off-by: Christopher Hakkaart <[email protected]>
Signed-off-by: Christopher Hakkaart <[email protected]>
@christopher-hakkaart
Copy link
Collaborator

christopher-hakkaart commented Jun 3, 2025

@adamrtalbot I pushed a bunch of changes rather than loading the PR with suggestions again.

Noteworthy changes:

  • Made passive sentences active by moving the second half of the sentences with a comma to the front.
  • Aligned punctuation and style for lists with and without colons.
  • Some minor voice and tense alignments (e.g., removing "will" from sentences).
  • Move all H4 headings (####) up to H3 (###), H4 is too deep in the navigation.
  • Corrected capitalization of Azure-related words.
  • Minor punctuation fixes.

Feel free to revert anything you disagree with.

I'll give it a quick look again tomorrow with fresh eyes and make any final suggestions. Overall, it's looking really good.

docs/azure.md Outdated
Comment on lines 23 to 74
1. Create an Azure Batch account in the Azure portal.
2. Increase the quotas in your Azure Batch account to the pipeline's needs.

:::{note}
Quotas impact the number of pools, CPUs, and jobs you can create.
:::

3. Create a storage account and Azure Blob Storage in the same region as the Batch account.
4. Configure Nextflow to submit processes to Azure Batch using configuration. For example:

```groovy
process {
executor = 'azurebatch'
}

azure {
storage {
accountName = '<STORAGE_ACCOUNT_NAME>'
}
batch {
location = '<LOCATION>'
accountName = '<BATCH_ACCOUNT_NAME>'
autoPoolMode = true
allowPoolCreation = true
}
}
```

Replace the following:

- `STORAGE_ACCOUNT_NAME`: your account name
- `LOCATION`: your Azure region
- `ACCOUNT_NAME`: your batch account name

:::{note}
The above snippet excludes authentication for Azure services. See [Authentication](#authentication) for more information.
:::

5. Launch your pipeline with the above configuration and add a working directory on Azure Blob Storage:

```bash
nextflow run <PIPELINE_NAME> -w az://<BLOB_STORAGE>/
```

Replace the following:

- `PIPELINE_NAME`: your pipeline, for example, `nextflow-io/rnaseq-nf`
- `BLOB_STORAGE`: your Azure Blob Storage from the storage account defined in your configuration

:::{tip}
You can list Azure regions with: `az account list-locations -o table`
:::
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit messy and hard to follow:

image

Perhaps we should move the comments and placeholder comments out of the numbered points to keep the flow simple.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Create an Azure Batch account in the Azure portal.
2. Increase the quotas in your Azure Batch account to the pipeline's needs.
:::{note}
Quotas impact the number of pools, CPUs, and jobs you can create.
:::
3. Create a storage account and Azure Blob Storage in the same region as the Batch account.
4. Configure Nextflow to submit processes to Azure Batch using configuration. For example:
```groovy
process {
executor = 'azurebatch'
}
azure {
storage {
accountName = '<STORAGE_ACCOUNT_NAME>'
}
batch {
location = '<LOCATION>'
accountName = '<BATCH_ACCOUNT_NAME>'
autoPoolMode = true
allowPoolCreation = true
}
}
```
Replace the following:
- `STORAGE_ACCOUNT_NAME`: your account name
- `LOCATION`: your Azure region
- `ACCOUNT_NAME`: your batch account name
:::{note}
The above snippet excludes authentication for Azure services. See [Authentication](#authentication) for more information.
:::
5. Launch your pipeline with the above configuration and add a working directory on Azure Blob Storage:
```bash
nextflow run <PIPELINE_NAME> -w az://<BLOB_STORAGE>/
```
Replace the following:
- `PIPELINE_NAME`: your pipeline, for example, `nextflow-io/rnaseq-nf`
- `BLOB_STORAGE`: your Azure Blob Storage from the storage account defined in your configuration
:::{tip}
You can list Azure regions with: `az account list-locations -o table`
:::
1. Create an Azure Batch account in the Azure portal.
2. Increase the quotas in your Azure Batch account to the pipeline's needs.
3. Create a storage account and Azure Blob Storage in the same region as the Batch account.
4. Configure Nextflow to submit processes to Azure Batch using configuration. For example:
```groovy
process {
executor = 'azurebatch'
}
azure {
storage {
accountName = '<STORAGE_ACCOUNT_NAME>'
}
batch {
location = '<LOCATION>'
accountName = '<BATCH_ACCOUNT_NAME>'
autoPoolMode = true
allowPoolCreation = true
}
}
```
5. Launch your pipeline with the above configuration and add a working directory on Azure Blob Storage:
```bash
nextflow run <PIPELINE_NAME> -w az://<BLOB_STORAGE>/
```
:::{note}
The above snippet excludes authentication for Azure services. See [Authentication](#authentication) for more information.
:::
Replace the following:
- `STORAGE_ACCOUNT_NAME`: your account name
- `LOCATION`: your Azure region
- `ACCOUNT_NAME`: your batch account name
- `PIPELINE_NAME`: your pipeline, for example, `nextflow-io/rnaseq-nf`
- `BLOB_STORAGE`: your Azure Blob Storage from the storage account defined in your configuration
:::{note}
Quotas impact the number of pools, CPUs, and jobs you can create.
:::
:::{tip}
You can list Azure regions with: `az account list-locations -o table`
:::

Maybe?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have these attached to the relevant step. For a user who may follow the steps one at a time, it doesn't make sense to have them at the bottom. E.g., considerations for setting quotas should be considered when setting the quota, not after the whole thing is set up.

We could try and wrap it into the same point to avoid the note:

Submit a request through the Azure Portal's Quotas page to increase your Azure Batch account limits on pools, CPUs, and jobs based on your pipeline's resource needs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to simplify it with c422f6a.

  • Moved notes to within the bullet points (if you have something to say say it properly)
  • Removed config items and referred to the authentication section

Comment on lines +182 to +185
Nextflow uses the following environment variable if the managed identity client ID is not provided in the Nextflow configuration file:

- `AZURE_MANAGED_IDENTITY_USER`: The client ID for a user-assigned managed identity.
:::
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The language is a bit weird here.

@adamrtalbot
Copy link
Collaborator Author

@christopher-hakkaart all looks good - I'm never sure about punctuation for what is an Azure product and what isn't.

If we aren't using 4th level headings we should remove "Pool management" and just have a few other 3rd level headings, I've left a comment.

justinegeffen and others added 6 commits June 3, 2025 22:14
Co-authored-by: Adam Talbot <[email protected]>
Signed-off-by: Justine Geffen <[email protected]>
Yup - I missed that! Thanks!

Co-authored-by: Adam Talbot <[email protected]>
Signed-off-by: Chris Hakkaart <[email protected]>
Good point

Co-authored-by: Adam Talbot <[email protected]>
Signed-off-by: Chris Hakkaart <[email protected]>
Co-authored-by: Adam Talbot <[email protected]>
Signed-off-by: Chris Hakkaart <[email protected]>
Co-authored-by: Adam Talbot <[email protected]>
Signed-off-by: Chris Hakkaart <[email protected]>
Co-authored-by: Adam Talbot <[email protected]>
Signed-off-by: Chris Hakkaart <[email protected]>
@christopher-hakkaart
Copy link
Collaborator

Thanks @adamrtalbot - agree with your point about headings. I've accepted your suggestions.

I've also asked @justinegeffen to look at this PR as well. As I made so many changes/suggestions it would be good to get a second review from the style side. We don't want to slow this PR down, so if the review is taking too long, we can look at merging this now, and then making a second PR to fix any style-related issues I introduced.

Regarding the naming, I never know that to do either. I took notes on everything I wasn't sure about and checked as I went. I'll write up a cheat sheet for future reference.

2. Increase the quotas in your Azure Batch account to the pipeline's needs. Quotas impact the number of pools, CPUs, and jobs you can create.
3. Create a storage account and Azure Blob Storage in the same region as the Batch account.
4. Add authentication and account details to your Nextflow configuration file. See [Authentication](#authentication) for examples of configuration.
5. Configure Nextflow to submit processes to Azure Batch by setting the process.executor directive to `azurebatch`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
5. Configure Nextflow to submit processes to Azure Batch by setting the process.executor directive to `azurebatch`.
5. Configure Nextflow to submit processes to Azure Batch by setting the `process.executor` directive to `azurebatch`.

To run pipelines with Azure Batch:

1. Create an Azure Batch account in the Azure portal.
2. Increase the quotas in your Azure Batch account to the pipeline's needs. Quotas impact the number of pools, CPUs, and jobs you can create.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Increase the quotas in your Azure Batch account to the pipeline's needs. Quotas impact the number of pools, CPUs, and jobs you can create.
2. Increase the quotas in your Azure Batch account to support the pipeline's needs. Quotas impact the number of pools, CPUs, and jobs you can create.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamrtalbot, does this suggestion make sense in the context?

A system-assigned identity is tied to a specific Azure resource. To use it:

1. Enable [system-assigned managed identity](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-to-configure-managed-identities?pivots=qs-configure-portal-windows-vm#system-assigned-managed-identity) on your Azure resource.
2. Configure the required role assignments. See [Required Roles](#required-roles) for more information.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Configure the required role assignments. See [Required Roles](#required-roles) for more information.
2. Configure the required role assignments. See [Required roles](#required-roles) for more information.

Signed-off-by: Justine Geffen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants