-
Notifications
You must be signed in to change notification settings - Fork 689
Azure docs rewrite #6087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Azure docs rewrite #6087
Conversation
The Azure environment variables were not documented. This PR adds them to the documentation pages. Signed-off-by: adamrtalbot <[email protected]>
Changes: - restructures Azure docs to be more logical - authentication - storage - compute - advanced - Removes redundancy and clarifies some bits - Simplifies where possible - Guides user from A to Z in a more strict manner Signed-off-by: adamrtalbot <[email protected]>
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Signed-off-by: adamrtalbot <[email protected]>
Signed-off-by: adamrtalbot <[email protected]>
Signed-off-by: adamrtalbot <[email protected]>
Signed-off-by: adamrtalbot <[email protected]>
…orks and move some stuff around for clarity Signed-off-by: adamrtalbot <[email protected]>
Signed-off-by: adamrtalbot <[email protected]>
Signed-off-by: adamrtalbot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of suggestions looks much worse than it is. I try to do suggestions on separate lines so you can screen and accept and reject specific comments rather than edit the suggestion.
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Christopher Hakkaart <[email protected]>
Signed-off-by: Christopher Hakkaart <[email protected]>
Signed-off-by: adamrtalbot <[email protected]>
Signed-off-by: adamrtalbot <[email protected]>
@christopher-hakkaart this is ready for a re-review. |
Signed-off-by: Christopher Hakkaart <[email protected]>
Signed-off-by: Christopher Hakkaart <[email protected]>
@adamrtalbot I pushed a bunch of changes rather than loading the PR with suggestions again. Noteworthy changes:
Feel free to revert anything you disagree with. I'll give it a quick look again tomorrow with fresh eyes and make any final suggestions. Overall, it's looking really good. |
docs/azure.md
Outdated
1. Create an Azure Batch account in the Azure portal. | ||
2. Increase the quotas in your Azure Batch account to the pipeline's needs. | ||
|
||
:::{note} | ||
Quotas impact the number of pools, CPUs, and jobs you can create. | ||
::: | ||
|
||
3. Create a storage account and Azure Blob Storage in the same region as the Batch account. | ||
4. Configure Nextflow to submit processes to Azure Batch using configuration. For example: | ||
|
||
```groovy | ||
process { | ||
executor = 'azurebatch' | ||
} | ||
|
||
azure { | ||
storage { | ||
accountName = '<STORAGE_ACCOUNT_NAME>' | ||
} | ||
batch { | ||
location = '<LOCATION>' | ||
accountName = '<BATCH_ACCOUNT_NAME>' | ||
autoPoolMode = true | ||
allowPoolCreation = true | ||
} | ||
} | ||
``` | ||
|
||
Replace the following: | ||
|
||
- `STORAGE_ACCOUNT_NAME`: your account name | ||
- `LOCATION`: your Azure region | ||
- `ACCOUNT_NAME`: your batch account name | ||
|
||
:::{note} | ||
The above snippet excludes authentication for Azure services. See [Authentication](#authentication) for more information. | ||
::: | ||
|
||
5. Launch your pipeline with the above configuration and add a working directory on Azure Blob Storage: | ||
|
||
```bash | ||
nextflow run <PIPELINE_NAME> -w az://<BLOB_STORAGE>/ | ||
``` | ||
|
||
Replace the following: | ||
|
||
- `PIPELINE_NAME`: your pipeline, for example, `nextflow-io/rnaseq-nf` | ||
- `BLOB_STORAGE`: your Azure Blob Storage from the storage account defined in your configuration | ||
|
||
:::{tip} | ||
You can list Azure regions with: `az account list-locations -o table` | ||
::: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. Create an Azure Batch account in the Azure portal. | |
2. Increase the quotas in your Azure Batch account to the pipeline's needs. | |
:::{note} | |
Quotas impact the number of pools, CPUs, and jobs you can create. | |
::: | |
3. Create a storage account and Azure Blob Storage in the same region as the Batch account. | |
4. Configure Nextflow to submit processes to Azure Batch using configuration. For example: | |
```groovy | |
process { | |
executor = 'azurebatch' | |
} | |
azure { | |
storage { | |
accountName = '<STORAGE_ACCOUNT_NAME>' | |
} | |
batch { | |
location = '<LOCATION>' | |
accountName = '<BATCH_ACCOUNT_NAME>' | |
autoPoolMode = true | |
allowPoolCreation = true | |
} | |
} | |
``` | |
Replace the following: | |
- `STORAGE_ACCOUNT_NAME`: your account name | |
- `LOCATION`: your Azure region | |
- `ACCOUNT_NAME`: your batch account name | |
:::{note} | |
The above snippet excludes authentication for Azure services. See [Authentication](#authentication) for more information. | |
::: | |
5. Launch your pipeline with the above configuration and add a working directory on Azure Blob Storage: | |
```bash | |
nextflow run <PIPELINE_NAME> -w az://<BLOB_STORAGE>/ | |
``` | |
Replace the following: | |
- `PIPELINE_NAME`: your pipeline, for example, `nextflow-io/rnaseq-nf` | |
- `BLOB_STORAGE`: your Azure Blob Storage from the storage account defined in your configuration | |
:::{tip} | |
You can list Azure regions with: `az account list-locations -o table` | |
::: | |
1. Create an Azure Batch account in the Azure portal. | |
2. Increase the quotas in your Azure Batch account to the pipeline's needs. | |
3. Create a storage account and Azure Blob Storage in the same region as the Batch account. | |
4. Configure Nextflow to submit processes to Azure Batch using configuration. For example: | |
```groovy | |
process { | |
executor = 'azurebatch' | |
} | |
azure { | |
storage { | |
accountName = '<STORAGE_ACCOUNT_NAME>' | |
} | |
batch { | |
location = '<LOCATION>' | |
accountName = '<BATCH_ACCOUNT_NAME>' | |
autoPoolMode = true | |
allowPoolCreation = true | |
} | |
} | |
``` | |
5. Launch your pipeline with the above configuration and add a working directory on Azure Blob Storage: | |
```bash | |
nextflow run <PIPELINE_NAME> -w az://<BLOB_STORAGE>/ | |
``` | |
:::{note} | |
The above snippet excludes authentication for Azure services. See [Authentication](#authentication) for more information. | |
::: | |
Replace the following: | |
- `STORAGE_ACCOUNT_NAME`: your account name | |
- `LOCATION`: your Azure region | |
- `ACCOUNT_NAME`: your batch account name | |
- `PIPELINE_NAME`: your pipeline, for example, `nextflow-io/rnaseq-nf` | |
- `BLOB_STORAGE`: your Azure Blob Storage from the storage account defined in your configuration | |
:::{note} | |
Quotas impact the number of pools, CPUs, and jobs you can create. | |
::: | |
:::{tip} | |
You can list Azure regions with: `az account list-locations -o table` | |
::: |
Maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to have these attached to the relevant step. For a user who may follow the steps one at a time, it doesn't make sense to have them at the bottom. E.g., considerations for setting quotas should be considered when setting the quota, not after the whole thing is set up.
We could try and wrap it into the same point to avoid the note:
Submit a request through the Azure Portal's Quotas page to increase your Azure Batch account limits on pools, CPUs, and jobs based on your pipeline's resource needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to simplify it with c422f6a.
- Moved notes to within the bullet points (if you have something to say say it properly)
- Removed config items and referred to the authentication section
Nextflow uses the following environment variable if the managed identity client ID is not provided in the Nextflow configuration file: | ||
|
||
- `AZURE_MANAGED_IDENTITY_USER`: The client ID for a user-assigned managed identity. | ||
::: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The language is a bit weird here.
@christopher-hakkaart all looks good - I'm never sure about punctuation for what is an Azure product and what isn't. If we aren't using 4th level headings we should remove "Pool management" and just have a few other 3rd level headings, I've left a comment. |
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Justine Geffen <[email protected]>
Yup - I missed that! Thanks! Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Chris Hakkaart <[email protected]>
Good point Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Chris Hakkaart <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Chris Hakkaart <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Chris Hakkaart <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Chris Hakkaart <[email protected]>
Thanks @adamrtalbot - agree with your point about headings. I've accepted your suggestions. I've also asked @justinegeffen to look at this PR as well. As I made so many changes/suggestions it would be good to get a second review from the style side. We don't want to slow this PR down, so if the review is taking too long, we can look at merging this now, and then making a second PR to fix any style-related issues I introduced. Regarding the naming, I never know that to do either. I took notes on everything I wasn't sure about and checked as I went. I'll write up a cheat sheet for future reference. |
Signed-off-by: adamrtalbot <[email protected]>
2. Increase the quotas in your Azure Batch account to the pipeline's needs. Quotas impact the number of pools, CPUs, and jobs you can create. | ||
3. Create a storage account and Azure Blob Storage in the same region as the Batch account. | ||
4. Add authentication and account details to your Nextflow configuration file. See [Authentication](#authentication) for examples of configuration. | ||
5. Configure Nextflow to submit processes to Azure Batch by setting the process.executor directive to `azurebatch`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5. Configure Nextflow to submit processes to Azure Batch by setting the process.executor directive to `azurebatch`. | |
5. Configure Nextflow to submit processes to Azure Batch by setting the `process.executor` directive to `azurebatch`. |
To run pipelines with Azure Batch: | ||
|
||
1. Create an Azure Batch account in the Azure portal. | ||
2. Increase the quotas in your Azure Batch account to the pipeline's needs. Quotas impact the number of pools, CPUs, and jobs you can create. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. Increase the quotas in your Azure Batch account to the pipeline's needs. Quotas impact the number of pools, CPUs, and jobs you can create. | |
2. Increase the quotas in your Azure Batch account to support the pipeline's needs. Quotas impact the number of pools, CPUs, and jobs you can create. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamrtalbot, does this suggestion make sense in the context?
Signed-off-by: Justine Geffen <[email protected]>
A system-assigned identity is tied to a specific Azure resource. To use it: | ||
|
||
1. Enable [system-assigned managed identity](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-to-configure-managed-identities?pivots=qs-configure-portal-windows-vm#system-assigned-managed-identity) on your Azure resource. | ||
2. Configure the required role assignments. See [Required Roles](#required-roles) for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. Configure the required role assignments. See [Required Roles](#required-roles) for more information. | |
2. Configure the required role assignments. See [Required roles](#required-roles) for more information. |
Signed-off-by: Justine Geffen <[email protected]>
It started as a quick update to the docs and turned into a full re-write.
Changes: