-
Notifications
You must be signed in to change notification settings - Fork 676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graceful fallback when array
is used on unsupported executors
#5924
Comments
I would just label the processes for which you want to use job arrays and use withLabel to set the array directive for those processes. It's generally better to be specific in that way rather than enabling job arrays across the board |
I'm not sure this fully addresses the issue. If I set the following in the config
it will still fail if I run the pipeline with the |
I assume you have two profiles for local and slurm? In that case you can just put the array config in the slurm profile |
That's the solution I have implemented for the time being. The pipeline is nf-core, so it needs to be as portable as possible. I hoped I could somehow make using arrays the default behavior on grid executors, but it seems it's not possible for the time being. |
I would caution against making job arrays the default across the entire pipeline, because there are cases where it can actually hurt you. For example, the first step in the pipeline will probably receive a deluge of tasks all at once because it's just loading the inputs. But a later step might receive tasks at a slower rate as some upstream tasks take longer than others. In the latter case, it might not be so important to batch the job submissions if they are already slow, and enabling job arrays might just needlessly delay job submissions while waiting for a full array to submit That's just my intuition, but in reality it would depend on the actual submit rates vs the capacity of the scheduler. I do wonder whether this issue would come up in practice or not |
New feature
Currently, config parsing always fails if the
array
process directive is used in an executor that does not support job arrays (most importantlylocal
). It would be reasonable to have Nextflow fall back to running single processes instead and print a warning tostderr.
If needed, this behavior can be controlled with an environment variable.Use case
Some pipelines have processes that take very little time to run (< 5 minutes). In that case, using job arrays is a reasonable default behavior. However, it is impossible to set directly, as it will fail on executors without job arrays.
Suggested implementation
As described above.
The text was updated successfully, but these errors were encountered: