Skip to content

Implicit workflow syntax #10

Closed
Closed
@bentsherman

Description

@bentsherman

One of the restrictions I added to the script grammar was that imperative code is not allowed outside the workflow body. Essentially, the only valid top-level statements are:

  • shebang
  • feature flags
  • includes
  • processes
  • workflows
  • output definition

I added this restriction because arbitrary code would make parsing more difficult and it makes the script harder to read. This code is essentially part of the entry workflow, so it should be placed there. The nf-core pipelines are slowly moving towards this pattern as well.

Of course, this restriction would break most of our code examples in the Nextflow docs. Most notably, the basic "hello world" wouldn't work:

println "Hello, World!"

To support this use case, I added an alternate rule to the parser called an "implicit workflow". Basically, if the script consists only of imperative code (no includes, processes, workflows, etc), then the parser wraps the entire script in an entry workflow under the hood:

workflow {
  println "Hello, World!"
}

If the script does contain some processes or workflows, the parser tries to accept them and then raise an error along the lines of "processes and workflows are not allowed with an implicit workflow script". That way users can get some guidance from the editor rather than something more cryptic like "unexpected token".

But now the problem is that certain syntax errors are handled incorrectly. For example:

process foo {
  publishDir = params.outdir
  // ...
}

The equals sign is invalid, so the parser falls back to interpreting the entire script as an implicit workflow. In this scenario, processes and workflows just look like function calls, so the parser accepts the script and immediately declares every process/workflow invalid because it's actually an implicit workflow 😆

If it had been something like a stray comma or curly brace, the syntax error remains correct because the implicit workflow rule also fails to parse it.

In this particular example, I could extend the parser rule for processes to accept the equals sign and raise an error later, but I fear that there are endless edge cases which could still cause this problem.

There are a few other ways I can think of to deal with this, each with their own trade-offs. Need to ponder this question more to see if there is a better way out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions