DSL2 pipe syntax enhancements #3243

jemunro · 2022-09-25T02:24:19Z

New feature

Enhancing the pipe operator to allow processes and workflows with multiple inputs to be piped together.

Usage scenario

Given the following processes:

process toUpper {
    input: val x
    output: stdout
    script: "echo -n $x | tr a-z A-Z"
}
process strConcat {
    input: val x; val y
    output: stdout
    script: "echo -n $x; echo -n $y"
}

If we want to run out data through toUpper and into strConcat, the best we can do with piping is the following:

workflow {
    append = channel.value('!')
    strConcat(
        channel.of('foo', 'bar') | toUpper,
        append) |
        view()
}

which gives output:

N E X T F L O W  ~  version 22.04.5
Launching `works.nf` [zen_mclean] DSL2 - revision: 2f1ccf68d9
executor >  local (4)
[68/eb8b8d] process > toUpper (2)   [100%] 2 of 2 ✔
[e9/7fb402] process > strConcat (2) [100%] 2 of 2 ✔
FOO!
BAR!

Suggest implementation

The following would be more intuitive and readable:

workflow {
    append = channel.value('!')
    channel.of('foo', 'bar') |
        toUpper() | 
        strConcat(append) |
        view()
}

Attempting to run the above currently gives the following error:

Process `toUpper` declares 1 input channel but 0 were specified
 -- Check script 'main.nf' at line: 17 or see '.nextflow.log' file for more details

if we replace toUpper() | with toUpper |, the error becomes:

Process `strConcat` declares 2 input channels but 1 were specified

This could be achieved by making the first argument or a process be the piped value (regardless of whether parenthesis are present). This already seems to be the case for operators that accept multiple arguments, e.g.:

channel.of('foo', 'bar') |
        mix(channel.of('baz')) |
        view

The text was updated successfully, but these errors were encountered:

bentsherman · 2022-09-26T20:31:19Z

Basically you want to be able to curry channel arguments. I think it would work for processes but not for operators, because operators use an object-oriented syntax, e.g. a | op(b) is equivalent to a.op(b). It's an interesting idea.

Also, the example with mix works because mix can take any number of arguments.

jemunro · 2022-09-27T08:24:59Z

As far as I can tell this already works for most operators, just not for processes.

For example the following works fine:

workflow {
    left  = channel.of(['A', 1], ['B', 2], ['C', 3], ['D', 7])
    right = channel.of(['B', 6], ['C', 5], ['D', 2], ['A', 8])

    left | 
        join(right) | 
        map { s, x, y -> [s, x, y, x + y] } |
        groupTuple(by: 3) |
        map { ss, xx, yy, z -> [ss.join('-'), xx.sum() + yy.sum() ]} |
        combine(left) |
        view()
}

[B-C, 16, A, 1]
[B-C, 16, B, 2]
[B-C, 16, C, 3]
[B-C, 16, D, 7]
[A-D, 18, A, 1]
[A-D, 18, B, 2]
[A-D, 18, C, 3]
[A-D, 18, D, 7]

jemunro · 2022-09-27T08:28:50Z

This piping paradigm is widely used in R: see https://r4ds.had.co.nz/pipes.html

bentsherman · 2022-09-27T13:29:59Z

Yes, most operators already work this way because they use an object-oriented syntax, so it is clear which argument is being piped. With processes I only worry that it's not clear which input to pipe the argument into.

jemunro · 2022-09-28T03:16:19Z

I think that it would be fairly intuitive that it was the first process input that was being piped into.

To illustrate further, let's consider some operators that could be implemented as processes:
1. unary operator:

process SPLIT_CSV {
    input: val(x)
    output: val(split)
    exec: split = x.split(',')
}
workflow {
    csv = channel.of('a,b,c', 'd,e,f')
    // --------------- operator ---------------
    csv.splitCsv().view()        // (1) - valid
    csv | splitCsv | view()      // (2) - valid
    splitCsv(csv) | view()       // (3) - not valid
    // --------------- process ----------------
    csv.SPLIT_CSV().view()       // (1) - not valid
    csv | SPLIT_CSV | view()     // (2) - valid
    SPLIT_CSV(csv) | view()      // (3) - valid
}

2. binary operator:

process MERGE {
    input: val(x); val(y)
    output: val(xy)
    exec: xy = (x instanceof List ? x : [x]) + (y instanceof List ? y : [y])
}
workflow {
    left  = channel.of(['A', 1], ['B', 2], ['C', 3])
    right = channel.of(['X', 4], ['Y', 5], ['Z', 6])
    // --------------- operator ---------------
    left.merge(right).view()      // (1) - valid
    left | merge(right) | view()  // (2) - valid
    merge(left, right) | view()   // (3) - not valid
    // --------------- process ----------------
    left.MERGE(right).view()      // (1) - not valid
    left | MERGE(right) | view()  // (2) - not valid
    MERGE(left, right) | view()   // (3) - valid
}

From this we can see that option (2) is valid for both the process and operator in unary case, but only for the operator in the binary case.

bentsherman · 2022-09-28T20:39:22Z

Thank you for the detailed example. It is interesting, Nextflow's operator syntax suggests the first input should be piped, whereas currying in functional programming would suggest that the last input should be piped. I foresee future tribal divisions over this question 😄

I also wanted to mention, there is a & operator for invoking multiple processes with the same inputs (docs). What if we could use this syntax on channels as well?

// and operator with processes
workflow {
    channel.from('Hello') | (foo & bar) | mix | view
}

// and operator with channels
workflow {
    (left & right) | merge | view
}

jemunro · 2022-09-29T05:36:23Z

The & operator is an interesting idea. Using the first example, are you suggesting using the & something like this?

workflow {
    append = channel.value('!')
    channel.of('foo', 'bar') |
        toUpper &
        append |
        strConcat |
        view()
}

bentsherman · 2022-09-29T14:19:39Z

Well, it wouldn't work in that case, because you want channel.of('foo', 'bar') to be piped into toUpper and not append. So it looks like your argument currying idea is more versatile.

jemunro · 2022-09-30T02:52:28Z

Another thought, what it there was an operator that worked similarly to Groovy's with?

Using OO syntax, with already mostly works:

append = channel.value('!')
channel.of('foo', 'bar')
    .with { toUpper (it) }
    .with { strConcat(it, append) }
    .view()

We could define an operator, say withOp, that works with pipes:

append = channel.value('!')
channel.of('foo', 'bar') |
    toUpper |
    withOp { strConcat(it, append) } |
    view

The nice thing about this approach is that it is quite flexible, you could do any arbitrary thing with channels, operators and processes in the closure and then pipe the result.

I made a draft PR that enables the above to work: #3254

stale · 2023-03-18T09:26:53Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bentsherman added the lang/dsl2 label Sep 26, 2022

jemunro mentioned this issue Sep 30, 2022

draft withOp #3254

Closed

jemunro mentioned this issue Nov 7, 2022

exec and eval operators #3356

Closed

stale bot added the stale label Mar 18, 2023

pditommaso added the pinned label Mar 18, 2023

stale bot removed the stale label Mar 18, 2023

bentsherman linked a pull request Nov 16, 2023 that will close this issue

Support closure operand for pipe operator #3423

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DSL2 pipe syntax enhancements #3243

DSL2 pipe syntax enhancements #3243

jemunro commented Sep 25, 2022 •

edited

Loading

bentsherman commented Sep 26, 2022

Uh oh!

jemunro commented Sep 27, 2022

Uh oh!

jemunro commented Sep 27, 2022

Uh oh!

bentsherman commented Sep 27, 2022

Uh oh!

jemunro commented Sep 28, 2022 •

edited

Loading

Uh oh!

bentsherman commented Sep 28, 2022

Uh oh!

jemunro commented Sep 29, 2022

Uh oh!

bentsherman commented Sep 29, 2022

Uh oh!

jemunro commented Sep 30, 2022 •

edited

Loading

Uh oh!

stale bot commented Mar 18, 2023

Uh oh!

DSL2 pipe syntax enhancements #3243

DSL2 pipe syntax enhancements #3243

Comments

jemunro commented Sep 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New feature

Usage scenario

Suggest implementation

bentsherman commented Sep 26, 2022

Uh oh!

jemunro commented Sep 27, 2022

Uh oh!

jemunro commented Sep 27, 2022

Uh oh!

bentsherman commented Sep 27, 2022

Uh oh!

jemunro commented Sep 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bentsherman commented Sep 28, 2022

Uh oh!

jemunro commented Sep 29, 2022

Uh oh!

bentsherman commented Sep 29, 2022

Uh oh!

jemunro commented Sep 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stale bot commented Mar 18, 2023

Uh oh!

jemunro commented Sep 25, 2022 •

edited

Loading

jemunro commented Sep 28, 2022 •

edited

Loading

jemunro commented Sep 30, 2022 •

edited

Loading