-
Notifications
You must be signed in to change notification settings - Fork 685
DSL2 pipe syntax enhancements #3243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Basically you want to be able to curry channel arguments. I think it would work for processes but not for operators, because operators use an object-oriented syntax, e.g. Also, the example with |
As far as I can tell this already works for most operators, just not for processes. For example the following works fine: workflow {
left = channel.of(['A', 1], ['B', 2], ['C', 3], ['D', 7])
right = channel.of(['B', 6], ['C', 5], ['D', 2], ['A', 8])
left |
join(right) |
map { s, x, y -> [s, x, y, x + y] } |
groupTuple(by: 3) |
map { ss, xx, yy, z -> [ss.join('-'), xx.sum() + yy.sum() ]} |
combine(left) |
view()
}
|
This piping paradigm is widely used in R: see https://r4ds.had.co.nz/pipes.html |
Yes, most operators already work this way because they use an object-oriented syntax, so it is clear which argument is being piped. With processes I only worry that it's not clear which input to pipe the argument into. |
I think that it would be fairly intuitive that it was the first process input that was being piped into. To illustrate further, let's consider some operators that could be implemented as processes: process SPLIT_CSV {
input: val(x)
output: val(split)
exec: split = x.split(',')
}
workflow {
csv = channel.of('a,b,c', 'd,e,f')
// --------------- operator ---------------
csv.splitCsv().view() // (1) - valid
csv | splitCsv | view() // (2) - valid
splitCsv(csv) | view() // (3) - not valid
// --------------- process ----------------
csv.SPLIT_CSV().view() // (1) - not valid
csv | SPLIT_CSV | view() // (2) - valid
SPLIT_CSV(csv) | view() // (3) - valid
} 2. binary operator: process MERGE {
input: val(x); val(y)
output: val(xy)
exec: xy = (x instanceof List ? x : [x]) + (y instanceof List ? y : [y])
}
workflow {
left = channel.of(['A', 1], ['B', 2], ['C', 3])
right = channel.of(['X', 4], ['Y', 5], ['Z', 6])
// --------------- operator ---------------
left.merge(right).view() // (1) - valid
left | merge(right) | view() // (2) - valid
merge(left, right) | view() // (3) - not valid
// --------------- process ----------------
left.MERGE(right).view() // (1) - not valid
left | MERGE(right) | view() // (2) - not valid
MERGE(left, right) | view() // (3) - valid
} From this we can see that option (2) is valid for both the process and operator in unary case, but only for the operator in the binary case. |
Thank you for the detailed example. It is interesting, Nextflow's operator syntax suggests the first input should be piped, whereas currying in functional programming would suggest that the last input should be piped. I foresee future tribal divisions over this question 😄 I also wanted to mention, there is a // and operator with processes
workflow {
channel.from('Hello') | (foo & bar) | mix | view
}
// and operator with channels
workflow {
(left & right) | merge | view
} |
The workflow {
append = channel.value('!')
channel.of('foo', 'bar') |
toUpper &
append |
strConcat |
view()
} |
Well, it wouldn't work in that case, because you want |
Another thought, what it there was an operator that worked similarly to Groovy's Using OO syntax, append = channel.value('!')
channel.of('foo', 'bar')
.with { toUpper (it) }
.with { strConcat(it, append) }
.view() We could define an operator, say append = channel.value('!')
channel.of('foo', 'bar') |
toUpper |
withOp { strConcat(it, append) } |
view The nice thing about this approach is that it is quite flexible, you could do any arbitrary thing with channels, operators and processes in the closure and then pipe the result. I made a draft PR that enables the above to work: #3254 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Uh oh!
There was an error while loading. Please reload this page.
New feature
Enhancing the pipe operator to allow processes and workflows with multiple inputs to be piped together.
Usage scenario
Given the following processes:
If we want to run out data through
toUpper
and intostrConcat
, the best we can do with piping is the following:which gives output:
Suggest implementation
The following would be more intuitive and readable:
Attempting to run the above currently gives the following error:
if we replace
toUpper() |
withtoUpper |
, the error becomes:This could be achieved by making the first argument or a process be the piped value (regardless of whether parenthesis are present). This already seems to be the case for operators that accept multiple arguments, e.g.:
The text was updated successfully, but these errors were encountered: