Skip to content

Commit 75cb555

Browse files
committed
pmapbatch with type
1 parent 3d323af commit 75cb555

File tree

9 files changed

+436
-479
lines changed

9 files changed

+436
-479
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
Manifest.toml
22
*.cov
33
coverage
4+
docs/build

.travis.yml

+11-3
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,20 @@ julia:
77
- 1.2
88
- 1
99
- nightly
10-
matrix:
10+
jobs:
1111
allow_failures:
1212
- julia: nightly
1313
fast_finish: true
14+
include:
15+
- stage: "Documentation"
16+
julia: 1.4
17+
os: linux
18+
script:
19+
- julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd()));
20+
Pkg.instantiate()'
21+
- julia --project=docs/ docs/make.jl
22+
after_success: skip
1423
notifications:
1524
email: false
1625
after_success:
17-
- julia -e 'using Pkg; Pkg.add("Coverage"); using Coverage; Coveralls.submit(process_folder())'
18-
- julia -e 'using Pkg; Pkg.add("Coverage"); using Coverage; Codecov.submit(process_folder())'
26+
- julia -e 'using Pkg; Pkg.add("Coverage"); using Coverage; Codecov.submit(process_folder())'

README.md

+31-71
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
# ParallelUtilities.jl
22

33
[![Build Status](https://travis-ci.com/jishnub/ParallelUtilities.jl.svg?branch=master)](https://travis-ci.com/jishnub/ParallelUtilities.jl)
4-
[![Coverage Status](https://coveralls.io/repos/github/jishnub/ParallelUtilities.jl/badge.svg?branch=master)](https://coveralls.io/github/jishnub/ParallelUtilities.jl?branch=master)
54
[![codecov](https://codecov.io/gh/jishnub/ParallelUtilities.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/jishnub/ParallelUtilities.jl)
5+
[![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://jishnub.github.io/ParallelUtilities.jl/stable)
6+
[![Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://jishnub.github.io/ParallelUtilities.jl/dev)
67

78
Parallel mapreduce and other helpful functions for HPC, meant primarily for embarassingly parallel operations that often require one to split up a list of tasks into subsections that can be processed on individual cores.
89

@@ -14,29 +15,6 @@ Install the package using
1415
pkg> add ParallelUtilities
1516
julia> using ParallelUtilities
1617
```
17-
18-
# Exported functions
19-
20-
* `pmap`-related functions
21-
* `pmapreduce`
22-
* `pmapreduce_commutative`
23-
* `pmapsum`
24-
* `pmapreduce_elementwise`
25-
* `pmapsum_elementwise`
26-
* Functions to evenly split a Tuple of ranges
27-
* `ProductSplit`
28-
* `ntasks`
29-
* `whichproc`
30-
* `procrange_recast`
31-
* `localindex`
32-
* `whichproc_localindex`
33-
* `extremadims`
34-
* `extrema_commonlastdim`
35-
* Utility functions to query the cluster
36-
* `gethostnames`
37-
* `nodenames`
38-
* `nprocs_node`
39-
4018
# Quick start
4119

4220
```julia
@@ -47,17 +25,17 @@ julia> addprocs(2)
4725

4826
julia> @everywhere using ParallelUtilities
4927

50-
julia> pmapreduce(x->ones(2).*myid(),x->hcat(x...),1:nworkers())
28+
julia> pmapreduce(x -> ones(2).*myid(), x -> hcat(x...), 1:nworkers())
5129
2×2 Array{Float64,2}:
5230
2.0 3.0
5331
2.0 3.0
5432

55-
julia> pmapreduce_commutative(x->ones(2).*myid(),sum,1:nworkers())
33+
julia> pmapreduce_commutative(x -> ones(2).*myid(), sum, 1:nworkers())
5634
2-element Array{Float64,1}:
5735
5.0
5836
5.0
5937

60-
julia> pmapsum(x->ones(2).*myid(),1:nworkers())
38+
julia> pmapsum(x -> ones(2).*myid(), 1:nworkers())
6139
2-element Array{Float64,1}:
6240
5.0
6341
5.0
@@ -76,7 +54,7 @@ julia> @everywhere begin
7654
where each parameter takes up values in a range, and we would like to sample the entire parameter space. As an example, we choose the ranges to be
7755

7856
```julia
79-
julia> xrange,yrange,zrange = 1:3,2:4,3:6 # ranges should be strictly increasing
57+
julia> xrange, yrange, zrange = 1:3, 2:4, 3:6 # ranges should be strictly increasing
8058
```
8159

8260
There are a total of 36 possible `(x,y,z)` combinations possible given these ranges. Let's say that we would like to split the evaluation of the function over 10 processors. We describe the simple way to evaluate this and then explain how this is achieved.
@@ -111,33 +89,16 @@ Secondly, the iterator is passed to the function in batches and not elementwise,
11189
As an example we demonstrate how to evaluate the function `f` for the ranges of parameters listed above:
11290

11391
```julia
114-
julia> p = pmapbatch_elementwise(f,(xrange,yrange,zrange));
92+
julia> p = pmapbatch_elementwise(f, (xrange,yrange,zrange));
11593

11694
julia> Tuple(p)
11795
(6, 7, 8, 7, 8, 9, 8, 9, 10, 7, 8, 9, 8, 9, 10, 9, 10, 11, 8, 9, 10, 9, 10, 11, 10, 11, 12, 9, 10, 11, 10, 11, 12, 11, 12, 13)
118-
119-
# Check for correctness
120-
julia> p == map(f,vec(collect(Iterators.product(xrange,yrange,zrange))))
121-
true
122-
123-
# pmapbatch_elementwise produces the same result as pmap, although the internals are different
124-
julia> pmapbatch_elementwise(x->x^2,1:3)
125-
3-element Array{Int64,1}:
126-
1
127-
4
128-
9
129-
130-
julia> pmap(x->x^2,1:3)
131-
3-element Array{Int64,1}:
132-
1
133-
4
134-
9
13596
```
13697

13798
There is also a function `pmapbatch` that deals with batches of parameters that are passed to each processor, and `pmap_elementwise` calls this function under the hood to process the parameters one by one. We may use this directly as well if we need the entire batch for some reason (eg. reading values off a disk, which needs to be done once for the entire set and not for every parameter). As an example we demonstrate how to obtain the same result as above using `pmapbatch`:
13899

139100
```julia
140-
julia> p = pmapbatch(x->[f(i...) for i in x],(xrange,yrange,zrange));
101+
julia> p = pmapbatch(x->[f(i...) for i in x], (xrange,yrange,zrange));
141102

142103
julia> Tuple(p)
143104
(6, 7, 8, 7, 8, 9, 8, 9, 10, 7, 8, 9, 8, 9, 10, 9, 10, 11, 8, 9, 10, 9, 10, 11, 10, 11, 12, 9, 10, 11, 10, 11, 12, 11, 12, 13)
@@ -149,22 +110,22 @@ Often a parallel execution is followed by a reduction (eg. a sum over the result
149110

150111
As an example, to sum up a list of numbers in parallel we may call
151112
```julia
152-
julia> pmapsum_elementwise(identity,1:1000)
113+
julia> pmapsum_elementwise(identity, 1:1000)
153114
500500
154115
```
155116

156117
Here the mapped function is taken to by `identity` which just returns its argument. To sum the squares of the numbers in a list we may use
157118

158119
```julia
159-
julia> pmapsum_elementwise(x->x^2,1:1000)
120+
julia> pmapsum_elementwise(x -> x^2, 1:1000)
160121
333833500
161122
```
162123

163124
We may choose an arbitrary reduction operator in the function `pmapreduce` and `pmapreduce_commutative`, and the elementwise function `pmapreduce_commutative_elementwise`. The reductions are carried out as a binary tree across all workers.
164125

165126
```julia
166127
# Compute 1^2 * 2^2 * 3^2 in parallel
167-
julia> pmapreduce_commutative_elementwise(x->x^2,prod,1:3)
128+
julia> pmapreduce_commutative_elementwise(x -> x^2, prod, 1:3)
168129
36
169130
```
170131

@@ -177,7 +138,7 @@ julia> workers()
177138
3
178139

179140
# The signature is pmapreduce(fmap,freduce,iterable)
180-
julia> pmapreduce(x->ones(2).*myid(),x->hcat(x...),1:nworkers())
141+
julia> pmapreduce(x -> ones(2).*myid(), x -> hcat(x...), 1:nworkers())
181142
2×2 Array{Float64,2}:
182143
2.0 3.0
183144
2.0 3.0
@@ -192,7 +153,7 @@ julia> sum(workers())
192153
5
193154

194155
# We compute ones(2).*sum(workers()) in parallel
195-
julia> pmapsum(x->ones(2).*myid(),1:nworkers())
156+
julia> pmapsum(x -> ones(2).*myid(), 1:nworkers())
196157
2-element Array{Float64,1}:
197158
5.0
198159
5.0
@@ -201,14 +162,14 @@ julia> pmapsum(x->ones(2).*myid(),1:nworkers())
201162
It is possible to specify the return types of the map and reduce operations in these functions. To specify the return types use the following variants:
202163

203164
```julia
204-
# Signature is pmapreduce(fmap,Tmap,freduce,Treduce,iterators)
205-
julia> pmapreduce(x->ones(2).*myid(),Vector{Float64},x->hcat(x...),Matrix{Float64},1:nworkers())
165+
# Signature is pmapreduce(fmap, Tmap, freduce, Treduce, iterators)
166+
julia> pmapreduce(x -> ones(2).*myid(), Vector{Float64}, x -> hcat(x...), Matrix{Float64}, 1:nworkers())
206167
2×2 Array{Float64,2}:
207168
2.0 3.0
208169
2.0 3.0
209170

210-
# Signature is pmapsum(fmap,Tmap,iterators)
211-
julia> pmapsum(x->ones(2).*myid(),Vector{Float64},1:nworkers())
171+
# Signature is pmapsum(fmap, Tmap, iterators)
172+
julia> pmapsum(x -> ones(2).*myid(), Vector{Float64}, 1:nworkers())
212173
2-element Array{Float64,1}:
213174
5.0
214175
5.0
@@ -219,13 +180,13 @@ Specifying the types would lead to a type coercion if possible, or an error if a
219180
```julia
220181
# The result is converted from Vector{Float64} to Vector{Int}.
221182
# Conversion works as the numbers are integers
222-
julia> pmapsum(x->ones(2).*myid(),Vector{Int},1:nworkers())
183+
julia> pmapsum(x -> ones(2).*myid(), Vector{Int}, 1:nworkers())
223184
2-element Array{Int64,1}:
224185
5
225186
5
226187

227188
# Conversion fails here as the numbers aren't integers
228-
julia> pmapsum(x->rand(2),Vector{Int},1:nworkers())
189+
julia> pmapsum(x -> rand(2), Vector{Int}, 1:nworkers())
229190
ERROR: On worker 2:
230191
InexactError: Int64(0.7742577217010362)
231192
```
@@ -236,12 +197,12 @@ The progress of the map-reduce operation might be tracked by setting the keyword
236197

237198
```julia
238199
# Running on 8 workers, artificially induce load using sleep
239-
julia> pmapreduce(x->(sleep(myid());myid()),x->hcat(x...),1:nworkers(),showprogress=true)
200+
julia> pmapreduce(x -> (sleep(myid()); myid()), x -> hcat(x...), 1:nworkers(), showprogress=true)
240201
Progress in pmapreduce : 100%|██████████████████████████████████████████████████| Time: 0:00:09
241202
1×8 Array{Int64,2}:
242203
2 3 4 5 6 7 8 9
243204

244-
julia> pmapreduce(x->(sleep(myid());myid()),x->hcat(x...),1:nworkers(),showprogress=true,progressdesc="Progress : ")
205+
julia> pmapreduce(x -> (sleep(myid()); myid()), x -> hcat(x...), 1:nworkers(), showprogress=true, progressdesc="Progress : ")
245206
Progress : 100%|████████████████████████████████████████████████████████████████| Time: 0:00:09
246207
1×8 Array{Int64,2}:
247208
2 3 4 5 6 7 8 9
@@ -264,14 +225,13 @@ with appropriately chosen parameters, and in many ways a `ProductSplit` behaves
264225
The signature of the constructor is
265226

266227
```julia
267-
ProductSplit(tuple_of_ranges,number_of_processors,processor_rank)
228+
ProductSplit(tuple_of_ranges, number_of_processors, processor_rank)
268229
```
269230

270231
where `processor_rank` takes up values in `1:number_of_processors`. Note that this is different from MPI where the rank starts from 0. For example, we check the tasks that are passed on to the processor number 4:
271232

272233
```julia
273-
julia> ps = ProductSplit((xrange,yrange,zrange),10,4)
274-
ProductSplit{Tuple{Int64,Int64,Int64},3,UnitRange{Int64}}((1:3, 2:4, 3:5), (0, 3, 9), 10, 4, 10, 12)
234+
julia> ps = ProductSplit((xrange, yrange, zrange), 10, 4);
275235

276236
julia> collect(ps)
277237
4-element Array{Tuple{Int64,Int64,Int64},1}:
@@ -337,10 +297,10 @@ julia> val = (3,3,4)
337297
julia> val in ps
338298
true
339299

340-
julia> localindex(ps,val)
300+
julia> localindex(ps, val)
341301
3
342302

343-
julia> val=(10,2,901);
303+
julia> val = (10,2,901);
344304

345305
julia> @btime $val in $ps_long
346306
50.183 ns (0 allocations: 0 bytes)
@@ -354,10 +314,10 @@ julia> @btime localindex($ps_long, $val)
354314
Another useful function is `whichproc` that returns the rank of the processor a specific set of parameters will be on, given the total number of processors. This is also computed using a binary search.
355315

356316
```julia
357-
julia> whichproc(params_long,val,10)
317+
julia> whichproc(params_long, val, 10)
358318
4
359319

360-
julia> @btime whichproc($params_long,$val,10)
320+
julia> @btime whichproc($params_long, $val, 10)
361321
1.264 μs (14 allocations: 448 bytes)
362322
4
363323
```
@@ -367,18 +327,18 @@ julia> @btime whichproc($params_long,$val,10)
367327
We can compute the ranges of each variable on any processor in `O(1)` time.
368328

369329
```julia
370-
julia> extrema(ps,dim=2) # extrema of the second parameter on this processor
330+
julia> extrema(ps, dim=2) # extrema of the second parameter on this processor
371331
(3, 4)
372332

373-
julia> Tuple(extrema(ps,dim=i) for i in 1:3)
333+
julia> Tuple(extrema(ps, dim=i) for i in 1:3)
374334
((1, 3), (3, 4), (4, 4))
375335

376336
# Minimum and maximum work similarly
377337

378-
julia> (minimum(ps,dim=2),maximum(ps,dim=2))
338+
julia> (minimum(ps, dim=2), maximum(ps, dim=2))
379339
(3, 4)
380340

381-
julia> @btime extrema($ps_long,dim=2)
341+
julia> @btime extrema($ps_long, dim=2)
382342
52.813 ns (0 allocations: 0 bytes)
383343
(1, 3000)
384344
```

docs/Project.toml

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
[deps]
2+
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
3+
4+
[compat]
5+
Documenter = "0.25"

docs/make.jl

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
using Documenter
2+
using ParallelUtilities
3+
4+
DocMeta.setdocmeta!(ParallelUtilities, :DocTestSetup, :(using ParallelUtilities); recursive=true)
5+
6+
makedocs(;
7+
modules=[ParallelUtilities],
8+
authors="Jishnu Bhattacharya",
9+
repo="https://github.com/jishnub/ParallelUtilities.jl/blob/{commit}{path}#L{line}",
10+
sitename="ParallelUtilities.jl",
11+
format=Documenter.HTML(;
12+
prettyurls=get(ENV, "CI", "false") == "true",
13+
canonical="https://jishnub.github.io/ParallelUtilities.jl",
14+
assets=String[],
15+
),
16+
pages=[
17+
"Reference" => "index.md",
18+
],
19+
)
20+
21+
deploydocs(;
22+
repo="github.com/jishnub/ParallelUtilities.jl",
23+
)

docs/src/index.md

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
```@meta
2+
CurrentModule = ParallelUtilities
3+
```
4+
5+
# ParallelUtilities.jl
6+
7+
```@autodocs
8+
Modules = [ParallelUtilities]
9+
```

0 commit comments

Comments
 (0)