Multiprocessing in R with future.callr

1 minute read


Multiprocessing in R with future.callr

I learned Python before getting to know R, but I now use R more often than Python. When trying to loop lots of lots of items for processing, I always wanted to just import multiprocessing as mp, but I haven’t found something like that in R until very recently.

If you have some familiarity in R, especially experiences of dealing with data, you probably have used the tidyverse collection, especially how easy to use the map_* functions in the purrr package. But do you also know that there is the multiprocessing version furrr package and its future_map_* family of function? Basically, every single map function in purrr will have a counterpart in furrr.

But I always find the default future package not as easy to work with as I would like. Until I came across future.callr package.

Let’s have an example.

## Loading required package: future
plan(callr, workers = 10) 
# choose the number of workers wisely

# first try the serial version
purrr::walk(1:10, function(x) Sys.sleep(1))
## 10.039 sec elapsed
# then the multiprocessing via callr
furrr::future_walk(1:10, function(x) Sys.sleep(1))
## 2.383 sec elapsed

There are, in fact, some costs when using multiprocessing, and this will be compensated if you apply it on many many things. You code will run faster, with multiple CPU cores carrying out the jobs in the same time.

Leave a Comment

Your email address will not be published. Required fields are marked *