Title
#beginners-need-help
Rroger

Rroger

11/28/2021, 2:46 AM
What are the practical differences between the
ThreadRunner
and
ParallelRunner
. 1. I know that the
ParallelRunner
won't take lambda functions. In this case how to deal with the nodes with identify lambda functions (
lambda x: x
)? 2. I tried running using
ThreadRunner
and it did run the nodes at the same time and shortened the end-to-end runtime. Are there situations in which I shouldn't use
ThreadRunner
?
datajoely

datajoely

11/28/2021, 6:57 PM
So Python doesn't have true multi-threading due to GIL. So ParallelRunner will run isolated branches as different processes. ThreadRunner is designed for Spark jobs where the execution happens via an API call outside of Python world. You can read more here https://kedro.readthedocs.io/en/latest/11_tools_integration/01_pyspark.html