What are the practical differences between the `Th...
# beginners-need-help
r
What are the practical differences between the
ThreadRunner
and
ParallelRunner
. 1. I know that the
ParallelRunner
won't take lambda functions. In this case how to deal with the nodes with identify lambda functions (
lambda x: x
)? 2. I tried running using
ThreadRunner
and it did run the nodes at the same time and shortened the end-to-end runtime. Are there situations in which I shouldn't use
ThreadRunner
?
d
So Python doesn't have true multi-threading due to GIL. So ParallelRunner will run isolated branches as different processes. ThreadRunner is designed for Spark jobs where the execution happens via an API call outside of Python world. You can read more here https://kedro.readthedocs.io/en/latest/11_tools_integration/01_pyspark.html
3 Views