Edak
04/11/2022, 4:36 AMdatajoely
04/11/2022, 9:11 AMpython
...
return {
"part/foo": lambda: pd.DataFrame({"data": [1, 2]}),
"part/bar": lambda: pd.DataFrame({"data": [3, 4]}),
}
callable
functions, you can use a def
or like this example use an anonymous lambda
- The dataset then saves each partition in a loop, but it a way much more memory efficient than doing it all within the nodeEdak
04/11/2022, 3:05 PMreturn {
key: (lambda: _preprocess_partion(load_func())) for key, load_func in partitioned_input.items()
}
datajoely
04/11/2022, 3:07 PMEdak
04/11/2022, 3:07 PMdatajoely
04/11/2022, 3:07 PMEdak
04/11/2022, 3:15 PMdatajoely
04/11/2022, 3:23 PMZhee
06/02/2022, 7:59 AM