noklam
05/24/2022, 11:00 AMnoklam
05/24/2022, 11:02 AMmap
, but for loop is fine too.noklam
05/24/2022, 11:04 AMpandas
is that it is memory hungry, especially during I/O and certain operations. Using the chunk
args helps to mitigate this problem by only loading & processing small batch of data and stitch them by at the end.
If the new dataset already iterate through the entire dataset before you start applying any transformation logic, then it doesn't help your memory problem.Mackson
05/24/2022, 11:08 AMnoklam
05/24/2022, 11:18 AMdatajoely
05/24/2022, 11:22 AMMackson
05/24/2022, 11:24 AMdatajoely
05/24/2022, 11:25 AMdatajoely
05/24/2022, 11:25 AMnoklam
05/24/2022, 11:29 AMMackson
05/24/2022, 11:30 AMnoklam
05/24/2022, 11:31 AMMackson
05/24/2022, 11:32 AMdatajoely
05/24/2022, 2:32 PMnoklam
05/24/2022, 2:44 PMdatajoely
05/24/2022, 2:46 PMnoklam
05/24/2022, 2:54 PMLazy2PickName
05/25/2022, 2:01 PMnode(
func=foo,
inputs=['input', 'params:parameter'],
outputs='output'
),
Is there something similar I can do to pass a credential from the credentials.yml
to my node?datajoely
05/25/2022, 2:03 PMparameters.yml
in your local
folder, but typically your nodes should focus on data flow not IOLazy2PickName
05/25/2022, 2:04 PMwaylonwalker
05/25/2022, 4:06 PMpython
node(lambda *frames: pd.concat(frames), ["cars", "cars"], "two_cars")
How would you concatenate pandas dataframes?antony.milne
05/25/2022, 4:17 PMhello_world
05/26/2022, 3:39 PMdatajoely
05/26/2022, 3:41 PMfrom kedro.io.data_catalog import DataSetError
before you do thisdatajoely
05/26/2022, 3:42 PMexcept DataSetError
rather than the full classpathhello_world
05/26/2022, 3:43 PMJA_next
05/26/2022, 10:49 PMdatajoely
05/27/2022, 9:30 AMpickle.PickleDataSet
https://kedro.readthedocs.io/en/stable/data/data_catalog.htmlnoklam
05/27/2022, 9:37 AMdatajoely
05/27/2022, 10:23 AMdatajoely
05/27/2022, 10:23 AM