I think you mentioned your problem is with a big dataset the Kedro #beginners-need-help

I think you mentioned your problem is with a big d...

noklam

05/24/2022, 11:04 AM

I think you mentioned your problem is with a big dataset, the problem with

pandas

is that it is memory hungry, especially during I/O and certain operations. Using the

chunk

args helps to mitigate this problem by only loading & processing small batch of data and stitch them by at the end. If the new dataset already iterate through the entire dataset before you start applying any transformation logic, then it doesn't help your memory problem.

Previous Next