noklam
05/24/2022, 11:04 AMpandas
is that it is memory hungry, especially during I/O and certain operations. Using the chunk
args helps to mitigate this problem by only loading & processing small batch of data and stitch them by at the end.
If the new dataset already iterate through the entire dataset before you start applying any transformation logic, then it doesn't help your memory problem.