Is anyone having problems with adlfs + partitioned...
# advanced-need-help
n
Is anyone having problems with adlfs + partitioned dataset + parallel runner? Apparently, the dataset can't retrieve partitions from blob storage when using this combination. In my tests, it might be something related to asyncio calls inside adlfs glob function.
Copy code
python
# pipeline_registry.py
pipelines["partitioned"] = Pipeline([node(print, 'partitioned', None)])
Copy code
yml
# catalog.yml
partitioned:
  type: PartitionedDataSet
  dataset: pandas.CSVDataSet
  path: abfs://...dfs.core.windows/...
  credentials: lab
  filename_suffix: .csv
Copy code
log
[10/10/22 14:39:54] INFO     Kedro project
[10/10/22 14:39:55] INFO     Loading data from 'partitioned' (PartitionedDataSet)...
4 Views