antony.milne
11/10/2021, 2:29 PM_validate_catalog
explains a bit what's going on here:
Ensure that all data sets are serializable and that we do not have non proxied memory data sets being used as outputs as their content not be synchronized across threads.
The second part about memory datasets is what's relevant here. As Joel said, default for parallel runner is that _SharedMemoryDataset
is used rather than MemoryDataSet
(see ParallelRunner.create_default_data_set
for where this happens).
In theory you could specify this dataset type explicitly in the catalog, but the fact that it's private means that's probably not a good idea, and I've never seen anyone do so. Just don't define them in the catalog and they will default to _SharedMemoryDataset
and everything should work ok 🙂