user
03/12/2022, 6:30 PMregister_flow.py
script, the datasets in the catalog object are named as in the file. However, in the Nodes the input and output datasets are namespaced. Therefore, when running that flow it will create only memory datasets because it will assume all the datasets don't exist in the catalog. Now if I change the register_flow.py
so that it does not create MemoryDatasets for everything, the run_node
function does not work as the input and catalog name don't match up and the save/load functions don't work anymore (it tries loading a namespaced dataset that it can't find in the catalog). Is there a way to obtain either a namespaced catalog or a pipeline object where the inputs/outputs of the nodes are not namespaced so that the run_node
function works properly? 🙂datajoely
03/13/2022, 11:56 AMuser
03/13/2022, 6:56 PM0.17.6
it all works)
The catalog for spaceflight starter for version 0.17.7 does not include the additional data_science
and data_processing
namespaces.
e.g.:
preprocessed_companies
instead of data_processing.preprocessed_companies
Therefore, only MemoryDatasets are ever used when running either the prefect workflow or the normal kedro run.
Is that how it is supposed to be? This layering could get a bit intricatedatajoely
03/14/2022, 9:17 AMantony.milne
03/14/2022, 9:59 AMFlow
03/14/2022, 10:03 AMdatajoely
03/14/2022, 10:05 AMFlow
03/14/2022, 10:06 AMdatajoely
03/14/2022, 10:40 AMFlow
03/14/2022, 12:28 PMavan-sh
03/14/2022, 1:46 PM