https://kedro.org/ logo
#beginners-need-help
Title
# beginners-need-help
l

LightMiner

04/21/2022, 7:52 PM
Hi everyone, I've been using kedro for a little while , and followed EngineerOne videos , i had a question about programatically adding datasets, for one of my projects i have a hierarchy of files that is growing in a structured way where i have recordings that are being added for new subjects (data/sub01/recordings.txt) , in one of the videos of Dataengineerone he does so by changing the ProjectContext class in the run.py file , but it seems that in the recents version this file is no more.

https://www.youtube.com/watch?v=CIRVpMqWEIs

I wanna be able to create the datasets automatically from params , and create corresponding nodes from params, I was thinking of 4 solutions: 1-Find the Equivalent of the ProjectContextClass, iv'e been wondering if this class is still used in a new file or if there is the equivalent in the new version of kedro 2-Use jinja2 in the catalog, if i use jinja2 i've been wondering then how i could load the parameters for iterating over them and creating the catalog entries, 3-Create a custom class, but then i've been wondering how to return a dictionary of callables like the partitionDataset does, 4- Use hooks as proposed in a past question, but sincerely i still never used them, which solution is the best ? or is there another simpler one ?
d

datajoely

04/21/2022, 8:09 PM
This to me feels like a subclass of partitioned datasets to handle multiple directories
Happy to help you through it
But in theory you can steal most of the logic and tweak for your situation
l

LightMiner

04/21/2022, 9:08 PM
so i should create a class that inherit Partitionned dataset and overwrite the load and save method?
d

datajoely

04/21/2022, 9:22 PM
Exactly because - and correct me if I'm wrong - the only difference is the fact you are dealing with multiple directories not the same one right?
If so steal and tweak!
l

LightMiner

04/21/2022, 10:33 PM
yes exactly , i'll dig into the code of PartitionnedDataset and try to return a 2 level dictionary inside of the load function,thanks!
Here is the hierarchicalDataset code if it can help someone , it's quite sloppy python , but it works, thx once again!
2 Views