https://kedro.org/ logo
#advanced-need-help
Title
# advanced-need-help
r

Rjify

04/20/2022, 7:01 PM
Hello guys, I am working on reformatting a ML project to Kedro. Basically in the project I have three pipeline, data engineering, data science and prediction. Along with having main nodes for these pipeline I also have lot of helper/utility functions which need to be reformatted into kedro somewhere. I am unsure how I should structure these helper functions. Whether I should put them down as sub-pipelines or use them as is in the form of helper scripts. I would like to know what's the Kedro standard in this use case. TIA
d

datajoely

04/20/2022, 7:02 PM
So this is more of an art than a science
My view is that you should have very little business logic in your nodes
And simply call other packages within them
Happy to help you think through in more detail
r

Rjify

04/20/2022, 7:10 PM
So basically I should keep the helper scripts as is and only put the relevant logic in the nodes. Does it makes sense to convert helper functions to nodes as well and have a sort of helper pipeline?
d

datajoely

04/20/2022, 7:10 PM
So it does depend on the complexity and contents of your helper scripts
If they're pure python functions which don't do any IO then they're ready
Especially if they're already tested!
r

Rjify

04/20/2022, 7:12 PM
Yeah, they are mostly pure python functions
and they are already tested
d

datajoely

04/20/2022, 7:12 PM
So then I'd focus on readability and maintainability
Kedro nodes should be simple and in general just string together logic defined in other places
So it sounds like you're in a good place
Other bits of advice: Feel free to
kedro pipeline create
many single purpose pipelines, they can be combined easily and namespaced for both your mental model and visualisation
r

Rjify

04/20/2022, 7:17 PM
It makes sense. Thanks for your suggestion @datajoely.
d

datajoely

04/20/2022, 7:17 PM
Good luck
Shout if you need any sense check
r

Rjify

04/26/2022, 8:09 PM
Something like this @Mirko