Title
#beginners-need-help
j c h a r l e s

j c h a r l e s

12/30/2021, 10:07 AM
Creating pipelines with dynamic inputs has been slightly less intuitive than I expected, however I have not used tools like kedro before. Am hoping that this set-up is a one-time cost that I will not have to incur again. Also have noticed that the validation for input types for kedro itself gives cryptic errors. I think if you pass a list of pipeline objects by accident, it will give an error
list object has no attribute filter
, a potentially more useful error might be something that throws
ValueError, expected a pipeline object and received list
. Another error I found was when I tried to pass literal values to a node function, the error was something like
cannot split
, and could be improved to be
ValueError: inputs are not allowed to contain literal values (like integers). Please use functools.partial to create a node function with the desired literal argument specified
. These errors are very hard to debug because they are thrown deep in the kedro library code. Could save a lot of hassle by having better validation errors.
datajoely

datajoely

12/30/2021, 10:23 AM
Hello - I've created a thread since there are a few things here
10:24 AM
1. Firstly to install Kedro on 3.9 or 3.10 for that matter you can follow these steps https://github.com/quantumblacklabs/kedro/discussions/1117#discussioncomment-1822667
10:25 AM
2. Since it's work in progress - the Kedro team can't support users using the develop branch, only released versions of Kedro fall under that
10:25 AM
3. Office hours are a great idea - would you create a GitHub issue asking for this, if enough people react/comment we could set this up
10:26 AM
4. I'd argue that the error message are descriptive for what going on - but it's hard to debug from here, could you share a repo?
10:27 AM
ValueError, expected a pipeline object and received list
If it's like this
[Pipeline(), Pipeline(), Pipeline()]
you may need to
sum()
the list or wrap it in another
Pipeline()
10:29 AM
ValueError: inputs are not allowed to contain literal values (like integers). Please use functools.partial to create a node function with the desired literal argument specified.
Again inputs need to refer to catalog entires or parameter keys, you can't provide a regular python object without partially applying the function first
j c h a r l e s

j c h a r l e s

12/30/2021, 11:30 AM
I think that if we paired on the error messages you would probably agree that they are fairly generic and very hard to trace as the exceptions are thrown from deep within the core kedro library. I just probably didn't do a great job of communicating
11:32 AM
I do understand the solutions to #4 now, after a bit of difficult debugging. A better error message will increase adoptability & drive usage for this software. Rather than spending 45+ minutes debugging, it will drop to 15 seconds of debugging for new users
datajoely

datajoely

12/30/2021, 11:32 AM
What sort of error message would have been helpful?
j c h a r l e s

j c h a r l e s

12/30/2021, 11:32 AM
Am happy to create a github issue for #3
11:32 AM
A validation error for each case
datajoely

datajoely

12/30/2021, 11:33 AM
What do you mean by that? Isn’t that a ValueError?
j c h a r l e s

j c h a r l e s

12/30/2021, 11:33 AM
Yes
11:33 AM
I'd try to throw a ValueError that lets the user know that the input type is incorrect
11:34 AM
For this case:
If it's like this [Pipeline(), Pipeline(), Pipeline()]  you may need to sum() the list or wrap it in another Pipeline()
11:34 AM
And also for passing of literal values as inputs/outputs that are not pandas dataframes types of objects
datajoely

datajoely

12/30/2021, 11:36 AM
Im not entirely convinced with the pipeline suggestion - but I think we can do a better job with the second one i think
11:36 AM
It’s jumping to a complex solution before without suggesting the most common one
11:37 AM
I’ll raise this with the team in the new year no doubt
j c h a r l e s

j c h a r l e s

12/30/2021, 11:37 AM
What do you mean by jumping to a complex solution?
datajoely

datajoely

12/30/2021, 11:38 AM
Partially applying a function is something we don’t really document and have plans for making neater in 0.18.0
j c h a r l e s

j c h a r l e s

12/30/2021, 11:38 AM
oh yeah
11:38 AM
No worries about that
11:38 AM
I'm happy that I have something thats working honestly
11:38 AM
The error message though I still think is important to patch
datajoely

datajoely

12/30/2021, 11:39 AM
The pipeline error can happen so many different ways we may want to keep that generic
j c h a r l e s

j c h a r l e s

12/30/2021, 11:39 AM
"list object has no method filter" getting thrown deep within an unknown library is fairly daunting to see
datajoely

datajoely

12/30/2021, 11:39 AM
That one we can deffo do a better job on
j c h a r l e s

j c h a r l e s

12/30/2021, 11:39 AM
I saw that and was like "#$%#", ... I have no idea what caused this
datajoely

datajoely

12/30/2021, 11:39 AM
That should say expected a pipeline object
j c h a r l e s

j c h a r l e s

12/30/2021, 11:39 AM
yeah
11:39 AM
for sure
datajoely

datajoely

12/30/2021, 11:39 AM
Agreed there
11:40 AM
Yeah happy for you to raise an issue for that one and the partial error. If you’re feeling brave - you could even raise the PR into the main library!
j c h a r l e s

j c h a r l e s

12/30/2021, 11:40 AM
I might if I have some extra time
datajoely

datajoely

12/30/2021, 11:40 AM
It will be towards the bottom of the backlog in the short term
11:40 AM
Since it’s not broken just could be improved
j c h a r l e s

j c h a r l e s

12/30/2021, 11:40 AM
Is there any way to get trained up on any of this. I'm already actively using develop and fairly able to satisfy my use case
datajoely

datajoely

12/30/2021, 11:41 AM
We can’t support the develop branch as it’s subject to change
11:41 AM
To run on 3.9 I would follow the instruction to install the released version but override the version lock
j c h a r l e s

j c h a r l e s

12/30/2021, 11:41 AM
Would that be where I'd be making PR's to? if I were to suggest improvements?
11:42 AM
to run kedro-viz?
datajoely

datajoely

12/30/2021, 11:42 AM
No you PR to main
11:42 AM
There is a contribution guide
11:42 AM
For viz I’d suggest raising an issue on that repo
j c h a r l e s

j c h a r l e s

12/30/2021, 11:43 AM
Okay. Will raise issues for 3.9 for kedro viz, office hours on the main kedro git repo, optionally a couple PRs to the main branch to add validation for the pipeline input
datajoely

datajoely

12/30/2021, 11:43 AM
j c h a r l e s

j c h a r l e s

12/30/2021, 11:43 AM
Yeah thanks for the repsonses as usual 💯
datajoely

datajoely

12/30/2021, 11:44 AM
No problem and good luck
j c h a r l e s

j c h a r l e s

12/30/2021, 11:44 AM
Will link those here as I have them
datajoely

datajoely

12/30/2021, 11:44 AM
We get notifications on slack whenever these things come through so no need 😊
j c h a r l e s

j c h a r l e s

12/30/2021, 11:45 AM
Maybe for others who eventually follow this thread, could be useful?
datajoely

datajoely

12/30/2021, 11:45 AM
Sure thing
j c h a r l e s

j c h a r l e s

12/30/2021, 11:45 AM
Definitely has helped me to read through threads here between you and others etc
datajoely

datajoely

12/30/2021, 11:45 AM
That’s good to know - I’m never sure if people actually do that!
j c h a r l e s

j c h a r l e s

12/30/2021, 11:46 AM
I do that a lot, was searching through this discord a lot tonight to double check my intuition about a bunch of different topics