Does the new catalog lazy loading somehow hit the ...
# beginners-need-help
Does the new catalog lazy loading somehow hit the describe functionality of a dataset now, prior to the dataset being initialized?
I have the following custom dataset:
I’m not entirely sure what you’re looking to achieve
Also you can use the regular SQL datasets in Kedro with snowflake as long as you install
ya still trying to type everything out
the issue is the _describe function is being hit during the initialization prior to the self.sql = sql
And is the problem speed or something
No its that the run fails due to catalog load error because the dataset cannot be instantiated.
Oh got you
I mean you can return an empty dictionary in the describe method
But I would encourage you to use the regular pandas.SqlQueryDataSet with the snowflake engine library as it’s tested
@User we basically are doing what you are saying, but just managing credentials inside of a singleton in another package. Reguardless, I am baffeled as to why _describe is being hit prior to __init__ or even how that works.
Yeah it’s happening somewhere as the catalog is built up
Also, note this worked until upgrading from 17.2 to 17.5.
I need to finish for the day - but I will look into this first thing
Sounds good! Thanks for the help @User!
I found the problem. Super() is hitting the AbstractDataSet class which now has the following definition for
which calls self.describe() hit upon creation.
This seems a bit dangerous to me, given that the AbstractDataset is kinda designed to be inherited in the same way my dataset is created. But anyone including the input params in the same way that I did (super first then self.x being created) will hit this same error.
Hi @User ! AbstractDataSet doesn't have a constructor, so when extending it you don't need to call
. I assume you're trying to pass these to
, does that call super on
or sth? I'm not sure why you'd hit
before anything else happens, can you show us your Snowflake class?
Or I'd also make use of
for debugging when using multiple inheritance like here.
@User, I am not entirely following what you are saying. From what I can tell, to have a custom datasets, it has to be an inheritance from the abstract class such that when the catalog is built and passed to the AbstractDataset it is able to build, if you remove the AbstractDataset from the inheritance, it no longer functions.. __str__ is called like __init__ during the instantiation of the class. So what is happening is that in my class is defining _describe with an eliment which is not defined yet, as it is defined after the super(). Simply moving super() down resolves it. I suppose a viable solution would be to focus super() the Snowflake class, but again I suspect a lot of people will run into this same issue with how it is currently setup.
Can we see the
class to make sense of what's going on?
@User There are more to these classes, but I have shorten them to what you would care about.
Thanks @User we will get back to you shortly