Title
#beginners-need-help
WolVez

WolVez

09/23/2021, 5:18 PM
Does the new catalog lazy loading somehow hit the describe functionality of a dataset now, prior to the dataset being initialized?
5:19 PM
I have the following custom dataset:
datajoely

datajoely

09/23/2021, 5:21 PM
I’m not entirely sure what you’re looking to achieve
5:22 PM
Also you can use the regular SQL datasets in Kedro with snowflake as long as you install https://pypi.org/project/snowflake-sqlalchemy/
WolVez

WolVez

09/23/2021, 5:22 PM
ya still trying to type everything out
5:22 PM
the issue is the _describe function is being hit during the initialization prior to the self.sql = sql
datajoely

datajoely

09/23/2021, 5:22 PM
And is the problem speed or something
WolVez

WolVez

09/23/2021, 5:23 PM
No its that the run fails due to catalog load error because the dataset cannot be instantiated.
datajoely

datajoely

09/23/2021, 5:23 PM
Oh got you
5:23 PM
I mean you can return an empty dictionary in the describe method
5:24 PM
But I would encourage you to use the regular pandas.SqlQueryDataSet with the snowflake engine library as it’s tested
WolVez

WolVez

09/23/2021, 5:33 PM
@User we basically are doing what you are saying, but just managing credentials inside of a singleton in another package. Reguardless, I am baffeled as to why _describe is being hit prior to __init__ or even how that works.
datajoely

datajoely

09/23/2021, 5:37 PM
Yeah it’s happening somewhere as the catalog is built up
WolVez

WolVez

09/23/2021, 5:37 PM
Also, note this worked until upgrading from 17.2 to 17.5.
datajoely

datajoely

09/23/2021, 5:37 PM
I need to finish for the day - but I will look into this first thing
WolVez

WolVez

09/23/2021, 5:37 PM
Sounds good! Thanks for the help @User!
5:49 PM
I found the problem. Super() is hitting the AbstractDataSet class which now has the following definition for
__str__
which calls self.describe() hit upon creation.
5:51 PM
This seems a bit dangerous to me, given that the AbstractDataset is kinda designed to be inherited in the same way my dataset is created. But anyone including the input params in the same way that I did (super first then self.x being created) will hit this same error.
Lorena

Lorena

09/24/2021, 9:29 AM
Hi @User ! AbstractDataSet doesn't have a constructor, so when extending it you don't need to call
super().__init__
. I assume you're trying to pass these to
Snowflake
, does that call super on
AbstractDataSet
or sth? I'm not sure why you'd hit
str()
before anything else happens, can you show us your Snowflake class?
9:31 AM
Or I'd also make use of
mro()
for debugging when using multiple inheritance like here.
WolVez

WolVez

09/24/2021, 4:55 PM
@User, I am not entirely following what you are saying. From what I can tell, to have a custom datasets, it has to be an inheritance from the abstract class such that when the catalog is built and passed to the AbstractDataset it is able to build, if you remove the AbstractDataset from the inheritance, it no longer functions.. __str__ is called like __init__ during the instantiation of the class. So what is happening is that in my class is defining _describe with an eliment which is not defined yet, as it is defined after the super(). Simply moving super() down resolves it. I suppose a viable solution would be to focus super() the Snowflake class, but again I suspect a lot of people will run into this same issue with how it is currently setup.
datajoely

datajoely

09/27/2021, 8:23 AM
Can we see the
Snowflake
class to make sense of what's going on?
WolVez

WolVez

09/27/2021, 2:22 PM
@User There are more to these classes, but I have shorten them to what you would care about.
2:26 PM
message has been deleted
datajoely

datajoely

09/27/2021, 2:30 PM
Thanks @User we will get back to you shortly