Hello, I am migrating an internal lib that we deve...
# advanced-need-help
f
Hello, I am migrating an internal lib that we developped to
kedro==0.18.0
. We use
kedro.framework.session.get_current_session
to get the current session in order to either create a new session if it is None, or use it directly. This function was removed in
0.18.0
(with https://github.com/kedro-org/kedro/pull/1138) . What is the new way to find the current active session?
d
As per the release notes we have deprecated this functionality https://github.com/kedro-org/kedro/blob/develop/RELEASE.md
We now view all sessions as equivilent to a run and are thus ephemeral
appologies if this changes some of your assumptions, this was made because trying to manage existing sessions particularily in concurrent contexts was becoming too complex to handle in every case
f
OK, so is
session.run
still the best way to run programmatically a pipeline? And if I want to access the context and catalog for a given env, I used to use a session, is it still OK or do I have to find another way?
d
so my rule of thumb if you're accessing the context directly it's a sign you've gone too far
the correct way to access the live library objects is via Kedro hooks
f
What do you mean "gone too far"? We have defined datasets in the catalog, and I want to load a dataset. We kedro < 0.18, we used to load the context to access the catalog, and then use
catalog.log
. I do not see how I can use hooks for this use case
We have pipelines where all this is taken care of, but other parts of our apps need to access the data (say, for a plot). I found useful to only have one source of truth for the data, in the catalog
d
so philisophically we believe the nodes should have no knowledge of IO and should be pure python functions
so we typically don't encourage people to access the catalog within a node
f
I mean outside of a node
d
then isn't a
before_pipeline_run
or
after_pipeline_run
hook the right place to do this?
f
I do not want to run a pipeline, just access the data
d
Oh okay then in this case it makes sense
it will be a new session
f
An example would be a flask app where an endpoint runs a pipeline, and another makes a plot. The second endpoints needs the catalog
d
yes with you
I personally want to make that workflow native - it's on the backlog but we haven't gone there
in that situation, maybe a good reference would be to poke around Kedro-Viz's internals
and then copy how we do it there
We're in the process of updating the demo to 0.18.x but I'm pretty sure this part is the same
f
OK, I'll have a look, thanks
I think in an app we run into problems where there was already an active session, hence the use for getting the current session (if any)
d
So I think our change should have remedied that
please shout if it doesn't
but sessions should now be isolated
f
OK, it seems that I will have to change the API we proposed. We could previously propose:
Copy code
python
# this would create a session and return the catalog
catalog = get_catalog()
# but this would also return the catalog inside the already created session
with KedroSession.create(env="prod"):
    catalog = get_catalog()
Without access to the current session, I do not see how we can provide it now
d
If you add an
as session
to the end of your context manager you can access the objects live
f
Yes, I know, I am the one providing the
get_catalog
function, as a helper for those that do not remember that you need to do
session.load_context().catalog
. But maybe, it is not that useful
Or for that matter that do not remember to use a session at all
n
session.run()
would be the preferred way to run a pipeline programatically
Do you actually need the active session or just that this is blocking you to create another session?
f
Well, ideally, if I am already in a session for some reason, I'd like to use it and not close it
n
If that's the case using
with KedroSession.create() as session
will probably give you access to one session without the need to close/recreate a new one. The active session will soon be removed and you could create as many sessions as you needed, but most likely you only need one.