https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
advanced-need-help
  • d

    datajoely

    08/16/2022, 9:08 PM
    Hi @marioFeynman we made a related change to the SQL dataset last year https://github.com/kedro-org/kedro/pull/1163 This may provide some inspiration on how to think about things. Are the open connections causing issues?
  • d

    datajoely

    08/16/2022, 9:08 PM
    You could probably get somewhere with an after pipeline run hook too
  • m

    marioFeynman

    08/16/2022, 9:11 PM
    Yes, we are having some problems due a spike in the connections so i wanted to close all the connections that depends on me
  • m

    marioFeynman

    08/16/2022, 9:12 PM
    I read about the singleton for the engines, so i will need to iterate somehow overall all the datasets objects, and close all the engines after the success/failed pipes?
  • m

    marioFeynman

    08/16/2022, 9:13 PM
    That was my first idea... is that the recommended way?
  • d

    datajoely

    08/16/2022, 9:13 PM
    So the after pipeline run hook has access to the catalog as a live object so you should be able to access all of the SQL dataset instances
  • d

    datajoely

    08/16/2022, 9:13 PM
    It's a bit hacky but feels doable
  • m

    marioFeynman

    08/16/2022, 9:14 PM
    Yeah, but its all about the hacky stuffs
  • m

    marioFeynman

    08/16/2022, 9:15 PM
    I will make a test and let you Know if works
  • m

    marioFeynman

    08/16/2022, 9:55 PM
    We are using an old 0.16.x version, so, sadly, im not able to grab the engine
  • d

    datajoely

    08/16/2022, 9:57 PM
    Ah - I would encourage you to upgrade at some point if you can
  • m

    marioFeynman

    08/16/2022, 9:58 PM
    Is in my current roadmap, but like 6 month in the future hahaha
  • u

    user

    08/20/2022, 10:59 PM
    dynamic parameters on datasets in Kedro https://stackoverflow.com/questions/73430557/dynamic-parameters-on-datasets-in-kedro
  • u

    user

    08/23/2022, 2:29 PM
    How to use generators with kedro? https://stackoverflow.com/questions/73460511/how-to-use-generators-with-kedro
  • j

    Jose Alejandro M

    08/23/2022, 8:36 PM
    Hi, i am currently trying using kedro for a ML prediction pipeline which has a component of preprocessing and other one of prediction. Our client wants to combine this pipeline with another development that has been done in javascript and then use both of them via an API which i have implemented in FastAPI. Since i am consuming the javascript development via API i want to be able to capture the last output of the kedro pipeline into a memory dataset so i can be able to use its output and then provide a Json answer though the API. I am having problems getting this output, since i always get an empty dictionary. I am using in the catalog a MemoryDataset and i specified the copy mode to assign but it still does not work. My concrete question is does anyone know how can i import the pipeline into a python script and get the output of the pipeline to a variable in python so i can use it for further purposes? This is my script template
    from kedro.framework.session import KedroSession
    from kedro.framework.startup import bootstrap_project
    from pathlib import Path
    
    
    
    metadata = bootstrap_project(Path("Mypath"))
    session=KedroSession.create(metadata.package_name,Path("Mypath"),env="base")
    context=session.load_context()
    
    result=session.run(pipeline_name="__default__")
    print(result)
    thanks for your help
  • d

    datajoely

    08/23/2022, 8:57 PM
    So we can work to find some solution for this, but could you not use a persistence layer to just pick up the data in the downstrem process?
  • j

    Jose Alejandro M

    08/23/2022, 9:01 PM
    Yes, well this is the way i am doing it but since the data can grow quite fast i would like to keep it in memory. Our client is more concerned in performance. I am not sure if this is possible to do (easily) but i have been wandering if this is possible. I appreciate your help 😄
  • d

    datajoely

    08/23/2022, 9:02 PM
    So I'm 90% sure the last items in the pipeline should be returned in a dictionary as a result of session.run()
  • d

    datajoely

    08/23/2022, 9:02 PM
    But you need to ensure they are not declared as catalog entries and are implicitly treated as MemoryDataSets
  • n

    noklam

    08/23/2022, 9:09 PM
    I think @datajoely is correct, it will return the free_outputs, which is defined as pipeline outputs - catalog entries, this is one of the bit that I think we could change, but this is how it is done currently.
  • j

    Jose Alejandro M

    08/23/2022, 9:10 PM
    When you say that they are not declared in the catalog, i have a problem because when i use the parameter of "env" and i do not register the output, it takes the variable from base environment and as a result i never get the output. I do not know if you could suggest a way around to solve this
  • d

    datajoely

    08/23/2022, 9:12 PM
    So you can explicitly declare MemoryDataSets in your catalog, if you do this in your custom env does it work?
  • j

    Jose Alejandro M

    08/23/2022, 9:13 PM
    That's what i am doing, declaring it but it still returns an empty dictionary 🥲
  • d

    datajoely

    08/23/2022, 9:13 PM
    If you remove the entry from your base env does it work?
  • j

    Jose Alejandro M

    08/23/2022, 9:13 PM
    i am using
    final_data:
      type: MemoryDataSet
      copy_mode: assign
  • j

    Jose Alejandro M

    08/23/2022, 9:14 PM
    I will try it and i will let you know
  • j

    Jose Alejandro M

    08/23/2022, 9:17 PM
    😄 Yeah it did work!!!, i do not know why i never thought on removing it from base environemnt, thanks for your help. 🥳
  • d

    datajoely

    08/23/2022, 9:17 PM
    Amazing! I agree with @noklam we can do better here
  • s

    skuma458

    08/24/2022, 3:09 PM
    Hi, we have upgraded kedro to v0.17.7 but now viz seems to stop working, it always hangs at /usr/local/lib/python3.8/site-packages/hdfs/config.py:15: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses from imp import load_source
  • s

    skuma458

    08/24/2022, 3:09 PM
    there is no verbose option to check if there are any failures, any inputs how to fix
Powered by Linen
Title
s

skuma458

08/24/2022, 3:09 PM
there is no verbose option to check if there are any failures, any inputs how to fix
View count: 1