https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
beginners-need-help
  • d

    datajoely

    02/16/2022, 8:40 PM
    We'll have 3.9, 3.10 support shortly
  • d

    datajoely

    02/16/2022, 8:41 PM
    we can update it quikcly because it will introduce a breaking change and thus will be in 0.18.0
  • r

    reduction

    02/16/2022, 8:43 PM
    thanks for the help
  • e

    Edak

    02/17/2022, 8:58 PM
    Any recommendations on how to add "great expectations" into an existing kedro project? Not too familiar with using hooks so any example code would be really useful to start with.
  • d

    Dhaval

    02/17/2022, 9:01 PM
    kedro.io.core.DataSetError: Save path `/home/thakkar/Work/kedro_project/data/03_primary/Master_table.pkl/2022-02-17T20.20.56.877Z/Master_table.pkl` for PickleDataSet(backend=<module 'pickle' from '/home/thakkar/anaconda3/envs/kedro_project/lib/python3.8/pickle.py'>, filepath=/home/thakkar/Work/kedro_project/data/03_primary/Master_table.pkl, load_args={}, protocol=file, save_args={}, version=Version(load=None, save='2022-02-17T20.20.56.877Z')) must not exist if versioning is enabled.
    I am currently using the example code for prefect from the kedro tutorials and there's this weird bug that I came across. Whenever I register a flow, when the folders in the
    data
    folder are empty, the first run works completely fine but when the same flow is run again, it gives the error. PS: I have enabled versioned=True for this dataset. Ideally every run should have it's own timestamped folder and then the pkl/csv file but that's not the case when working with Prefect. I don't know what is going on under the hood over there so I could really appreciate some help
    a
    d
    • 3
    • 31
  • d

    datajoely

    02/17/2022, 9:11 PM
    Hi Dhuval - I think there is an old run (without versioning) at the same location. Would you remove and try again?
  • d

    datajoely

    02/17/2022, 9:12 PM
    Hi @User there are some examples on the docs on the hooks page - but my top tip if you are only using pandas then pandera is much easier to use
  • d

    Dhaval

    02/17/2022, 9:13 PM
    I have completely emptied the folder location and reran the flow again. The error still persists that's why I thought I'll reach out for help. After removing
    versioned=True
    the code works as expected but then I miss out on seeing what the results for the previous runs look like
  • d

    Dhaval

    02/17/2022, 9:22 PM
    @User For your reference. I have created a run, the first run is executed when there are no folders inside the data/03_primary folder. The second time, as expected, there is a folder present but the run is not able to save the versioned dataset in the same folder because it picks the timestamp for the first run of the flow
  • d

    Dhaval

    02/17/2022, 9:25 PM
    message has been deleted
  • a

    avan-sh

    02/18/2022, 7:18 AM
    Hi Dhaval, don't think I might be right but just throwing my thoughts on why this could be happening. Is it possible that the kedro context initiated first time is being reused on the second run?
  • d

    Dhaval

    02/18/2022, 7:19 AM
    I'm guessing it's the same thing too. The flow that is being used has a fixed timestamp at all times, which doesn't make sense. There's something that goes in the backend and I don't know how to change the timestamp for the runs 😅
  • a

    avan-sh

    02/18/2022, 7:24 AM
    I'll try to replicate the issue, would following this guide[https://kedro.readthedocs.io/en/stable/10_deployment/05_prefect.html] replicate your setup?
  • d

    Dhaval

    02/18/2022, 7:26 AM
    Yes, this is exactly what I've used. Versioning datasets are causing issues while executing quick runs repetitively for a flow
  • a

    avan-sh

    02/18/2022, 7:50 AM
    prefect-versioned-datasets
  • i

    idriss__

    02/18/2022, 11:55 AM
    Hello everyone! I wanted to know why kedro owns base and local env ? and what each one dedicated for ? why i can not create my own default env ?
  • d

    datajoely

    02/18/2022, 11:56 AM
    Hi @User this should explain everything... https://kedro.readthedocs.io/en/latest/04_kedro_project_setup/02_configuration.html#additional-configuration-environments
  • d

    datajoely

    02/18/2022, 11:56 AM
    not sure how I added that sticker
  • d

    datajoely

    02/18/2022, 11:56 AM
    ¯\_(ツ)_/¯
  • d

    datajoely

    02/18/2022, 11:57 AM
    but the logic is that
    base
    is committed to git,
    local
    isn't and overrides any duplicate keys
  • d

    datajoely

    02/18/2022, 11:57 AM
    you can then add additional `env`s in the docs above
  • i

    Isaac89

    02/18/2022, 2:13 PM
    Hi! is there a way to show the list of all the versions of a dataset in the catalog?
    a
    • 2
    • 6
  • i

    Isaac89

    02/18/2022, 3:49 PM
    Hi! has anyone ever tried to run a pipeline in jupyter returning a plot? everything works till the and the figure is rendered correctly, but it raises the error
    AttributeError: 'Affine2D' object has no attribute '_boxout'
  • d

    datajoely

    02/18/2022, 3:49 PM
    what kind of plot is this? Matplotlib?
  • i

    Isaac89

    02/18/2022, 3:49 PM
    yes
  • i

    Isaac89

    02/18/2022, 3:51 PM
    I'm running it in this way :
    SequentialRunner().run(pipeline=my_pipeline.from_nodes("node1").to_nodes("node10"),catalog=test)
  • i

    Isaac89

    02/18/2022, 3:52 PM
    the plot node is a memory dataset and is the last one. it is returned as a dict
    {"node10": <Figure size 720x720 with 1 Axes>}
  • d

    datajoely

    02/18/2022, 3:52 PM
    If it gets to that point I'm not sure it's a Kedro issue
  • d

    datajoely

    02/18/2022, 3:53 PM
    it's an issue with Matplotlib render step
  • d

    datajoely

    02/18/2022, 3:53 PM
    so i would hunt around their docs
Powered by Linen
Title
d

datajoely

02/18/2022, 3:53 PM
so i would hunt around their docs
View count: 1