https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
beginners-need-help
  • g

    gui42

    04/05/2022, 6:34 PM
    yep. Tehy should have the same structure and just be named properly. Ill look into partitioned dataset!
  • g

    gui42

    04/05/2022, 6:39 PM
    These seem cool. Can the incremental dataset be used to run a pipeline per partition?
  • d

    datajoely

    04/05/2022, 6:40 PM
    Incremental checkpoints the last partition seen
  • b

    beats-like-a-helix

    04/06/2022, 9:23 PM
    Let's say I have thousands of operations to perform which are computationally expensive. Each iteration yields a set of parameters which I'd like to write one at a time to the same file -- whether this be as rows of a csv, SQL table or whatever -- so that already written data is preserved should the script fail on a particular iteration. What is the "Kedro approved" factory pattern for a use case like this? Any advice would be much appreciated, cheers.
    d
    w
    • 3
    • 14
  • z

    Zemeio

    04/06/2022, 9:40 PM
    Hey guys. I was following the tutorial on experiment tracking, but all of my data is being saved on memory dataset, regardless of what the catalog says, and the folder 09 is not created, nor are the results saved. Any idea why that would happen? Tutorial: https://kedro.readthedocs.io/en/stable/tutorial/set_up_experiment_tracking.html Python version: 3.8.5 Kedro version: 0.18.0
    2022-04-07 06:29:25,232 - kedro.io.data_catalog - INFO - Saving data to `data_processing.preprocessed_companies` (MemoryDataSet)...
    preprocessed_companies:
      type: pandas.ParquetDataSet
      filepath: data/02_intermediate/preprocessed_companies.pq
      layer: intermediate
  • n

    noklam

    04/06/2022, 9:42 PM
    Would be great if you can share a gist or repo of your working directory
  • d

    datajoely

    04/06/2022, 9:45 PM
    factory pattern
  • z

    Zemeio

    04/06/2022, 9:48 PM
    Ok, created a temporary repo for that https://github.com/Zemeio/kedro_experiment/tree/main/new-kedro-project
    n
    d
    • 3
    • 41
  • n

    noklam

    04/06/2022, 10:09 PM
    spaceflights tutorial is not saving dataset correctly to be shown in kedro-viz
  • s

    sebaxtian

    04/07/2022, 6:45 PM
    Hi everyone, I got this error as well (https://github.com/kedro-org/kedro/issues/1409), I would like to know if anybody else got the same error?
    (.venv) sebaxtian@Lenovo:~/Workspaces/Sebaxtian/kedro-hello-world$ python hello_kedro.py
    Traceback (most recent call last):
      File "hello_kedro.py", line 39, in <module>
        print(runner.run(greeting_pipeline, data_catalog))
    TypeError: run() missing 1 required positional argument: 'hook_manager'
    kedro version: v0.18.0
  • d

    datajoely

    04/07/2022, 6:47 PM
    The hello kedro example is currently broken and we will be fixing it I'm the next few days. For now the spaceflights tutorial is working as expected!
  • s

    sebaxtian

    04/07/2022, 6:47 PM
    Thanks !
  • s

    sebaxtian

    04/07/2022, 6:49 PM
    @datajoely I fixed it as I mentioned here: https://github.com/kedro-org/kedro/issues/1409#issuecomment-1092053287 I'm not sure that is the right solution but its works for me
  • g

    gui42

    04/07/2022, 10:51 PM
    Guys, sometimes when running
    kedro run ...
    the process hangs at the end and it doesn't finish. Any Idea on how to debug it?
  • d

    datajoely

    04/08/2022, 8:52 AM
    Hi @gui42 it would be good to learn more. What stage fails? Can you provide some logs? Have you tried putting a
    breakpoint()
    within the node?
  • e

    Edak

    04/11/2022, 4:36 AM
    Any example of a node that lazily loads and lazily saves a partitioned dataset while performing some transformations? I've seen the examples on this page (https://kedro.readthedocs.io/en/stable/data/kedro_io.html#partitioned-dataset-save) but am having trouble wrapping my head around how this is done in a node that doing a few transformation to a large dataset.
    d
    z
    • 3
    • 14
  • m

    Malaguth

    04/11/2022, 5:03 AM
    Hello, everyone. I'm having a problem when I try to use the
    TemplatedConfigLoader
    through the settings file. I'm receiving the error: `dynaconf.validator.ValidationError`: Invalid value
    kedro.config.templated_config.TemplatedConfigLoader
    received for setting
    CONFIG_LOADER_CLASS
    . It must be a subclass of
    kedro.config.config.ConfigLoader
    . Is this validation correct? After the changes in 0.18, the superclass wouldn't be the
    AbstractConfigLoader
    ?
  • n

    noklam

    04/11/2022, 7:00 AM
    This is indeed a bug and we are aware of it. This will be fixed soon. https://github.com/kedro-org/kedro/issues/1402
  • d

    datajoely

    04/11/2022, 9:10 AM
    Lazy partition save
  • m

    Malaguth

    04/11/2022, 12:46 PM
    Thanks, I'll try the workaround in the issue for the time being
  • n

    nd0rf1n

    04/11/2022, 8:05 PM
    Hello, everybody! It's my first day playing with Kedro and I'm getting the following error, when I'm running
    kedro run
    during the "Extend the data processing pipeline" step of the spaceflights tutorial (that's where you add the
    pandas.ParquetDataset
    to the catalog):
    kedro.io.core.DataSetError: Class `pandas.ParquetDataset` not found or one of its dependencies has not been installed.
    Any ideas on what the issue is? I've spent quite a few hours playing around with conda environments, tried both on Windows and WSL, but I keep getting the same error. Googling around has not helped either.
    d
    n
    • 3
    • 61
  • d

    Daehyun Kim

    04/14/2022, 8:57 PM
    Hi Team, do you have an official ML Kedro example project?
  • d

    datajoely

    04/14/2022, 8:59 PM
    I haven't updated this to 0.18.x yet but this may be helpful https://github.com/datajoely/modular-spaceflights
  • d

    Daehyun Kim

    04/14/2022, 9:02 PM
    thank you
  • z

    Zemeio

    04/18/2022, 1:19 PM
    Hey guys. I am trying to create a new kedro project using the mini-kedro, but I am getting an error saying it does not exist. I am using kedro 0.8.0, with python 3.8.5. Was it just nor ported yet to 0.8.0?
  • z

    Zemeio

    04/18/2022, 1:20 PM
    Error: Kedro project template not found at git+https://github.com/kedro-org/kedro-starters.git. Specified tag 0.18.0. The following tags are available: 0.17.0, 0.17.1, 0.17.2, 0.17.3, 0.17.4, 0.17.5, 0.17.6, 0.17.7, 0.18.0. The aliases for the official Kedro starters are: 
    - astro-airflow-iris
    - mini-kedro
    - pandas-iris
    - pyspark
    - pyspark-iris
    - spaceflights
  • d

    datajoely

    04/18/2022, 1:20 PM
    What command are you typing?
  • z

    Zemeio

    04/18/2022, 1:20 PM
    kedro new -s mini-kedro
  • d

    datajoely

    04/18/2022, 1:21 PM
    I think there is a chance that one has changed its name
  • z

    Zemeio

    04/18/2022, 1:21 PM
    I saw that and tried the alias, but it didn't work as well
Powered by Linen
Title
z

Zemeio

04/18/2022, 1:21 PM
I saw that and tried the alias, but it didn't work as well
View count: 1