https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
beginners-need-help
  • d

    datajoely

    11/18/2021, 2:28 PM
    So this gets tricky with the parallel runner due to limitation of how multi-processing works in Python - what do you need from the current one that's not available in an old one?
  • d

    datajoely

    11/18/2021, 2:28 PM
    Realistically hooks are the way that we encourage users to do this sort of thing
  • i

    Isaac89

    11/18/2021, 2:32 PM
    The problem is that there is no current session. I was trying to load a dataset from the catalog. The old one would also be ok
  • d

    datajoely

    11/18/2021, 2:33 PM
    So you're within a node and want access to a dataset, why not pass it to the node ahead of time?
  • i

    Isaac89

    11/18/2021, 3:21 PM
    I'm passing a function to the node with its arguments and I want to execute it. The Node is always the same, but functions may be different with a different parameter defining their arguments. I want the node to be able to use whichever function as long as the function and the parameters are compatible. That's the reason why I don't want to make the function dependent on the argument of a specific function. Of course I could load the dataset in a different way like passing the path of the dataset to the function in the argument and let the function deal with it, but kedro catalog is cool so I was trying to use it and for the SequentialRunner everything was fine. Is the kedro session passed in the subprocesses with the ParallelRunner? Is there a way of saving and loading it back?
  • d

    datajoely

    11/18/2021, 4:05 PM
    I think you may pushing things outside of the way its been designed and in many ways want you to use it. I'm sure it's doable, but it's hard to help this far off piste if that makes sense.
  • i

    Isaac89

    11/18/2021, 5:01 PM
    Sure, I understand. Probably I need to design it in another way. Thanks!
  • r

    RRoger

    11/21/2021, 4:00 AM
    I'm following this tutorial (Extracting Notebook Cell Functions as Pipeline Nodes - Jupyter to Kedro Ep. 2). Is the Kedro-wings section (

    https://youtu.be/cKlrkIgGYEw?t=531▾

    ) different for version 0.17.5? I don't think
    kedro new
    produces
    run.py
    .
  • r

    RRoger

    11/21/2021, 4:25 AM
    How to get a
    before_node_run
    hook to execute? I added the following in
    src/bnhm/hooks.py
    (as suggested in https://kedro.readthedocs.io/en/stable/07_extend_kedro/02_hooks.html#use-hooks-to-extend-a-node-s-behaviour):
    def say_hello(node: Node):
        """An extra behaviour for a node to say hello before running."""
        print(f"Hello from {node.name}")
    
    
    class ProjectHooks:
        @hook_impl
        def register_config_loader(...):
            return ConfigLoader(conf_paths)
    
        @hook_impl
        def register_catalog(...):
            return ....
    
        @hook_impl
        def before_node_run(self, node: Node):
            # adding extra behaviour to a single node
            if node.name == "Country Plot":
                say_hello(node)
    The pipeline is as attached. The output doesn't have the expected "Hellow from Country Plot":
    2021-11-21 15:17:45,228 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
    2021-11-21 15:17:45,306 - root - INFO - ** Kedro project bnhm
    2021-11-21 15:17:45,768 - kedro.io.data_catalog - INFO - Loading data from `GBIF` (GBIFRequestDataSet)...
    2021-11-21 15:17:59,472 - kedro.pipeline.node - INFO - Running node: get_country_plot([GBIF]) -> [country_plot]
    2021-11-21 15:17:59,534 - kedro.io.data_catalog - INFO - Saving data to `country_plot` (MatplotlibWriter)...
    2021-11-21 15:17:59,608 - kedro.runner.sequential_runner - INFO - Completed 1 out of 1 tasks
    2021-11-21 15:17:59,609 - kedro.runner.sequential_runner - INFO - Pipeline execution completed successfully.
    2021-11-21 15:17:59,609 - kedro.framework.session.store - INFO - `save()` not implemented for `BaseSessionStore`. Skipping the step.
  • d

    dmb23

    11/21/2021, 8:44 AM
    Hey @RRoger, I think the problem is the name of the node you check against: you test against "Country Plot", but first this is the prettyfied name (I think the variable itself is named
    country_plot
    ) and second this is a dataset, not a node. So either you change your hook into a dataset hook, or you check for something like
    if node.name == "get_country_plot"
    to get the node which creates this dataset. Does something like this work?
  • r

    RRoger

    11/21/2021, 9:06 AM
    I changed the method to:
    @hook_impl
    def before_node_run(self, node: Node):
        # adding extra behaviour to a all nodes
        print(f"Hello from {node.name}")
    Which should print before each node right? But still nothing.
  • d

    dmb23

    11/21/2021, 9:14 AM
    That is weird. Did you register the hooks in the settings.py ? Else I would also run out of ideas...
  • r

    RRoger

    11/21/2021, 9:15 AM
    Yes,
    ProjectHooks
    is in
    settings.py
    by default:
    HOOKS = (ProjectHooks(),)
    And the
    before_node_run
    method was added to in the
    ProjectHooks
    class.
  • d

    datajoely

    11/21/2021, 9:16 AM
    Hi @RRoger I can look at this in detail tomorrow - but maybe put a
    breakpoint()
    in your hook and inspect the variables at runtime to make sense of what’s going on
  • r

    RRoger

    11/21/2021, 9:17 AM
    Thanks, I'll try this.
  • k

    khern

    11/22/2021, 8:19 AM
    Hello! I'm a complete beginner in Kedro. I am getting this error when I run my pipeline, can someone help me? Thank you so much! kedro.io.core.DataSetError: Failed while saving data to data set MemoryDataSet(). maximum recursion depth exceeded
  • d

    datajoely

    11/22/2021, 9:29 AM
    Hi @User did the breakpoint help?
  • d

    datajoely

    11/22/2021, 9:30 AM
    Hey @User so that error means that you have some code within your node that recurses without ending. Do you have any variables that are calling themselves?
  • a

    antony.milne

    11/22/2021, 1:23 PM
    I can confirm having just tried this myself that this works:
    class ProjectHooks:
        @hook_impl
        def before_node_run(self, node):
            print("Hello from ",  node.name)
    If you're doing exactly the same as that and it's not appearing, try doing
    log.info
    and see if that appears. Also can confirm that the structure of the kedro project template has changed since the video you were watching was made - there's now no need to define the
    run
    command in your project
  • k

    khern

    11/22/2021, 3:58 PM
    hello @User , found the error! I checked my nodes, and my codes seemed okay. Then I checked my catalog.yml, and there is a typo. Pipeline ran okay after correcting the typo. Thank you so much!
  • d

    datajoely

    11/22/2021, 4:03 PM
    Good! Unlucky that caused a recursion error - but yes breakpoints are your friend and if you see MemoryDataSet its often a good sign that Kedro can't resolve the catalog name!
  • r

    RRoger

    11/23/2021, 6:23 AM
    I put the breakpoint in the
    hooks.py
    but it didn't even activate.
  • i

    Isaac89

    11/23/2021, 12:07 PM
    Hi! I used update_wrapper_partial(function, params) as the function to some nodes, but when using kedro viz all the nodes using the update_wrapper_partial have no metadata like in the following picture. Has anyone already experienced anything like this? Thanks!
  • i

    Isaac89

    11/23/2021, 12:08 PM
    message has been deleted
  • d

    datajoely

    11/23/2021, 12:18 PM
    Hello - this is a known issue. We will look to fix it shortly!
  • i

    Isaac89

    11/23/2021, 12:19 PM
    ok perfect! thanks !
  • r

    RRoger

    11/24/2021, 4:00 AM
    Is the intention of Kedro to wrap all individual calculations as individual nodes?
  • d

    datajoely

    11/24/2021, 10:07 AM
    This is a style point - I think we err on the side of modular, easily testable code. I'm currently working on a sample project that is more representative of what using Kedro in anger looks like. Perhaps you can use that as inspiration: https://github.com/datajoely/modular-spaceflights
  • k

    khern

    11/24/2021, 12:11 PM
    Hello! I can't seem to run my project by nodes. Worked fine when I run kedro run --pipeline de I tried command: kedro run --node node_name and kedro run --pipeline de --node node_name but I end up with ValueError: Pipeline does not contain nodes named node_name.
  • a

    Arnaldo

    11/24/2021, 12:25 PM
    @User did you specify the name of your node like this?
    node(
        train_model,
        ["example_train_x", "example_train_y", "parameters"],
        "example_model",
        name="train",
    ),
Powered by Linen
Title
a

Arnaldo

11/24/2021, 12:25 PM
@User did you specify the name of your node like this?
node(
    train_model,
    ["example_train_x", "example_train_y", "parameters"],
    "example_model",
    name="train",
),
View count: 1