https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
beginners-need-help
  • m

    metalmind

    01/05/2022, 5:21 PM
    As far as I know, this is a tool for manual labeling. What I'm looking for is a script with a function used labeling and another for feature generation.
    d
    • 2
    • 68
  • m

    metalmind

    01/05/2022, 5:25 PM
    Data can be loaded via configuration, right? I want to use Kedro during the experiement/R&D phase. So I need to be able to mix and match raw data + feature generator + label generator + parameters. Say I have 10 raw data files. 10 feature generation functions (source files), ...etc. I want to try Raw Data 1 + Featurizer 2 + Labeler 5 + Parameter Set 6 (1/2/5/6). Then 3/4/1/7, ...etc. All while letter Mlflow record the results.
  • m

    metalmind

    01/05/2022, 5:25 PM
    So not only raw data can be configured and loaded, also scripts (functions) used to generate the features and labels.
  • m

    metalmind

    01/05/2022, 5:27 PM
    --config may be used as a starting point, but how to change the function in a pipeline node from it. I have an idea already but wanted first to check if there's a way to do it out of the box.
  • d

    datajoely

    01/05/2022, 5:28 PM
    @User would you mind switching to the thread?
  • c

    ChainYo

    01/06/2022, 10:17 AM
    Hey, I just discovered Kedro last night and give it a try this morning with the spacefights tutorial . I'm a bit confused about using kedro in conda env. It seems that
    kedro install
    or
    pip install -r src/requirements.txt
    install correctly things in the env, but
    kedro ipython
    doesn't use it... I was getting errors with loading things in
    catalog
    and
    %reload_kedro
    . The errors are gone if I use the main ubuntu env without conda.
    d
    • 2
    • 51
  • c

    ChainYo

    01/06/2022, 10:18 AM
    I'm wondering if I miss something with conda env, or I got something broken on my workstation ?
  • i

    idriss__

    01/06/2022, 4:50 PM
    Hello ! currently we are building a pipeline with kedro, so we created a custom dataset based on AbstractDataset, and we are wondering how to access to _descrive() method while we are in a node. actually we have access to _load method of Dataset object only. thx
  • d

    datajoely

    01/06/2022, 4:53 PM
    Hi @User the node actually can't access that sort of thing, by design the node shouldn't care how the data is loaded/saved. If you want to expose this information somewhat, perhaps the
    before_node_run
    hook may be useful
  • m

    Mackson

    01/09/2022, 1:44 PM
    Hello, guys. In my DS pipeline I experiment A LOT and as I understand the focus of Kedro is on good software (not on a lot of experiments), the point is that the nodes itself are really great because I write only once. The question I have is with pipelines and init files where I have to write a lot of new pipelines and declarations to create these different experiments (I am keeping track in mlflow so this part is really ok), the namespace really helps but it's not flexible enough, I have no problems with this current workflow. My question is I am using Kedro in the correct way or am I missing something? Thanks a lot!
  • a

    antony.milne

    01/10/2022, 7:41 AM
    Hi @User and welcome! This sounds like you're doing roughly the right thing, but you might be missing a few things: * the best way to clone many nodes/pipelines that are similar but with slightly different parameters (e.g. different model) is through modular pipelines. @User will have some nice examples here to illustrate * note you can nest namespaces, e.g.
    linear.experiment_1.train_model
    . kedro-viz supports also supports arbitrarily nested hierarchy like this * experiment tracking is now natively supported in kedro. Still work in progress but if you're interested you can see the documentation at https://kedro.readthedocs.io/en/stable/08_logging/02_experiment_tracking.html and a demo at https://demo.kedro.org/ (click the flask icon on the left sidebar)
  • d

    datajoely

    01/10/2022, 9:41 AM
    Example of multiple params: https://github.com/datajoely/modular-spaceflights/blob/main/conf/base/parameters/modelling.yml
  • d

    Daehyun Kim

    01/11/2022, 6:15 PM
    I have a question about hooks plugin example, https://kedro.readthedocs.io/en/stable/07_extend_kedro/04_plugins.html#hooks i created hooks example and install it to my venv. if I want to add MyHooks to my project, what i need to do?
  • d

    datajoely

    01/11/2022, 6:17 PM
    You need to add it to
    settings.py
    . What do you mean by installing the hooks?
  • d

    datajoely

    01/11/2022, 6:17 PM
    I have an example of adding the custom
    TimingHooks
    here: https://github.com/datajoely/modular-spaceflights/blob/main/src/modular_spaceflights/settings.py
  • d

    Daehyun Kim

    01/11/2022, 6:19 PM
    is TimingHooks in plugin?
  • d

    datajoely

    01/11/2022, 6:19 PM
    Apologies! I misread what you wanted, it is not
  • d

    datajoely

    01/11/2022, 6:19 PM
    it's a lifecycle hook
  • d

    datajoely

    01/11/2022, 6:19 PM
    Our plugin documentation could use some work...
  • d

    Daehyun Kim

    01/11/2022, 6:19 PM
    what I want to do is defining custom hook in plugin, and if i install plugin, i want to apply it to use the custom hook in my kedro project
  • d

    datajoely

    01/11/2022, 6:20 PM
    Gotcha - @User who will be online tomorrow can help here, @User if he sees this knows that stuff well too
  • d

    datajoely

    01/11/2022, 6:20 PM
    The other thing you can do is read through how
    kedro-telemetry
    works since it's a really simple plugin https://github.com/quantumblacklabs/kedro-telemetry
  • d

    Daehyun Kim

    01/11/2022, 6:23 PM
    thank you, i will read it.
  • d

    Daehyun Kim

    01/11/2022, 6:28 PM
    it looks like all I need to do it is installing plugin, after installing the example plugin i could see
    2022-01-11 10:27:39,174 - root - INFO - Reached after_catalog_created hook
    when i run the pipeline. this is awesome, thanks
  • d

    Daehyun Kim

    01/11/2022, 6:30 PM
    one more question is If I already have
    after_catalog_created
    in a different hook in the current kedro project, which one will have priority between after_catalog_created() in the current kedro project and after_catalog_created() from the plugin?
  • d

    datajoely

    01/11/2022, 6:40 PM
    I donโ€™t know if you can guarantee the order - this is deffo an advanced question!
  • d

    datajoely

    01/11/2022, 6:40 PM
    Iโ€™ll wait for the others to be online tomorrow to get back to you on this one!
  • d

    Daehyun Kim

    01/11/2022, 6:40 PM
    ok, thanks!
  • d

    deepyaman

    01/11/2022, 7:18 PM
    The one from the plugin will be called first; see https://kedro.readthedocs.io/en/latest/07_extend_kedro/02_hooks.html#registering-your-hook-implementations-with-kedro for more info on hook order. ๐Ÿ™‚ (I didn't know this until I looked it up just now either lol)
  • i

    idriss__

    01/12/2022, 8:57 AM
    Hi guys hope you are doing good. i'm wondering how to release unused MemoryDataset which i will not use in next pipeline nodes thank you
Powered by Linen
Title
i

idriss__

01/12/2022, 8:57 AM
Hi guys hope you are doing good. i'm wondering how to release unused MemoryDataset which i will not use in next pipeline nodes thank you
View count: 1