https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
advanced-need-help
  • d

    datajoely

    08/17/2021, 7:46 AM
    Thanks for answering @User 👑
  • u

    user

    08/25/2021, 1:28 PM
    Saving data with DataCatalog https://stackoverflow.com/questions/68923747/saving-data-with-datacatalog
  • a

    Anish Shah @ WANDB

    09/02/2021, 6:29 PM
    Has anyone served a custom MLFlow model and that was trained and saved in kedro via
    kedro-mlflow
    ? I have trained a custom mock model in which I define operations to save internal data sources to an mlflow run and load them into the desired custom
    mlflow.pyfunc.PythonModel
    . I am able to serve the model using
    --no-conda
    . Without this flag I get an error raised in which
    ModuleNotFoundError: No module named '<kedro_package_name>'
    . In my MLModel file I set
    loader_module: <kedro_package_name>.extras.mlflow.loader_cosine_model
    . I can also provide any other additional details if anyone can help!
  • a

    Arnaldo

    09/02/2021, 6:39 PM
    @User
  • u

    user

    09/03/2021, 8:28 PM
    Hi, I think you have declared your PipelineML object as in this demo: https://github.com/Galileo-Galilei/kedro-mlflow-tutorial/blob/4c85c357162a85093f0875fe3085fbd9ebe2e4be/src/kedro_mlflow_tutorial/hooks.py#L60-L70 You can see that the it is possible to specify the ``conda_env`` here. It accepts either a path or a dictionnary. I suggest in the tutorial to use ``{your_kedro_package}=={__version__} ``, because in an enterprise setup, I often deploy the code of my package in an internal Nexus /Pypi, so it can be downloaded when needed. It won't work for you if you haven't publish your package on PyPI fist, because conda tries to to pip install it. 4 solutions: - pass the dictionary of your requirements instead of the default one in the ``pipeline_ml_factory`` call - pass the path to your requirements.txt or conda.yml file ``pipeline_ml_factory`` call - publish your package on PyPI so conda can download it (not recommended for public projects, but likely the best solution in an enterprise setup) - create an empty conda env, activate it, install manually your package inside it (``pip install -e /path/to/kedro/package/src``) and call ``mlflow serve`` inside it.
  • u

    user

    09/09/2021, 7:16 PM
    Hello everyone. how to get the current run_id inside a node when using SequentialRunner.run (not using KedroSession nor KedroContext) ?
  • d

    datajoely

    09/09/2021, 8:05 PM
    It’s present in the hooks https://kedro.readthedocs.io/en/latest/kedro.framework.hooks.specs.NodeSpecs.html#kedro.framework.hooks.specs.NodeSpecs.before_node_run Further reading - https://kedro.readthedocs.io/en/latest/07_extend_kedro/02_hooks.html
  • u

    user

    09/09/2021, 11:20 PM
    I know about hooks, but I don't see in the documentation how to register hooks without KedroSession ? I am using SequentialRunner.run to run my pipelines not KedroSession.
  • u

    user

    09/10/2021, 3:28 AM
    How to let kedro execute nodes in sequence https://stackoverflow.com/questions/69126849/how-to-let-kedro-execute-nodes-in-sequence
  • d

    datajoely

    09/10/2021, 6:21 AM
    Ah I’m sorry I miss understood. If you’re not following the standard structure I think your only option is to maybe define a custom runner
  • u

    user

    09/10/2021, 9:07 AM
    Running a kedro pipeline with inputs and outputs defined in the command line https://stackoverflow.com/questions/69129846/running-a-kedro-pipeline-with-inputs-and-outputs-defined-in-the-command-line
  • u

    user

    09/10/2021, 1:07 PM
    While running kedro - ImportError: Bad git executable https://stackoverflow.com/questions/69132675/while-running-kedro-importerror-bad-git-executable
  • i

    Isaac89

    09/13/2021, 6:36 AM
    Hi everyone! I would like to be able to pass command line arguments (like input path and output path) from the command line to update the catalog. I sow the possibility of using kedro.config.TemplatedConfigLoader, but it requires to manually define the global_dict variables in the hooks. Is there a way to intercept arguments defined through the CLI and use them in the TemplatedConfigLoader ? Furthermore, thinking about the problem with dynamic catalogues and reproducibility, It would be nice to be able to save the rendered jinja2 catalog such that if one would want to reproduce it, he/she would just have to run the pipeline with the saved catalogue. What would be the Kedro way of doing this? Thanks for your help!
  • y

    Yetunde

    09/14/2021, 10:33 AM
    We've actually been talking about a lot of issues related to configuration here: https://github.com/quantumblacklabs/kedro/issues/891 Do you want to pop a comment on this GitHub issue?
  • s

    Solarer

    09/14/2021, 12:37 PM
    I also have a configuration related question: Can I run
    kedro run --env test
    and still have my settings in 'test' overwritten by my settings in 'local'? I think it only works when NOT specifying an environment
  • s

    Solarer

    09/14/2021, 12:37 PM
    otherwise my settings in conf/local are ignored
  • d

    datajoely

    09/14/2021, 3:01 PM
    You would have to alter the templates config loader to do this - IIRC there is a post on this very topic by @Ignacio if you search for it
  • s

    Solarer

    09/14/2021, 3:11 PM
    Hmm, I am thinking about opening a feature request for that because in my current project I am always using an environment - therefore 'local' does not work at all. Can you point me to the post by @User ? I was not able to find it in git
  • d

    datajoely

    09/14/2021, 3:14 PM
    I think this is it https://discord.com/channels/778216384475693066/846330075535769601/875275273430515795
  • i

    Isaac89

    09/15/2021, 12:31 PM
    Interesting results! Thanks for pointing me to it!
  • w

    Waldrill

    09/20/2021, 4:39 PM
    Hello everyone. I’d like an opinion on a scalability issue I’m facing, mostly because I’m new to Kedro, and maybe yall can point me to the best way out of this. My problem is that I have a generic enough pipeline that can be applied to some factory equipments. Currently it attends only one but the client wants to scale this up. A project characteristic is that the catalog and parameters change substantially between the applications. And a restriction is that we want to be able to run the pipelines independently as needed, meaning that not only raw data should be changed between them, but the outputs too. I’m worried about duplicating code because it makes maintenance difficult and I also worried about losing the convenient environment usage to separate domains
    local
    ,
    production_vm
    ,
    production_cloud
    , etc. That said … how can I keep the pipeline reusable with substantial changes in catalog and parameters? Thanks for reading! 👍
  • w

    Waldrill

    09/20/2021, 4:49 PM
    P.S.: It's not that I haven't found ways out of it ... but my idea is to check on best practices, and avoid future reworks.
  • u

    user

    09/20/2021, 5:28 PM
    You can use nested environment in this case. In
    conf/
    , you can maintain the following structure:
    conf/
    |__ app1/
        |__ local
        |__ vm
    |__ app2/
    And run the pipeline with
    kedro run --env="app1/local"
  • w

    Waldrill

    09/20/2021, 6:03 PM
    Thanks for the idea @User, following your example, will it respect a default parameter file for each app?
    conf/
    ├── app1/
    │   ├── parameters.yml
    │   ├── local/
    │   │   └── catalog.yml
    │   └── vm/
    │       ├── catalog.yml
    └── app2
    will it read the
    conf/app1/parameters.yml
    if passed
    --env="app1/local"
    ?
  • d

    datajoely

    09/20/2021, 6:27 PM
    @Waldrill it should take any of the keys present in those nested parameters files as overrides
  • d

    datajoely

    09/20/2021, 6:28 PM
    https://kedro.readthedocs.io/en/latest/04_kedro_project_setup/02_configuration.html#additional-configuration-environments
  • w

    Waldrill

    09/20/2021, 6:32 PM
    Cool @User, was the environment made for this kind of separation (between applications I mean) or it is just a work around? I don't want to use something that might not be designed for it and therefore struggle with it in the feature releases.
  • d

    datajoely

    09/20/2021, 6:33 PM
    No it’s a core part of Kedro
  • d

    datajoely

    09/20/2021, 6:33 PM
    The canonical example is staging/prod/test
  • d

    datajoely

    09/20/2021, 6:33 PM
    But we’ve seen this used for per country pipelines for example
Powered by Linen
Title
d

datajoely

09/20/2021, 6:33 PM
But we’ve seen this used for per country pipelines for example
View count: 1