https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
beginners-need-help
  • r

    rafael.gildin

    09/24/2022, 9:47 PM
    I’m looking for a custom sequential runner. Any tips ?
  • d

    Deemac

    09/26/2022, 11:54 AM
    Hello. Would someone be able to please tell me the best way to debug my pipelines? I use PyCharm. I wish to use the debugger. Which script should I run? Do I need to change anything in the run configuration? Thanks!
  • d

    datajoely

    09/26/2022, 11:55 AM
    There is a tutorial for setting up pycharm run config in the docs! Then you just add breakpoints and use the debugger
  • d

    datajoely

    09/26/2022, 11:55 AM
    Also if you're lazy you can use pdb and the
    breakpoint()
    statement
  • n

    noklam

    09/26/2022, 12:12 PM
    https://kedro.readthedocs.io/en/stable/development/set_up_pycharm.html Got you covered!
  • r

    rohan_ahire

    09/26/2022, 2:33 PM
    Hi All. How do I reference credentials from Azure backed key vault so that the data catalog can access the azure storage buckets?
  • d

    datajoely

    09/26/2022, 2:51 PM
    You'll need to use a before catalog created hook to inject the credentials into our catalog
  • d

    datajoely

    09/26/2022, 2:51 PM
    I think
  • d

    datajoely

    09/26/2022, 2:51 PM
    That or tweak how your templated config loader works in settings.py
  • r

    rohan_ahire

    09/26/2022, 3:23 PM
    I added the credentials to my spark configuration and it worked. So I am not passing credentials to data catalog, just making them available at the cluster level.
  • d

    datajoely

    09/26/2022, 3:43 PM
    That's the best way
  • g

    Goss

    09/26/2022, 5:33 PM
    Trying to adapt the spaceflight tutorial to run on Kubeflow but I get this error:
    kedro.io.data_catalog - INFO - Loading data from 'data_science.candidate_modelling_pipeline.metrics' (MemoryDataSet)...
    Since this is Kubeflow and MemoryDataset cannot be used, this is not surprising. I observe that
    data_science.candidate_modelling_pipeline.metrics
    is not mentioned in catalog.yml. Is the fix to simply modify catalog.yml to add in this dataset similar to the way
    data_science.active_modelling_pipeline.metrics
    is already in there? As in, was this just an oversight? In which case, I would have separate output files for active and cadidate metrics...
  • g

    Goss

    09/26/2022, 8:56 PM
    If I run
    kedro catalog create --pipeline __default__
    on the space tutorial, it generates a bunch of datasets not in the catalog:
    data_science.active_modelling_pipeline.X_test:
      type: MemoryDataSet
    data_science.active_modelling_pipeline.X_train:
      type: MemoryDataSet
    data_science.active_modelling_pipeline.y_test:
      type: MemoryDataSet
    data_science.active_modelling_pipeline.y_train:
      type: MemoryDataSet
    data_science.candidate_modelling_pipeline.X_test:
      type: MemoryDataSet
    data_science.candidate_modelling_pipeline.X_train:
      type: MemoryDataSet
    data_science.candidate_modelling_pipeline.y_test:
      type: MemoryDataSet
    data_science.candidate_modelling_pipeline.y_train:
      type: MemoryDataSet
    Why aren't these included in
    conf/base/catalog.yml
    when their absence causes errors like
    ValueError: Pipeline input(s) {'data_science.active_modelling_pipeline.y_train', 'data_science.active_modelling_pipeline.X_train'} not found in the DataCatalog
    ???
    d
    • 2
    • 4
  • g

    Goss

    09/26/2022, 9:39 PM
    If I run ` kedro catalog create pipeline
  • r

    rohan_ahire

    09/27/2022, 8:20 PM
    Is it possible to access the traceback of a failed kedro pipeline? Like we want to call a rest api, where body of the api will have the error message of the failed kedro pipeline.
  • n

    noklam

    09/27/2022, 8:22 PM
    Sure, it's just Python program. You can do a try except block and return the traceback message if u want
  • n

    noklam

    09/27/2022, 8:24 PM
    @rohan_ahire https://github.com/kedro-org/kedro/issues/1846 We are quite interested to understand how people are exposing kedro pipeline via web api and see if there anything we can improve
  • b

    Barros

    09/29/2022, 10:42 PM
    How do I disable rich in kedro 0.18.3? For my use case, rich backtrace is too verbose and it doesn't integrate well with vscode.
  • b

    Barros

    09/29/2022, 10:44 PM
    Sometimes rich breaks the line and I can't just ctrl-click the file to get to the code in vscode, and this can become annoying over time
  • m

    Merel

    09/30/2022, 1:37 PM
    You can switch back to "plain" logging following these instructions: https://kedro.readthedocs.io/en/stable/logging/logging.html#use-plain-console-logging
  • b

    Barros

    09/30/2022, 2:19 PM
    Didn't work :/
  • g

    Goss

    09/30/2022, 5:57 PM
    How do I use a different default dataset? Trying to run on Kubeflow where MemoryDataSet won't work. Want to change it to something like PickleDataSet. Seems like DataCatalogWithDefault is no longer in v0.18...
  • d

    datajoely

    09/30/2022, 6:01 PM
    You can define the default dataset with a custom runner
  • g

    Goss

    09/30/2022, 6:05 PM
    Got it... thanks! https://kedro.readthedocs.io/en/latest/nodes_and_pipelines/run_a_pipeline.html#custom-runners
  • b

    Barros

    09/30/2022, 6:38 PM
    Oh no, actually it worked partially! The rich logging handler stops logging when I do this change, but the rich backtrace (when something breaks) is still there. This can be tested when you call a pipeline that has wrong inputs or a pipeline that does not exist, for example. So this is only for the console logging, not for the actual backtrace when an exception occurs
  • b

    Barros

    09/30/2022, 6:38 PM
    This may be the case of an issue
  • b

    Barros

    09/30/2022, 6:43 PM
    You can see this here: the upper part (the console handler) and below the backtrace still using rich
  • d

    datajoely

    09/30/2022, 6:48 PM
    So you can disable tracebacks this way I think https://github.com/kedro-org/kedro/issues/1712#issuecomment-1189517668
  • b

    Barros

    09/30/2022, 6:55 PM
    No, this still makes rich the main backtrace. Thanks to this I have found a related issue that is interesting for this discussion: https://github.com/Textualize/rich/issues/2461
  • b

    Barros

    09/30/2022, 7:00 PM
    So the workaround right now is to comment lines 217-219 in
    kedro/framework/project/__init__.py
Powered by Linen
Title
b

Barros

09/30/2022, 7:00 PM
So the workaround right now is to comment lines 217-219 in
kedro/framework/project/__init__.py
View count: 1