778216384475693066 #beginners-need-help

Channels

advanced-need-help

job-posting

welcome

rafael.gildin

09/23/2022, 3:10 PM

Other issue, is there a way to reduce the error message from cli?

noklam

09/23/2022, 3:27 PM

You can set the level of the log to display via the

logging.yml

, but in general I don't think you want to hide error messages

rafael.gildin

09/23/2022, 3:32 PM

thanks

rafael.gildin

09/23/2022, 4:22 PM

how?

noklam

09/24/2022, 8:40 PM

You simply update the LEVEL of logging that you desire

noklam

09/24/2022, 8:41 PM

There should be a

logging.yml

, it's standard Python Logging module if you need more docs about it.

rafael.gildin

09/24/2022, 9:46 PM

Even if I change it , the huge error message doesn’t disappear.

rafael.gildin

09/24/2022, 9:46 PM

Thanks you anyway

rafael.gildin

09/24/2022, 9:47 PM

I’m looking for a custom sequential runner. Any tips ?

Deemac

09/26/2022, 11:54 AM

Hello. Would someone be able to please tell me the best way to debug my pipelines? I use PyCharm. I wish to use the debugger. Which script should I run? Do I need to change anything in the run configuration? Thanks!

datajoely

09/26/2022, 11:55 AM

There is a tutorial for setting up pycharm run config in the docs! Then you just add breakpoints and use the debugger

datajoely

09/26/2022, 11:55 AM

Also if you're lazy you can use pdb and the

breakpoint()

statement

noklam

09/26/2022, 12:12 PM

https://kedro.readthedocs.io/en/stable/development/set_up_pycharm.html Got you covered!

rohan_ahire

09/26/2022, 2:33 PM

Hi All. How do I reference credentials from Azure backed key vault so that the data catalog can access the azure storage buckets?

datajoely

09/26/2022, 2:51 PM

You'll need to use a before catalog created hook to inject the credentials into our catalog

datajoely

09/26/2022, 2:51 PM

I think

datajoely

09/26/2022, 2:51 PM

That or tweak how your templated config loader works in settings.py

rohan_ahire

09/26/2022, 3:23 PM

I added the credentials to my spark configuration and it worked. So I am not passing credentials to data catalog, just making them available at the cluster level.

datajoely

09/26/2022, 3:43 PM

That's the best way

Goss

09/26/2022, 5:33 PM

Trying to adapt the spaceflight tutorial to run on Kubeflow but I get this error:

kedro.io.data_catalog - INFO - Loading data from 'data_science.candidate_modelling_pipeline.metrics' (MemoryDataSet)...

Since this is Kubeflow and MemoryDataset cannot be used, this is not surprising. I observe that

data_science.candidate_modelling_pipeline.metrics

is not mentioned in catalog.yml. Is the fix to simply modify catalog.yml to add in this dataset similar to the way

data_science.active_modelling_pipeline.metrics

is already in there? As in, was this just an oversight? In which case, I would have separate output files for active and cadidate metrics...

Goss

09/26/2022, 8:56 PM

If I run

kedro catalog create --pipeline __default__

on the space tutorial, it generates a bunch of datasets not in the catalog:

Copy code

data_science.active_modelling_pipeline.X_test:
  type: MemoryDataSet
data_science.active_modelling_pipeline.X_train:
  type: MemoryDataSet
data_science.active_modelling_pipeline.y_test:
  type: MemoryDataSet
data_science.active_modelling_pipeline.y_train:
  type: MemoryDataSet
data_science.candidate_modelling_pipeline.X_test:
  type: MemoryDataSet
data_science.candidate_modelling_pipeline.X_train:
  type: MemoryDataSet
data_science.candidate_modelling_pipeline.y_test:
  type: MemoryDataSet
data_science.candidate_modelling_pipeline.y_train:
  type: MemoryDataSet

Why aren't these included in

conf/base/catalog.yml

when their absence causes errors like

ValueError: Pipeline input(s) {'data_science.active_modelling_pipeline.y_train', 'data_science.active_modelling_pipeline.X_train'} not found in the DataCatalog

???

Goss

09/26/2022, 9:39 PM

If I run ` kedro catalog create pipeline

rohan_ahire

09/27/2022, 8:20 PM

Is it possible to access the traceback of a failed kedro pipeline? Like we want to call a rest api, where body of the api will have the error message of the failed kedro pipeline.

noklam

09/27/2022, 8:22 PM

Sure, it's just Python program. You can do a try except block and return the traceback message if u want

noklam

09/27/2022, 8:24 PM

@rohan_ahire https://github.com/kedro-org/kedro/issues/1846 We are quite interested to understand how people are exposing kedro pipeline via web api and see if there anything we can improve

Barros

09/29/2022, 10:42 PM

How do I disable rich in kedro 0.18.3? For my use case, rich backtrace is too verbose and it doesn't integrate well with vscode.

Barros

09/29/2022, 10:44 PM

Sometimes rich breaks the line and I can't just ctrl-click the file to get to the code in vscode, and this can become annoying over time

Merel

09/30/2022, 1:37 PM

You can switch back to "plain" logging following these instructions: https://kedro.readthedocs.io/en/stable/logging/logging.html#use-plain-console-logging

Barros

09/30/2022, 2:19 PM

Didn't work :/

Goss

09/30/2022, 5:57 PM

How do I use a different default dataset? Trying to run on Kubeflow where MemoryDataSet won't work. Want to change it to something like PickleDataSet. Seems like DataCatalogWithDefault is no longer in v0.18...