https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
beginners-need-help
  • d

    datajoely

    01/12/2022, 10:20 AM
    So I think this is a limitation of the current runner - you could do an explicit
    after_node_run
    hook since that has the node name, catalog and inputs avaialble https://kedro.readthedocs.io/en/stable/kedro.framework.hooks.specs.NodeSpecs.html#kedro.framework.hooks.specs.NodeSpecs.after_node_run
  • i

    idriss__

    01/12/2022, 10:58 AM
    ok i'll try that thanks !
  • c

    ChainYo

    01/13/2022, 10:16 AM
    Is there a way to automatically add a dataset to catalog based on params ? Because the API endpoint is exactly the same but the name
    d
    • 2
    • 113
  • d

    datajoely

    01/13/2022, 10:17 AM
    You would have to use a lifecycle hook, but it's very possible
  • d

    datajoely

    01/13/2022, 10:17 AM
    it's not 'kedrific' because we like thinks to be readable at rest, but it is very possible
  • d

    datajoely

    01/13/2022, 10:18 AM
    essentially you would define a hook that retrieved parameters and dynamically added datasets using
    catalog.add()
  • c

    ChainYo

    01/13/2022, 10:20 AM
    Thanks for the hint. Same applies for Nodes I imagine ?
  • c

    ChainYo

    01/13/2022, 10:20 AM
    I mean the preprocessing is the same so in my pipeline I could also have a lifecyle hook ?
  • c

    ChainYo

    01/13/2022, 10:21 AM
    or Do I need to add N nodes to the pipelines where N is the number of datasets names ?
  • d

    datajoely

    01/13/2022, 10:25 AM
    Let me try and dig up an example
  • d

    datajoely

    01/13/2022, 10:26 AM
    in the future we want to support the problem you're dealing with natively
  • d

    datajoely

    01/13/2022, 10:26 AM
    as it's a super reasonable use-case
  • d

    datajoely

    01/13/2022, 10:49 AM
    AfterCatalogCreated hook to prepopulate datasets from parameters
  • g

    ggerog

    01/13/2022, 3:42 PM
    Hi all, first of all thanks for the great work done, on kedro! Just had an easy question about the logger. How do I start logging with the journal logger not just the info.log logger?
  • d

    datajoely

    01/13/2022, 4:08 PM
    Hi @User the journal is about to be deprecated so I'd be a little careful about using that one. 1) There are a bunch of settings you can do in
    logging.yml
    to configure what the various loggers do. 2) In python world you can grab a logger and start using it via
    logging.getLogger(name)
  • d

    datajoely

    01/13/2022, 4:08 PM
    I can't exactly remember what the
    name
    of the the journal logger is, maybe
    kedro.journal
    but not 100% sure
  • g

    ggerog

    01/13/2022, 4:13 PM
    Cool so I might just add another log then. I guess that would require adding another handler. Not sure how I will allocate the name of the log though?
    handlers:
        console:
            class: logging.StreamHandler
            level: INFO
            formatter: simple
            stream: ext://sys.stdout
    
        info_file_handler:
            class: logging.handlers.RotatingFileHandler
            level: INFO
            formatter: simple
            filename: logs/info.log
            maxBytes: 10485760 # 10MB
            backupCount: 20
            encoding: utf8
            delay: True
    
        error_file_handler:
            class: logging.handlers.RotatingFileHandler
            level: ERROR
            formatter: simple
            filename: logs/errors.log
            maxBytes: 10485760 # 10MB
            backupCount: 20
            encoding: utf8
            delay: True
    
        journal_file_handler:
            class: kedro.versioning.journal.JournalFileHandler
            level: INFO
            base_dir: logs/journals
            formatter: json_formatter
  • d

    datajoely

    01/13/2022, 4:14 PM
    we actually have docs here https://kedro.readthedocs.io/en/stable/08_logging/01_logging.html
  • d

    datajoely

    01/13/2022, 4:15 PM
    to introduce a new logger just use the 'key' under handlers
  • g

    ggerog

    01/13/2022, 4:17 PM
    thanks yea I had been reading that but didn't mention the bit about keys. Cool, so logging.getLogger("info_file_handler") is how it would work for the logs/info.log log.
  • d

    datajoely

    01/13/2022, 4:17 PM
    👍
  • d

    datajoely

    01/13/2022, 4:18 PM
    yeah we could probably flesh out the docs a bit more (or if you want to make an open source contribution, feel free to raise a PR 😄 )
  • g

    ggerog

    01/13/2022, 4:40 PM
    Oh yea one final question was curious if other variables are defined for the formatter and how they are defined? Looks like jinja templating I guess.
    formatters:
        simple:
            format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
  • d

    datajoely

    01/13/2022, 5:03 PM
    This is actually pure python
  • d

    datajoely

    01/13/2022, 5:03 PM
    https://docs.python.org/3/library/logging.html#logrecord-attributes
  • d

    datajoely

    01/13/2022, 5:04 PM
    we then just take this yaml file and pass it to
    logging.config.dictConfig
  • g

    ggerog

    01/13/2022, 5:24 PM
    cool nice never knew about that!
  • g

    ggerog

    01/14/2022, 12:54 PM
    Back with more questions 😃 . Was wondering what the recommended way of testing nodes would be? Particularly if you need some intermediate processing as an input for one of nodes. Do you know if there would be a way to pause runs, then run tests? I guess you could do
    kedro run --to-nodes=<node name>
    , and go step by step.
  • g

    ggerog

    01/14/2022, 12:55 PM
    The other idea I had was doing a diff on a log. Probably going step by step and doing unit-tests might be a bit more thorough.
  • d

    datajoely

    01/14/2022, 12:56 PM
    So the two ways I do it: 1. Using the CLI like you suggest and doing things like
    kedro run --node
    etc. 2. Using a proper debugger in VS Code or PyCharm
Powered by Linen
Title
d

datajoely

01/14/2022, 12:56 PM
So the two ways I do it: 1. Using the CLI like you suggest and doing things like
kedro run --node
etc. 2. Using a proper debugger in VS Code or PyCharm
View count: 1