https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
beginners-need-help
  • j

    JA_next

    06/15/2022, 7:15 PM
    Thanks @avan-sh , is this by design? then can I export the catalog in jupyternotebook to a local datacatalogYAML file? I dont think copy the path and dataset name and paste is a good practice or the only approach.
  • n

    noklam

    06/15/2022, 8:05 PM
    Are you trying to create catalog interactively? It is quite difficult to infer the yaml file base on the python object, there are many possible options
  • j

    JA_next

    06/16/2022, 2:11 AM
    yeah. that's why I am asking. I am a little surprised we can not update catalog yaml interactively. can you elaborate a little about other options?
  • a

    avan-sh

    06/16/2022, 5:37 AM
    write catalog info to files
  • o

    ouzo61

    06/16/2022, 1:18 PM
    Hello everyone, has anyone a good resource on how to use great expectations with kedro?
    y
    • 2
    • 2
  • y

    Yetunde

    06/16/2022, 2:38 PM
    Using Great Expectations
  • w

    waylonwalker

    06/16/2022, 7:34 PM
    Is there a discussion already open for how heavy-handed kedro is on it's requirements? I have some issues on occasion where things have moved since the last kedro release, but kedro does not allow new enough versions. currently my build-reqs is broken because of this issue on pip-tools. https://github.com/jazzband/pip-tools/issues/1617 It's fixed after 6.5.1, but kedro does not want me to use it (I am able to, but get pip warnings all over). Will raise an issue if there isn't already one somewhere.
    a
    • 2
    • 10
  • a

    antony.milne

    06/16/2022, 8:23 PM
    kedro requirements
  • m

    mjmare

    06/17/2022, 2:12 PM
    How to get a list of pipelines (in a CLI utility)? In the past (<1.18) I did something like this:
    metadata = bootstrap_project(project_dir)
    
    with KedroSession.create(metadata.package_name) as session:
        context = session.load_context()
        pipeline_names = sorted(context.pipelines.keys())
    Now, however, context.pipelines does not exist anymore. What is the proper way to do this? TIA
  • a

    antony.milne

    06/17/2022, 2:33 PM
    That should be
    from kedro.framework.project import pipelines
  • d

    datajoely

    06/17/2022, 2:34 PM
    The reason this is now possible as in 0.18 pipelines are basically python packages and don't need any magic to be registered
  • m

    mjmare

    06/17/2022, 2:43 PM
    @datajoely @antony.milne Antony's solution works for me, since I really needed the names under which the pipelines were registered, not the package. Later on, I do a session.run(pipeline_name=pipeline_name)
  • g

    greeeen

    06/22/2022, 1:20 AM
    hi everyone, i have a question about the project structure. if i want to separate the core logic and the kedro pipelines into 2 modules, is there any configuration that i need to modify correspondingly? file structure (omitting all other files): - my_project - src - core - kedro
  • a

    avan-sh

    06/23/2022, 2:31 AM
    You don’t need change any configuration for this, your pipelines should run fine irrespective of where you define them as long as they’re imported & registered under register_pipelines
  • g

    greeeen

    06/23/2022, 2:37 AM
    got it, thanks. i was just wondering if there will be issues with packaging/Kedro's load_context..
  • a

    avan-sh

    06/23/2022, 2:49 AM
    It should be fine as long as you’re not moving settings & pipeline_registry files.
  • n

    noklam

    06/23/2022, 1:03 PM
    So
    core
    is some Python module and
    kedro
    is where the kedro project is? This is fine as
    core
    is external to kedro as any
    site_pacakge
    you have.
    kedro
    is really where the kedro project is so you just doing
    kedro xxx
    from there.
  • g

    greeeen

    06/23/2022, 1:15 PM
    > So core is some Python module and kedro is where the kedro project is? yes, you are correct. i tried it out anyway and i found that i need to modify the
    pyproject.toml
    as follows:
    [tool.kedro]
    package_name = "kedro"
    project_name = "my_project"
    project_version = "0.17.5"
  • w

    williamc

    06/23/2022, 4:31 PM
    Let's say I just cloned my kedro project repo to another machine, and its datasets are versioned and configured to use S3 for storage. If I try to run a pipeline that depends on those datasets I get the infamous
    kedro.io.core.VersionNotFoundError
    . Bucket has versions all the way up to
    2022-06-07T22.04.39.460Z/
    and the error says
    2022-06-23T16.20.52.945Z
    . Is this the intended behavior? Thanks
    n
    d
    • 3
    • 14
  • n

    noklam

    06/23/2022, 8:56 PM
    How does the file look like before your changes?
  • n

    noklam

    06/23/2022, 8:58 PM
    What is the command that you run? Did you specify any version? If not it should just grab whatever latest version you have in the S3 store
  • l

    lancechua

    06/24/2022, 1:02 PM
    I'm trying to upgrade from v0.17.4 to v0.18.1, and when I try to do kedro run, it's still trying to look for kedro.versioning
  • a

    antony.milne

    06/24/2022, 4:20 PM
    What is the error exactly? If it's something about
    journals
    I suspect it's because it's still mentioned in your logging.yml file. Easiest way to fix that is just to copy and paste this into your logging.yml file: https://github.com/kedro-org/kedro/blob/0.18.1/kedro/templates/project/%7B%7B%20cookiecutter.repo_name%20%7D%7D/conf/base/logging.yml
  • l

    lancechua

    06/25/2022, 2:47 AM
    Yes, it was indeed about some journal logger. Will try that. Thanks!
  • i

    inigohrey

    06/26/2022, 6:49 PM
    Hello. Where would I go looking to find where the params file is being read from disk? I'm having an issue with non-ASCII characters on Windows. I am saving my params.yml as UTF-8 encoding but python, taking locale.getpreferredencoding(), is attempting to read it using CP1252 which generates gibberish for characters like ñ, é etc. If I wanted to change the config of a YAMLDataSet I would change it like here: https://github.com/kedro-org/kedro/issues/772#issuecomment-847650332 but I don't know if there is a similar config for the parameters.yml file.
    a
    • 2
    • 6
  • a

    antony.milne

    06/27/2022, 8:10 AM
    params file encoding
  • w

    williamc

    06/27/2022, 5:01 PM
    Sorry for the late response. I just tried
    kedro run
    , no version specified.
  • s

    sjster

    06/27/2022, 5:18 PM
    Hello, running a Kedro pipeline results in my job getting killed. It looks it is running out of memory as it is trying to save the result of a node to a ParquetDataSet. My inputs are about 2G in size. Any solutions or suggestions?
  • d

    datajoely

    06/27/2022, 5:21 PM
    So it would be great to get a stack trace to understand what's going on. Could you try wrapping your
    ParquetDataSet
    in a
    PartitionedDataSet
    would allow you to write smaller chunks and won't fail
  • s

    sjster

    06/27/2022, 5:33 PM
    Will try that
Powered by Linen
Title
s

sjster

06/27/2022, 5:33 PM
Will try that
View count: 1