https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
beginners-need-help
  • d

    datajoely

    03/28/2022, 7:22 PM
    If you wanted to share the modern way of doing this back to the community it would be much appreciated! Perhaps a gist or something?
  • d

    Dhaval

    03/28/2022, 7:43 PM
    @User I have just used the code available on Kedro Lifecycle management from the docs It can be found here : https://kedro.readthedocs.io/en/stable/04_kedro_project_setup/03_session.html?highlight=load_context#create-a-session
    from kedro.framework.session import KedroSession
    from kedro.framework.startup import bootstrap_project
    from pathlib import Path
    
    metadata = bootstrap_project(Path.cwd())
    with KedroSession.create(metadata.package_name) as session:
        context = session.load_context()
  • d

    Dhaval

    03/28/2022, 7:44 PM
    Post this I am able to access all elements. So now I can work with the context
  • b

    Burn1n9m4n

    03/28/2022, 10:33 PM
    Is there a way to output an excel file from a Kedro pipeline that uses the autofilter function from xlsxwriter? I want to be able to provide a complete data set that comes prefiltered when it is opened by the user
    a
    d
    • 3
    • 2
  • b

    Burn1n9m4n

    03/28/2022, 10:33 PM
    It needn’t be done entirely within the catalog either.
  • a

    avan-sh

    03/29/2022, 3:38 AM
    xlsxwriter-autofilter
  • w

    Walber Moreira

    03/29/2022, 6:13 PM
    Ia there an optimal way to visualize namespaced pipelines in kedro-viz? If I have like two pipelines “train” and “predict”, I can’t visualize them together. This also holds for checking the node execution order/input output through pipelines[‘pipename’].describe() on kedro ipython
    d
    • 2
    • 1
  • b

    Bruno

    03/29/2022, 7:57 PM
    Hello, anyone knows how to pass a list in the nodes input? here's an example:
    node(
                func=dataframe_melting,
                inputs=["mapped_df", ["altitude"], "disease"],
                outputs="melted_fcl_altitude_df",
                name="fcl_altitude_dataframe_melting_node"
            ),
  • n

    noklam

    03/29/2022, 7:59 PM
    What's the function signature?
  • b

    Bruno

    03/29/2022, 7:59 PM
    def dataframe_melting(df, id_vars, var_name) -> pd.DataFrame:
  • b

    Bruno

    03/29/2022, 8:00 PM
    df is a DataFrame, id_vars is a list and var_name a str
  • n

    noklam

    03/29/2022, 8:04 PM
    Then you can use it like a normal variable, the string literal is just alias for the variable. ["mapped_df", "altitude", "disease"]
  • b

    Bruno

    03/29/2022, 8:04 PM
    and if id_vars be a dictionary?
  • n

    noklam

    03/29/2022, 8:15 PM
    The node is not aware about the type, it just treats it as a variable. You can also pass in named argument. You can also use a dictionary of string literal as node input/output For example https://github.com/quantumblacklabs/kedro-starters/blob/main/pandas-iris/{{ cookiecutter.repo_name }}/src/{{ cookiecutter.python_package }}/pipelines/data_engineering/pipeline.py
    b
    • 2
    • 31
  • d

    datajoely

    03/29/2022, 9:01 PM
    visualising namespaces
  • p

    pypeaday

    03/30/2022, 2:09 PM
    @User is there a way to set any kind of lifecycle on versioned datasets? I'm not seeing anything in docs about that... Or is Kedro's position that we should use an underlying filesystem's capabilities here (like ZFS snapshots or S3 lifecycle policies)? @User fyi
  • d

    datajoely

    03/30/2022, 2:30 PM
    what do you mean lifecycle? Some sort of expiry?
  • p

    pypeaday

    03/30/2022, 3:58 PM
    ya ya exactly - like if I want to keep the last 5 versions, and then versions at the beginning of the month for the past 12 months, and then annual versions for the last X years or whatever...
    d
    a
    • 3
    • 32
  • m

    Matheus Serpa

    03/30/2022, 5:19 PM
    Hello there! Is there a way to get the input name inside a node function? For example,
    node(func=melt_data, inputs="fcl_elevation")
    ...
    
    def melt_data(df):
        # how to get "fcl_elevation" inside func?
    d
    a
    • 3
    • 7
  • w

    WolVez

    04/01/2022, 9:16 PM
    is there a way to flag a dataset to not run asynchronously and to wait until other nodes are complete, if asnyc is enabled?
  • d

    datajoely

    04/01/2022, 9:24 PM
    The best way to do that is to break the pipeline into pieces and execute from the CLI like
    kedro run --pipeline a && kedro run --pipeline b --async && kedro run --pipeline c
  • d

    datajoely

    04/01/2022, 9:25 PM
    && wil wait for the previous statement to complete, single & will do both simultaneously
  • m

    mulajumento

    04/05/2022, 1:15 AM
    Hello guys! I would like to know if it is possible to "extract" the file path of a partitioned dateset catalog used as an input in a node. I tried to look in the internet for alternatives but I couldn't find a solution for it.
    d
    • 2
    • 19
  • m

    munchmuch

    04/05/2022, 4:12 AM
    Hi all I'm trying to get the TemplatedConfigLoader to work. I'm getting this error
    It must be a subclass of kedro.config.config.ConfigLoader
    . It appears the TemplatedConfigLoader is inheriting from AbstractConfigLoader in 0.18.0 any idea how to fix this? Tried to change to inheriting from ConfigLoader itself, passes the assert but doesn't use my globals.yml. Thank you
    d
    • 2
    • 2
  • d

    datajoely

    04/05/2022, 4:32 AM
    using individual partitions
  • d

    datajoely

    04/05/2022, 4:36 AM
    ConfigLoader issue with 0.18.x
  • g

    gui42

    04/05/2022, 6:31 PM
    Hey guys! Quick question. How could I handle in the catalog a directory that can have an unknown number of files, but conveniently named?
  • d

    datajoely

    04/05/2022, 6:31 PM
    PartitionedDataset!
  • g

    gui42

    04/05/2022, 6:32 PM
    This seems nice. My use case is more ML driven. Think of it as train test sets but generated by another team/application.
  • d

    datajoely

    04/05/2022, 6:33 PM
    So I think it should work - but we do have an assumption that things are reproducible so be careful!
Powered by Linen
Title
d

datajoely

04/05/2022, 6:33 PM
So I think it should work - but we do have an assumption that things are reproducible so be careful!
View count: 1