https://kedro.org/ logo
Join the conversationJoin Discord
Channels
advanced-need-help
announcements
beginners-need-help
introductions
job-posting
plugins-integrations
random
resources
welcome
Powered by Linen
advanced-need-help
  • d

    datajoely

    07/13/2022, 10:07 AM
    understood
  • m

    Matthias Roels

    07/13/2022, 10:08 AM
    As a rather hacky solution, I could already have the
    env
    set as an actual environment variable (
    KEDRO_ENV
    ) and use that in my custom
    register_pipelines
    function to create a new instance of
    config_loader
    and fetch my pipelines from there. But it is not an ideal work around...
  • d

    datajoely

    07/13/2022, 10:09 AM
    I was going to suggest that as an interim
  • d

    datajoely

    07/13/2022, 10:19 AM
    can I ask if the push to upgrade is driven by the cookiecutter vulnerability or 0.18.x features?
  • d

    datajoely

    07/13/2022, 10:19 AM
    because I'm 99% sure you can upgrade the version of cookiecutter without breaking 0.17.x
  • m

    Matthias Roels

    07/13/2022, 10:53 AM
    Well the push to upgrade is because of mainly two reasons: - overall we have the requirement to regularly upgrade for security compliance (open source packages can only run behind on 1 minor version) - We want need click >= 8.0.0 as a requirement for other dependencies that we want to upgrade
  • a

    antony.milne

    07/13/2022, 10:55 AM
    @Matthias Roels very much sympathise with your pain here! This discussion is very relevant: https://github.com/kedro-org/kedro/discussions/1436. The way to do this now is through the hook
    after_context_created
    that makes available
    context.config_loader
    . You can pass this as a global variable to your
    register_pipelines
    function and do
    config_loader.get("pipeline")
    (or whatever) there. This still feels very hacky to me and I would like to have a better way to do it, but unfortunately since per-environment pipeline.yml files isn't really the "kedro way" it is awkward to support. IMO we should make it much easier for people to use a plugin that enables this functionality
  • a

    antony.milne

    07/13/2022, 11:01 AM
    An alternative would be to instead use this code in
    register_pipelines
    (which is how the config loader is instantiated internally):
    import settings 
    config_loader_class = settings.CONFIG_LOADER_CLASS
    config_loader_class(
        conf_source=str(self._project_path / settings.CONF_SOURCE),
        env=env,
        runtime_params=extra_params,
        **settings.CONFIG_LOADER_ARGS,
    )
    To get this working fully you still need to get
    env
    and
    extra_params
    into
    register_pipelines
    . I think the cleanest way to do that is again through the
    after_context_created
    hook, but in theory you could just directly extract it from the CLI arguments if you always run kedro that way.
  • m

    Matthias Roels

    07/13/2022, 12:06 PM
    Thanks for the suggestion! How would you extract it form the CLI arguments? That looks like a nice (and clean) alternative
  • m

    Matthias Roels

    07/13/2022, 7:09 PM
    I did some digging myself and found this: https://github.com/kedro-org/kedro/blob/main/kedro/framework/cli/hooks/specs.py#L12 however I donโ€™t know how to use it. Is there any example available?
  • a

    antony.milne

    07/13/2022, 7:43 PM
    @Matthias Roels I wouldn't recommend
    before_command_run
    for this. Better to access it using
    click
    as something like
    click.get_current_context(silent=True).params["env"]
    .
  • a

    antony.milne

    07/13/2022, 7:45 PM
    The catch with extracting arguments from the CLI like this is that it won't work for programmatic kedro runs using the Python API (i.e. calling
    session.run
    directly rather than doing
    kedro run
    from the CLI)
  • a

    antony.milne

    07/13/2022, 7:47 PM
    The bonus with this approach is you don't need any hooks at all; you can just put the
    click
    stuff directly in
    register_pipelines
  • m

    Matthias Roels

    07/14/2022, 7:39 AM
    Thanks for the suggestion! Still have to come up with a work around to make the reload_kedro magic work in notebooks. But we can work around that ourselves. Thanks for all the help!
  • x

    xxavier

    07/14/2022, 8:34 AM
    Hi everyone, I am trying to achieve the following: - get a list of items from an API using APIDataSet - loop over the list of items and make one API call per item to get more information using the APIDataSet The first part is working fine (after a small modification to the APIDataset to allow for the use of token authentication as described in a previous message [1]). However, I am not sure what the best way to proceed is for the second part. I was considering checking the catalog.py option but wanted to know if there was a better way to loop over items and aggregate them within the catalog. Any help is appreciated. ๐Ÿ™‚ [1] https://discord.com/channels/778216384475693066/778998585454755870/973951577561890856
    d
    • 2
    • 8
  • a

    arum

    07/14/2022, 9:19 AM
    Hi there, anyone here who got working with kedro project in airflow?
  • d

    datajoely

    07/14/2022, 9:32 AM
    pushing the limits of api dataset
  • d

    datajoely

    07/14/2022, 9:34 AM
    Have you checked out the kedro-airflow plugin
  • a

    arum

    07/14/2022, 10:02 AM
    yes, here is my issue
  • a

    arum

    07/14/2022, 10:04 AM
    'kedro package' command created the dist/ with .egg and .whl files and also another folder with project name in epc_fi.egg-info/ which has requires.txt file
  • a

    arum

    07/14/2022, 10:05 AM
    This requires.txt file has custom package which should be downloaded from NEXUS and not available locally on MAC. And when i run 'pip install dist/epc_fi-0.1-py3-none-any.whl'
  • a

    arum

    07/14/2022, 10:06 AM
    it couldn't find my custom package and how can i get it referenced into the kedro package
  • a

    arum

    07/14/2022, 10:06 AM
    ERROR: Could not find a version that satisfies the requirement custom-py-tools==2.1.349 (from epc-fi) (from versions: none) ERROR: No matching distribution found for custom-py-tools==2.1.349 WARNING: There was an error checking the latest version of pip.
  • d

    datajoely

    07/14/2022, 10:07 AM
    I think what you're describing is not really a kedro issue. Kedro package basically does
    python -m build
    ,
    custom-py-tools
    isn't provided by Kedro, nor is the Nexus artifactory manager.
  • a

    arum

    07/14/2022, 10:25 AM
    right!! i seem to be missing the basic... i got those installed now
  • a

    arum

    07/14/2022, 10:26 AM
    still struggling to get started
  • a

    arum

    07/14/2022, 10:26 AM
    [2022-07-14 13:21:06,000] {logging_mixin.py:115} INFO - Model version 20220713-163948
    [2022-07-14 13:21:06,000] {taskinstance.py:1909} ERROR - Task failed with exception
    Traceback (most recent call last):
      File "/Users/sl/airflow/dags/epc_fi.py", line 36, in execute
        with KedroSession.create(self.package_name,
      File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/kedro/framework/session/session.py", line 172, in create
        session._setup_logging()
      File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/kedro/framework/session/session.py", line 188, in _setup_logging
        conf_logging = self._get_logging_config()
      File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/kedro/framework/session/session.py", line 176, in _get_logging_config
        conf_logging = self._get_config_loader().get(
      File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/kedro/config/templated_config.py", line 161, in get
        config_raw = _get_config_from_patterns(
      File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/kedro/config/common.py", line 69, in _get_config_from_patterns
        raise ValueError(
    ValueError: Given configuration path either does not exist or is not a valid directory: /conf/base
    [2022-07-14 13:21:06,004] {taskinstance.py:1415} INFO - Marking task as UP_FOR_RETRY. dag_id=epc-fi, task_id=preprocess, execution_date=20220714T102039, start_date=20220714T102055, end_date=20220714T102106
    [2022-07-14 13:21:06,007] {standard_task_runner.py:92} ERROR - Failed to execute job 121 for task preprocess (Given configuration path either does not exist or is not a valid directory: /conf/base; 4330)
    [2022-07-14 13:21:06,051] {local_task_job.py:156} INFO - Task exited with return code 1
    [2022-07-14 13:21:06,060] {local_task_job.py:273} INFO - 0 downstream tasks scheduled from follow-on schedule check
  • a

    arum

    07/14/2022, 10:27 AM
    i have got the /conf/base in the same place as dags/ in airflow, seem to be wrong place ?
  • d

    datajoely

    07/14/2022, 10:29 AM
    So we don't package the configuration- you need to place it there yourself https://kedro.readthedocs.io/en/stable/deployment/airflow_astronomer.html#deployment-process
  • a

    arum

    07/14/2022, 12:54 PM
    i did placed the conf/ in the same directory as dags/ in airflow/ as shown below, yet the same error. COuld someone point where should the configuration directory be placed?
Powered by Linen
Title
a

arum

07/14/2022, 12:54 PM
i did placed the conf/ in the same directory as dags/ in airflow/ as shown below, yet the same error. COuld someone point where should the configuration directory be placed?
View count: 1