1. sorry for my mistake this got confused by my te...
# advanced-need-help
1. sorry for my mistake this got confused by my team with the session.py that appears on the side. (none of my team put his hands on kedro so some questions are wrong assumptions perhaps) 2. yes I use kerdo-airflow plugin 3. I am struggling on this part, I have tried following most of what I find online but nothing comes closer to making kedro DAG runs (generic directional tips that don't translate into an operational tutorial perhaps) I start thinking we shouldn't run kedro pipelines in airflow 4. Yes this is related to kedro-airflow auto-generated dag. Here is the summary of what I am trying to do: 1- I want to use kedro to make the data science team make production-ready code from the get go 2- As an MLOps I want to automate the process so that the kedro pipeline can be made into a DAG without so much friction and can be orchestrated 3- This kedro DAG should be able to run stand-alone (conf files & data should be read from a bucket -or from DWH or redis- and not from local storage) -cannot be pushing data and files cluttering the repo so leveraging bucket storage- What's your take on my approach?
Sorry for the late response! I think your approach sounds good and it's very much in line with the Kedro philosophy of how deployed pipelines should be setup. If you use the
plugin it will essentially help you create an Airflow DAG from a Kedro pipeline. Then you need to package the resulting project just like any python package and run it as such (step 3 on the readme: https://github.com/kedro-org/kedro-plugins/tree/main/kedro-airflow). It is true that we don't have a step-by-step tutorial for this, but it's still possible to make your Kedro pipelines run on airflow.
What specific step in the process are you struggling with? Is it the lack of tutorial guidance or do you get errors?
To come back to your fourth question "4- why there is kedro operator instead of python or base operator in the auto-generated DAG? " What I can tell from looking at the code is that this operator contains some Kedro specific info that's needed to run the pipeline like the package name and pipeline name.