noklam
09/09/2022, 12:38 PMkedro-airflow
which helps you to create a Airflow Dag. However, you don't usually want to have a 1-1 mapping between Kedro pipeline and orchestrator DAGs, since they are usually larger node conceptually.noklam
09/09/2022, 12:39 PMnoklam
09/09/2022, 12:40 PMnoklam
09/09/2022, 12:41 PMkedro run --pipeline a"
, or optionally the Python API (which the kedro-airflow
helps you to do that)Eliãn
09/09/2022, 12:44 PMEliãn
09/09/2022, 12:44 PMEliãn
09/09/2022, 12:56 PMpython
configs = {
'products': {
'schedule_interval':'@weekly'
},
'customers': {
'schedule_interval':'@daily'
},
}
def generate_dag(dag_id, start_date, schedule_interval, details):
with DAG(dag_id, start_date=start_date, schedule_interval=schedule_interval) as dag:
@task
...
for name, detail in configs.items():
dag_id = f'dag_{name}'
globals()[dag_id] = generate_dag(dag_id, ...)
@noklam something like thisnoklam
09/09/2022, 2:16 PMkedro run --params=<config>
or just use the Python API with KedroSessions.create(extra_params=<params>)
then do a session.run(pipeline=<some_pipeline>)
Eliãn
09/09/2022, 2:22 PMEliãn
09/09/2022, 2:22 PMrohan_ahire
09/09/2022, 7:07 PMfrom kedro.framework.session import KedroSession
from kedro.framework.startup import bootstrap_project
from pathlib import Path
metadata = bootstrap_project(Path.cwd())
with KedroSession.create(metadata.package_name) as session:
session.run()
2. Does kedro have pipeline templates? Like for example, a pipeline template for regression use case or classification use case? Or do we just use the kedro pipeline create data_processing
to create a sample template and add processing code in it?sri
09/09/2022, 7:10 PMdatajoely
09/09/2022, 7:13 PMbefore_pipeline_run
hook!sri
09/09/2022, 7:59 PMdatajoely
09/09/2022, 8:00 PMdatajoely
09/09/2022, 8:01 PMrohan_ahire
09/09/2022, 8:19 PMdatajoely
09/09/2022, 8:22 PMrohan_ahire
09/09/2022, 8:24 PMdatajoely
09/09/2022, 8:26 PMwaylonwalker
09/13/2022, 9:36 PMwaylonwalker
09/13/2022, 10:05 PMGoss
09/14/2022, 5:13 PMbuild_kubeflow_pipeline.py
script, there are many references to AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in the generated kfp yaml file. But I'm not using AWS at all. Why are these in there?mrjpz99
09/14/2022, 10:18 PMkedro viz
to visualize the pipeline, I got this error as the screenshot shows. I configured the "catalog.yml" with the filepath pointing to the dataset class script. Am I doing anything wrong? Or is it a bug in "kedro-viz" that it can't handle custom dataset since it's looking for the specific installed package?mrjpz99
09/14/2022, 10:25 PMdatajoely
09/14/2022, 10:25 PMmrjpz99
09/14/2022, 10:28 PMDataSetError: An exception occurred when parsing config for DataSet
'name_match_model':
Class
'name_matching_v2.extras.datasets.transformer_dataset.SentenceTransformerModel' not
found or one of its dependencies has not been installed.
datajoely
09/14/2022, 10:29 PMdatajoely
09/14/2022, 10:29 PMmrjpz99
09/14/2022, 10:33 PM{project_name}/extras
folder. Then I specify the type
of the model artifact in the catalog.yml
file as {project_name}.extras.datasets.{custom_dataset.py}.{custom_dataset_class}
. Anything else I missed?mrjpz99
09/14/2022, 10:33 PM{project_name}/extras
folder. Then I specify the type
of the model artifact in the catalog.yml
file as {project_name}.extras.datasets.{custom_dataset.py}.{custom_dataset_class}
. Anything else I missed?