07/28/2021, 5:38 PM
Sorry I'm late to the party. @User if you open after running
kedro viz
you will see our OpenAPI schema and the Redoc documenation. Please shout if you need help integrating Viz into your lineage export. Indeed we can and should turn viz into a lineage tool, because if you toggle off all of the nodes, you get a DAG of data flow. In fact, Kedro pipelines are controlled by data topology by connecting nodes through their inputs and outputs, so it's a DAG of data, not a DAG of tasks like other workflow engine. In a sense, that's already a data lineage. If we get data schema and column level lineage, we can also render that. But extracting column level lineage in a heterogenous execution environment is difficult. We could do that for Spark, SQL, etc. independently but not when they are all mixed together in a pipeline.