Mad Hatter
07/15/2021, 12:55 PMnoklam
07/16/2021, 5:08 AMnoklam
07/16/2021, 5:08 AMsigma
07/16/2021, 2:00 PMsigma
07/16/2021, 2:01 PMwaylonwalker
07/23/2021, 12:51 PMkedro-diff
? Or is more ethical to leave that one on.datajoely
07/23/2021, 12:54 PMwaylonwalker
07/23/2021, 12:56 PMLorena
07/23/2021, 3:41 PMwaylonwalker
07/23/2021, 6:08 PMdatajoely
07/26/2021, 12:31 PMwaylonwalker
07/26/2021, 12:44 PMwaylonwalker
07/26/2021, 12:45 PMdatajoely
07/26/2021, 12:45 PMuser
07/28/2021, 5:38 PMkedro viz
you will see our OpenAPI schema and the Redoc documenation. Please shout if you need help integrating Viz into your lineage export. Indeed we can and should turn viz into a lineage tool, because if you toggle off all of the nodes, you get a DAG of data flow. In fact, Kedro pipelines are controlled by data topology by connecting nodes through their inputs and outputs, so it's a DAG of data, not a DAG of tasks like other workflow engine. In a sense, that's already a data lineage. If we get data schema and column level lineage, we can also render that. But extracting column level lineage in a heterogenous execution environment is difficult. We could do that for Spark, SQL, etc. independently but not when they are all mixed together in a pipeline.datajoely
07/28/2021, 6:42 PMsigma
07/28/2021, 6:45 PMdatajoely
07/28/2021, 6:47 PMdatajoely
07/28/2021, 6:47 PMwaylonwalker
08/02/2021, 1:43 PMwaylonwalker
08/02/2021, 1:44 PMdatajoely
08/02/2021, 1:45 PMwaylonwalker
08/02/2021, 1:46 PMwaylonwalker
08/02/2021, 1:47 PMwaylonwalker
08/02/2021, 2:11 PMwaylonwalker
08/11/2021, 2:04 PMkedro-diff
is, if you have large files in your project that are not required for making the pipeline objects it can fill up your tmp directory.
I think the solution is to implement a .kedroignore, that will ignore certain files and directories specified by the user. If one does not exist, I will ignore the following items by default.
python
default_ignore_items = [".envrc", ".venv", ".kedro-diff", "data"]
What else belongs in the default ignore? are there large files stored in common directories that we dont need while running diff?Arnaldo
08/11/2021, 2:11 PMdocs
and logs
folders as well, @Userwaylonwalker
08/11/2021, 2:11 PMuser
08/16/2021, 2:01 AM