Title
#beginners-need-help
datajoely

datajoely

11/05/2021, 10:22 AM
Hi @User - now I can write up my thoughts now. tldr - out of the box we don't do all we could do in this area and I personally want to limit the YAML our users have to write. - Looking at the industry the best things I've seen are what Great Expectations (https://docs.greatexpectations.io/docs/tutorials/getting_started/check_out_data_docs/) and dbt (https://docs.getdbt.com/docs/building-a-dbt-project/documentation) are able to do. - Practically - it's not a lot of work to work with a Kedro DataCatalog object with the Python API, feed it into GE and generate this sort of documentation - it's not something we offer as a first party integration (yet) but would like to do some day. - Something we've been keen to do for a long time is extend
kedro-viz
to have some sort of 'catalog manager' where this would make sense to live. It's not under active development, but if users start shouting that they'd like it has more weight on the backlog 🙂 - Finally, the most structured way of doing this today is to use Sphinx and the built in
kedro build-docs
command to generate static docs. This is mostly there for Python API docs, but everything on the Kedro docs (https://kedro.readthedocs.io/en/stable/) is made this way so you can steal how we do it too. I think it would be pretty neat to write a script that used the DataCatalog python API to create data documentation stubs which you then fill in with human readable descriptions.