Hello everyone, I'm in the early stage of my Kedr...
# beginners-need-help
u
Hello everyone, I'm in the early stage of my Kedro understanding and I am wondering what the best practice is for visualising data? Typically when using notebooks I'll use packages like matplotlib and seaborn but I'm not entirely sure how they fit into the Kedro workflow? Any advice would be appreciated! Thank you, Lawrence
a
This is a great question! There's a few parts to the answer: * there's a set of
kedro jupyter
commands that can be used to launch a jupyter notebook that has useful kedro functionality already enabled (like being able to load up datasets): https://kedro.readthedocs.io/en/stable/tools_integration/ipython.html. And within a notebook you can do all your usual work with seaborn etc. * there's particular dataset types that handle plots, e.g.
MatplotlibWriterDataSet
(also handles seaborn and other libraries built on top of MPL) and
PlotlyDataSet
. So it's very common to generate a plot and then save it as a file using these * these datasets can also be shown when you explore your pipeline on kedro-viz: https://kedro.readthedocs.io/en/stable/tutorial/visualise_pipeline.html#visualise-plotly-charts-in-kedro-viz. Check https://demo.kedro.org/ and click on "Price histogram" if you want an example. * in the future you should be able to see them in experiment tracking also, so you can compare graphs from different pipelien runs: https://kedro.readthedocs.io/en/stable/tutorial/set_up_experiment_tracking.html FYI @tynan
u
Thank you! This was very helpful! I had figured out about the jupyter notebooks but it seemed sort of against the Kedro workflow of moving everything to a more production type system (out of Jupyter). I'd not discovered the graphical datasets though, so that is what I shall use going forward! Thanks again for your help! 😀