Hello all. Newbie to kedro here. I'm currently tes...
# advanced-need-help
Hello all. Newbie to kedro here. I'm currently testing out using kedro for my organisation which uses Qubole for most of our ml work. In the Qubole environment we have access to a plain jupyter notebook where we are able to read files from s3 and run our models using the clusters available. I've attempted to load my kedro project as a folder in s3 and run the project from my notebook but cant seem to get the code to work. I believe it has to do with referencing the path in which the kedro project is location in s3. Attached is the code used to run in jupyter. Do you have any recommendations on how i can get this working ? Here's the error im getting * Could not find the project configuration file 'pyproject.toml' in //s3:/*
Hi Aanan, I'm afraid I'm not familiar with Qubole but I'm wondering if you could use your kedro project as a package. I.e. package up your project as a python package, pip install it and run it with
python -m my_package
. And you were right, the error message suggests it's not finding the file that distinguishes a regular folder from a kedro project, namely
Another thing to try would be to see if you can use line magics to sneakily run the kedro project that way. e..g using the system line magic
which runs a command (so you do
!kedro run
which works in my test project). Or if that fails maybe you could use
and point to the kedro project
file. https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-run
The conventional way of running a kedro project is through the command line
kedro run
, but I guess you know that and the option isn't available to you? I don't know how locked down the Qubole environment is and whether you'll be able to effectively simulate running a command line through the Jupyter notebook environment. In some Jupyter instances you even have the option to launch a terminal window in the environment itself, which means you could just run a project from the CLI as normal