https://kedro.org/ logo
#beginners-need-help
Title
# beginners-need-help
n

noklam

10/05/2022, 4:24 PM
I am not sure how dbx works, but how would you execute code from there for non-kedro project? I imagine you must start from the same directory, why do you need to specify the absolute path?
r

rohan_ahire

10/05/2022, 4:26 PM
dbx deploys code to databricks workflow job. The workflow job has the script name coded with the uniquely generated path and the python package dependencies are also installed to the cluster.
n

noklam

10/05/2022, 4:31 PM
Do you have to configure the script every time?
r

rohan_ahire

10/05/2022, 4:32 PM
Not really. Deployment of the code happens through Github Actions. So dbx helps package the code and dependencies and deploy it to databricks and create a workflow job.
n

noklam

10/05/2022, 4:33 PM
but you still have to define an entrypoint I guess? like executing a particular file?
r

rohan_ahire

10/05/2022, 4:34 PM
the python script is the file that creates the kedro session
databricks job calls the python script in the screenshot
n

noklam

10/05/2022, 4:34 PM
Can you just use a relative path in this case?
r

rohan_ahire

10/05/2022, 4:35 PM
like Path(script_name).parent
relative to the script being executed?
n

noklam

10/05/2022, 4:37 PM
something like that - it is also a bit strange why this script lives in src/pipeline/data_preprocessing/? It should probably just live at the top level as a __main__.py or something similar
r

rohan_ahire

10/05/2022, 4:38 PM
I can move it up, that is not a problem. Its just way of arranging multiple scripts. There are many.
n

noklam

10/05/2022, 4:38 PM
this isn't too important though, as long as the path is correct, but if it starts from the root level, KedroSession should picks up pyproject.toml automatically
You can leave it as is then, just need to figure out the relative path
r

rohan_ahire

10/05/2022, 4:38 PM
I will try relative path. I just wanted to make sure if there was a cleaner approach.
n

noklam

10/05/2022, 4:40 PM
I think the proper way to deploy is package your kedro project. https://kedro.readthedocs.io/en/stable/tutorial/package_a_project.html Then you should just do something like
from my_project import __main__
__main__(pipeline=)
r

rohan_ahire

10/05/2022, 4:43 PM
To execute this way, we still need to run this code from the project root dir right? Where the toml file exists?