I am not sure how dbx works, but how would you exe...
# beginners-need-help
n
I am not sure how dbx works, but how would you execute code from there for non-kedro project? I imagine you must start from the same directory, why do you need to specify the absolute path?
r
dbx deploys code to databricks workflow job. The workflow job has the script name coded with the uniquely generated path and the python package dependencies are also installed to the cluster.
n
Do you have to configure the script every time?
r
Not really. Deployment of the code happens through Github Actions. So dbx helps package the code and dependencies and deploy it to databricks and create a workflow job.
n
but you still have to define an entrypoint I guess? like executing a particular file?
r
the python script is the file that creates the kedro session
databricks job calls the python script in the screenshot
n
Can you just use a relative path in this case?
r
like Path(script_name).parent
relative to the script being executed?
n
something like that - it is also a bit strange why this script lives in src/pipeline/data_preprocessing/? It should probably just live at the top level as a __main__.py or something similar
r
I can move it up, that is not a problem. Its just way of arranging multiple scripts. There are many.
n
this isn't too important though, as long as the path is correct, but if it starts from the root level, KedroSession should picks up pyproject.toml automatically
You can leave it as is then, just need to figure out the relative path
r
I will try relative path. I just wanted to make sure if there was a cleaner approach.
n
I think the proper way to deploy is package your kedro project. https://kedro.readthedocs.io/en/stable/tutorial/package_a_project.html Then you should just do something like
from my_project import __main__
__main__(pipeline=)
r
To execute this way, we still need to run this code from the project root dir right? Where the toml file exists?