Galileo-Galilei
02/14/2022, 9:58 PMpython
# First test with plain mlflow
# temp.py
import os
import mlflow
os.environ["MLFLOW_S3_ENDPOINT_URL"]='http://192.168.0.150:9000'
os.environ["AWS_ACCESS_KEY_ID"]='minioadmin'
os.environ["AWS_SECRET_ACCESS_KEY"]='minioadmin'
mlflow.set_tracking_uri('postgresql://postgres:postgres@localhost:5432/mlflow_db')
with mlflow.start_run():
mlflow.log_param("a",1)
Then open the ui, and check if the resutsl are stored where you want. If it is not the case, it means that your mlflow configuration is incorrect, check your server settings / port / password.
If it works as expected, restart your kernel and try the following script:
python
# Second test kedro-mlflow for configuration settingg and plain mlflow for logging
from kedro.framework.session import KedroSession
from kedro_mlflow.config import get_mlflow_config
with KedroSession.create() as session:
mlflow_config=get_mlflow_config()
mlflow_config.setup()
with mlflow.start_run():
mlflow.log_param("a",1)
This sets the environment variables and the uri through ``kedro-mlflow`` and then log in plain mlflow. Does it logs where you want?Galileo-Galilei
02/14/2022, 10:06 PMpython
# in node.py
def compute_metrics (y_true, y_preds):
<compute>
return metric1, metric2, [metric3_step0, metric3_step1, metric3_step2, metric3_step3]
# in pipeline_registry.py
Pipeline([
...,
node(compute_metrics, ["y_true", "y_preds"], [my_metric1, my_metric2, my_metric3]),
...
])
# in catalog.yml
my_metric1:
type: kedro_mlflow.io.metrics.MlflowMetricDataSet
my_metric2:
type: kedro_mlflow.io.metrics.MlflowMetricDataSet
my_metric3:
type: kedro_mlflow.io.metrics.MlflowMetricHistoryDataSet
This is easier than returning the complex format above
3. Unfortunately this is not linked to ``kedro-mlflow``, I don't modify the UI. I know that mlflow often changes its UI and it slightly varies across versions. Feel free to [open an issue to my repo](https://github.com/Galileo-Galilei/kedro-mlflow/issues), I'll add it to my backlog to log it as a tag in the future so we can have consistent access across mlflow versions.Dhaval
02/15/2022, 7:58 AMDarthGreedius
02/16/2022, 10:03 PMDarthGreedius
02/16/2022, 10:04 PMDarthGreedius
02/16/2022, 10:04 PMIS_0102
02/17/2022, 4:56 PMAlexandros Tsakpinis
03/30/2022, 12:17 PMdatajoely
03/30/2022, 12:21 PMdatajoely
03/30/2022, 12:22 PMshaunc
03/30/2022, 1:11 PMnoklam
04/21/2022, 11:08 AMDownforu
04/22/2022, 12:50 PMdocker-compose up postgres
2/ In another teminal : docker-compose up init_db
3/ In the previous terminal : docker-compose up scheduler webserver
Thanks a lot for your help !noklam
04/22/2022, 2:50 PMlogging.yml
with this setting.
"disable_existing_loggers": True
This is an issue that we are keen to fix, so please share if you have any finding! Thank you for the very detail report!Downforu
04/22/2022, 3:24 PMnoklam
04/22/2022, 3:43 PMDownforu
04/27/2022, 12:03 PMclass UpdateParametersFile:
@hook_impl
def before_pipeline_run(self, run_params: Dict[str, Any]) -> None:
conf_paths = ['conf/base', 'conf/local']
conf_loader = ConfigLoader(conf_paths)
config_params = conf_loader.get("parameters*", "parameters*/**")
with open('conf/base/parameters.yml', 'w') as file:
pass
config_params["param1"] = dict(key1=10, key2=24)
# Initialize key-values pairs
config_params["param3"] = dict(key1='sum')
with open('conf/base/parameters.yml', 'w') as file:
yaml.dump(config_params, file)
I also made the config_params available to the nodes that require "parameters" by adding the following to the ProjectHooks class:
class ProjectHooks:
@hook_impl
def before_node_run(self, node: Node, inputs):
conf_paths = ['conf/base', 'conf/local']
conf_loader = ConfigLoader(conf_paths)
config_params = conf_loader.get("parameters*", "parameters*/**")
if node.name == "node_name":
return {"parameters": config_params}
Thank you in advancenoklam
04/27/2022, 12:50 PMpass
intended in the UpdateParametersFile
?Downforu
04/27/2022, 1:26 PMdatajoely
04/27/2022, 1:28 PMdatajoely
04/27/2022, 1:29 PMConfigLoader
like this, the parameters are already accessible in the catalog
objectDownforu
04/27/2022, 1:36 PMDownforu
04/27/2022, 1:47 PMdatajoely
04/27/2022, 1:58 PMDownforu
04/27/2022, 2:12 PMkedro run
is that I see experiments and metrics recording in Azure Workspace which is not the case when I trigger the DAG from the Airflow UIdatajoely
04/27/2022, 2:14 PMDownforu
04/27/2022, 2:15 PMDownforu
04/27/2022, 2:17 PMDownforu
04/27/2022, 2:26 PMservices:
postgres:
image: postgres:13
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
ports:
- "5434:5432"
init_db:
build:
context: .
dockerfile: Dockerfile
command: bash -c "airflow db init && airflow db upgrade"
env_file: .env
depends_on:
- postgres
scheduler:
build:
context: .
dockerfile: Dockerfile
restart: on-failure
command: bash -c "airflow scheduler"
env_file: .env
depends_on:
- postgres
ports:
- "8080:8793"
volumes:
- ./airflow_dags:/opt/airflow/dags
- ./airflow_logs:/opt/airflow/logs
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
webserver:
build:
context: .
dockerfile: Dockerfile
hostname: webserver
restart: always
env_file: .env
depends_on:
- postgres
command: bash -c "airflow users create -r Admin -u admin -e admin@example.com -f admin -l user -p admin && airflow webserver"
volumes:
- ./airflow_dags:/opt/airflow/dags
- ./airflow_logs:/opt/airflow/logs
ports:
- "5000:8080"
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 32
Downforu
05/02/2022, 10:47 AMkedro package
2/ docker-compose up postgres
3/ Open another terminal :
docker-compose up init_db
4/ In the new terminal :
docker-compose up scheduler webserver
Thank you in advance !