Burn1n9m4n
03/23/2022, 5:27 PMdatajoely
03/23/2022, 5:28 PMBurn1n9m4n
03/23/2022, 7:21 PMdatajoely
03/23/2022, 7:26 PMdatetimes
key included a list of columns I would map a dictionary of column name to date type pairs and pass those to df.astype
waylonwalker
03/23/2022, 7:45 PMdatajoely
03/23/2022, 7:46 PMwaylonwalker
03/23/2022, 9:34 PMdatajoely
03/23/2022, 9:44 PMload_args
and save_args
that you should be aware of and in some cases keep declarative and configurable in the catalogidriss__
03/24/2022, 9:32 AMdatajoely
03/24/2022, 9:51 AMnoestl
03/24/2022, 1:48 PMdatajoely
03/24/2022, 1:55 PMdatajoely
03/24/2022, 1:56 PMnoestl
03/24/2022, 2:53 PMdatajoely
03/24/2022, 3:02 PMjcasanuevam
03/28/2022, 8:28 AMantony.milne
03/28/2022, 8:57 AMversioned: true
then you can keep track of the file over time as well. There are types like yaml.YAMLDataSet
available for this sort of thing: https://kedro.readthedocs.io/en/stable/kedro.extras.datasets.htmljcasanuevam
03/28/2022, 9:04 AMvivecalindahl
03/28/2022, 1:18 PM[...]
node(
name="process",
func="process_fcn",
inputs=dict(
df="data_at_filepath",
df_info="${DATA_INFO}"
),
[...]
)
where DATA_INFO
would be an environment variable. However, AFAICT I can't inject an environment variable like this, the globals dict is not available (?). The two solutions I see are
1) just using os.getenv
inside of the function process_fcn
or
2) instead make the data info a parameter, refer to it as param:data_info
and pass it in via kedro run --params data_info:<something>
.
Or is there a better way?
This looks pretty similar to what I'm asking about: https://github.com/kedro-org/kedro/issues/1076Dhaval
03/28/2022, 7:12 PMdatajoely
03/28/2022, 7:13 PMdatajoely
03/28/2022, 7:13 PMDhaval
03/28/2022, 7:13 PMhttps://www.youtube.com/watch?v=fYkVtzXUEBE▾
datajoely
03/28/2022, 7:13 PMdatajoely
03/28/2022, 7:14 PMDhaval
03/28/2022, 7:14 PMDhaval
03/28/2022, 7:15 PMDhaval
03/28/2022, 7:15 PM2022-03-29 00:25:09.405 Traceback (most recent call last):
File "/home/thakkar/anaconda3/envs/basic_vis/lib/python3.8/site-packages/streamlit/scriptrunner/script_runner.py", line 443, in _run_script
exec(code, module.__dict__)
File "/home/thakkar/Work/ramp-zendesk/app.py", line 17, in <module>
data = context.catalog.list()
File "/home/thakkar/anaconda3/envs/basic_vis/lib/python3.8/site-packages/kedro/framework/context/context.py", line 320, in catalog
return self._get_catalog()
File "/home/thakkar/anaconda3/envs/basic_vis/lib/python3.8/site-packages/kedro/framework/context/context.py", line 356, in _get_catalog
conf_catalog = self.config_loader.get("catalog*", "catalog*/**", "**/catalog*")
File "/home/thakkar/anaconda3/envs/basic_vis/lib/python3.8/site-packages/kedro/framework/context/context.py", line 449, in config_loader
return self._get_config_loader()
File "/home/thakkar/anaconda3/envs/basic_vis/lib/python3.8/site-packages/kedro/framework/context/context.py", line 432, in _get_config_loader
raise KedroContextError(
kedro.framework.context.context.KedroContextError: Expected an instance of `ConfigLoader`, got `NoneType` instead.
datajoely
03/28/2022, 7:16 PMDhaval
03/28/2022, 7:21 PMDhaval
03/28/2022, 7:21 PM