saving CachedDataSet in S3
# beginners-need-help
d
saving CachedDataSet in S3
Hi there, I have a question about saving the dataset in S3. This is a part of
catalog.yml
Copy code
train_raw_data:
  type: CachedDataSet
  versioned: true
  dataset:
    type: pickle.PickleDataSet
    filepath: s3://test/data/train_raw_data.pickle
if I have proper AWS credentials in
~/.aws/credentials
, it works well(saving the dataset into s3 path I set) But I need to set the credentials in
conf/local/credentials.yml
for some reason. so I removed
~/.aws/credentials
and create
conf/local/credentials.yml
like
Copy code
aws_access_key_id: AAA
aws_secret_access_key: BBB
aws_session_token: XXX
It doesn't work and I think boto3 print out
Unable to locate credentials
message. I also tried to change the format of credentials.yml with modified
catalog.yml
like
Copy code
dev_s3:
  aws_access_key_id: AAA
  aws_secret_access_key: BBB
  aws_session_token: XXX
Copy code
```
train_raw_data:
  type: CachedDataSet
  versioned: true
  credentials: dev_s3
  dataset:
    type: pickle.PickleDataSet
    filepath: s3://test/data/train_raw_data.pickle
It doesn't work either and it shows `DataSet 'train_raw_data' must only contain arguments valid for the constructor of
kedro.io.cached_dataset.CachedDataSet
.`
d
So there a two things here - 1. You can use the
.aws
credentials environment variable ahead of the Kedro approach, we just expose it so you can have a way of doing it consistently 2. The cached dataset is a wrapper so you need to push it
credentials
key down one level under
dataset
d
thank you
it works!!
One more quick question, Is there a way to set
default credentials
in
conf/local/credentials
? if we can set the default credentials instead named credentials such as
dev_s3
, we may don't need to specify
credentials: dev_s3
for all datasets.
d
There isn't but you can do a couple of different things. - For S3 stuff environment variables can simplify this for you outside of Kedro - In Kedro YAML you can use the anchor syntax to reuse the same structure over and over https://blog.daemonl.com/2016/02/yaml.html
d
thank you!
d
No problem!
4 Views