FelicioV
03/02/2022, 5:42 PMPartitionedDataSets
with pandas.ExcelDataSet
and specifying load_args
as sheet_name
, names
and dtype
. It works like a charm but I'm worrying about the size of the catalog/ingest.yml
. I've been searching for a way to split that catalog yml into a few files, maybe on business oriented segments, but I have had no luck with it. Is there a intended way to do such a thing? If not intended way implemented, I've been thinking (not really tried though) to mess up with the register_catalog
on the ProjectHooks
class. Am I making any sense? Thanks!datajoely
03/02/2022, 6:08 PMdatajoely
03/02/2022, 6:09 PMFelicioV
03/02/2022, 6:09 PMdatajoely
03/02/2022, 8:03 PMdatajoely
03/02/2022, 8:03 PMdatajoely
03/02/2022, 8:04 PMcatalog*
and catalog*/**
so we will pick up any files that are prefixed with catalog
or live (recursively) within a folder with that prefixdatajoely
03/02/2022, 8:04 PMdatajoely
03/02/2022, 8:05 PMdatajoely
03/02/2022, 8:06 PMdatajoely
03/02/2022, 8:06 PMdatajoely
03/02/2022, 8:07 PMFelicioV
03/02/2022, 8:08 PMFelicioV
03/02/2022, 8:09 PM# <conf_root>/<env>/catalog/<pipeline_name>.yml
rockets:
type: MemoryDataSet
scooters:
type: MemoryDataSet
datajoely
03/02/2022, 8:10 PMdatajoely
03/02/2022, 8:10 PMdatajoely
03/02/2022, 8:10 PMdatajoely
03/02/2022, 8:11 PMdatajoely
03/02/2022, 8:11 PMFelicioV
03/02/2022, 8:12 PMdatajoely
03/02/2022, 8:13 PMdatajoely
03/02/2022, 8:13 PMdatajoely
03/02/2022, 8:13 PMFelicioV
03/02/2022, 8:13 PMdatajoely
03/02/2022, 8:14 PMFelicioV
03/02/2022, 8:15 PMdatajoely
03/02/2022, 8:15 PM