datajoely
05/16/2022, 10:46 PMdatajoely
05/16/2022, 10:46 PMwwliu
05/16/2022, 10:49 PMAnnaRie
05/18/2022, 6:59 PMantony.milne
05/18/2022, 8:32 PMwwliu
05/18/2022, 11:29 PMnoklam
05/18/2022, 11:35 PMnoklam
05/18/2022, 11:37 PMdatajoely
05/19/2022, 10:48 AMThere is no guarantee about the order
but only per dependency level, i.e. If dataset D
requires A
,B
and C
. D
will always be the last executed, but the order in which A
, B
and C
in not fixed per run.Lazy2PickName
05/19/2022, 4:41 PMdef _parse_inctf() -> Pipeline:
return Pipeline(
[
node(
func=nodes.insert_columns_inctf,
inputs='external-inct-fracionada',
outputs="inctf-preprocess-01-insert-columns",
name="read-and-insert-columns-inctf",
),
node(
func=nodes.parse_inct_dates,
inputs="inctf-preprocess-01-insert-columns",
outputs="inctf-preprocess-02-parse-dates"
),
node(
func=nodes.get_pct_change,
inputs="inctf-preprocess-02-insert-columns",
outputs="inctf-preprocessed"
),
]
)
From, those datasets, only the external-inct-fracionada
and inctl-preprocessed
are actually declared in the catalog.yml
. I want to pass the others as MemoryDatasets, they are intermediaries to my pipeline, but when I run, I get this error:
ValueError: Pipeline input(s) {'inctf-preprocess-02-insert-columns'} not found in the DataCatalog
Is there a way of doing this without declaring each intermediary dataset in my catalog? Just so you know, this is the entrance of external-inct-fracionada
in my catalog:
external-inct-fracionada:
type: project.io.encrypted_excel.EncryptedExcelDataSet
filepath: "${DATA_DIR}/External/INCT/INCTF_0222.xls"
And EncryptedExcelDataSet
and it's implementation is seen in the attached filenoklam
05/19/2022, 4:53 PMLazy2PickName
05/19/2022, 4:57 PMSirTylerDurden
05/20/2022, 1:34 AMdatajoely
05/20/2022, 9:56 AMdatajoely
05/20/2022, 10:01 AMSirTylerDurden
05/21/2022, 12:08 AMdatajoely
05/21/2022, 3:04 AMRRoger
05/21/2022, 6:16 AMoutput
to a list of length 2000, i.e. ["senate_2006-03-30", "senate_2006-03-31", ...]
, i.e. a 2000-line pipeline.py
? Or is there some sort of clever templating?datajoely
05/21/2022, 6:35 AMRRoger
05/21/2022, 11:46 AMMackson
05/24/2022, 12:37 AMdatajoely
05/24/2022, 1:10 AMMackson
05/24/2022, 8:37 AMMackson
05/24/2022, 8:38 AMdatajoely
05/24/2022, 8:50 AMMackson
05/24/2022, 8:51 AMMackson
05/24/2022, 8:56 AMnoklam
05/24/2022, 10:02 AMMackson
05/24/2022, 10:54 AMMackson
05/24/2022, 10:55 AMMackson
05/24/2022, 10:55 AM