datajoely
11/25/2021, 10:40 AMBastian
11/25/2021, 4:05 PMdatajoely
11/25/2021, 4:07 PMRRoger
11/25/2021, 11:55 PMnode
but threw an error.
I know that transcoding allows the same dataset to be loaded in multiple ways, I want to go the other way around.datajoely
11/26/2021, 10:05 AMOnéira
11/26/2021, 8:47 PMdatajoely
11/26/2021, 8:48 PMdatajoely
11/26/2021, 8:50 PMOnéira
11/26/2021, 8:54 PMOnéira
11/26/2021, 8:54 PMdatajoely
11/26/2021, 8:56 PMOnéira
11/26/2021, 8:59 PMRroger
11/27/2021, 9:40 PMprocess_col_A
, process_col_B
, ..., process_col_ZZ
.datajoely
11/27/2021, 9:47 PMbrewski
11/27/2021, 10:09 PMRroger
11/28/2021, 2:46 AMThreadRunner
and ParallelRunner
.
1. I know that the ParallelRunner
won't take lambda functions. In this case how to deal with the nodes with identify lambda functions (lambda x: x
)?
2. I tried running using ThreadRunner
and it did run the nodes at the same time and shortened the end-to-end runtime. Are there situations in which I shouldn't use ThreadRunner
?sri
11/28/2021, 12:16 PMsri
11/28/2021, 3:32 PMdatajoely
11/28/2021, 6:55 PMdatajoely
11/28/2021, 6:58 PMdatajoely
11/28/2021, 7:00 PMdatajoely
11/28/2021, 7:04 PMsri
11/29/2021, 9:48 AMdatajoely
11/29/2021, 9:57 AMpandas.SQLTableDataSet
catalog references with different configdatajoely
11/29/2021, 9:58 AMsri
11/29/2021, 10:38 AMdatajoely
11/29/2021, 10:39 AMSQLQueryDataSet
+ Jinja or define your own datasetApoorva
11/29/2021, 10:52 AMdatajoely
11/29/2021, 11:54 AMinsert
, upsert
or overwrite
. There isn't really a 'read only' mode, if you only plan on reading you can select any of those and it will never if you never save back to HIVE. If you really want to block saves, you can inherit the dataset and override the save()
method to raise NotImplementedError
sri
11/29/2021, 5:34 PM