ggerog
01/14/2022, 1:03 PMdatajoely
01/14/2022, 1:52 PMjaweiss2305
01/16/2022, 3:13 PMdatajoely
01/16/2022, 4:13 PMczix
01/17/2022, 9:55 PMRRoger
01/18/2022, 3:33 AMpandas.SQLQueryDataSet
? i.e. like using the SSH Tunnel feature (e.g. in DBeaver)datajoely
01/18/2022, 10:13 AMdatajoely
01/18/2022, 10:14 AMmartinlarsalbert
01/19/2022, 4:15 PM{% for id in $ids %}
and` {% for id in ${ids} %}` but none of this works...datajoely
01/19/2022, 4:20 PMglobals.yml
this is currently on the backlog. I think the easiest way to make these variables available is to customise how you register TemplatedConfigLoader
in hooks.py
or perhaps subclass and extend TemplatedConfigLoader
datajoely
01/19/2022, 4:23 PManyconfig
the flag ac_config=True
enables jinjadatajoely
01/19/2022, 4:24 PManyconfig.load
actually takes an extra parameter called paths
where you can declare jinja2
templatesdatajoely
01/19/2022, 4:24 PMmartinlarsalbert
01/19/2022, 4:25 PMdatajoely
01/19/2022, 4:26 PMdatajoely
01/19/2022, 4:27 PMRroger
01/19/2022, 9:44 PM1073741819 (0xC0000005)
?
It happens when using ThreadRunner
. Using SequentialRunner
is fine. Is it because too many processes are trying to access the same dataset simultaneously?datajoely
01/19/2022, 9:45 PMThreadRunner
with Spark or Dask? For python pipelines you should be using ParallelRunner
Rroger
01/19/2022, 9:47 PMThreadRunner
with Pandas. I tried using ParallelRunner
a while ago and got an error about lambda functions.datajoely
01/19/2022, 9:49 PMParallelRunner
here? We can help you work through the errors, due to the way Python works you don't get true concurrency with threads and we have to use multi-processing.datajoely
01/19/2022, 9:49 PMdatajoely
01/19/2022, 9:49 PMdatajoely
01/19/2022, 9:50 PM&&
operator in your CLI commandRroger
01/19/2022, 10:08 PMParallelRunner
leads to TypeError: cannot pickle 'module' object
.datajoely
01/19/2022, 10:09 PMdatajoely
01/19/2022, 10:09 PMdatajoely
01/19/2022, 10:10 PMdatajoely
01/19/2022, 10:12 PMRroger
01/19/2022, 10:38 PMmartinlarsalbert
01/20/2022, 9:34 AMac_context=globals
which would expose the globals to the jinja2, but it also seams that the globals.yml and catalog.yml is loaded together in arbitrary order so that the globals are not known at the time of the jinja2 rendering. I suspect (as mentioned) that TemplatedConfigLoader
needs a major overhaul to change this