Is there a way to run a DAG so that one or more no...
# advanced-need-help
s
Is there a way to run a DAG so that one or more nodes failing in the middle don't stop the whole pipeline? I'd like it to continue running the nodes that do not depend on the failed nodes, possibly tracking the failed and skipped nodes and throwing an error or log message at the end summarizing what was failed or skipped. I've thought of writing a custom runner for this, but it occurred to me something others may have worked on or been interested in.
I was thinking of adapting the
ParallelRunner
by changing it's exception handling. Instead of re-raising, it could prune the to-do list of nodes that depend on the failing one, capturing the failed and skipped ones. Then it would just continue. At the end it could log those failed and skipped ones. It could even raise an exception at the end. But that way it tried to do everything that it could. Sometimes it's pretty valuable to have as many parts of the pipeline succeed as possible.
a
Just use the sequential runner. Do a try catch, eat the exception, and then add the node to the failed ones Then when you try to find the next one to run, add the ones with the failed dependencies to the list
s
That’s definitely an idea. It would be easier if kedro natively had actual network representations and methods for listing node descendants. I can use networkx for that, but converting kedro pipelines into a graph is a pain. I’ve done it many times, but have to use custom code to achieve it. Am I missing something already existing?
a
i think youre overcomplicating it. For less than 1000 nodes (you probably have ~50) you can just bruteforce it. Just do scans of the dependencies, removing the ones with failed parents, until the failed dependencies remain the same after the scan. It's like 10 lines of python