Title
#beginners-need-help
Bertozzo

Bertozzo

08/11/2021, 5:14 PM
Greetings ! I'm facing a very similar issue to this one that happened in an older version https://github.com/quantumblacklabs/kedro/issues/291 Basically I can't save my dataset due an encoding error
datajoely

datajoely

08/11/2021, 5:15 PM
Hi @User
5:15 PM
have you tried this in your YAML entry?
fs_args:
    open_args_load:
        mode: "rb"
5:17 PM
In general when this comes up this sort of approach tends to work
yaml
my_dataset:
  type: pandas.CSVDataSet
  filepath: xxxxx.csv
  fs_args:
    open_args_load:
      mode: "rb"
      encoding: "utf-8"
    open_args_save:
      mode: "w"
      encoding: "utf-8"
Bertozzo

Bertozzo

08/11/2021, 5:17 PM
Hey there ! yes, that was my first attempt
datajoely

datajoely

08/11/2021, 5:17 PM
Could you post your stack trace
Bertozzo

Bertozzo

08/11/2021, 5:17 PM
i'll try this one
5:19 PM
now i've a different output
datajoely

datajoely

08/11/2021, 5:19 PM
Progress!
5:19 PM
can you post it here
Bertozzo

Bertozzo

08/11/2021, 5:20 PM
File "yaml_yaml.pyx", line 707, in yaml._yaml.CParser.get_single_node File "yaml_yaml.pyx", line 725, in yaml._yaml.CParser._compose_document File "yaml_yaml.pyx", line 776, in yaml._yaml.CParser._compose_node File "yaml_yaml.pyx", line 890, in yaml._yaml.CParser._compose_mapping_node File "yaml_yaml.pyx", line 776, in yaml._yaml.CParser._compose_node File "yaml_yaml.pyx", line 892, in yaml._yaml.CParser._compose_mapping_node File "yaml_yaml.pyx", line 905, in yaml._yaml.CParser._parse_next_event yaml.scanner.ScannerError: mapping values are not allowed in this context in "C:\Users\lbertozz\Downloads\aut-ia-avaliador-de-materias\conf\base\catalog.yml", line 16, column 11
datajoely

datajoely

08/11/2021, 5:20 PM
okay thats just a bad YAML file
5:20 PM
there will be a bad indent somewhere
5:20 PM
line 16 to be exact
Bertozzo

Bertozzo

08/11/2021, 5:21 PM
mm, thats interesting
5:21 PM
and where do i find this file ?
5:22 PM
i only know where my xlxs and csv are
datajoely

datajoely

08/11/2021, 5:22 PM
it's you catalog file
C:\Users\lbertozz\Downloads\aut-ia-avaliador-de-materias\conf\base\catalog.yml
5:22 PM
line 16
Bertozzo

Bertozzo

08/11/2021, 5:23 PM
ok, that was just an ident error 😅
5:24 PM
now i am back to the original one, charmap encode etc etc
datajoely

datajoely

08/11/2021, 5:24 PM
okay interesting
5:24 PM
Can you post your YAML entry here?
5:24 PM
for that dataset
Bertozzo

Bertozzo

08/11/2021, 5:25 PM
one moment, i guess i found it
datajoely

datajoely

08/11/2021, 5:25 PM
no problem
Bertozzo

Bertozzo

08/11/2021, 5:25 PM
i need to put this in all csvs right ?
5:26 PM
so i can read them all correctly
datajoely

datajoely

08/11/2021, 5:26 PM
There is a way to re-use the same pattern over and over
5:26 PM
but let's get it working for one
5:27 PM
yaml
my_dataset:
  type: pandas.CSVDataSet
  filepath: xxxxx.csv
  load_args:
     on_bad_lines: skip 
     encoding: 'utf-8'
5:28 PM
the other option is to try different encodings like
utf-8-sig
and
utf-16
Bertozzo

Bertozzo

08/11/2021, 5:28 PM
i did the same command to all df, an looks like it advanced, but now its saying that saving none to a dataset is not allowed
5:29 PM
which is pretty logical tbh
datajoely

datajoely

08/11/2021, 5:29 PM
which command are you talking about?
Bertozzo

Bertozzo

08/11/2021, 5:29 PM
this one
datajoely

datajoely

08/11/2021, 5:29 PM
You will get the
None
error if your
node
returns nothing
5:30 PM
ah understood
5:30 PM
So I'm a little confused, did we get it working?
5:30 PM
or do you still need help
Bertozzo

Bertozzo

08/11/2021, 5:31 PM
File "c:\users\lbertozz\appdata\local\programs\python\python37\lib\site-packages\kedro\io\core.py", line 232, in save raise DataSetError("Saving
None
to a
DataSet
is not allowed") kedro.io.core.DataSetError: Saving
None
to a
DataSet
is not allowed
5:31 PM
thats was the last message
datajoely

datajoely

08/11/2021, 5:31 PM
Okay so can you show me the
Node
you use to process the dataset
5:32 PM
because it works by Catalog ->
load()
->
Node(Python function)
->
save()
-> Catalog
5:33 PM
If you're getting the
None
error it means your node is returning a None value
5:33 PM
not a DataFrame
Bertozzo

Bertozzo

08/11/2021, 5:33 PM
whats the most weird of all
5:34 PM
is that all my colleagues that use ubuntu can run it
5:34 PM
without any issues
datajoely

datajoely

08/11/2021, 5:34 PM
hmm
Bertozzo

Bertozzo

08/11/2021, 5:34 PM
they dont even need to put the encoding parameters
datajoely

datajoely

08/11/2021, 5:34 PM
maybe its your Python environment
5:34 PM
are you running in a virtual env?
Bertozzo

Bertozzo

08/11/2021, 5:34 PM
nope
datajoely

datajoely

08/11/2021, 5:35 PM
so I can't guarantee that will fix things
5:35 PM
but it sometimes makes things easier not having to worry about multiple Python versions, multiple version of Kedro etc
Bertozzo

Bertozzo

08/11/2021, 5:35 PM
thats what i've done, just installed the 37, set as a global variable on my system
5:36 PM
and then as a prymary interpreter on vscode
datajoely

datajoely

08/11/2021, 5:36 PM
And we're sure it's the same version of the code?
5:36 PM
this is super weird
5:36 PM
Can you post a screenshot of the catalog entry and the python node
5:37 PM
I can't really work it out without seeing them
Bertozzo

Bertozzo

08/11/2021, 5:37 PM
one moment pls
5:40 PM
can i call u pls ? and then share the screen ?
datajoely

datajoely

08/11/2021, 5:41 PM
I can't tonight (I'm also on calls 🤦‍♂️)
5:41 PM
I could book some time tomorrow?
5:41 PM
or do it async on here
Bertozzo

Bertozzo

08/11/2021, 5:42 PM
np
5:42 PM
ok, lets start over
datajoely

datajoely

08/11/2021, 5:43 PM
Would you like to book some time in tomorrow?
5:43 PM
as it's getting towards end of day here in London
Bertozzo

Bertozzo

08/11/2021, 5:44 PM
message has been deleted
5:44 PM
thats my screen rn
datajoely

datajoely

08/11/2021, 5:44 PM
okay and I need to see the
.py
file that has the nodes
5:44 PM
if you scroll a bit further up on the terminal it should tell you the name of the node
Bertozzo

Bertozzo

08/11/2021, 5:46 PM
like this ?
5:46 PM
021-08-11 14:43:12,636 - kedro.runner.sequential_runner - WARNING - There are 2 nodes that have not run. You can resume the pipeline run by adding the following argument to your previous command: --from-nodes "download_files,files_to_text" Traceback (most recent call last): File "c:\users\lbertozz\appdata\local\programs\python\python37\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec)
datajoely

datajoely

08/11/2021, 5:46 PM
Even further please
5:47 PM
you can right click the terminal and select all -> copy
Bertozzo

Bertozzo

08/11/2021, 5:48 PM
message has been deleted
datajoely

datajoely

08/11/2021, 5:50 PM
That's very helpful thank you
5:50 PM
so let me explain how to look at the logs as it explains whats going on
5:50 PM
message has been deleted
5:51 PM
The node you need to look at is called
full_data_clean
5:51 PM
and that will be in a
.py
file somewhere in your project folder
5:51 PM
It looks like there is a bad return somewhere in there where a
None
object is being returned
5:52 PM
sorry my screenshot is run
5:53 PM
the node to look at is this one
5:53 PM
message has been deleted
5:53 PM
download_files
Bertozzo

Bertozzo

08/11/2021, 5:55 PM
ok, im checking for it
datajoely

datajoely

08/11/2021, 5:55 PM
So this is no longer a CSV encoding issue
5:56 PM
it looks like the encoding tweak fixed things
5:56 PM
but now is just a badly formed node
5:56 PM
I'm going to log off for the evening - but feel free to post questions here and I'll pick them up when I'm next online 🙂
5:56 PM
good luck!
Bertozzo

Bertozzo

08/11/2021, 5:57 PM
i see, and why do you think its not working then ? i really thought it was about a windows issue or smt
5:57 PM
sure, thank you so much !
5:57 PM
really helped ! have a good day, see u soon 😄
datajoely

datajoely

08/11/2021, 5:57 PM
💪
Bertozzo

Bertozzo

08/11/2021, 8:58 PM
Passing by to close the issue here, all set and running smoothly now ! Thanks again for your support @datajoely ! Take care !
datajoely

datajoely

08/11/2021, 8:58 PM
Nice! Well done