Title
#beginners-need-help
e

ende

09/27/2021, 11:06 PM
Hi. Am I correct in suspecting that the
AbstractDataSet
examples in https://kedro.readthedocs.io/en/stable/07_extend_kedro/03_custom_datasets.html are actually using code from the
AbstractVersionedDataSet
?
11:14 PM
for example, at https://kedro.readthedocs.io/en/stable/07_extend_kedro/03_custom_datasets.html
from pathlib import PurePosixPath
from typing import Any, Dict

from kedro.io.core import (
    AbstractVersionedDataSet,
    get_filepath_str,
    get_protocol_and_path,
)

import fsspec
import numpy as np
from PIL import Image


class ImageDataSet(AbstractVersionedDataSet):
    """``ImageDataSet`` loads / save image data from a given filepath as `numpy` array using Pillow.

    Example:
    ::

        >>> ImageDataSet(filepath='/img/file/path.png')
    """

    def __init__(self, filepath: str):
        """Creates a new instance of ImageDataSet to load / save image data for given filepath.

        Args:
            filepath: The location of the image file to load / save data.
        """
        # parse the path and protocol (e.g. file, http, s3, etc.)
        protocol, path = get_protocol_and_path(filepath)
        self._protocol = protocol
        self._filepath = PurePosixPath(path)
        self._fs = fsspec.filesystem(self._protocol)

    def _load(self) -> np.ndarray:
        """Loads data from the image file.

        Returns:
            Data from the image file as a numpy array
        """
        # using get_filepath_str ensures that the protocol and path are appended correctly for different filesystems
        load_path = get_filepath_str(self._get_load_path(), self._protocol)
...
This is supposed to just be a complete example of AbstractDataSet
11:15 PM
The main problem here seeming to be that
_get_load_path
is not inherited from the
AbstractDataSet
base class.
datajoely

datajoely

09/28/2021, 8:38 AM
Hi @User - I'm not sure I follow. This example demonstrates how to define a versioned dataset, if you're not interested in versioning then just inherit from
AbstractDataSet
e

ende

09/28/2021, 12:18 PM
iow I think there is a bug in the documentation. Both the examples for AbstractDataSet and AbstractVersionDataSet demonstrate code using AbstractVersionedDataSet.
12:21 PM
(iow the example for AbstractDataSet does not demonstrate AbstrstrsctDataSet)
12:23 PM
This leads to a problem with this line: 'load_path = get_filepath_str(self._get_load_path(), self._protocol)' because _get_load_path() doesn't seem to be a method of the AbstractDataSet basepath.
datajoely

datajoely

09/28/2021, 12:34 PM
ah I'm with you
12:34 PM
I'll log a ticket
12:35 PM
which class is using
AbstractDataSet
?
12:35 PM
I can only see
AbstractVersionedDataSet
on that page?
e

ende

09/28/2021, 6:34 PM
That's the issue, none of the examples are using
AbstractDataSet
(but presumably the first example should be)
6:35 PM
I imagine what happened is someone had code for the
AbstractVersionedDataSet
example case and copy+modified that for the AbstractDataSet section.
datajoely

datajoely

09/29/2021, 8:06 AM
Thanks - I understand now!
e

ende

09/29/2021, 3:58 PM
Feel free to drop a link to the ticket if you want me to elaborate on anything. I managed to put together a simple AbstractDataSet example from some other example code.
datajoely

datajoely

09/29/2021, 3:58 PM
Great - it's on our backlog
3:58 PM
good luck!
3:58 PM
apologies for the confusion
e

ende

09/29/2021, 5:35 PM
No worries at all. Just trying to contribute back 🙂