Hi. Am I correct in suspecting that the `Abstract...
# beginners-need-help
e
Hi. Am I correct in suspecting that the
AbstractDataSet
examples in https://kedro.readthedocs.io/en/stable/07_extend_kedro/03_custom_datasets.html are actually using code from the
AbstractVersionedDataSet
?
for example, at https://kedro.readthedocs.io/en/stable/07_extend_kedro/03_custom_datasets.html
Copy code
from pathlib import PurePosixPath
from typing import Any, Dict

from kedro.io.core import (
    AbstractVersionedDataSet,
    get_filepath_str,
    get_protocol_and_path,
)

import fsspec
import numpy as np
from PIL import Image


class ImageDataSet(AbstractVersionedDataSet):
    """``ImageDataSet`` loads / save image data from a given filepath as `numpy` array using Pillow.

    Example:
    ::

        >>> ImageDataSet(filepath='/img/file/path.png')
    """

    def __init__(self, filepath: str):
        """Creates a new instance of ImageDataSet to load / save image data for given filepath.

        Args:
            filepath: The location of the image file to load / save data.
        """
        # parse the path and protocol (e.g. file, http, s3, etc.)
        protocol, path = get_protocol_and_path(filepath)
        self._protocol = protocol
        self._filepath = PurePosixPath(path)
        self._fs = fsspec.filesystem(self._protocol)

    def _load(self) -> np.ndarray:
        """Loads data from the image file.

        Returns:
            Data from the image file as a numpy array
        """
        # using get_filepath_str ensures that the protocol and path are appended correctly for different filesystems
        load_path = get_filepath_str(self._get_load_path(), self._protocol)
...
This is supposed to just be a complete example of AbstractDataSet
The main problem here seeming to be that
_get_load_path
is not inherited from the
AbstractDataSet
base class.
d
Hi @User - I'm not sure I follow. This example demonstrates how to define a versioned dataset, if you're not interested in versioning then just inherit from
AbstractDataSet
e
iow I think there is a bug in the documentation. Both the examples for AbstractDataSet and AbstractVersionDataSet demonstrate code using AbstractVersionedDataSet.
(iow the example for AbstractDataSet does not demonstrate AbstrstrsctDataSet)
This leads to a problem with this line: 'load_path = get_filepath_str(self._get_load_path(), self._protocol)' because _get_load_path() doesn't seem to be a method of the AbstractDataSet basepath.
d
ah I'm with you
I'll log a ticket
which class is using
AbstractDataSet
?
I can only see
AbstractVersionedDataSet
on that page?
e
That's the issue, none of the examples are using
AbstractDataSet
(but presumably the first example should be)
I imagine what happened is someone had code for the
AbstractVersionedDataSet
example case and copy+modified that for the AbstractDataSet section.
d
Thanks - I understand now!
e
Feel free to drop a link to the ticket if you want me to elaborate on anything. I managed to put together a simple AbstractDataSet example from some other example code.
d
Great - it's on our backlog
good luck!
apologies for the confusion
e
No worries at all. Just trying to contribute back 🙂
2 Views