3
\$\begingroup\$

I'm writing a function that validates array-like container to have correct type, length or to be hashable. Some of conditions may not be used for validation.

def check_container(container, type_=None, length=None, hashable=None):
    """
    Validates container to have correct type and length, or hashable parameter.

    Arguments:
        container {iterable} -- instance to be validated

    Keyword Arguments:
        type_ {type/tuple of types} -- valid type/type
        length {int} -- valid length
        hashable {bool} -- if container should be hashable

    Returns:
        bool -- result of verification
    """
    valid = True
    if type_ is not None:
        valid = valid and isinstance(container, type_)

    if length is not None:
        valid = valid and len(container) == length

    if hashable is not None:
        valid = valid and isinstance(container, collections.Hashable)

    return valid

I confused by the style I collect all booleans together: valid = valid and other_condition. It doesn't look "pythonic" for me. Any thoughts on how it can be writed more compact, readable and "pythonic way"?

Update

Usage of check_container function in code:

class one:

class BoxToMultiDiscrete:

    def __init__(self, bounds, n_bins):

        if not check_container(bounds, tuple):
            raise TypeError("bounds should be a tuple")

        if not check_container(n_bins, tuple, len(bounds[0])):
            raise TypeError("n_bins should be a tuple of the same length as boundaries")

        low = list(bounds[0])
        high = list(bounds[1])

        self.bins_list = []
        for i in range(len(n_bins)):
            bins = np.linspace(low[i], high[i], n_bins[i] - 1)
            self.bins_list.append(bins)

    def transform(self, input_vector):

        if not check_container(input_vector, tuple, len(self.bins_list)):
            raise TypeError("input_vector should be a tuple of the same length as n_bins")

        output_vector = tuple()
        for i in range(len(input_vector)):
            digit = np.digitize(input_vector[i], self.bins_list[i])
            output_vector += (int(digit),)
        return output_vector

class two:

class MultiDiscreteToDiscrete:
    def __init__(self, n_bins):

        if not check_container(n_bins, tuple):
            raise TypeError("n_bins should be a tuple")

        self.n_bins = n_bins
        self.space_size = np.product(n_bins)

        temp = (1,) + n_bins[:-1]
        self.cumprod = np.cumprod(temp)

    def transform(self, input_vector):

        if not check_container(input_vector, tuple, len(self.n_bins)):
            raise TypeError("input_vector should be a tuple of the same length as n_bins")

        return self._convert(input_vector)

    def _convert(self, input_vector):
        num = 0
        for i, bin in enumerate(input_vector):
            num += bin * self.cumprod[i]
        return num

P.S. Maybe you can give some feedback about classes too.

\$\endgroup\$
4
  • 1
    \$\begingroup\$ Note that type is a built-in function. Best practice is to avoid shadowing built-in names. \$\endgroup\$ Commented Apr 4, 2022 at 20:59
  • 1
    \$\begingroup\$ Won't isinstance({}, collections.Hashable) always evaluate to False? \$\endgroup\$ Commented Apr 4, 2022 at 21:30
  • 2
    \$\begingroup\$ What would be the use case of that function? I would refactor the function into oblivion and perform the three respective checks on-site. \$\endgroup\$ Commented Apr 4, 2022 at 21:33
  • \$\begingroup\$ @RichardNeumann I have multiple places in my code where I need to check containers. How to avoid duplicated code without moving it into separate function? \$\endgroup\$ Commented Apr 4, 2022 at 21:54

2 Answers 2

4
\$\begingroup\$

By the Single Responsibility Principle, each function should do one thing, and the name of the function should reflect its purpose. Your function does multiple things, which is the underlying reason for the code being awkward. As a result, the code that calls check_container() is also awkward, since it can no longer give one specific reason why the validation failed.

It performs zero, one, or two checks for the type of container. In modern Python, types of parameters should be declared using type annotations. You can either do static analysis using mypy, or runtime validation using enforce.

I'm not sure why you insist that certain containers be tuples. Is it important that it be immutable? Or hashable? If it's not actually important that they be tuples, consider dropping that requirement in favour of duck typing.

With the type checks taken care of by other means, all you have left is a dimension check. So, write a function for that, or maybe you can just write that check inline instead.

\$\endgroup\$
1
\$\begingroup\$

I think a checking function like this should return as soon as the result is known. E.g.

if type_ is not None and not isinstance(container, type_):
    return False
if length is not None and len(container) != length:
    return False
if hashable is not None and not isinstance(container, collections.Hashable):
    return False
return True

This strikes me as more readable than maintaining the valid variable throughout the function.

We could replace the whole thing with a single condition:

return ((type_ is None or isinstance(container, type_) and
        (length is None or len(container) == length) and
        (hashable is None or isinstance(container, collections.Hashable))

I'd probably use an optional boolean for hashable:

hashable is None or hashable == isinstance(container, collections.Hashable)
\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.