error: float() argument must be a string or a number, not 'SAMPLES'

Question

    from random import random
    from sklearn.metrics import mean_squared_error
    from sklearn.model_selection import LeaveOneOut
    from sklearn.ensemble import RandomForestClassifier
    import numpy as np
    
    numberOfTest = 8
    numberOfFeature = 5
    numberOfSamplePerEachFeature = 450
    
    #Create a 3D list to store the data needed for learning and experimentation. In this 3D list, there are only attributes and values for each attribute, and there is no value for the target attribute.**

    dataForLearning = [[[0.0 for i in range(numberOfSamplePerEachFeature)] for j in range(numberOfFeature)] for k in range(numberOfTest)]

    # Create lists to store target values that will be used for learning and testing.**

    targetValue = [0.0 for i in range(numberOfTest)]
    
    # Here the data set is initialized ===============================**
    
    for i in range(numberOfTest):
        for j in range(numberOfFeature):
            for k in range(numberOfSamplePerEachFeature):
                dataForLearning[i][j][k] = random()
    
    for i in range(numberOfTest):
        targetValue[i] = random()
    
# =========================================================**
    
    
    class SAMPLES:
        def __init__(self, k=0, value=0.0):
            self.object = np.zeros(numberOfSamplePerEachFeature)
            self.object[k] = value
    
        def __repr__(self):
            return self.object
    
        
# In order to convert a 3D list to a 2D list, the third dimension is actually an object, each object containing an array of data. This was done because the library for the random forest classifier was giving a 3D entry error.

    temp = [[SAMPLES() for i in range(numberOfFeature)] for j in range(numberOfTest)]
    for i in range(numberOfTest):
        for j in range(numberOfFeature):
            for k in range(numberOfSamplePerEachFeature):
                temp[i][j].object[k] = dataForLearning[i][j][k]
    X = np.array(temp)
    
    Y = np.array(targetValue)
    Y_pred = np.zeros(len(targetValue))
    
    oneOfAll = LeaveOneOut()
    oneOfAll.get_n_splits(X)
    for train_index, test_index in oneOfAll.split(X):
        X_train, X_test = X[train_index], X[test_index]
        Y_train, Y_test = Y[train_index], Y[test_index]
    
        # define the model
        model = RandomForestClassifier()
        # fit the model on the whole dataset
        model.fit(X_train, Y_train)
        Y_pred[test_index] = model.predict(X_test)
    
    print(mean_squared_error(Y, Y_pred))

full error message:

Traceback (most recent call last): File "G:\machineLearning.py", line 60, in

model.fit(X_train, Y_train)

File "C:\Users\mehran\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\ensemble_forest.py", line 327, in fit

X, y = self._validate_data(

File "C:\Users\mehran\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\base.py", line 581, in _validate_data

X, y = check_X_y(X, y, **check_params)

File "C:\Users\mehran\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\validation.py", line 964, in check_X_y

X = check_array(

File "C:\Users\mehran\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\utils\validation.py", line 746, in check_array

array = np.asarray(array, order=order, dtype=dtype)

TypeError: float() argument must be a string or a number, not 'SAMPLES'

@ex4, Thanks for your comment. How can I do it? (my IDE is pycharm) — mehran
– mehran, Commented Jun 3, 2022 at 8:56
Just run your code and copy&paste full error message, not just that summary line. I don't know about PyCharm, but you can run your code in console. — ex4
– ex4, Commented Jun 3, 2022 at 8:59

Iguananaut · Accepted Answer · 2022-06-03 09:37:40Z

2

You are passing a (nested) list of your SAMPLE objects to np.array, but NumPy has no idea how to convert an arbitrary object to an array. What you're doing is equivalent to this shorter example:

>>> import numpy as np
>>> class Sample:
...     pass
... 
>>> samples = [Sample(), Sample()]
>>> arr = np.array(samples)
>>> arr
array([<__main__.Sample object at 0x7f7c03812250>,
       <__main__.Sample object at 0x7f7c03812640>], dtype=object)
>>> np.asarray(arr, dtype=np.float)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: float() argument must be a string or a number, not 'Sample'

I think you should modify your Sample class (if it's even needed) to take the array it wraps as an argument. The triply-nested for-loop you have assigning data to the sample objects is very inefficient.

answered Jun 3, 2022 at 9:37

Iguananaut

23.8k6 gold badges54 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

mehran Over a year ago

Thanks for your answer. As I explained above, because each feature has a large number of samples per test, I use the class structure (object) to convert 3D train data to 2D.

Iguananaut Over a year ago

Your class isn't adding any value, and in fact won't help you with anything there. Without knowing what problem you're trying to solve it's hard to know what the best answer is but you can reduce a 3-D array down to 2-D simply by reshaping the array: numpy.org/devdocs/user/…

mehran Over a year ago

That is, I will convert 5 features, each of which has 450 elements, into 450 x 5 features. These two modes do not make a difference in the complexity of the machine learning algorithm.

mehran Over a year ago

The third dimension here is the values for each Feature obtained over time.

mehran Over a year ago

Do you think the recursive neural network can be suitable for this example? I know that the information I have given you about the issue may not be enough.

|

Collectives™ on Stack Overflow

error: float() argument must be a string or a number, not 'SAMPLES'

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related