0

I have multiple objects in a s3 bucket (part files). I need to read them and concatenate to one single numpy array. I am using below code

def read_and_concat(bucket, key_list):
    length = len(key_list)
    for index, key in enumerate(key_list):
        s3_client.download_file(bucket, key, 'test.out')
        target_data = genfromtxt('test.out', delimiter=',')
        data_shape = target_data.shape
        data[index] = np.array(data_shape)
        data[index] = target_data
    result = np.concatenate([data[i] for i in range(length)])
    return result

This throws me error NameError: name 'data' is not defined. I guess I need to define data as a 2D numpy array before using it in data[index] = np.array(data_shape) line. But I am not sure how.

Or is there any other thing I am missing?

Please suggest.

1 Answer 1

1

I think that data needs to be defined before you use it in this case. Assigning by index to a variable that doesn't exist throws a NameError. I'm not sure the extra step of creating the array is needed because genfromtext returns an ndarray.

def read_and_concat(bucket, key_list):
    length = len(key_list)
    data = []
    for index, key in enumerate(key_list):
        s3_client.download_file(bucket, key, 'test.out')
        data.append(genfromtxt('test.out', delimiter=','))
    return np.concatenate(data)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.