Efficient way to flatten 1D string numpy array to 1D float numpy array

Question

I'd like to ask the better way to use numpy to convert/flatten above 1D string numpy array into 1D or 2D float array. The thing is I have to store the list of float values into the columns of pandas dataframe in order to avoid the awkward array. However, it is painful to read those values out because the float list was treated by pandas as a whole string. Here is the format I usually got when I called df.branchName :

array(['[110.762924, 176.10782, 97.453545, 47.24211, 9.123961, 49.076572, 9.892334, 155.76273]',
       '[54.983498, 42.953392, 26.73925, 20.285473, 5.817261, 10.84536, 7.2550445, 9.386389]',
       '[65.68088, 131.3692, 142.83168, 75.19385, 59.589417, -5.885845]',
       '[99.765884, 123.900116, 151.18433, 137.31078, 13.298813, 18.483736, 8.851394, 24.93825]',
       '[66.62968, 71.72392, 71.836624, 59.481956, 61.341305]',
       '[66.629616, 71.72373, 71.8364, 59.48184, 61.34116]',
       '[2.667129, 28.940117, 58.59804, 89.9932, 8.460876, 2.7282248, 36.31937, 63.39166]'])

I expect to get

array([110.762924, 176.10782, 97.453545, 47.24211, 9.123961, 49.076572, 9.892334, 155.76273, 54.983498, 42.953392, 26.73925, 20.285473, 5.817261, 10.84536, 7.2550445, 9.386389, ....])

Many thanks in advance.

Looks like you had a pandas dataframe with lists as column values, and then wrote it to a csv. And this is what you get from reading the csv. Do you see the quoted lists in the csv? That isn't a good way of storing a dataframe. You can convert those strings to lists with eval (one string at a time), but overall the approach is messy. — hpaulj
– hpaulj, Commented May 9, 2020 at 3:04
Hi hpaulj, As I mentioned, I saved the float list into my dataframe but it will be treated as a single string when I read it out again from csv file. I know it is not a good way but haven't figured out a better way to do that. Is there a way or method in pandas to keep the saving and reading data type consistently? — Zhen Yan
– Zhen Yan, Commented May 9, 2020 at 4:11
A csv can only save/read a table, 2d. Your structure is more complex than that. — hpaulj
– hpaulj, Commented May 9, 2020 at 4:47
Since the strings are valid string representations of a lists, eval can convert them to lists, which can then be joined into one array. np.hstack([eval(a) for a in arr]) Sometimes have arrays in the column, which display without the commas (and possibly with '...'); those require a bit of editing. — hpaulj
– hpaulj, Commented May 9, 2020 at 6:27

Ehsan · Accepted Answer · 2020-05-09 03:58:06Z

I agree with @hpaulj that there must be a more standard way of what you are trying to achieve. (Maybe post a new question explaining the issue). But in case you insist on using it this way, here is a solution:

np.hstack(np.array([[float(st) for st in item.strip('[]\s').split(',')] for item in list(a)]))

Note that you cannot stack them vertically, as the lists have different lengths.

output:

[110.762924  176.10782    97.453545   47.24211     9.123961   49.076572
   9.892334  155.76273    54.983498   42.953392   26.73925    20.285473
   5.817261   10.84536     7.2550445   9.386389   65.68088   131.3692
 142.83168    75.19385    59.589417   -5.885845   99.765884  123.900116
 151.18433   137.31078    13.298813   18.483736    8.851394   24.93825
  66.62968    71.72392    71.836624   59.481956   61.341305   66.629616
  71.72373    71.8364     59.48184    61.34116     2.667129   28.940117
  58.59804    89.9932      8.460876    2.7282248  36.31937    63.39166  ]

Collectives™ on Stack Overflow

Efficient way to flatten 1D string numpy array to 1D float numpy array

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related