I have a list get from database.
[{
'name': 'John',
'score': 30
}, {
'name': 'Jan',
'score': 23
}, {
'name': 'Mike',
'score': 34
}]
Can numpy get the sum of the score? (without loop through 1 by 1 using for in)
I have a list get from database.
[{
'name': 'John',
'score': 30
}, {
'name': 'Jan',
'score': 23
}, {
'name': 'Mike',
'score': 34
}]
Can numpy get the sum of the score? (without loop through 1 by 1 using for in)
You can do this by performing a sum on a list comprehension that collects all the "scores":
sum( [x['score'] for x in MyListOfDictionaries] )
(PS. Numpy is not necessary here)
Edit: as pointed out by @sebastian in the comments, the brackets around the list comprehension aren't necessary since we're plugging this directly into a function, i.e.:
sum(x['score'] for x in MyListOfDictionaries)
this is known as "generator" syntax; from a performance point of view it can be more efficient as it avoids the extra step of allocating memory for the list before processing it.
[] brackets though - you can call sum on a generator as well, avoiding the creation of a temporary list containing all the scores.In [1963]: ll=[{
...: 'name': 'John',
...: 'score': 30
...: }, {
...
...: }]
First the obvious iterative solution
In [1965]: sum([d['score'] for d in ll])
Out[1965]: 87
I can turn it into an object array with:
In [1966]: np.array(ll)
Out[1966]:
array([{'score': 30, 'name': 'John'}, {'score': 23, 'name': 'Jan'},
{'score': 34, 'name': 'Mike'}], dtype=object)
but applying sum directly to that won't help. But:
In [1967]: from operator import itemgetter
In [1970]: np.frompyfunc(itemgetter('score'),1,1)(ll).sum()
Out[1970]: 87
See my recent answer https://stackoverflow.com/a/38936480/901925 for more on how to access attributes of objects in an array.
frompyfunc doesn't really get rid of iteration - it just wraps it in a user friendly manner. And the itemgetter is still doing item['score'] for each dictionary in the list.
This use of itemgetter is basically the same as:
In [1974]: list(map(itemgetter('score'), ll))
Out[1974]: [30, 23, 34]
List comprehension, map, frompyfunc are all ways of iterating through the list and getting the score value from each dictionary.
pandas may be able to turn this whole list into a dataframe, but don't be fooled by its ease of use - it's doing all of this, and more, under the covers.
NumPy is a library for numerical arrays processing. You use numbers as columns names, convert your collection to matrix and use NumPy to make your calculations if you want to use exactly NumPy and its performance.
I suggest you to try pandas library: it has a type DataFrame which was created to contain and process collections like yours (like dataframes in R language or tables in MatLab) — tables with columns and rows. It has sum method which solves your problem.
I guess, it's not the only thing that you want to do with your data and speed is important — I'd recommend to use this library.
Here are related StackOverflow questions, which will show you some abilities of the library:
score, and I did mention without loop through 1 by 1. likenumpy.sum(list)