Goal: Efficiently do something for each element in a list, and then return the original list, so that I can do something else with original list. For example, let lst be a very large list, and suppose we do many operations to it before applying our foreach. What I want to do is something like this:
lst.many_operations().foreach(x => f(x)).something_else()
However, foreach returns a unit. I seek a way to iterate through the list and return the original list supplied, so that I can do something_else() to it. To reduce the memory impact, I need to avoid saving the result of lst.many_operations() to a variable.
An obvious, but imperfect, solution is to replace foreach with map. Then the code looks like:
lst
.many_operations()
.map(x => {
f(x)
x
}).something_else()
However, this is not good because map constructs a new list, effectively duplicating the very large list that it iterated through.
What is the right way to do this in Scala?
2.13you may usetapbut, really, is the same. Also, if you are very concerned about memory usage: a) you are probably wrong, but b) use avieworiteratoror any other lazy collection (although, again, you may actually make things slower)lstcontains a billion or so items. If I save to a variable after each composition which changes the list, then I effectively have several lists, each of which is many GB in size, available as variables in the local scope. If I don't save them as variables, then they are not stored in memory, except where they are used, right? In this case, I need to be very memory cognizantListreturns a newListhence the recommendation of looking for a lazy collection if you really want to avoid such waste. Or, if you are really at the scale of billions consider looking into a streaming solution like fs2, AkkaStreams, Monix or ZIO.