0

Goal: Efficiently do something for each element in a list, and then return the original list, so that I can do something else with original list. For example, let lst be a very large list, and suppose we do many operations to it before applying our foreach. What I want to do is something like this:

lst.many_operations().foreach(x => f(x)).something_else()

However, foreach returns a unit. I seek a way to iterate through the list and return the original list supplied, so that I can do something_else() to it. To reduce the memory impact, I need to avoid saving the result of lst.many_operations() to a variable.

An obvious, but imperfect, solution is to replace foreach with map. Then the code looks like:

lst
.many_operations()
.map(x => {
  f(x)
  x
}).something_else()

However, this is not good because map constructs a new list, effectively duplicating the very large list that it iterated through.

What is the right way to do this in Scala?

5
  • 2
    "To reduce the memory impact, I need to avoid saving the result of lst.many_operations() to a variable" that doesn't make any sense and is the answer you are looking for. - If you want to be fancy and you ware in 2.13 you may use tap but, really, is the same. Also, if you are very concerned about memory usage: a) you are probably wrong, but b) use a view or iterator or any other lazy collection (although, again, you may actually make things slower) Commented Sep 26, 2022 at 15:26
  • @LuisMiguelMejíaSuárez pipe/tap seem to be what I was looking for. Thank you Commented Sep 26, 2022 at 15:38
  • 1
    "To reduce the memory impact, I need to avoid saving the result of lst.many_operations() to a variable." - agreeing with Luis Miguel Mejía Suárez: this sentence makes no sense. You're trying to optimize away something entirely negligible and irrelevant. Before "optimizing" anything further, please make sure that you remember how reference types are treated on the JVM. Your list is certainly not a stack-allocated array or anything like that. Commented Sep 26, 2022 at 15:51
  • @AndreyTyukin Assume lst contains a billion or so items. If I save to a variable after each composition which changes the list, then I effectively have several lists, each of which is many GB in size, available as variables in the local scope. If I don't save them as variables, then they are not stored in memory, except where they are used, right? In this case, I need to be very memory cognizant Commented Sep 26, 2022 at 15:55
  • 2
    "If I don't save them to the list, they are not available in memory, right?" wrong, the objects will always exist in memory no matter if you save them to a variable or not, each operation on a List returns a new List hence the recommendation of looking for a lazy collection if you really want to avoid such waste. Or, if you are really at the scale of billions consider looking into a streaming solution like fs2, AkkaStreams, Monix or ZIO. Commented Sep 26, 2022 at 16:07

1 Answer 1

2

The simplest way seems to be:

    lst.foreach(many_operations)
    lst.foreach(something_else)

However: using side effects is really not a good idea. I would urge you to revisit your design to use explicit pure transformations rather than side effects and mutations.

To address your concern about having multiple lists in memory at the same time, you can use view or iterator to emulate streaming processing, and discard intermediate results you do not need to use again:

   val newList = lst.iterator
    .map(foo)
    .map(bar)
    .map(baz)
    .toList

(lst will get garbage collected if you do not reference it again).

Sign up to request clarification or add additional context in comments.

1 Comment

"lst will get garbage collected if you do not reference it again" - This is the answer I needed. The garbage collection eliminates it if it isn't referenced again, so it is not kept in memory until the end of scope. Also, this answer helps me understand @LuisMiguelMejíaSuárez 's suggestion regarding view and iterator. Thank you. Answer accepted.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.