Scala: InputStream to Array[Byte]

Question

With Scala, what is the best way to read from an InputStream to a bytearray?

I can see that you can convert an InputStream to char array

Source.fromInputStream(is).toArray()

stefanobaghino · Accepted Answer · 2021-10-25 06:49:32Z

49

How about:

Stream.continually(is.read).takeWhile(_ != -1).map(_.toByte).toArray

Update: use LazyList instead of Stream (since Stream is deprecated in Scala 3)

LazyList.continually(is.read).takeWhile(_ != -1).map(_.toByte).toArray

edited Oct 25, 2021 at 6:49

stefanobaghino

13.4k4 gold badges44 silver badges69 bronze badges

answered Feb 5, 2011 at 7:58

Eastsun

18.9k7 gold badges60 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Jus12 Over a year ago

Could you explain the difference between this and the variant in the question?

rahul Over a year ago

@Jus12 I was looking for a byte array. What I have in the question, is a way to obtain the char array.

Marcus Downing Over a year ago

Won't that create a huge linked list, then convert it to an array? That doesn't look very efficient, in time or memory.

mikhail_b Over a year ago

It looks like this does not create a linked list after all. Stream.continually produces an iterator, and takeWhile and map seem to convert iterators to iterators. E.g. evaluating Array(1, 2, 3, 4, -1).iterator.takeWhile(-1 !=).map(_.toByte) in a Scala 2.9.3 REPL gives me Iterator[Byte] = non-empty iterator.

James Ward Over a year ago

This seemed to cause OOM errors for me. Things were GC'd eventually but the spikes were beyond what my server could handle.

|

Andriy Plokhotnyuk · Accepted Answer · 2020-05-18 06:14:22Z

47

Just removed bottleneck in our server code by replacing

Stream.continually(request.getInputStream.read()).takeWhile(_ != -1).map(_.toByte).toArray

with

org.apache.commons.io.IOUtils.toByteArray(request.getInputStream)

Or in pure Scala:

def bytes(in: InputStream, initSize: Int = 8192): Array[Byte] = {
  var buf = new Array[Byte](initSize)
  val step = initSize
  var pos, n = 0
  while ({
    if (pos + step > buf.length) buf = util.Arrays.copyOf(buf, buf.length << 1)
    n = in.read(buf, pos, step)
    n != -1
  }) pos += n
  if (pos != buf.length) buf = util.Arrays.copyOf(buf, pos)
  buf
}

Do not forget to close an opened input stream in any case:

val in = request.getInputStream
try bytes(in) finally in.close()

edited May 18, 2020 at 6:14

answered Jan 18, 2013 at 16:11

Andriy Plokhotnyuk

8,0432 gold badges46 silver badges69 bronze badges

4 Comments

Haakon Over a year ago

That's org.apache.commons.io.IOUtils.toByteArray, in case anyone was wondering.

EdgeCaseBerg Over a year ago

This definitely feels faster. Anyone done any benchmarks or tests with larger files?

hermansc Over a year ago

Thank you. I had huge issues with GC Overhead errors running this with Apache Spark, where 90% of the time my tasks spent in GC. Replacing with toByteArray massively sped up things.

dividebyzero Over a year ago

It's important to point out how this solution can really drastically out-perform the alternatives, where you have things like map(_.toByte) iterating over the input byte-by-byte... Do this you are working with big-data!

overthink · Accepted Answer · 2011-08-02 18:13:07Z

20

In a similar vein to Eastsun's answer... I started this as a comment, but it ended up getting just a bit to long!

I'd caution against using Stream, if holding a reference to the head element then streams can easily consume a lot of memory.

Given that you're only going to read in the file once, then Iterator is a much better choice:

def inputStreamToByteArray(is: InputStream): Array[Byte] =
  Iterator continually is.read takeWhile (-1 !=) map (_.toByte) toArray

edited Aug 2, 2011 at 18:13

overthink

24.6k4 gold badges71 silver badges69 bronze badges

answered Feb 5, 2011 at 13:00

Kevin Wright

49.7k9 gold badges107 silver badges156 bronze badges

Comments

psp · Accepted Answer · 2011-02-05 07:07:16Z

14

import scala.tools.nsc.io.Streamable
Streamable.bytes(is)

Don't remember how recent that is: probably measured in days. Going back to 2.8, it's more like

new Streamable.Bytes { def inputStream() = is } toByteArray

answered Feb 5, 2011 at 7:07

psp

12.2k1 gold badge43 silver badges52 bronze badges

4 Comments

Y.H Wong Over a year ago

Is it safe to use stuff from scala.tools packages? Are they even a part of the standard library?

psp Over a year ago

No. But if you want to know how to write it, there it is.

Thilo Over a year ago

It seems to have moved to the more standard scala.reflect.io package now.

WestCoastProjects Over a year ago

scala.reflect.io.Streamable.bytes

Wilfred Springer · Accepted Answer · 2011-03-02 20:21:53Z

11

With Scala IO, this should work:

def inputStreamToByteArray(is: InputStream): Array[Byte] = 
   Resource.fromInputStream(in).byteArray

answered Mar 2, 2011 at 20:21

Wilfred Springer

10.9k6 gold badges58 silver badges71 bronze badges

Comments

pathikrit · Accepted Answer · 2016-03-09 04:34:27Z

7

With better-files, you can simply do is.bytes

answered Mar 9, 2016 at 4:34

pathikrit

33.7k39 gold badges154 silver badges230 bronze badges

1 Comment

Julian Pieles Over a year ago

better.files should just be in std lib. It is so much better. Also if you want Array[Byte] you need to use is.byteArray instead.

TimT · Accepted Answer · 2014-05-11 17:09:14Z

3

Source.fromInputStream(is).map(_.toByte).toArray

answered May 11, 2014 at 17:09

TimT

1,71418 silver badges14 bronze badges

1 Comment

Sebastian J. Over a year ago

This fails on binary/false encoded text files: stackoverflow.com/questions/13327536/…

Andriy Onyshchuk · Accepted Answer · 2019-02-10 21:24:57Z

2

How about buffered version of solution based on streams plus ByteArraOutputStream to minimize boilerplate around final array growing?

val EOF: Int = -1

def readBytes(is: InputStream, bufferSize: Int): Array[Byte] = {
  val buf = Array.ofDim[Byte](bufferSize)
  val out = new ByteArrayOutputStream(bufferSize)

  Stream.continually(is.read(buf)) takeWhile { _ != EOF } foreach { n =>
    out.write(buf, 0, n)
  }

  out.toByteArray
}

answered Feb 10, 2019 at 21:24

Andriy Onyshchuk

914 bronze badges

Comments

Chris Martin · Accepted Answer · 2015-08-31 13:33:29Z

1

Here's an approach using scalaz-stream:

import scalaz.concurrent.Task
import scalaz.stream._
import scodec.bits.ByteVector

def allBytesR(is: InputStream): Process[Task, ByteVector] =
  io.chunkR(is).evalMap(_(4096)).reduce(_ ++ _).lastOr(ByteVector.empty)

answered Aug 31, 2015 at 13:33

Chris Martin

30.9k12 gold badges83 silver badges142 bronze badges

2 Comments

OlegYch Over a year ago

probably no reason to reduce, that would defeat the incremental nature of streams

Chris Martin Over a year ago

The reason is that the question asks for a byte array.

R A · Accepted Answer · 2023-01-03 13:49:35Z

1

Since JDK 9:

is.readAllBytes()

answered Jan 3, 2023 at 13:49

R A

3353 silver badges11 bronze badges

Comments

Jitendra Nandre · Accepted Answer · 2022-02-02 07:52:09Z

0

We can do using Google API ByteStreams

com.google.common.io.ByteStreams

pass the stream to ByteStreams.toByteArray method for conversion

ByteStreams.toByteArray(stream)

edited Feb 2, 2022 at 7:52

answered Jan 27, 2020 at 14:28

Jitendra Nandre

1617 bronze badges

Comments

Y.H Wong · Accepted Answer · 2011-02-05 08:29:56Z

-1

def inputStreamToByteArray(is: InputStream): Array[Byte] = {
    val buf = ListBuffer[Byte]()
    var b = is.read()
    while (b != -1) {
        buf.append(b.byteValue)
        b = is.read()
    }
    buf.toArray
}

edited Feb 5, 2011 at 8:29

answered Feb 5, 2011 at 6:43

Y.H Wong

7,2443 gold badges36 silver badges35 bronze badges

1 Comment

Eastsun Over a year ago

Does List[Byte] have a method "add"?

Collectives™ on Stack Overflow

Scala: InputStream to Array[Byte]

12 Answers 12

7 Comments

4 Comments

Comments

4 Comments

Comments

1 Comment

1 Comment

Comments

2 Comments

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

12 Answers 12

7 Comments

4 Comments

Comments

4 Comments

Comments

1 Comment

1 Comment

Comments

2 Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related