18
$\begingroup$

I have a dataset of 3D coordinates with a length of about $ 4\times 10^6 $.

From this volume I am sequentially selecting coordinates along one axis and manipulating this subset.

My question: Can the Select function be replaced by something that is faster.

Here is the example code with the needed time for selection:

SeedRandom[1];

coordinates = RandomReal[10, {4000000, 3}]; // AbsoluteTiming

{0.0989835, Null}

selectedCoordinates = Select[coordinates, #[[1]] > 6 && #[[1]] < 7 & ]; // AbsoluteTiming

{5.88215, Null}

Dimensions[selectedCoordinates]

{400416, 3}
$\endgroup$
4
  • 5
    $\begingroup$ Pick[coordinates, 6 < # < 7 & /@ coordinates[[All, 1]]] is almost twice as fast as Select[..] $\endgroup$ Commented Oct 25, 2017 at 11:43
  • 6
    $\begingroup$ You can compile your Select: compiled = Compile[{{coords, _Integer, 2}}, Select[coords, #[[1]] > 6 && #[[1]] < 7 &], CompilationTarget -> "C"] . Then compiled[coordinates] takes 0.2 secs on my machine. $\endgroup$ Commented Oct 25, 2017 at 11:51
  • 1
    $\begingroup$ Cases[coordinates, {x_, y_, z_} /; x > 6 && y < 7] Assuming that you want to get #[[1]]>6 &&#[[2]]<7. Otherwise the output would always by {}. No integer can be >6 and <7 at the same time ,-). $\endgroup$ Commented Oct 25, 2017 at 12:00
  • $\begingroup$ @RMMA: Thank you for your remark. I changed to RandomReal. $\endgroup$ Commented Oct 25, 2017 at 14:30

3 Answers 3

29
$\begingroup$
res1 = Select[coordinates, #[[1]] > 6 && #[[1]] < 7 &]; // 
  AbsoluteTiming // First

6.997629

res2 = Select[coordinates, 6 < #[[1]] < 7 &]; // AbsoluteTiming // First

4.676356

res3 = Pick[coordinates, 6 < # < 7 & /@ coordinates[[All, 1]]]; // 
  AbsoluteTiming // First

5.266651

res4 = Pick[coordinates, (1 - UnitStep[# - 7]) (1 - UnitStep[6 - #]) &@
      coordinates[[All, 1]], 1]; // AbsoluteTiming // First

0.353154

res6 = compiled[coordinates]; // AbsoluteTiming // First

0.667676

where

compiled = Compile[{{coords, _Real, 2}}, Select[coords, #[[1]] > 6 && #[[1]] < 7 &]]`

is the method suggested in Leonid's comment (without the option `CompilationTarget -> "C").

Equal[res1, res2, res3, res4, res5, res6]

True

$\endgroup$
7
  • $\begingroup$ Thank so much. The last solution is my case about 35 times faster than Select. $\endgroup$ Commented Oct 25, 2017 at 14:04
  • $\begingroup$ A fairer comparison would chain the inequalities for Select as well. And it would be nice to include the comparison for a compiled selector. $\endgroup$ Commented Oct 25, 2017 at 14:18
  • $\begingroup$ @mrz, my pleasure, Thank you for the accept. $\endgroup$ Commented Oct 25, 2017 at 14:25
  • $\begingroup$ @Alan, I added the variant of Select you suggested. I don't have a c compiler installed, so i cannot include timings for the method suggested by Leonid. Without the CompilationTarget->"C" compiled is slower than Pick. $\endgroup$ Commented Oct 25, 2017 at 14:32
  • $\begingroup$ Something like Pick[c,UnitStep[c-6,7-c],1] is more compact, but is about 30 times slower than your res4 formulation! The problem is with the multi-dimensional UnitStep. Something like Pick[c, UnitStep[c - 6]*UnitStep[7 - c], 1] seems as fast or slightly faster than the res4 formulation. $\endgroup$ Commented Oct 25, 2017 at 16:14
22
$\begingroup$

Slightly faster than @kglr's solution is to use Clip:

SeedRandom[1];
coordinates = RandomReal[10, {4000000, 3}];

r1 = Pick[
    coordinates,
    Unitize @ Clip[coordinates[[All,1]], {6, 7}, {0, 0}],
    1
];//RepeatedTiming

r2 = Pick[
    coordinates,
    (1-UnitStep[#-7]) (1-UnitStep[6-#])&@coordinates[[All,1]],
    1
];//RepeatedTiming

r1 === r2

{0.10, Null}

{0.15, Null}

True

$\endgroup$
2
  • $\begingroup$ Thanks a lot for your help ... I have to remenber Clip ... this function is extremely fast $\endgroup$ Commented Oct 25, 2017 at 19:08
  • $\begingroup$ Thanks. I learn about Pick, combined with Clip How wise! $\endgroup$ Commented Oct 26, 2017 at 17:23
4
$\begingroup$

My question: can the Select function be replaced by something that is faster.

Yes! Check out the BoolEval package.

SeedRandom[1];
coordinates = RandomReal[10, {4000000, 3}]; // AbsoluteTiming
(* {0.118832, Null} *)

selectedCoordinates = 
   Select[coordinates, #[[1]] > 6 && #[[1]] < 7 &]; // AbsoluteTiming
(* {6.08899, Null} *)
Needs["BoolEval`"]

selectedCoordinates2 = BoolPick[coordinates, 6 < coordinates[[All, 1]] < 7]; // AbsoluteTiming
(* {0.145518, Null} *)

selectedCoordinates == selectedCoordinates2
(* True *)

Be sure to read the documentation of the package to see more usage examples and learn about caveats.

$\endgroup$
1
  • $\begingroup$ This is great! I just installed and tried it. $\endgroup$ Commented Aug 5, 2019 at 10:18

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.