3

I'm learning Julia coming from Python. I want to get the elements of an array b such that each element is in array a. My attempt in Julia is shown after doing what I need in python. My question is this: is there a better/faster way to do this in Julia? I'm suspicious about the simplicity of what I've written in Julia, and I worry that such a naive looking solution might have suboptimal performance (again coming from Python).

Python:

import numpy as np
a = np.array([1, 2, 3, 4])
b = np.array([7, 8, 2, 3, 5])
indices_b_in_a = np.nonzero(np.isin(b, a))
b_in_a = b[indices_b_in_a]
# array([2, 3])

Julia:

a = [1, 2, 3, 4];
b = [7, 8, 2, 3, 5];
indices_b_in_a = findall(ele -> ele in a, b);
b_in_a = b[indices_b_in_a];
#2-element Vector{Int64}:
# 2
# 3
1
  • Apart from suggestions in Shayan's answer, there is also: filter(∈(a),b) which is short. Commented Dec 20, 2022 at 21:59

2 Answers 2

5

Maybe this would be a helpful answer:

julia> intersect(Set(a), Set(b))
Set{Int64} with 2 elements:
  2
  3

# Or even
julia> intersect(a, b)
2-element Vector{Int64}:
 2
 3

Note that if you had repetitive numbers, this method fails to exactly replicate your expected behavior since I'm working on unique values here! If you have repetitive elements, there should replace an element-by-element approach for searching! in that case, using binary search would be a good choice.
Another approach is using broadcasting in Julia:

julia> a = rand(1:100, 1000);
       b = rand(1:3000, 5000);

julia> b[in.(b, Ref(a))]
161-element Vector{Int64}:
  8
  5
 70
 73
  ⋮

# Exactly the same approach with a slightly different syntax
julia> b[b.∈Ref(a)]
161-element Vector{Int64}:
  8
  5
 70
 73
 30
 63
 73
  ⋮

Q: What is the role of Ref in the above code block?
Ans: By wrapping a in Ref, I make a Reference of a and prevent the compiler from iterating through a as well within the broadcasting procedure. Otherwise, it would try to iterate on the elements of a and b simultaneously which is not the right solution (even if both objects hold the same length).
However, Julia's syntax is specific (typically), but it's not that complicated. I said this because you mentioned:

I worry that such a naive looking solution...

Last but not least, do not forget to wrap your code in a function if you want to obtain a good performance in Julia.

Sign up to request clarification or add additional context in comments.

7 Comments

What you've written is more semantically meaningful than my solution, and your comment on julia's syntax is also helpful.
@JaredFrazier, Note that if you had repetitive numbers, this method fails to exactly replicate your expected behavior since I'm working on unique values here! If you have repetitive elements, there should replace an element-by-element approach for searching! in that case, using binary search would be a good choice.
I just noticed that and was about to suggest that you should add that as a part of your answer. Thanks for the clarification
@JaredFrazier, Sure! Also, I will suggest another approach within minutes.
What is the advantage here of using the Ref keyword? I see based on this discourse that it is used in broadcasting, but other than that i don't quite get it.
|
2

Another approach using array comprehensions.

julia> [i for i in a for j in b if i == j]
2-element Vector{Int64}:
 2
 3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.