0

I have 2 objects (imagine them as database tables):

O1:
field1(id)
field2
field3

O2:
field1
field2
field3(id)
field4

I have 2 lists:
L1 is a list of O1 objects
L2 is a list of O2 objects

Question: is there a way to join these two lists by L1.field1 and L2.field3 just like an SQL JOIN? The item count of the two lists are always equal (1:1 relation) but they are not necessarily sorted by these two fields.

3 Answers 3

1

You can do it the simple and naive way:

joined = [ i + j for i in L1 for j in L2 if i[0] == j[2] ]

It will certainly be much more efficient than pandas for tiny lists, but will perform poorly for large ones.

A mid way would be to use an auxiliary dictionnary:

D2 = { j[2]: j for j in L2 }
joined = [ i + D2[i[0]] for i in L1 ]

It will perform now on O(len(L1)) + O(len(L2)) instead of O(len(L1)) * O(len(L2)). Still less efficient that the highly optimized pandas module for very large data sets, but far better that the naive approach for not too small lists.

Sign up to request clarification or add additional context in comments.

Comments

0

pandas has a lot of functions to deal with data in this way.

Turn your lists into pd.DataFrames and then you can use pd.join. Like SQL JOIN this lets you specify parameters like inner, left, right, outer.

dfL1.set_index(field1).join(dfL2.set_index(field3))

Comments

0

I try do show an example, if I get the point. Let's say you have these classes:

class User():
  def __init__(self, id, name):
    self.id = id
    self.name = name

class Image():
  def __init__(self, id, user_id, filename):
    self.id = id
    self.user_id = user_id
    self.filename = filename

And the following collections:

users = [User(1, 'Jim'), User(2, 'Spock')]
images = [Image(1, 1, 'jim_1.jpg'), Image(2, 1, 'jim_2.jpg'), Image(3, 2, 'spk_1.jpg')]

Once you fetch a user form the collection, let's say the first:

user = users[0]

You can query for images in this way:

user_images = [ image for image in images if image.user_id == user.id ]

for image in user_images:
  print(image.filename)

While if you have the image, since in this case is a relation one to many:

image = images[0]
user = [user for user in users if user.id == image.user_id][0] # [0] as it is 1:n relation


For the join table:

join_table = [ {'name': user.name, 'filename': image.filename} for user in users for image in images if user.id == image.user_id ]

for e in join_table:
  print(e['name'], e['filename'])

Which returns:

# Jim jim_1.jpg
# Jim jim_2.jpg
# Spock spk_1.jpg

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.