0

We could use help on structuring our Mongo database. We need to store country IDs then run queries to return documents containing matching countries. Assume the IDs are strings 6-10 chars long.

Two options:

1) Store the country IDs as one massive string separated by some delimiter

(e.g., /). Ex: "IDIDID1/IDIDID2/IDIDID3/IDIDID4/IDIDID5".

2) Store the IDs in an array.

 Ex: ["IDIDID1", "IDIDID2", "IDIDID3", "IDIDID4", "IDIDID5"].

We want to optimize for queries like "Find all documents containing country ID IDIDID3."

For option 1, we plan to use a RegEx to query documents (e.g., /IDIDID3/).

For option 2, we will use the standard $in operator.

Which option yields better read performance?

Does using the string approach yield better performance because you can index strings (as opposed to the limitation of only one array indexable by Mongo)?

We're using MongoMapper.

1
  • why the close vote? this is a legitimate programming question. Commented Mar 26, 2013 at 7:57

1 Answer 1

1

From MongDB Manual

$regex can only use an index efficiently when the regular expression 
has an anchor for the beginning (i.e. ^) of a string and is a case-sensitive match.
Additionally, while /^a/, /^a.*/, and /^a.*$/ match equivalent strings, 
they have different performance characteristics. 
All of these expressions use an index if an appropriate index exists; 
however, /^a.*/, and /^a.*$/ are slower. /^a/ can stop scanning after matching the prefix.

So using an array and a multi key index makes more sense in terms of performance

Sign up to request clarification or add additional context in comments.

3 Comments

thanks, @orid. what if each doc contains more than one array that requires an index? as far as we understand, mongo only allows one array in each collection to get indexed. is this right?
No, that isn't correct. The restriction over multi-key (array) index is in conjunction with compound indexes: only a single field in the compound index may hold array
sorry that's what i meant. we need to index an array in the context of a compound index. or maybe this is a poor assumption? our query finds docs based on six attributes. all six have high cardinality, so we assume indexing each attribute in the compound index is the right path. assuming this is correct, we need to index multiple attributes -- which could either be arrays or long strings -- within a compound index.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.