This questions contains lots of topics (in my mind at least). I really want to try to break this questions to its core components:
Design
As David noted (first comment) there is a design problem here - an ever-growing array as a sub document is not ideal (please refer to this blog post for more details).
On the over hand - when we imagine how a separate collection of messages will looks like, it will be something like this:
_id: ObjectId('...') // how do I identify the message
channel_id: 'cn247f9' // the message belong to a private chat or a group
user_id: 1234 // which user posted this message
message: 'hello or something' // the message itself
Which is also not that great because we are repeating the channel and user ids as a function of time. This is why the bucket pattern is used
So... what is the "best" approach here?
Concept
The most relevant question right now is - "which features and loads this chat is suppose to support?". I mean, many chats are only support messages display without any further complexity (like searching inside a message). Keeping that in mind, there is a chance that we store in our database an information that is practically irrelevant.
This is (almost) like storing a binary data (such an image) inside our db. we can do this, but with no actual good reason. So, if we are not going to support a full-text search inside our messages, there is no point to store the messages inside our db.. at all
But.. what if we want to support a full-text search? well - who said that we need to give this task to our database? we can easily download messages (using pagination) and make the search operation on the client side itself (while keyword not found, download previous page and search it), taking the loads out of our database!
So.. it seems like that messages are not ideal for storage in database in terms of size, functionality and loads (you may consider this conclusion as a shocking one)
ReDesign
- Using a hybrid approach where messages are stored in a separated collection with pagination (the bucket pattern supports this as described here)
- Store messages outside your database (since your are using
Node.js you may consider using chunk store), keeping only a reference to them in the database itself
- Set your page with a size relevant to your application needs and also with calculated fields (for instances: number of current messages in page) to ease database loads as much as possible
Schema
channels:
_id: ObjectId
pageIndex: Int32
isLastPage: Boolean
// The number of items here should not exceed page size
// when it does - a new document will be created with incremental pageIndex value
// suggestion: update previous page isLastPage field to ease querying of next page
messages:
[
{ userId: ObjectID, link: string, timestamp: Date }
]
messagesCount: Int32
Final Conclusion
I know - it seems like a complete overkill for such a "simple" question, but - Dawid Esterhuizen convinced me that designing your database to support your future loads from the very beginning is crucial and always better than simplifying db design too much
The bottom line is that the question "which features and loads this chat is suppose to support?" is still need to be answered if you intend to desgin your db efficiently (e.g. to find the Goldilocks zone where your design suits your application needs in the most optimal way)