A couple of points worth mentioning..
- Your example code in file "courses.py" is inserting grades as a
string that represents an array, not an actual array. This was
pointed out by Matt in the comments, and you requested an
explanation. Here is my attempt to explain - if you insert a string
that looks like an array you cannot perform $unwind, or $lookup on
sub-elements because they aren't sub-elements, they are part of a
string.
- You have array data in courses that hold students grades, which are
the datapoints that are desired, but you start the aggregation on
the student collection. Instead, perhaps change your perspective a
bit and come at it from the courses collections instead of the
student perspective. If you do, you will may re-qualify the
requirement as - "show me all courses and student grades where
student id is 0".
- Your array data seems to have a datatype mismatch. The student id
is an integer in your string variable "array", but the student
collection has the student id as a string. Need to be consistent
for the $lookup to work properly (if not wanting to perform a bunch
of casting).
But, nonetheless, here is a possible solution to your problem. I have revised the python code, including a redefinition of the aggregation...
The name of my test database is pythontest as seen in this code example.
This database must exist prior to running the code else an error.
File students.py
from pymongo import MongoClient
import pprint
client = MongoClient("mongodb://127.0.0.1:27017")
db = client.pythontest
student = [{"_id":"0",
"firstname":"Bert",
"lastname":"Holden"},
{"_id":"1",
"firstname":"Sam",
"lastname":"Olsen"},
{"_id":"2",
"firstname":"James",
"lastname":"Swan"}]
students = db.students
students.insert_many(student)
pprint.pprint(students.find_one())
Then the courses file. Notice the field grades is no longer a string, but is a valid array object? Notice the student id is a string, and not an integer? (In reality, a stronger datatype such as UUID or int would likely be preferable).
File courses.py
from pymongo import MongoClient
import pprint
client = MongoClient("mongodb://127.0.0.1:27017")
db = client.pythontest
course = [{"_id":"10",
"coursename":"Databases",
"grades": [{ "student_id": "0", "grade": 83.442}, {"student_id": "1", "grade": 45.323}, {"student_id": "2", "grade": 87.435}]}]
courses = db.courses
courses.insert_many(course)
pprint.pprint(courses.find_one())
... and finally, the aggregation file with the changed aggregation pipeline...
File aggregation.py
from pymongo import MongoClient
import pprint
client = MongoClient("mongodb://127.0.0.1:27017")
db = client.pythontest
pipeline = [
{ "$match": { "grades.student_id": "0" } },
{ "$unwind": "$grades" },
{ "$project": { "coursename": 1, "student_id": "$grades.student_id", "grade": "$grades.grade" } },
{
"$lookup":
{
"from": "students",
"localField": "student_id",
"foreignField": "_id",
"as": "student"
}
},
{
"$unwind": "$student"
},
{ "$project": { "student._id": 0 } },
{ "$match": { "student_id": "0" } }
]
pprint.pprint(list(db.courses.aggregate(pipeline)))
Output of running program
> python3 aggregation.py
[{'_id': '10',
'coursename': 'Databases',
'grade': 83.442,
'student': {'firstname': 'Bert', 'lastname': 'Holden'},
'student_id': '0'}]
The format of the data at the end of the program may not be as desired, but can be tweaked by manipulating the aggregation.
** EDIT **
So if you want to approach this aggregation from the student rather than approaching it from the course you can still perform that aggregation, but because the array is in courses the aggregation will be a bit more complicated. The $lookup must utilize a pipeline itself to prepare the foreign data structures:
Aggregation from Student perspective
db.students.aggregate([
{ $match: { _id: "0" } },
{ $addFields: { "colStudents._id": "$_id" } },
{
$lookup:
{
from: "courses",
let: { varStudentId: "$colStudents._id"},
pipeline:
[
{ $unwind: "$grades" },
{ $match: { $expr: { $eq: ["$grades.student_id", "$$varStudentId" ] } } },
{ $project: { course_id: "$_id", coursename: 1, grade: "$grades.grade", _id: 0} }
],
as: "student_course"
}
},
{ $project: { _id: 0, student_id: "$_id", firstname: 1, lastname: 1, student_course: 1 } }
])
Output
> python3 aggregation.py
[{'firstname': 'Bert',
'lastname': 'Holden',
'student_course': [{'course_id': '10',
'coursename': 'Databases',
'grade': 83.442}],
'student_id': '0'}]