Lost in MongoDB: Learning to Aggregate



I am getting to the end of stage one in writing a database driver for MongoDB to use in my Lockdown Project sudsjs.com. All has gone well until I get to totalling up a field in the database. I had to enter the wonderful world of aggregation, which is confusing, powerful and function-rich. 


I started with counting the number of sales transactions for a customer.

In SQL you just pass this to the database system:

"SELECT COUNT(*) FROM salesfile WHERE customer_number=25"

Well, not exactly. In real life you say customer_number=? and then give the number (25) separately. That is for security reasons, but we will ignore that for the present.

The MongoDB equivalent is pretty straightforward. Easier really because you don't need the question mark thing.

db.collection(salesfile).countDocuments({customer: 25})

No problem at all with that.

Then I wanted to add up the value of those sales. In SQL

"SELECT SUM(value) FROM salesfile WHERE customer_number=25"

But in MongoDB you get this:

array=await db.collection(salesfile).aggregate([  {
       $match: {customer_number: 25}
   },
   {
       $group: {
             _id: null,
             result: {$sum: '$value'}
         }
     } ]).toArray();
total=array[0].result;



Remember one of the claims made on the MongoDB website is that it is
"simple for developers to learn and use". I have to tell you that simple this is not.
 
In fact the aggregate function is a very rich and powerful, providing dozens,
if not hundreds of useful capabilities.  It takes a while to get your head around it, not helped much by the documentation. Fortunately I managed to find a video tutorial on aggregation and once you understand it it is really impressive.  

Let me break this example down. You go through stages where the output of one stage is fed into the next.   They call this a pipeline. Not really a good name because a pipeline is a long continuous thing, whereas the aggregate function is a series of steps. More like a production line.  

Stage 1: $match: Match filters the data to select records (documents in mongo-speak) for customer 25.

Stage 2: $group: Group the results by some other field - but we say null because we want all of the documents.  Then for each group (in this case the only group) add up the value field.

Then turn the resulting object into an array so we can use it.  The result is the first (and only) element in the array.

There it is.


No comments:

Post a Comment