In today’s world where AI is gaining too much popularity, suppose, you have created a website to show AI-generated articles on tending topics. These articles are stored in a MongoDB collection named “Articles”. Each document has a field “AI” containing the AI model used for that article generation.
Now, to let users filter articles based on which AI is written, you have to get a list of all AI models but selecting all the AI fields directly produces duplicates and due to this, the filter can have a single option multiple times.
Here comes the distinct() method to get unique AI model names, guaranteeing no duplicates. Let’s learn about it in detail.
MongoDB distinct() Method
The MongoDB distinct() method returns an array of unique values for a specified field across all documents that match the given query criteria.
In simple words, the array returned by the distinct() method contains all the values that are present in all documents for the specified field but with no duplicates, which means each value occurs only once, no matter how many times it occurs in the collection.
Syntax:
db.collection.distinct(field, query, options)
Parameters:
- field: The name of the field for which we want to get the unique values.
- query: A query to select from which documents the distinct values should be retrieved.
- options: For optional parameters like collation, maxTimeMS, readConcern and readPreference.
Return:
It returns an array that contains unique values for the specified field from the documents selected according to the query parameter.
If the specified field’s value is an array, distinct() treats each element of the array as a separate value. For example, if a field has the value [10, [10], 10], distinct() treats 10, [10], and 10 as separate values.
Examples of distinct() Method
In this section, we will look at some examples of how we can use distinct() method inside the Mongo shell.
Example 1: Returning Distinct Values from All Documents
We will use the following document for this example:
> db.movies.find().pretty()
{
"_id" : ObjectId("60322d3501cd70079c48cb65"),
"title" : "Enchanted",
"year" : 2006,
"score" : 10,
"rating" : "PG",
"__v" : 0
}
{
"_id" : ObjectId("60322d3501cd70079c48cb67"),
"title" : "Final Destination II",
"year" : 2015,
"score" : 10,
"rating" : "PG-13",
"__v" : 0
}
{
"_id" : ObjectId("6190189ef5c8903629012fe1"),
"title" : "Fifty Shades of Grey",
"year" : 2015,
"score" : 10,
"rating" : "NC-17",
"__v" : 0
}
{
"_id" : ObjectId("6190189ef5c8903629012fe2"),
"title" : "Cars",
"year" : 2006,
"score" : 8,
"rating" : null,
"__v" : 0
}
{
"_id" : ObjectId("6190189ef5c8903629012fe3"),
"title" : "The Matrix",
"year" : 1999,
"score" : null,
"rating" : "R",
"__v" : 0
}
{
"_id" : ObjectId("61901f82f5c8903629012fe4"),
"title" : "Salt",
"year" : 2010,
"score" : 9,
"rating" : "",
"__v" : 0
}
{
"_id" : ObjectId("61901f82f5c8903629012fe5"),
"title" : "Knowing",
"year" : 2009,
"score" : 8,
"rating" : "",
"__v" : 0
}
{
"_id" : ObjectId("61924e7512800ff6d3639076"),
"title" : "The Revenant",
"year" : 2015,
"score" : 6,
"rating" : "R",
"__v" : 0
}
{
"_id" : ObjectId("61924e7512800ff6d3639077"),
"title" : "Maleficient: The Mistress of Evil",
"year" : 2019,
"score" : 10,
"rating" : "PG",
"__v" : 0
}
Using the distinct() method to get a unique list of rating fields:
> db.movies.distinct( "rating" )
[ null, "", "NC-17", "PG", "PG-13", "R" ]
Example 2: Returning Distinct Values for an Embedded Field
In this example, we will use the distinct() method on a MongoDB collection with a data model that includes embedded documents.
The collection we used in the previous example didn’t contain embedded documents. Hence, we will use a new collection as shown below:
> db.drones.find().pretty()
{
"_id" : ObjectId("61673f46b34f185eb7b2bf0c"),
"utility" : [
"Natural Resource Exploration",
"Remote sensing",
"Real estate and construction",
"Recreation",
"Delivery"
],
"onSale" : false,
"name" : "Nimbari Gryphon Medeta 65",
"price" : 77500,
"weight" : "77 kilograms",
"additionalDetails" : {
"material" : "carbon fiber",
"moreUses" : [
"Precision Agriculture",
"Land Inspection",
"Water Inspection",
"Cinematography"
]
}
}
{
"_id" : ObjectId("61673f46b34f185eb7b2bf0d"),
"utility" : [
"Natural Resource Exploration",
"Remote sensing",
"Real estate and construction",
"Recreation",
"Delivery"
],
"onSale" : false,
"name" : "X-Strimmer Eye",
"price" : 23500,
"weight" : "24 kilograms",
"additionalDetails" : {
"material" : "glass fiber",
"moreUses" : [
"Precision Agriculture",
"Cinematography"
]
}
}
{
"_id" : ObjectId("61673f46b34f185eb7b2bf0e"),
"utility" : [
"Natural Resource Exploration",
"Remote sensing",
"Real estate and construction",
"Recreation",
"Delivery"
],
"onSale" : false,
"name" : "Khai Balemosh Shefqa TRX",
"price" : 120500,
"weight" : "80 kilograms",
"additionalDetails" : {
"material" : "aluminum",
"moreUses" : [
"Precision Agriculture",
"Land Inspection"
]
}
}
{
"_id" : ObjectId("61673f46b34f185eb7b2bf0f"),
"utility" : [
"Natural Resource Exploration",
"Recreation",
"Delivery"
],
"onSale" : false,
"name" : "Sifinist Croma AX",
"price" : 99500,
"weight" : "97 kilograms",
"additionalDetails" : {
"material" : "lithium",
"moreUses" : [
"Precision Agriculture",
"Land Inspection",
"Water Inspection",
"Videography"
]
}
}
{
"_id" : ObjectId("61673f46b34f185eb7b2bf10"),
"utility" : [
"Remote sensing",
"Real estate and construction",
"Recreation"
],
"onSale" : false,
"name" : "Drovce Finnifield FR-7",
"price" : 87600,
"weight" : "13 kilograms",
"additionalDetails" : {
"material" : "polysterene",
"moreUses" : [
"Precision Agriculture",
"Land Inspection",
"Water Inspection",
"Videography"
]
}
}
Using the distinct() method to return unique values for an embedded field from the above collection:
> db.drones.distinct( "additionalDetails.material" )
[ "aluminum", "carbon fiber", "glass fiber", "lithium", "polysterene" ]
Example 3: Returning Unique Values for an Embedded Array Fields
Let’s try to return unique values for an embedded array field using the distinct() method from the same collection since it already contains embedded documents and arrays:
> db.drones.distinct( "additionalDetails.moreUses" )
[
"Cinematography",
"Land Inspection",
"Precision Agriculture",
"Videography",
"Water Inspection"
]
MongoDB distinct() only returns the unique field values but sometimes you may want to get a full record, i.e., the entire document that matched the distinct query. For this, you can use aggregation distinct.
Summary
In short, the MongoDB distinct() method returns an array of unique values for a specified field from documents that satisfy the given query condition. It treats each array element in a field as a separate value. The distinct() is useful for removing duplicate field values but not able to get full records. For getting full documents, we can use the aggregation framework, particularly the $group stage. We can also use the covered query to get unique field values so that the operations can use indexes directly without scanning the full documents.
Also Read: Using the getIndexes() Function in MongoDB
References
- https://stackoverflow.com/questions/28155857/mongodb-find-query-return-only-unique-values-no-duplicates
- https://docs.mongodb.com/manual/reference/method/db.collection.distinct/