Location>code7788 >text

Three ways for mongo to filter arrays in documents

Popularity:303 ℃/2024-10-12 11:21:43

preamble

There are a variety of data types in mongo, common ones include:

data type (for) instance descriptive
String { "x" : "foot" } String. A common data type for storing data. In MongoDB, UTF-8 encoded strings are legal.
Integer { "x" : 1 } Integer value. Used to store values. Depending on the server you are using, it can be 32-bit or 64-bit.
Object { "x" : { "y" : "foot" } } For inline documentation
Array { "x" : [ "a" , "b" ] } Used to store an array or list or multiple values as a key.

There is a very common query that filters some data in an array and returns only the data that meets the requirements. The data is as follows, keep the record with vehicle=train in travel below, filter out the other elements, and return the entire document.

{
    "name": "tom",
    "travel": [
        {
            "vehicle" : "train",
            "city" : "Beijing, capital of People's *"
        },
        {
            "vehicle" : "plane",
            "city" : "Shanghai"
        },
        {
            "vehicle" : "train",
            "city" : "Shenzhen subprovincial city in Guangdong, special economic zone close *"
        }
    ]
}

There are three ways to try to implement filtering of arrays, including:

  1. Aggregate Queries Use$unwindcommander-in-chief (military)travelThe array is broken up, and the result set is obtained using the$matchFilter the eligible data and finally use the$groupPerform aggregation to get the final result set
  2. Aggregate Queries Use$matchFilter the result set of eligible root documents and then use the$project returns the corresponding field at the same time as thetravelArrays using the$filterPerform internal filtering to return the final result set
  3. Ordinary query Filter the records first, then filter the array by the projection query

The following is an analysis of whether these three methods can fulfill the requirements.

Add Data

Suppose there are two records, each of which is information about a person, including name, occupation, and cities traveled. The cities traveled to is an array containing the names of the cities as well as the means of transportation.

({
    "uid" : "1000001",
    "name" : "zhangsan",
    "job": "coder",
    "travel" : [
        {
            "vehicle" : "train",
            "city" : "Beijing, capital of People's *"
        },
        {
            "vehicle" : "plane",
            "city" : "Shanghai"
        },
        {
            "vehicle" : "train",
            "city" : "Shenzhen subprovincial city in Guangdong, special economic zone close *"
        }
    ]
})
({

    "uid" : "1000002",
    "name" : "lisi",
    "job": "coder",
    "travel" : [
        {
            "vehicle" : "plane",
            "city" : "Beijing, capital of People's *"
        },
        {
            "vehicle" : "car",
            "city" : "Shanghai"
        },
        {
            "vehicle" : "train",
            "city" : "Shenzhen subprovincial city in Guangdong, special economic zone close *"
        }
    ]
})
()
{ _id: ObjectId("6708d3e646d2075ca11e88ce"),
  uid: '1000001',
  name: 'zhangsan',
  job: 'coder',
  travel:
   [ { vehicle: 'train', city: 'Beijing, capital of People's *' },
     { vehicle: 'plane', city: 'Shanghai' },
     { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } ] }
{ _id: ObjectId("6708d3f646d2075ca11e88cf"),
  uid: '1000002',
  name: 'lisi',
  job: 'coder',
  travel:
   [ { vehicle: 'plane', city: 'Beijing, capital of People's *' },
     { vehicle: 'car', city: 'Shanghai' },
     { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } ] }

Validation of three methods

Statement of Requirements

Now the goal is to filter out all the records where the city is traveled by train, i.e., the records with vehicle=train in the travel array, and filter out the non-target records.

Method 1

Method 1: Use$unwindcommander-in-chief (military)travelThe array is broken up, and the result set is obtained using thematchFilter the eligible data and finally use the$groupPerform aggregation to obtain the final result set.

('test').aggregate(
    [
        {   
            $unwind: "$travel" 
        },
        { 
            $match : {
                "job":"coder", 
                "": "train" 
            } 
        },
        { 
            $group : { 
                "_id" : "$uid", 
                "travel": { $push: "$travel" } 
            } 
        } 
    ]
)

Results:

{ _id: '1000002', travel: [ { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } ] }
{ _id: '1000001', travel: [ { vehicle: 'train', city: 'Beijing, capital of People's *' }, { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } ] }

Analysis:

unwind can split an array, e.g. unwind has the following effect:

{ _id: ObjectId("6708d3e646d2075ca11e88ce"),
  uid: '1000001',
  name: 'zhangsan',
  job: 'coder',
  travel: { vehicle: 'train', city: 'Beijing, capital of People's *' } }
{ _id: ObjectId("6708d3e646d2075ca11e88ce"),
  uid: '1000001',
  name: 'zhangsan',
  job: 'coder',
  travel: { vehicle: 'plane', city: 'Shanghai' } }
{ _id: ObjectId("6708d3e646d2075ca11e88ce"),
  uid: '1000001',
  name: 'zhangsan',
  job: 'coder',
  travel: { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } }
{ _id: ObjectId("6708d3f646d2075ca11e88cf"),
  uid: '1000002',
  name: 'lisi',
  job: 'coder',
  travel: { vehicle: 'plane', city: 'Beijing, capital of People's *' } }
{ _id: ObjectId("6708d3f646d2075ca11e88cf"),
  uid: '1000002',
  name: 'lisi',
  job: 'coder',
  travel: { vehicle: 'car', city: 'Shanghai' } }
{ _id: ObjectId("6708d3f646d2075ca11e88cf"),
  uid: '1000002',
  name: 'lisi',
  job: 'coder',
  travel: { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } }

Then filter the eligible data by match

{ _id: ObjectId("6708d3e646d2075ca11e88ce"),
  uid: '1000001',
  name: 'zhangsan',
  job: 'coder',
  travel: { vehicle: 'train', city: 'Beijing, capital of People's *' } }
{ _id: ObjectId("6708d3e646d2075ca11e88ce"),
  uid: '1000001',
  name: 'zhangsan',
  job: 'coder',
  travel: { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } }
{ _id: ObjectId("6708d3f646d2075ca11e88cf"),
  uid: '1000002',
  name: 'lisi',
  job: 'coder',
  travel: { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } }

Finally, aggregation is performed through groups, using _id as the aggregation dependency and merging data with the same _id.

Summary:

This method is able to achieve the requirements of filtering arrays, but there is a problem, split the array is relatively simple, want to merge it up again is not easy. group can only be aggregated with a certain variable as the base, the other variables will be lost. For example, the final result only retains the _id and travel, other variables are lost.

Method II

Method 2: Use$matchFilter the result set of eligible root documents and then use the$projectWhile returning the corresponding field in thetravelArrays using the$filterPerform internal filtering to return the final result set

('test').aggregate(
    [
        { 
            $match : { "job": "coder" } 
        },
        {
            $project: {
                "uid": 1,
                "name": 1,
                "travel": {
                    $filter: {
                        input: "$travel",
                        as: "item",
                        cond: { $eq : ["$$","train"] }
                    }
                }
            }
        }
    ]
)

Analysis of results:

{ _id: ObjectId("6708d3e646d2075ca11e88ce"),
  uid: '1000001',
  name: 'zhangsan',
  travel: [ { vehicle: 'train', city: 'Beijing, capital of People's *' },{ vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } ] }
{ _id: ObjectId("6708d3f646d2075ca11e88cf"),
  uid: '1000002',
  name: 'lisi',
  travel: [ { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } ] }

Analysis:

There are two types of queries in mongo: normal queries and advanced queries. Advanced queries include aggregate queries, which are implemented with theaggregate keyword.

MongoDB's aggregation pipeline takes a MongoDB document that has been processed in one pipeline and passes the results to the next pipeline for processing. Pipeline operations are repeatable.

Here we describe a few common operations used in the aggregation framework:

  • $project: Modifies the structure of the input document. Can be used to rename, add or remove fields, and to create calculations as well as nested documents.
  • $match: Used to filter data and output only documents that match the conditions. $match uses MongoDB's standard query operations.
  • $limit: Used to limit the number of documents returned by the MongoDB aggregation pipeline.
  • $skip: Skips the specified number of documents in the aggregation pipeline and returns the remaining documents.
  • $unwind: Splits an array type field in a document into multiple entries, each containing a value from the array.
  • $group: Groups documents in a collection and can be used for statistical results.
  • $sort: Sort the input document and output it.
  • $geoNear: Outputs ordered documents close to a geographic location.

Here first all jobs=coder are filtered using match and then the structure of the output is modified using project. Filter is used in project to filter the elements in the array.

The definition of filter is as follows:

Selects a subset of the array to be returned based on the specified condition. Returns an array containing only those elements that match the condition. The elements returned are in their original order.

$filter has the following syntax:

{ $filter: { input: <array>, as: <string>, cond: <expression> } }
realm norm
input Parses into an array ofdisplayed formula
as Optional. Represents each individual element of the array'svariantname (of a thing)<u><font style="color:rgb(199, 37, 78);background-color:rgb(249, 242, 244);">input</font></u>. If no name is specified, the variable name defaults to<u><font style="color:rgb(199, 37, 78);background-color:rgb(249, 242, 244);">this</font></u>
cond ought todisplayed formulacan be parsed as a boolean value that is used to determine whether elements should be included in the output array. The expression<u><font style="color:rgb(199, 37, 78);background-color:rgb(249, 242, 244);">input</font></u>Each element of the array is referenced separately using the variable names specified in the<u><font style="color:rgb(199, 37, 78);background-color:rgb(249, 242, 244);">as</font></u>

/manual/reference/operator/aggregation/filter/

At cond leave the element with vehicle=train and exclude the others.

Summary:

This approach accomplishes the query goal of both filtering out elements of the array and returning the complete document.

Method III

Method Three:

With a projection query, the eligible records are first selected, and in by using the projection operator, the fields that need to be returned, as well as excluding specific fields.

(
      {
         job: "coder"
      }, 
      {  
          uid: 1, 
          name: 1, 
          travel: {
             $filter: {
                input: "$travel",
                as: "item",
                cond: { $eq : ["$$","train"] }
             } 
          } 
      }
)

Results:

{ _id: ObjectId("6708d3e646d2075ca11e88ce"),
  uid: '1000001',
  name: 'zhangsan',
  travel:
   [ { vehicle: 'train', city: 'Beijing, capital of People's *' },
     { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } ] }
{ _id: ObjectId("6708d3f646d2075ca11e88cf"),
  uid: '1000002',
  name: 'lisi',
  travel: [ { vehicle: 'train', city: 'Shenzhen subprovincial city in Guangdong, special economic zone close *' } ] }

Analysis:

What is a projection query?

In MongoDB, a projection query is a query operation that is used to selectively return fields from a document. By using the projection operator, we can specify the fields to be returned and whether to exclude specific fields.

The projection query syntax is shown below:

({ <query> }, { <projection> })

Among them. is a query expression used to filter the documents that satisfy the conditions. is an optional parameter that specifies the field to be returned.

Retain fields, exclude fields in PROJECTION,Select or exclude specific elements of an array. Utilizing the property of selecting or excluding specific elements in an array can also serve the purpose.

Example:

If we only want to return the first tag in each document, we can do so:

({}, { tags: { $slice: 1 } })

In this post the array is filtered by the filter method to keep the elements that match the conditions.

Summary:

This method accomplishes the query goal and is a clean implementation with low common query complexity and without much use of keywords.

reference document

/mongodb/mongodb-questions/393_mongodb_mongo_query_with_projection.html

/a/1190000016629733

/manual/reference/operator/aggregation/filter/

/weixin_44009447/article/details/115479348