MongoDB 스터디

이 글은 MongoDB의 개념, 구조, 기본 연산, 스키마, 데이터 타입, 관계, 인덱싱, aggregation, 그리고 트랜잭션에 대한 상세한 스터디 내용을 담고 있습니다.

MongoDB ?

RDBMS 와 비교하기 쉬운 MongoDB 개념들

| RDBMS | MongoDB | | --- | --- | | Database | Database | | Table | Collections | | Instance | Documents |

RDBMS 의 Instance 와 Documents 가 다른점은, MongoDB 의 Document 가 Schemaless 라는 점이다. Mongoose 의 경우에는 Schema 를 생성해서 체계적으로 관리하게 해 주지만, 기본적으로 MongoDB 의 Collection 은 Schemaless 이다.

소프트웨어 구조

install_compass : GUI 툴인 Compass 설치
mongod : 실질적인 mongoDB 의 작동을 담당하는 Executable File
mongo : CLI Interface Executable File

Driver

RDBMS 의 드라이버와 같이, MongoDB 의 경우에도 많은 언어들의 Driver 들을 지원한다. MongoDB Document

Basic Operations

DROP DATABASE

db.dropDatabase()

DROP COLLECTION

db.myCollection.drop()

insertOne(data, options) MongoDB 는 JSON 형식으로 데이터를 저장하지만, 내부적으로는 JSON 을 Binary 형태로 변환한 BSON 을 사용한다. JSON 에서는 나타나지 않은 세분화된 Number Type 도 BSON 을 통해서 저장 가능하다.

db.flightData.insertOne({
  aircraft: "Airbus A380",
  distance: 12000,
  departAirport: "GIMPO"
})
...
{
  acknowledged: true,
  insertedId: ObjectId("61caf9a5735da54142345461")
}

만약 커스텀 ID 값을 사용하고 싶으면, "_id" 값을 직접 지정해 줄 수 있다.

db.flightData.insertOne({
  _id: "id1",
  aircraft: "Airbus A380",
  distance: 12000,
  departAirport: "GIMPO"
})
...
{ acknowledged: true, insertedId: 'id1' }

insertMany(data, options)

db.flightData.insertMany([{
  aircraft: "Airbus A380",
  distance: 12000,
  departAirport: "GIMPO"
},{
  aircraft: "Airbus A381",
  distance: 12001,
  departAirport: "INCHEON"
}])
...
{
  acknowledged: true,
  insertedIds: {
    '0': ObjectId("61cb0535735da54142345463"),
    '1': ObjectId("61cb0535735da54142345464")
  }
}

find(filter, options) find 명령어는 모든 데이터를 전달해주지 않고, Cursor Object 를 반환해 준다. (기본적으로는 20개씩 데이터를 반환해 준다.)

// find all data
db.flightData.find()
db.flightData.find({distance: 12000}).pretty()
db.flightData.find({distance: {$gt: 10000}}).pretty()

// 모든 데이터를 모아서 Array 로 만들어 준다.
db.passengers.find().toArray()
// Application 코드에서는 다음과 같이 iteration 을 돌릴 수 있다. 모든 데이터를 한 번에 로드하지 않고, Cursor 를 통해 데이터를 불러오므로 효율적이다.
db.passengers.find().forEach((passengerData) => { ... });

...

[
  {
    _id: 'id1',
    aircraft: 'delete',
    distance: 12000,
    departAirport: 'GIMPO',
    marker: 'delete'
  },
  {
    _id: ObjectId("61cb03f9735da54142345462"),
    aircraft: 'Airbus A380',
    distance: 12000,
    departAirport: 'GIMPO'
  },
  {
    _id: ObjectId("61cb0535735da54142345463"),
    aircraft: 'Airbus A380',
    distance: 12000,
    departAirport: 'GIMPO'
  }
]

findOne(filter, options)
update(filter, data, options) updateOne, updateMany 와 다르게, 2번째 파라미터로 $가 붙은 값이 오지 않는다. 2번째 파라미터에 있는 값을 그대로 Replace 한다. (replaceOne 을 사용하는 것이 더 안전한 방식이다.)

db.flightData.update({distance: 12000, {marker: "delete"}})

...

{
  marker: "delete"
}

updateOne(filter, data, options)

// set 을 하기 위해서는 $set Operator 를 사용한다.
// $set 은 있으면 수정하고, 없으면 해당 key 를 추가한다.
db.flightData.updateOne({distance: 12000}, {$set: {marker: "delete"}})

...

{
  acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0
}

updateMany(filter, data, options)
replaceOne(filter, data, options)
deleteOne(filter, options)

db.flightData.deleteOne({ departAirport: "GIMPO" })
db.flightData.deleteOne({ _id: "my-id" })
...
{ acknowledged: true, deletedCount: 1 }

deleteMany(filter, options)

// Delete All
db.flightData.deleteMany({ })
db.flightData.deleteMany({ key: "value" })

Projection select 를 할 때, 필요한 데이터만을 뽑아오거나, 임의로 데이터를 변형해서 뽑아오고 싶은 경우에 사용한다.

db.passengers.find(
    // where
    {},
    {
        // name 포함시키고 싶을 때
        name: 1,
        // _id 값 제외하고 싶을 때
        _id: 0
    }
)

Embedded Document & Array

Document 안에 Nested 된 Embedded Document 를 생성할 수 있다. Embedded Document 에 수반되는 제약사항은 다음과 같다.

Up to 100 Levels of Nesting
Max 16mb document

Structured Data 를 find 하는 방법

{
    name: "Albert Twostone",
    hobbies: ["sports", "cooking"]
}
// Array 안에 있는 값들을 탐색해서 find 해준다.
db.passengers.find({hobbies: "sports"})
/*
{
    _id: ObjectId("61cb0535735da54142345464"),
    aircraft: 'Airbus A381',
    distance: 12001,
    departAirport: 'INCHEON',
    status: { description: 'on-time' }
}
*/
// Nested Document 에 접근하기 위해서는, "**.**" 와 같이 접근해야 한다.
db.flightData.find({"status.description": "on-time"})

Schema

MongoDB 는 기본적으로 Schemaless 이지만, Document 의 유지보수성을 관리하기 위해 Schema 를 사용한다. Vanilla Javascript 를 사용해도 되지만, 일부러 strict 모드로 Typescript 를 사용하는 것과 같은 이유 같다. 어느 정도로 strict 하게 Schema 를 설정할 지는 개발자의 성향, 프로젝트의 성격에 따라 달라질 수 있을 것 같다.

유연성을 극대화 하는 경우

{
    title: "book",
    price: 10000
},
{
    name: "book",
    itemPrice: 10000,
    description: "~~"
}

기본적인 스키마 형식만 정하고, Excess property를 저장하는 것에는 큰 제한을 두지 않는 경우

{
    title: "book",
    price: 10000
}
{
    title: "book2",
    price: 11000,
    description: "Excess Field"
}

strict 하게 schema 를 관리

{
    title: "book",
    price: 10000
}
{
    title: "book2",
    price: 11000
}

Data Type

Text
- "Heejae"
Boolean
- true
Number
- NumberInt (int32)
  - 55
- NumberLong (int64)
  - 10000000000
- NumberDecimal
  - 12.99
ObjectId
- ObjectId("61d1042426f2c7a99df87d63")
ISODate
- ISODate("2022-01-02")
Timestamp
- Timestamp(11421532)
Embedded Document
Arrays

Relations

Nested/Embedded Documents

{
    userName: "max",
    address: {
        street: "Second Street"
    }
}

References

// Customers
{
    userName: "max",
    favoriteBooks: ["bookId1", "bookId2"]
}
// Books
{
    _id: "bookId1",
    name: "Star Wars "
}

크게 위의 두 케이스를 가지고 Relation 을 설정할 수 있는데, 각각의 경우에 따라 다른 케이스를 사용할 수 있다.

| Embedded Documents | References | | --- | --- | | Group data together logically | Split data across collections | | Great for data that belongs together and is not really overlapping with other data | Great for related but shared data as well as for data which is used in relations and standalone | | Avoid super deep nesting (100+levels) or extremely long arrays (<16mb) | Allows you to overcome nesting and size limits |

OneToOne Relation (Embedded) 만약 한 Entity 에 완벽히 1:1로 속하고 분리되지 않는 데이터의 경우, Embedded Relation 으로 Collection 을 설계하는 것이 효율적이다. ( ex) "환자"와 "진료요약" 데이터의 경우, Embedded 데이터로 저장하는 것이 더 낫다. 만약 Collection 을 분리하면, 데이터를 얻는 데에 두 번의 query 가 필요하기 때문에 비효율적이다.
OneToOne Relation (Reference) 한 Document 에 1:1 관계를 지니지만, 경우에 따라서 연결이 변경될 수 있는 경우, Reference 로 Collection 을 설계하는 것이 효율적이다. 만약 Embedded 로 Collection 을 설계할 경우, 관계가 변경될 때마다 연관된 데이터를 변경해야 하는 경우가 생긴다. ex) Person 과 Car 의 관계
OneToMany Relation (Embedded) 한 Docuement 에 OneToMany 로 속하지만, 두 Entity 사이의 관계가 변경될 여지가 없는 경우 Embedded 로 설계하는 것이 효율적이다. ex) Question 과 Answer 의 관계
OneToMany Relation (Reference) 위의 예시들과 같이, 두 Entity 간의 관계가 가변적으로 연결/분리 가능한 경우 Reference 를 사용하는 것이 효율적이다. ex) City 와 Citizen 의 관계 만약 City 에 Embedded 하게 Citizen 을 저장할 경우, Citizen 이 이사갈 때마다 데이터를 옮겨줘야 하며, 특정 도시의 인구가 많을 경우 MongoDB 의 데이터 제한 (16mb) 를 초과할 수 있다.
ManyToMany Relation (Embedded) Customer 과 Product 의 관계를 예로 들어보자. 만약 SQL 기반의 DB 였다면 Products, Customers, Orders 이렇게 세 가지 테이블을 통해서 모델링 할 것이다. NoSQL 기반의 Collection 에서는 Customer 에 Order 를 Embedded 형태로 저장할 수 있다.

db.customers.insertOne({ name: "Heejae", orders: [ {productId: ObjectId("abcde"), quantity: 1} ] })

이러한 Embedded 방식의 단점은 무엇일까? 데이터의 중복이 발생할 수 있다. 또한, 상품의 이름이 변경되면 모든 Document 를 UPDATE 해줘야 한다는 단점이 있다. 하지만 때로는 이것을 활용해서 장점으로 사용할 수 있다. 이러한 설계는 데이터의 Snapshot 을 남기는 것이므로, 영원히 해당 데이터를 저장할 수 있다.

*ManyToMany Relation (Reference) Book 과 Author 를 저장하는 예시를 생각해 보자. 만약 Embedded 형태로 데이터를 저장하는 경우, Author 의 정보가 변경되면 Book 의 데이터들을 모두 찾아가면 Author 의 정보를 저장해야 한다. 정리하자면, 데이터 접근 Frequency 가 낮다면 Embedded 형태, Frequency 가 높다면 Reference 형식으로 설계하는 것이 좋을 때가 많다.

$lookup

SQL 의 JOIN 문처럼, relation 관계인 객체의 경우 aggregate 를 통해 두 document 를 합칠 수 있다.

db.books.aggregate([
  {
    $lookup: {
      // Join 걸 테이블
      from: "authors", 
      // book 의 property 
      localField: "authors",
      // FK 로 사용될 keey
      foreignField: "_id",
    }
  }
])

Schema Validation

기본적으로 MongoDB 는 schemaless 이지만, document 의 일관성을 관리하기 위해 Schema Validation 을 설정할 수 있다.

| validationLevel | validationAction | | --- | --- | | Which documents get validated ? | What happens if validation fails ? | | strict / moderate | error / warning |

Example Collection 생성과 동시에 Schema Validation 추가

db.createCollection("posts", { 
  validator: { 
    $jsonSchema: { 
      bsonType: "object", 
      required: ["title", "text", "creator", "comments"], 
      properties: {
        title: {
          bsonType: "string",
          description: "must be a string and is required"
        },
        text: {
          bsonType: "string",
          description: "must be a string and is required"
        },
        creator: {
          bsonType: "objectId",
          description: "must be a objectId and is required"
        },
        comments: {
          bsonType: "array",
          description: "must be a array and is required",
          items: {
            bsonType: "object",
            required: ["text", "author"],
            properties: {
              text: {
                bsonType: "string",
                description: "must be a string and is required"
              },
              author: {
                bsonType: "objectId",
                description: "must be an objectId and is required"
              }
            }
          }
        }
      }
    },
    validationAction: "warn",
  } 
})

기존에 있던 Collection 에 Schema Validation 추가

db.runCommand({
  collMode: "posts",
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["title", "text", "creator", "comments"],
      properties: {
        title: {
          bsonType: "string",
          description: "must be a string and is required"
        },
        text: {
          bsonType: "string",
          description: "must be a string and is required"
        },
        creator: {
          bsonType: "objectId",
          description: "must be a objectId and is required"
        },
        comments: {
          bsonType: "array",
          description: "must be a array and is required",
          items: {
            bsonType: "object",
            required: ["text", "author"],
            properties: {
              text: {
                bsonType: "string",
                description: "must be a string and is required"
              },
              author: {
                bsonType: "objectId",
                description: "must be an objectId and is required"
              }
            }
          }
        }
      },
      validationAction: "warn"
    }
  },
})

Create

| API | Example | Description | | --- | --- | --- | | insertOne() | db.collectionName.insertOne({field: "value"}) | 하나의 Document Insert | | insertMany() | db.collectionName.insertMany([{}, {}]) | 한 개 이상의 Document Insert | | insert() | db.collectionName.insert() | 하나 이상의 Document 를 Insert 할 수 있지만, Not Recommended |

Ordered Insert

db.hobbies.insertMany([{_id: "sports", name: "Sports"}])
// ordered 는 default 로 true 이다. 이 경우 싱글스레드로 중간에 한 insert 가 실패하면, 뒤의 insert 문도 실패한다.
// ordered 가 false 로 설정되면, 병렬적으로 insertMany 의 insert 문들이 동시에 실행된다.
db.hobbies.insertMany([{_id: "sports", name: "Sports"}], {ordered: false})

Write Concern

// Summary : Level of Guarantee
// w: w 를 설정하게 되면, ReplicaSet 에 속한 멤버중 지정된 수만큼의 멤버에게 데이터 쓰기가 완료되었는지 확인한다. 
// j : 해당 값을 설정하면, 데이터 쓰기 작업이 디스크상의 journal 에 기록된 후 완료로 판단하는 옵션이다. 
// wtimeout: 해당 값을 설정하면, Primary 에서 Secondary 로 데이터 동기화시 timeout 값을 설정하는 옵션이다. 만약 wtimeout 의 limit 을 넘어가게 되면 실제로 데이터가 Primary에 기록되었다고 해도 error 를 리턴하게 된다.

db.persons.insertOne({name: "Heejae", age: 29}, {writeConcern: {w: 1, j: true, wtimeout: 1}})

Read

Comparison Operator

Equal Operator

db.movies.find({runtime: {$eq: 60}})

Not Equal Operator

db.movies.find({runtime: {$ne: 60}})

Lower Than Operator

db.movies.find({runtime: {$lt: 60}})
db.movies.find({runtime: {$lte: 60}})

Greater Than Operator

db.movies.find({runtime: {$gt: 60}})
db.movies.find({runtime: {$gte: 60}})

Embedded Fields Operator

db.movies.find({"rating.average": {$gt: 7.0}})
// 배열안에 "Drama" 가 들어있는 Document 들도 찾아준다. ex) ["Drama", "Comedy"]
db.movies.find({genres: "Drama"})
// genre 가 "Drama" 만 속해 있는 Document 들만 찾아준다. ex) ["Drama"]
db.movies.find({genres: ["Drama"]})

$in and $nin

db.movies.find({runtime: {$in: [30, 42]}})
db.movies.find({runtime: {$nin: [30, 42]}})

$or and $nor

db.movies.find({$or: [{"rating.average": {$lt: 5}}, {"rating.average": {$gt: 9}}]})
db.movies.find({$nor: [{"rating.average": {$lt: 5}}, {"rating.average": {$gt: 9}}]})

$and

db.movies.find({$and: [{"rating.average": {$lt: 5}}, {"genres": "Drama"}]})

$not

db.movies.find({runtime: {$not: {$eq: 5}}})
// 위의 쿼리는 아래와 같은 동작을 하게 된다.
db.movies.find({runtime: {$ne: 5}})

$exists

db.users.find({age: {$exists: true, $gt: 29}})
// Valid 한 값이 들어있는지 여부는 아래와 같이 많이 확인한다.
db.users.find({age: {$exists: true, $ne: null}})

$type

// javascript 에서는 number 와 double 의 경계가 없으므로, "double" 로 검색해도 숫자값이 나올 수 있다.
db.users.find({phone: {$type: "number"}})

$regex

db.movies.findOne({summary: {$regex: /musical/ })

$expr

// volume 값이 target 보다 높은 데이터를 출력하는 쿼리
db.sales.find({$expr: {$gt: ["$volume", "$target"]}})
// volume 이 190 이상이면 volume-10, 아니면 volume 값이 target 보다 큰 값들을 조회하는 쿼리
db.sales.find({$expr: {$gt: [{$cond: {if: {$gte: ["$volume", 190]}, then: {$subtract: ["$volume", 10]}, else: "$volume"}}], "$target"}})

$size

db.users.insertOne({name: "Chris", hobbies: ["Sports", "Cooking", "Hiking"]})
db.users.find({hobbies: {$size: 3}})

// 이렇게 쿼리를 날리면, 0번째 인덱스에는 "action", 1번째 인덱스에는 "thriller" 를 가진 영화를 찾는다. (순서까지 고려)
db.movieStarts.find({genre: ["action", "thriller"]})
// 아래와 같이 쿼리를 날리면, 순서 상관없이 "action", "thriller" 를 지닌 영화들을 조회한다.
db.movieStarts.find({genre: {$all: ["action", "thriller"]}})

// frequency 가 3 이상인 hobbies 를 조회하고 싶을 때 아래와 같이 쿼리를 날리면 제대로 조회되지 않는다.
// Sports 가 아닌 hobby 도 frequency 가 3 이상이면 조회된다.
db.users.find({$and: [{"hobbies.title": "Sports"}, {"hobbies.frequency": {$gte: 3}}]})
// 서브쿼리처럼 조건을 달기 위해서는 $elemMatch 를 사용해야 한다.
db.users.find({hobbies: {$elemMatch: {title: "Sports", frequency: {$gte: 3}}}})

Cursor

MongoDB 는 기본적으로 한 번에 20개의 데이터들만 fetch 해온다. 만약 다음 값들을 얻어오려면, next() 함수를 실행시켜야 한다. 다음 커서가 존재하는지에 대한 여부는 hasNext() 함수를 통해 확인할 수 있다. 정렬은 sort() 를 통해 수행할 수 있다.

// Ascending
db.movies.find().sort({"rating.average": 1})
// Descending
db.movies.find().sort({"rating.average": -1})

// 2가지 property 를 통해 정렬하고 싶을 때 
// 평점을 기준으로 오름차순 정렬, 상영시간을 기준으로 내림차순 정렬
db.movies.find().sort({"rating.average": 1, runtime: -1})

특정한 Cursor page 를 skip하고 싶은 경우, skip() 을 사용할 수 있다.

db.movies.find().sort({"rating.average": 1, runtime: -1}).skip(100)

한 cursor 당 fetch 할 데이터의 갯수를 제한하기 위해서는 limit()을 사용할 수 있다.

db.movies.find().sort({"rating.average": 1, runtime: -1}).skip(100).limit(10)

문법적으로는 sort, skip, limit 을 순서 상관없이 사용할 수 있지만, MongoDB 에서는 sort, skip, limit 순서대로 동작하게 된다.

특정 컬럼만 fetch 하고 싶은 경우(Projection), 2번째 파라미터를 넘기면 된다. 단, _id 는 별도로 0으로 지정해주지 않는 이상 항상 fetch 된다.

db.users.find({}, {name: 1, genres: 1, runtime: 1, "rating.average": 1})
// 배열인 경우, "property.$": 1 을 입력하면 배열의 모든 속성값들을 출력한다
db.users.find({}, {name: 1, genres: 1, runtime: 1, "genres.$": 1})
// slice 문법도 사용할 수 있다
db.users.find({}, {name: 1, genres: 1, runtime: 1, "genres": {$slice: 2}})

Update

UPDATE 문은 SQL 과 같이 (1) WHERE 문 (2) SET 문으로 이루어진다. WHERE 조건은 SELECT 했을 때와 같이 동작하며, SET 은 다양한 연산자가 존재한다.

$set

특정 property 를 완전히 overwrite 할 때 사용한다.

// 만약 hobbies 라는 property 가 있다면 변경하고, 없다면 추가한다.
db.users.updateOne({_id: ObjectId("abcd"), {$set: {hobbies: [{title: "Sports"}, {title: "Cooking"}]}}})

$set 연산자를 이용해서 복수 개의 property 를 수정하는 것도 가능하다.

db.users.updateOne({_id: ObjectId("abcd")}, {$set: {age: 29, phone: "01012341234"}})

$inc & $dec

// Manuel 의 나이를 2살 증가시킨다.
db.users.updateOne({name: "Manuel"}, {$inc: {age: 2}})
// Manuel 의 나이를 2살 감소시키고, isSporty 를 false 로 설정한다.
db.users.updateOne({name: "Manuel"}, {$inc: {age: -2}, $set: {isSporty: false}})

$min & $max & $mul

// Chris 의 나이가 35세보다 클 경우에만 35세로 변경시킨다. (min 의 정의)
db.users.updateOne({name: "Chris"}, {$min: {age: 35}})
// Chris 의 나이가 31세보다 높을 경우에만 31세로 변경시킨다. (max 의 정의)
db.users.updateOne({name: "Chris"}, {$max: {age: 31}})
// Chris 의 나이에 1.1배를 곱한 값으로 설정한다.
db.users.updateOne({name: "Chris"}, {$mul: {age: 1.1}})

Delete Field

// 아래의 쿼리는 phone 필드를 삭제하지는 않고, null 로 세팅한다.
db.users.updateMany({isSporty: true, phone: null})
// 아래의 쿼리는 phone 필드를 삭제한다.
// phone: "" 의 값은 무시된다. (중요하지 않음)
db.users.updateMany({isSporty: true}, {$unset: {phone: ""}})

Rename Field

// 모든 Document 들의 age 를 totalAge 로 rename 한다.
db.users.updateMany({}, {$rename: {age: "totalAge"}})

Upsert

updateOne, updateMany 의 3번째 파라미터로 upsert 를 사용할 수 있다.

// Collection 에 없으면 INSERT 하고, 있으면 UPDATE 한다.
db.users.updateOne({name: "NotExists"}, {$set: {age: 29, hobbies: [{title: "Good food"}]}}, {upsert: true})

Update Array

$set 구문에 property.*.newProperty 문법을 사용해서 Array 의 새로운 field 를 추가할 수 있다.

// { "_id" : ObjectId("61dfbfbdcde6fbd14bd613a5"), "name" : "Max", "hobbies" : [ { "title" : "Sports", "frequency" : 3 }, { "title" : "Cooking", "frequency" : 6 } ], "phone" : 131782734 }
db.users.find({hobbies: {$elemMatch: {title: "Sports", frequency: {$gte: 3}}}})
// { "_id" : ObjectId("61dfbfbdcde6fbd14bd613a5"), "name" : "Max", "hobbies" : [ { "title" : "Sports", "frequency" : 3, "highFrequency" : true }, { "title" : "Cooking", "frequency" : 6 } ], "phone" : 131782734 }
db.users.updateMany({hobbies: {$elemMatch: {title: "Sports", frequency: {$gte: 3}}}}, {$set: {"hobbies.$.highFrequency": true}})

위의 쿼리는 특정 WHERE 조건에 해당하는 Document 들만 값이 추가된다. 만약 모든 배열의 원소 값을 변경하려면 다음과 같은 문법을 사용해야 한다.

db.users.updateMany({totalAge: {$gt: 30}}, {$inc: {"hobbies.$[].frequency": -1}})

만약 Array 에 원소를 추가하고 싶다면, $push 를 사용할 수 있다.

db.users.updateOne({name: "Maria"}, {$push: {hobbies: {title: "Sports", frequency: 2}}})
db.users.updateOne({name: "Maria"}, {$push: {hobbies: {$each: [{title: "Sports", frequency: 2}, {title: "Cooking", frequency: 2}], $sort: {frequency: -1}}}})

반대로 Array 에서 원소를 제거하고 싶다면, $pull를 사용할 수 있다.

// title 이 "Hiking" 인 hobbies 를 제거한다.
db.users.updateOne({name: "Maria"}, {$pull: {hobbies: {title: "Hiking"}}})
// 가장 마지막 hobbies 를 제거한다.
db.users.updateOne({name: "Maria"}, {$pop: {hobbies: 1}})

Delete

DELETE 에는 크게 deleteOne, deleteMany 만 주요 메소드로 정리할 수 있다.

db.users.deleteOne({name: "Heejae"})
db.users.deleteMany({age: {$lt: 29}})

만약 Collection 의 모든 데이터를 삭제하기 위해서는 다음과 같이 할 수 있다.

db.users.deleteMany({})
db.users.drop()

만약 Database 를 삭제하기 위해서는 다음과 같이 해야한다.

db.dropDatabase()

Index

Single Field Index

// dob.age 를 Ascending Index 로 설정
db.contacts.createIndex({"dob.age": 1})
// dob.age 를 Descending Index 로 설정
db.contacts.createIndex({"dob.age": -1})
// dob.age Ascending Index 삭제
db.contacts.dropIndex({"dob.age": 1})
// email 필드를 UNIQUE Constraint 로 생성
db.contacts.dropIndex({"email": 1}, {unique: true})

조심해야 할 점은, unique index 로 만들 경우, "email" 이 없는 Document 는 INSERT 할 수 없다. 만약 email 이 존재하지 않을 수 있는데, 있는 값들 중에서는 unique 하게 만들기 위해서는 다음과 같이 처리해야 한다. db.contacts.createIndex({email: 1}, {unique: true, partialFilterExpression: {email: {$exists: true}}})

Combound Index

두 가지 이상의 필드를 이용하여 생성하는 Index이다. 단일 Index 의 경우 순서(Ascending, Descending)와 무관하지만, Compound Index 의 경우 두 컬럼의 정렬 순서가 중요하다. 예를 들어 gender 를 Ascending 으로, age 를 descending 으로 인덱스를 생성한 경우, find 조건에서 gender는 Ascending으로, age 는 Descending 으로 정렬해야 Index 가 동작한다. 신기하게도, 이 예의 대우 관계 (gender 는 Descending, age 는 Ascending) 도 인덱스가 동작한다.

db.contacts.createIndex({"gender": 1, "dob.age": 1})

Default Index

// _id 값은 default index 로 지정되어 있다.
db.contacts.getIndexes()

Partial Filters

Partial Index 의 총 크기는 Compound Index 보다 작다. 예를 들어, 아래의 예시는 dob.age 와 gender : male 인 데이터들만 색인 인덱싱하므로, dob.age 와 gender 를 Compound Index 로 만드는 것보다 효율적이라고 할 수 있다.

db.contacts.createIndex({"dob.age": 1}, {partialFilterExpression: {gender: "male"}})
db.contacts.createIndex({"dob.age": 1}, {partialFilterExpression: {gender: {$gt: 60}}})

Time To Live

db.sessions.createIndex({createdAt: 1}, {expireAfterSeconds: 10})

Multi Key Index

배열도 인덱싱 할 수 있다. 이 경우 MongoDB 는 내부적으로 MultiKey Index 를 설정한다. Multi Key 는 배열의 모든 값들을 다 빼내어서, 각각을 고유한 값으로 인덱싱하여 저장한다.

db.contacts.createIndex({hobbies: 1})

Text Index

Text Index 도 Multi Key Index 의 일종이다. Text Index 는 문자를 각각의 어절별로 나누어서, 탐색하기 쉬운 형태로 만들어낸다. createIndex({fieldName: "text"}) 와 같이 Text Index 를 설정할 수 있다.

db.products.insertMany([{title: "A Book", description: "This is an awesome book about a young artist"},{title: "Red T-Shirt", description: "This T-shirt is Red and awesome"}])
db.products.createIndex({description: "text"})

탐색할 때에는 다음과 같이 할 수 있다.

// 하나의 컬렉션에는 하나의 Text Index 를 설정할 수 있다. (Text Index 가 무거운 작업이기 때문이다.)
db.products.find({$text: {$search: "awesome"}})

Aggregation

MongoDB 의 Aggregation 은 크게 match, group, sort 단계로 나눌 수 있다.

match aggregate 할 데이터들을 filter 하는 조건이다.

db.contacts.aggregate([{$match: {gender: "female"}}])

group

// location.state 를 GROUP BY 로 묶고, GROUP 된 사람들의 총합을 totalPersons 로 묶어준다.
db.persons.aggregate([
    {$match: {gender: "female"}},
    {$group: {_id: {state: "$location.state"}, totalPersons: {$sum: 1}}},
])

sort

// location.state 를 GROUP BY 로 묶고, GROUP 된 사람들의 총합을 totalPersons 로 묶어준다.
db.persons.aggregate([
    {$match: {gender: "female"}},
    {$group: {_id: {state: "$location.state"}, totalPersons: {$sum: 1}}},
    {$sort: {totalPersons: -1}}
])    

db.persons.aggregate([
    {$match: {"dob.age": {$gt: 50}}},
    {
        $group: {
            _id: {gender: "$gender"},
            numPersons: {$sum: 1},
            avgAge: {$avg: "$dob.age"}
        }
    },
    {$sort: {numPersons: -1}}
])

project

db.persons.aggregate([
    {$project: {_id: 0, gender: 1, fullName: {$concat: ["$name.first", " ", "$name.last"]}}}
])

db.persons.aggregate([
    {$project: {_id: 0, gender: 1, fullName: {$concat: [{$toUpper: "$name.first"}, " ", {$toUpper: "$name.last"}]}}}
])

Transaction

MongoDB 스터디 중, Transaction 에 대해 살펴보게 되었다. MongoDB Driver 에서 직접 Transaction 처리를 하는 경우와, Mongoose 에서 처리하는 경우를 살펴보았다.

MongoDB Driver (Node.js)

// For a replica set, include the replica set name and a seedlist of the members in the URI string; e.g.
// const uri = 'mongodb://mongodb0.example.com:27017,mongodb1.example.com:27017/?replicaSet=myRepl'
// For a sharded cluster, connect to the mongos instances; e.g.
// const uri = 'mongodb://mongos0.example.com:27017,mongos1.example.com:27017/'
const client = new MongoClient(uri);
await client.connect();

// Prereq: Collection 이 미리 생성되어 있어야 한다.
await client
    .db('mydb1')
    .collection('foo')
    .insertOne({
        abc: 0
    }, {
        writeConcern: {
            w: 'majority'
        }
    });

await client
    .db('mydb2')
    .collection('bar')
    .insertOne({
        xyz: 0
    }, {
        writeConcern: {
            w: 'majority'
        }
    });

// Step 1: Transaction 처리를 위한 Client Session 을 생성한다.
const session = client.startSession();

// Step 2: (Optional) Transaction 옵션을 설정한다.
const transactionOptions = {
    readPreference: 'primary',
    readConcern: {
        level: 'local'
    },
    writeConcern: {
        w: 'majority'
    }
};

// Step 3: session.withTransaction 메서드를 통해 Transaction 을 시작한다. 콜백 함수를 통해 Transaction 비즈니스 로직을 처리한다. 비즈니스 로직을 처리한 후에, Commit 혹은 Rollback 처리한다.
// 주의: withTransaction 콜백은 async 함수이거나, Promise 를 반환해야 한다.
// 주의: 콜백 함수 내에서도, DB 변경시 두 번째 인자로 session 을 넘겨주어야, Transaction 처리된다. session 을 넘겨주지 않으면, 일반 DML 과 동일하게 동작한다.
try {
    await session.withTransaction(async () => {
        const coll1 = client.db('mydb1').collection('foo');
        const coll2 = client.db('mydb2').collection('bar');

        // Important:: You must pass the session to the operations

        await coll1.insertOne({
            abc: 1
        }, {
            session
        });
        await coll2.insertOne({
            xyz: 999
        }, {
            session
        });
    }, transactionOptions);
} finally {
    await session.endSession();
    await client.close();
}

Mongoose

MongoDB Driver 에서 withTransaction 콜백 함수 내에서도 DML 시 session 객체를 항상 넘겨주어야 하는 것이 불만이었다. typeorm 에서는, Transaction 선언 시 Transaction 을 처리할 수 있는 EntityManager 가 생성되어서, 해당 EntityManager 을 통해서 처리한 DML 은 항상 Transaction 내에서 동작한다. 하지만 Mongoose 에서는 MongoDB Driver 와 같이, session 을 선언한 후에 session 객체를 DML 문에 같이 넣어주어야 한다.

const Customer = db.model('Customer', new Schema({
    name: String
}));

let session = null;
return Customer.createCollection().
then(() => db.startSession()).
then(_session => {
    session = _session;
    // Start a transaction
    session.startTransaction();
    // This `create()` is part of the transaction because of the `session`
    // option.
    return Customer.create([{
        name: 'Test'
    }], {
        session: session
    });
}).
// Transactions execute in isolation, so unless you pass a `session`
// to `findOne()` you won't see the document until the transaction
// is committed.
then(() => Customer.findOne({
    name: 'Test'
})).
then(doc => assert.ok(!doc)).
// This `findOne()` will return the doc, because passing the `session`
// means this `findOne()` will run as part of the transaction.
then(() => Customer.findOne({
    name: 'Test'
}).session(session)).
then(doc => assert.ok(doc)).
// Once the transaction is committed, the write operation becomes
// visible outside of the transaction.
then(() => session.commitTransaction()).
then(() => Customer.findOne({
    name: 'Test'
})).
then(doc => assert.ok(doc)).
then(() => session.endSession());

Mongoose 문서에서는 위와 같은 코드로 예시가 나와있는데, 개인적으로는 startTransaction() 보다는 withTransaction() 을 사용하는 것이 버그를 더 줄여줄 것 같다. 수동으로 Transaction 을 열고 닫는 것 보다는, withTransaction 을 통해 Transaction 의 상태관리는 Mongoose 에게 맡기고, 코드상으로는 비즈니스 로직에 집중하는 것이 간결하기 때문이다. (물론 session 객체는 수동으로 닫아줘야 한다는 아쉬움은 남아있다.)

Transaction 관련 참고문서

MongoDB Driver Mongoose 공식문서 Typeorm 공식문서

MongoDB ?
소프트웨어 구조
Driver
Basic Operations
Embedded Document & Array
Schema
Data Type
Relations
$lookup
Schema Validation
Create
Read
    Comparison Operator
    Embedded Fields
    $in and $nin
    $or and $nor
    $and
    $not
    $exists
    $type
    $regex
    $expr
    $size
    $elemMatch
Cursor
Update
    $set
    $inc & $dec
    $min & $max & $mul
    Delete Field
    Rename Field
    Upsert
    Update Array
Delete
Index
    Single Field Index
    Combound Index
    Default Index
    Partial Filters
    Time To Live
    Multi Key Index
    Text Index
Aggregation
Transaction

이것도 읽어보세요