MongoDB - Introduction

MongoDB is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on concept of collection and document.

Database

Database is a physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases.

Collection

Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection are of similar or related purpose.

Document

A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection's documents may hold different types of data.

The following table shows the relationship of RDBMS terminology with MongoDB.

RDBMSMongoDB
DatabaseDatabase
TableCollection
Tuple/RowDocument
columnField
Table JoinEmbedded Documents
Primary KeyPrimary Key (Default key _id provided by MongoDB itself)
Database Server and Client
mysqld/Oraclemongod
mysql/sqlplusmongo

Sample Document

Following example shows the document structure of a blog site, which is simply a comma separated key value pair.

{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by: 'tutorials point',
   url: 'http://www.tutorialspoint.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100, 
   comments: [	
      {
         user:'user1',
         message: 'My first comment',
         dateCreated: new Date(2011,1,20,2,15),
         like: 0 
      },
      {
         user:'user2',
         message: 'My second comments',
         dateCreated: new Date(2011,1,25,7,45),
         like: 5
      }
   ]
}

_id is a 12 bytes hexadecimal number which assures the uniqueness of every document. You can provide _id while inserting the document. If you don’t provide then MongoDB provides a unique id for every document. These 12 bytes first 4 bytes for the current timestamp, next 3 bytes for machine id, next 2 bytes for process id of MongoDB server and remaining 3 bytes are simple incremental VALUE.

Any relational database has a typical schema design that shows number of tables and the relationship between these tables. While in MongoDB, there is no concept of relationship.

Advantages of MongoDB over RDBMS

  • Schema less − MongoDB is a document database in which one collection holds different documents. Number of fields, content and size of the document can differ from one document to another.

  • Structure of a single object is clear.

  • No complex joins.

  • Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly as powerful as SQL.

  • Tuning.

  • Ease of scale-out − MongoDB is easy to scale.

  • Conversion/mapping of application objects to database objects not needed.

  • Uses internal memory for storing the (windowed) working set, enabling faster access of data.

Why Use MongoDB?

  • Document Oriented Storage − Data is stored in the form of JSON style documents.

  • Index on any attribute

  • Replication and high availability

  • Auto-Sharding

  • Rich queries

  • Fast in-place updates

  • Professional support by MongoDB

Where to Use MongoDB?

  • Big Data

  • Content Management and Delivery

  • Mobile and Social Infrastructure

  • User Data Management

  • Data Hub

Comments

Popular posts from this blog

MongoDB - Data Modelling

SPARK - Deployment

SQOOP