Notes on MongoDB / Quickstart Guide to MongoDB

  • large system (multi-shards, config servers and mongo routers)
  • data model
    • database holds collections
    • collection holds documents
    • document is a set of fields
    • field is a key-value pair
    • key is a name
    • value is:
      • basic type
      • a document
      • or an array of values
  • mongo query language
  • BSON is a binary encoded serialization of JSON like documents. BSON was designed to be lightweight, traversable and efficient. BSON like JSON supports embedding of objects and arrays within other objects and arrays.
  • MongoDB uses BSON as a data storage and network transfer format for “documents”
    $bson = bson_encode(null);
    $bson = bson_encode(array('a' => 10));
    
  • using a database
    	> use mydb
    

    switched to db mydb

  • Insert data inside a collection
    	> j = {name: 'mongo'}
    	> db.things.save(j);
    	> db.things.find();
    
  • no collection was predefined, it appeared lazily
  • no structure for the document to store
  • mongo addeds an objectID in the field _id
  • do stuff in a loop
    	> for(var i=1;i<=20;i++) db.things.save({x:4, j:i});
    	> db.things.find();
    
  • find returns back a cursor, navigation between results is possible using the ‘it’ command
    c = db.things.find();
    	> while (c.hasNext()) printjson(c.next());
    	> db.things.find().foreach(printjson);
    
  •   > var c = db.things.find()
      > printjson(c[4]);
    

    will show the 5th record, not right because it will load all the results unto 4 in the memory which can have scalability issues

  • > db.things.find().toArray();
    
  • more queries
    select * from things where name = "mongo"
    > db.things.find({name: 'mongo'}).forEach(printjson);
    select * from things where x = 4;
    > db.things.find({x:4}).forEach(printjson);
    NOTE: that the _id field is always returned
    
  • allows returning only partial documents as well. for exam for name: ‘rajat’ return only the key j
    	db.things.find({x:4}), {j:true}.foreach(printjson);
    > findOne() - syntactic sugar
    	>printjson(db.things.findOne({name: 'mongo'});
    this is the same as
    	>find({name: 'mongo'}).limit(1);
    >printjson
    function(x) {
    	print(tojson(x));
    }
    
  • mongo is full javascript shell, so any javascript function, syntax or class can be used in the shell.besides it defines some of its own globals like (b etc). you can see the full api at. http://api.mongodb.org/js/
  • select * from users where age=33 order by name
    db.users.find({age: 33}).sort({name: true});
    
    select * from users where age>33
    db.users.find({'age': {$gt: 33}})
    or $lt
    
    select * from users where name like '%joe%';
    db.users.find({name:/Joe/});
    "Joe%" = {name:/^Joe/};
    $lt = less than , $lte = less than equal to
    
    sort order desc = .sort({name: -1});
    
    create index myindexname on users(name);
    db.users.ensureIndex({name: true});
    
    create index myindex on users(name, ts desc)
    db.users.ensureIndex({name:1, ts:-1});
    
    select * from users where a=1 or b = 2
    db.users.find({$or: [{a:1}, {b:2}]});
    
    explain select * from users where z=2;
    db.users.find({z:2}).explain();
    
    select distinct last_name from users;
    db.users.disctinct('last_name');
    
    select count(*) from users;
    db.users.count();
    
    select count(age) from users;
    db.users.find(age: {$exists: true}).count();
    
  • Commands:
    the mongo db has a concept of a database command as a way to perform special operations, or to request information about its current operational status.
    A command is sent to the database as a query to a special collection namespace called $cmd. The database will return a single document with the command results – user findOne(0 for that if your driver has it.
    the general syntax is:

    db.$cmd.findOne({<commandname>: <value>[, options]});
    the shell provides a helper function for this:
    db.runCommand({<commandname>: <value>[, options]);
    

    for example to check our database current’ profile level eating
    privileged commands:

    > use admin
    > db.runCommand('shutdown');
    

    getting help info for a command

    > db.commandHelp('datasize');
    
  • Clone Database
    // copy the entire database from one name to one server
    db.copyDatabase(<from-db>, <to-db>, <from-hostname>);
    
  • Lock, Snapshot and unlock
    sync command supports a lock option that allows to sa fely snapshot the database datafiles. while locked, all write operations are blocked, although read operations are still allowed. after snapshotting, use the unlock command to unlock the database an allows locks again.

    > use admin
    > db.runCommand({sync: true, lock: 1});
    unlock command:
    db.$cmd.sys.unlock.findOne();
    

    while the database can be read while locked for snapshoting, if a write is attempted, this will block readers due to the database’s use of read/write lock. (issues 1423);

  • Index related commands:ensureIndex() is a helper function. its implementation creates an index by adding its into into the system.indexes collection
    > use test
    > db.mycollection.ensureIndex(<keypattern>);
    // same as
    > db.system.indexes.insert({name: 'name', ns: 'namespaceToIndex', key: <keypatter>});
    
  • you can query system.indexes to see all indexes for a collection foo in db test.
    > db.system.indexes.find({ns: 'test.foo'});
    
  • dropping indexes
    db.mycollection.dropIndex(<name_or_pattern>);
    db.mycollection.dropIndexes(); <-- drops all the indexes
    e.g.
    t.dropIndex({name: true});
    
  • index namespace:
    each index has a namespace of its own for the btree bucket. the namespace is:
    <collectionamespace>.$<indexname>
    this is an internal namespace that cannot be queried directly.
  • Last error commands:Since mongodb doesn’t wait for a response by default when writing to the database, a couple of commands exist for ensuring that these operations have succeeded. these commands can be invoked automatically with many of the drivers when saving and updating in the ‘safe’ mode.
    db.$cmd.findOne({getlasterror:1});
    or
    db.runCommand('getlaterror');
    or helper
    db.getLastError();
    
  • getlaterror is primarily useful for write operation (although it is set after a command or query to) write operations by default do not have a return code. this saves the client from waiting for client/server turnarounds during write operations. one can always call getlasterror if one wants to return a code.
  • db.restError(); removes error that might have happened in the past, gets called before any comand is executed.
  • stats for a collection
    > use comedy
    > db.cartoon.validate();
    

    returns stats about the collection along with the number of records etc

  • Mongo Metadata:
    The <dbname>.system.* namespace in mongoDB are special and contain database specific information. the systems include:

    • system.namespcaes list all namespaces
    • system.indexes lists all indexes
    • system.profile stores database profiling information.
    • system.users lists users who may access the database
    • local.sources stores replica slave configuration data and state.
    • information on the structure of a stored object is stored within the object itself.
  • Collections:
    mongo collections are essentially named grouping of documents. you can think of them roughly equivalent to relational database tables. It is a collection of a BSON documents.
    - collection name should begin with letters and underscore. $ is reserved.
    the max size of a collection name is 128 char

About rp

Architect for large, highly scalable LAMP applications and Technical Manager with special focus on metrics based continuous improvement of teams and products. Rajat has close to a decade of experience of a very wide range of skills related to infrastructure, middleware, app servers all the way to front-end technologies and software development methodologies including agile, iterative waterfall, waterfall as well as ah-hoc startup using the right approach in the right context to reduce time to market.