I have a collection of people, named peeps:
| db.peeps.insert({UserName: 'BOB'}) | 
And I want to be able to find the user named “Bob”. Except I don’t know if the user name is lower, upper or mixed case:
| db.peeps.find({UserName:'bob'}).count() // 1 result | 
But I don’t really want to require my users to type the name the same case as was entered into the database…
| db.peeps.find({UserName:/bob/i}).count() // 3 results | 
Me: Yey, Regex!!!
MongoDB: Ahem! Not so fast… Look at the query plan.
| db.peeps.find({UserName:/bob/}).explain() | 
Me: Oh, I’ll create an index!
| 
 | 
Me: Yey!!!
MongoDB: Dude, dig deeper… and don’t forget to left-anchor your query.
| // Run explain(true) to get full blown details: | 
Me: Yey?
MongoDB: Each key in the index was examined! That’s not scalable… for a million documents, mongo will have to evaluate a million keys.
Me: But, but, but…
| db.peeps.find({UserName:/^bob/}).explain(true) | 
Me: This is back to exact match :-) Only one document returned. I want case insensitive match!
Old MongoDB: ¯\(ツ)/¯… Normalize string case for that field, or add another field where you store a lowercase version just for this comparison, then do an exact match?
Me:
New MongoDB: Dude: Collation!
Me: Oh?
Me: (Googles MongoDB Collation frantically…)
Me: Ahh!
| db.peeps.createIndex({UserName:-1}, { collation: { locale: 'en', strength: 2 } ) | 
Me: Squee!
MongoDB: Indeed.
Collation is a very welcome addition to MongoDB.
You can set Collation on a whole collection, or use it in specific indexing strategies.
The main pain point it solves for me is the case-insensitive string match, which previously required either changing the schema just for that (ick!), or using regex (index supported, but not nearly as efficient as exact match).
Beyond case-sensitivity, collation also addresses character variants, diacritics, and sorting concerns. This is a very important addition to the engine, and critical for wide adoption in many languages.
Check out the docs: Collation
