I have a collection of people, named peeps:
And I want to be able to find the user named “Bob”. Except I don’t know if the user name is lower, upper or mixed case:
But I don’t really want to require my users to type the name the same case as was entered into the database…
Me: Yey, Regex!!!
MongoDB: Ahem! Not so fast… Look at the query plan.
Me: Oh, I’ll create an index!
MongoDB: Dude, dig deeper… and don’t forget to left-anchor your query.
MongoDB: Each key in the index was examined! That’s not scalable… for a million documents, mongo will have to evaluate a million keys.
Me: But, but, but…
Me: This is back to exact match :-) Only one document returned. I want case insensitive match!
Old MongoDB: ¯\(ツ)/¯… Normalize string case for that field, or add another field where you store a lowercase version just for this comparison, then do an exact match?
New MongoDB: Dude: Collation!
Me: (Googles MongoDB Collation frantically…)
Collation is a very welcome addition to MongoDB.
You can set Collation on a whole collection, or use it in specific indexing strategies.
The main pain point it solves for me is the case-insensitive string match, which previously required either changing the schema just for that (ick!), or using regex (index supported, but not nearly as efficient as exact match).
Beyond case-sensitivity, collation also addresses character variants, diacritics, and sorting concerns. This is a very important addition to the engine, and critical for wide adoption in many languages.
Check out the docs: Collation