MongoDb and Cosmos DB audit log and logging
Configurator
MongoDb
When creating a new Identify instance, you can use MongoDb for storing audit logging:
You can also use the Audit log database for other types of logging data by selecting the SerilogSinks option for the Log target. We will describe more detail about SerilogSinks later on.
Cosmos DB for MongoDb API
You also can use Cosmos DB for MongoDb API instead of MongoDb for logging. We chose Cosmos DB for MongoDb API because we can reuse the same code base (which uses MongDB driver) for both MongoDb and CosmosDb.
To use the Cosmos DB, you must create a Cosmos DB account on the Azure portal:
Note that you must choose to create the latest version of Cosmos DB for MongoDb API. By the time we wrote this document, the latest version is 3.6.
Navigate to the Connection String tab to get the required information for the Identify Configurator:
Enter the required information to the Configurator's Audit log settings step:
Currently, we only support using master key for accessing Cosmos DB because it does not support users and roles for Mongo API yet.
Note: This limitation means that if you have multiple instances that shares the same a master key, all instances can access databases of the other instances.
Audit log collections
Because a Cosmos DB database allows maximum 25 containers, storing each type of audit data in a separate container as you are doing with SQL Server and its tables doesn't work. Meanwhile, storing them in separate databases is very costly. Luckily, because both MongoDb and Cosmos DB are schema-free databases, we can store many types of audit data in a log container. Therefore, the configurator only creates four containers: auditLogs, userAuditLogs, claimAuditLogs and logs.
Users and Roles for MongoDb
Unlike Cosmos Db for MongoDb API, MongoDb does support the user and role management feature. When creating a new Identify instance with MongoDb audit log, Identify configurator creates a database user that has the "insert", "update" and "find" permissions on all collections.
Identify configurator uses the newly created user to access the audit log data of the new Identify instance. Such a dedicated user ensures that audit log data is isolated between instances.
Support Serilog Sinks
When you use MongoDb/CosmosDb for audit log, you can also log all other types of logging to there. Specifically, data is logged to a dedicated collection named "logs". The result will look like:
For Mongo DB
For Cosmos DB
Note that we use the shared throughput at the database level strategy instead of the container level to reduce cost. Per Azure Cosmos DB service quotas, a collection requires to have minimum 100 RU/s. However, because we only have 4 containers, the minimum throughput of Identify audit log database is the required minimum value from Cosmos DB - 400 RU/s instead of the number of collections 100. By using a single collection for audit log data, we reduce the minimum throughput from (32100) RU/s to 400 RU/s, in which 32 is the number of different audit log types that Identify has at the time of writing. You can see the result at the Scale tab in Cosmos DB Data Explorer:
Enable database logging
You can store all logging data to this container by selecting the "SerilogSinks" option for the Log target setting found on the Logging/Settings page:
Audit log viewer: technical details
Currently, the audit log viewer feature is available for the old Identify Admin. Although it is partially available on the new Safewhere Admin, you can only view user-related audit log data.
On Identify Admin, the viewer works with both MongoDb and CosmosDb.
When it comes to viewing log data, performance is the key. The audit log viewer supports filtering data with a date range and a text filter.
From the Microsoft's document:
Azure Cosmos DB is a schema-agnostic database that allows you to iterate on your application without having to deal with schema or index management. By default, Azure Cosmos DB automatically indexes every property for all items in your container without having to define any schema or configure secondary indexes.
It means that, by default, Identify's Cosmos Db audit log database contains all the required indexes for a fast querying. The default date range automatic index is sufficient for all searches where Identify orders search results by the date field. However, to speed up searches that also use the filter field, we define a composite index for every audit log container to reduce the RU cost of the query.
For more detail about index policy for queries with filters on multiple properties, you can find it at this document
If you do not define a composite index for a query with filters on multiple properties, the query will still succeed. However, the RU cost of the query can be reduced with a composite index.