Monday, February 13, 2012

Duplicate Detection in CRM 2011

The Duplicate Detection feature has improved considerably since CRM 4.0 but it is often not given much attention. I would like to highlight in this post what has improved in Duplicate Detection, how to use it and what is still missing (areas of improvement for future releases).

Synchronous MatchCode Generation
MatchCodes allow the CRM platform to detect duplicates in the create/update operations. In CRM 4.0 the MatchCode generation was implemented asynchronously with a recurring system job which would generate the MatchCodes. The implication of that process is that if you create a record and then create a new duplicate record within a few seconds, the platform would not detect the duplicate because the MatchCode was not yet generated. This blog post talks about how to work around this limitation by making use of the PersistInSyncOptionalParameter; however, when creating/updating a record through the UI, the parameter was not enabled by default so you would need to write code to force the MatchCode generation in a synchronous manner. In CRM 2011 the PersistInSyncOptionalParameter (or CalculateMatchCodeSynchronously) parameters no longer apply because MatchCode generation is always done synchronously. What this means is that regardless of how fast you are creating one record after the other (and theoretically how many threads you have), the CRM platform will always be able to detect immediately if you are creating a duplicate. The fact that you can rely on the Duplicate Detection Engine to capture all your duplicates without having to consider any time variable is a huge relief in my opinion.

Different Way to Specify Whether to Run Duplicate Detection on Record Create/Update
Not much has changed here except perhaps the syntax. By default, when you execute a create or an update request against the CRM Web Service, the record will be created/updated without checking for duplicates. If you want to override this behaviour so that you force the platform to check for duplicates you would need to set the SuppressDuplicateDetection optional parameter to “false”. Not specifying this optional parameter has the same effect as if you would set it to “true” (bypasses duplicate detection):
CreateRequest req = new CreateRequest();
Account acc = new Account()
    Name = name,
req.Target = acc;
req.Parameters["SuppressDuplicateDetection"] = false;

catch (FaultException<OrganizationServiceFault> ex)
    if (ex.Detail.ErrorCode == -2147220685)
        // duplicate detected: Handle it here

Enhanced Duplicate Detection Processing (UR5+)
With the release of UR5, there have been some other improvements to Duplicate Detection:
  1. You can specify in a rule whether or not duplicate detection should consider two null (blank) values as identical
  2. You can select to detect duplicates only from active records, this can be useful to improve performance.
    What is Still Missing?
    In my opinion, the most fundamental feature that Duplicate Detection still lacks is the ability to package duplicate detection rules as part of a solution. It is quite painful to have to replicate the rules in all environments an having to do it directly in production can be scary, some rules are quite complex and error prone. I hope Microsoft has plans for improving their solution framework in the future to include “solutionizing” duplicate detection rules!


    1. Hi,

      Great blog!

      I have found some unique issues with crm 2011 duplicate detection, i have merged record in crm but when i try to run the duplicate job again, it show me the records that i have merged with inactive status, Can i just show the record that have not merge?because i have 1000 duplicate record this is ridiculous if i need to see record that already merged in duplicate detection jobs results box.

    2. in the the duplication rule record, you can tick the option of ignore inactive records.