• Insights

iManage and Email Duplicate Detection

Brian Podolsky

2 min read

All Insights

Autonomy (HP) recently released WorkSite Server 8.5 SP1 Update 6, with a new enhanced server-side email duplication detection technology.  Previously, the FileSite client would evaluate duplicates based on the MSG_ID value from Exchange.  This often caused problems with forwarded messages, or certain Outlook forms that share a MSG_ID.  Update 6 enables server-side duplicate detection based on the message send date and message subject, in addition to the MSG_ID.   So this now means that forwarded messages are NOT duplicates, and would also be filed — a little something to keep in mind when it comes to expected storage on your file servers and indexers.  More emails will be filed with the new algorithm.

According to the release notes, there are 3 registry keys that need to be added to each WorkSite Server in order to use the server-side duplication detection.  The first one disables client-side email duplicate detection:

HKEY_LOCAL_MACHINESoftwareWow6432NodeInterwovenWorkSiteimDmsSvc
Name: Duplicate Detection Type
DWORD Value: 2 – Disables Client-side E-mail Duplicate Detection

The second key enables the server-side detection:

HKEY_LOCAL_MACHINESOFTWAREWow6432NodeInterwovenWorkSiteimDmsSvc
Name: Server-Side Duplicate Detection
String Value: Y – Server performs duplicate detection

The third key determines the scope of the detection, and, as I’ll explain afterwards, could be the most dangerous key you could configure:

HKEY_LOCAL_MACHINESOFTWAREWow6432NodeInterwovenWorkSiteimDmsSvc
DWORD Name: Server-Side Duplicate Detection Type
Values:
0– [default] Searches for duplicates within the target folder or workspace.
1– Searches for duplicates across entire target library.
2– Disables duplicate detection.

Why might this key be dangerous?   With the right combination of supported configurations, it could spell disaster.  First, imagine you have the Email Management for Outlook installed, and have users that have linked folders within their Outlook inbox to matter Workspaces.  Next, imagine that when they linked their Outlook folders, they chose the option to keep the email in Outlook after it is filed.  Next, imagine each folder has 5,000 emails, and some of these emails have attachments that are 1 MB each.   See where I’m going with this?  Ok, well, now imagine you set that Server-Side Duplicate Detection Type registry key to 2, disabling email duplication.

With this combination, the Email Filing Service will cycle through the linked folders and file thousands of emails over and over again — until you either run out of space on your file server, or on your SQL data/log volumes.  Or until you notice that your new document number is all of a sudden 300,000 more than it was yesterday.

I contacted Autonomy about this potential situation, and they confirmed that there is no way to force de-duplication for linked folders.  I’ve submitted a new feature request for this behavior to Autonomy.  I hope they read those things.

All that being said, I have successfully performed this upgrade and enabled these features at a client.  The enhanced duplicate detection functions properly, and even adds links to the original email if a user attempts to file it to a different Workspace folder.  This is yet another step in the right direction for iManage’s email-filing capabilities.