Sonoma Partners Microsoft CRM and Salesforce Blog

OCR in CRM

Today's blog post was written by Angel Shishkov, Principal Developer at Sonoma Partners.

OCR (or Optical Character Recognition) technology has been quietly getting better over the years, while everyone has been distracted with voice recognition (Hey Cortana! Cortana? Hey? Oh, fine, I’ll just click it). Generic solutions exist that can do character recognition and specialized solutions exist for more specific tasks, like reading receipts. CRM can also benefit from OCR, especially for getting that digitized paper data into structured CRM entities.

I will demonstrate how OCR can be used in CRM by setting up a simple integration between CRM and an OCR API.

Requirement

A theoretical customer uses paper forms to send in requisition orders for equipment. The forms are very basic and contain the equipment serial number or SKU, the amount to be requisitioned and the name of the person requesting the equipment. The equipment exists in CRM as native Products and the people exist as native Contacts. A custom entity called Requisition was created to store the SKU, Amount and Requisition By fields for each request.

The current requisition process works as follows:

  1. Requisition form is filled in and printed/scanned.
  2. Requisition form is attached to an email and sent to a CRM mailbox.
  3. A CRM user receives the email and opens the attachment.
  4. A CRM user manually creates a new Requisition record and enters the information from the attachment.

Design

We will build a custom integration with an OCR service called OCR.Space. This is a web service API, making it ideal for use from within CRM Online, and it has a free pricing tier so we can experiment with it without obligation to buy.

We will have a plugin that fires on new Emails. It will use the OCR.Space service to read the requisition form attachment and then automatically create a new Requisition record in CRM with those values.

The new requisition process will work as follows:

  1. Requisition form is filled in and printed/scanned.
  2. Requisition form is attached to an email and sent to a CRM mailbox.
  3. Email arrives into CRM and fires a custom plugin.
  4. Custom plugin reads the email attachment and sends it to the OCR service to be interpreted.
  5. Custom plugin reads the values returned from OCR and creates a new Requisition record with those values.

Everything on the CRM side will be automated, with no manual work.

Implementation

The custom plugin will fire on Email create, check if the email was a requisition, based on the address it was sent to, or the subject line, or something else, and read the attachment from the email. Attachments are retrieved from CRM as Base64 encoded strings, which is perfect for calling the OCR.Space service with.

The OCR.Space service accepts an HTTP “POST” submission with an “apikey” parameter in the header specifying the API key we are provided when we registered. The image file to read is sent through a “base64Image” parameter in the message content.

The service returns a JSON object that describes the text which was read from the image. The service documentation for the request and response parameters can be found here: https://ocr.space/ocrapi

Here is an example (oversimplified) Requisition Form that would be attached to an incoming email.

Angel 1

First, we will create some classes to hold the deserialized JSON response. They are decorated with [DataContract] and [DataMember] so we can deserialize the JSON string into them using a DataContractJsonSerializer. 

Next, we set up the base plugin that will read the email attachment. That is standard CRM work, so I will omit it here. Finally, we create a method to take the email attachment as a Base64 string, send it to OCR and create a new Requisition record from the results.

Here is what the code does:

  • Start with the URL to the OCR service and the API key. These can come from the plugin configuration or a config entity.
  • Construct the data payload for the POST request by adding the “based64Image” parameter and the “apikey” in the header.
  • Submit the request and retrieve the response from the OCR service.
  • Deserialize the response JSON string using a DataContractJsonSerializer into the classes we created earlier.
  • Read the text returned by OCR and split it up into lines. Find the lines that start with “Product”, “Amount” and “Requisition By” and parse out the values following the line labels.
  • Fill in each parsed value into the appropriate attribute on the new Requisition record.
  • Call the OrgService to create the new Requisition record.

Conclusion

Automating processes in CRM through OCR can save a lot of repetitive, manual work that is prone to data entry errors. While OCR is not perfect and will make the occasional mistake, it tends to be very good when the input image is clean, in a standard format and uses a consistent text font. If you are looking to build an OCR integration for your CRM system and need some help, give us a call. Thanks for reading!

Topics: Microsoft Dynamics 365