Today's guest blogger is BJ Dibbern, a Developer at Sonoma Partners.
Just the other day a coworker and I were attempting to track down a request to CRM 2011’s OData service from Silverlight that just would not complete. We first noticed that a very simple update request was taking forever to finish and assumed there must be some sort of error happening that was not bubbling up through the user interface. We went down a long path to troubleshoot this issue, and this is our tale.
We started by first opening up Fiddler to check and see if there was actually an error occurring and it wasn’t bubbling up correctly. We’ve previously covered using Fiddler to help with troubleshooting problems in CRM, and you can reference that here (http://blog.sonomapartners.com/2012/01/fiddling-with-crm.html). However, we sat and watched the request, and it just never completed, just as it looked in the UI.
This was really strange as one would assume that the request would eventually time out; but today this was not the case.
For our next step, we headed over to the CRM Web Server to check its Event Viewer for any sign of error or warning occurring around the time of the request. While there was not anything explicitly listed in the Event Viewer, we did end up finding an event around that time. When we looked at the attached log, we found a detailed error stating: “Generic SQL Error”.
If you ever find yourself faced with a “Generic SQL Error”, it is far better to start with a SQL trace directly on the database as opposed to a CRM trace. Conceptually this makes sense as the error itself is actually occurring in the database.
We ended up heading off for the SQL Profiler in order to run a trace against the database. This can be done following the steps below.
1. Open the SQL Server Profiler. This can be found in the start menu under Microsoft SQL Server 2008 –> Performance Tools.
2. Select File –> New Trace.
3. Connect to the SQL Server in question.
4. On this screen’s General tab you can enter a name for the trace if you desire, and use a template if you have created one previously. Note that these instructions assume that you do not have a template setup already.
5. Go to the Events Selection tab.
6. Here you can start to select the events you would like to monitor. Filtering this down and checking individual events one at a time will help you be able to better hone in on your problem, but for now, we’ll select a bunch of events that are relevant.Note that for each event, if you check the box on the far left, the profiler will automatically select all the other columns on the right. Below you will find some suggested events from each category to monitor
a. Errors and Warnings: Exception, User Error Message.
b. Stored Procedures: RPC Completed, SP:StmtCompleted, SP:StmtStarting.
c. Locks: Lock:Deadlock, Lock:Deadlock Chain, Lock:Timeout, Lock:Timeout (timeout > 0).
d. TSQL: SQL:StmtCompleted, SQL:StmtStarting.
7. After selecting all your desired events to monitor, select the Column Filters button in the lower right hand corner of the window.
8. In the Column Filters window, select DatabaseName from the list on the left, and under Like enter the name of the desired CRM Organization’s database to be monitored. Performing this step will make sure you are only monitoring the data that is relevant to what you’re trying to troubleshoot.
9. Next you can click OK to get out of the Column Filters window.
10. Before you actually click Run to start the trace, you want to make sure you’re setup to reproduce the error message. Once ready, click Run in the SQL Server Profiler, then perform the action in CRM that causes the error. You will be able to Start, Pause, and Stop the trace from the toolbar in the main window of SQL Server Profiler.
We were able to monitor the errors and watch them happen during the trace. From here we were able to correlate what was being logged and determine that there was a process that was still running that held a lock on the table we were trying to access. We found a Lock:Timeout (http://msdn.microsoft.com/en-us/library/ms189107.aspx) event in the trace and this lead us to the fact that, as the event indicates, there was a process currently running that had a lock on the table and was causing our OData queries from Silverlight to fail.
Fair warning here that these methodologies were used on a development environment, and thus data integrity was not a huge concern. These methods should be used with caution and should be thoroughly evaluated by all parties involved before applying to a production environment.
After running a simple query we were able to verify that a query did have a lock on the table and had been running for several hours. Since we knew the table we were looking at did not have any long running processes that should’ve been occurring, we knew we could kill the process. After we killed the process, things thankfully returned to normal. We followed the following steps to retrieve that currently running process and then to kill it.
1. Open SQL Server Management Studio.
2. Connect to the SQL Server in question.
3. Create a new Query.
4. Execute the SQL statement below to display a list of all currently running transactions.
5. If you encounter a long running transaction that you’d like to end, just run a new statement, “KILL [sessionId]”, where [sessionId] is the Numeric identifier of the session you’d like to end.
SELECT TOP 100
[request_start_time] = s.[last_request_start_time],
[current_time] = CURRENT_TIMESTAMP,
[estimated_finish_time] = DATEADD
current_command = SUBSTRING
COALESCE(NULLIF(r.[statement_end_offset], -1)/2, 2147483647)
module = COALESCE(QUOTENAME(OBJECT_SCHEMA_NAME(t.[objectid], t.[dbid]))
+ '.' + QUOTENAME(OBJECT_NAME(t.[objectid], t.[dbid])), '<ad hoc>'),
[status] = UPPER(s.[status])
sys.dm_exec_connections AS c
sys.dm_exec_sessions AS s
ON c.session_id = s.session_id
LEFT OUTER JOIN
sys.dm_exec_requests AS r
ON r.[session_id] = s.[session_id]
sys.dm_exec_sql_text(r.[sql_handle]) AS t
s.status like '%running%'
ORDER BY session_id desc
While we weren’t able to uncover why that process had never completed, we were at least able to end the process, and we haven’t had the problem since.