This might be more technical of a blog post than my usual ones but I thought I would share this (and also make a note to myself for future reference).
What the application is for
I’ve been working on and asp.net application written in c# with a SQL database.
It is designed for the management of applicant interviews. At present it allows for interviewers to record the scores from the interview and the administrators to then see an overview and make offers / holds / rejects based on the applicants final score (each applicant has four mini interviews). The system is being expanded to allow for the management of the timetable for the days and running statistical analysis.
The site is on a secure server that has shibboleth authentication for all users.
What we saw happen
The days interviews were running well and all records were being submitted without much issue. One record had to be re-entered as the interviewer seems to manage to avoid the submission process – this ability is being looked at. But thanks to me monitoring it was highlighted and reentered within minutes.
At around 16:30 I noticed that there were several missing records. I asked all interviewers to check their last candidate and re enter if missing. Situation resolved.
But Why did this happen? It can’t have been user error as it happened to several interviewers all at approximately the same time.
Not being one to let things go and move on I needed to know why this happened and how to prevent it from happening.
My first thought was that another administrator was working on the server at the time and caused a service to restart. Quick instant message ruled this out.
Next I turned to a table in the database I created to record the pages people access. At 16:19:00 a user was fine. After 16:19:53 the first time any user makes a page request they are sent to default.aspx – the page a user is sent to when their session variable ‘username’ is null.
No one had reported being logged out of the system so I didn’t think it was a shibboleth issue.
I started digging into other logs and didn’t find much so I reached out to colleagues for advice. They suggested that the application might have been recycled and several reasons why this might be the case.
I could now go back to the system logs of the server with a targeted view.
Sure enough the app pool had recycled at 16:19:06!
Well some further digging discovered that the default app pool recycle interval was 1740 minutes (29 hours). I compared this to the previous recycle event time and it matched up. No error – just bad timing!
I’m going to resolve this in several ways.
- Set the recycle to occur at a set time (early morning when no one will be on it)
- Look at storing my session variables in a table on the database instead
- Look at recreating the variables on the page rather than directing to default.aspx
- Finally to make sure that the session doesn’t timeout on longer interviews using this suggestion from a colleague:
Add an Ajax script manager to your form. Add an update panel and within that add an Ajax timer and a label. Set the timer interval to a minute (value is in milliseconds) and on tick update the label text to an empty string. Will keep your session alive indefinitely.