Tuesday, December 11, 2012

A CAPTCHA Server Control for ASP.NET (WebControl Captcha))





Implementation

The first thing I had to deal with was the image generated by the CAPTCHA class. This was originally done with a dedicated .aspx form-- something that won't exist for a server control. How could I generate an image on the fly? After some research, I was introduced to the world of HttpModules and HttpHandlers. They are extremely powerful -- and a single HttpHandler solves this problem neatly.
All we need is a small Web.config modification in the <system.web> section:
<httpHandlers>
    <add verb="GET" path="CaptchaImage.aspx" 
       type="WebControlCaptcha.CaptchaImageHandler, WebControlCaptcha" />
</httpHandlers>
This handler defines a special page named CaptchaImage.aspx. Now, this "page" doesn't actually exist. When a request for CaptchaImage.aspx occurs, it will be intercepted and handled by a class that implements the IHttpHandler interface: CaptchaImageHandler. Here's the relevant code section:
 
 
Public Sub ProcessRequest(ByVal context As System.Web.HttpContext) _
       Implements System.Web.IHttpHandler.ProcessRequest
    Dim app As HttpApplication = context.ApplicationInstance

    '-- get the unique GUID of the captcha;
    '   this must be passed in via querystring
    Dim strGuid As String = Convert.ToString(app.Request.QueryString("guid"))

    Dim ci As CaptchaImage
    If strGuid = "" Then
        '-- mostly for display purposes when in design mode
        '-- builds a CAPTCHA image with all default settings 
        '-- (this won't reflect any design time changes)
        ci = New CaptchaImage
    Else
        '-- get the CAPTCHA from the ASP.NET cache by GUID
        ci = CType(app.Context.Cache(strGuid), CaptchaImage)
        app.Context.Cache.Remove(strGuid)
    End If

    '-- write the image to the HTTP output stream as an array of bytes
    ci.Image.Save(app.Context.Response.OutputStream, _
                              Drawing.Imaging.ImageFormat.Jpeg)

    '-- let the browser know we are sending an image,
    '-- and that things are 200 A-OK
    app.Response.ContentType = "image/jpeg"
    app.Response.StatusCode = 200
    app.Response.End()

End Sub
 
 
A new CAPTCHA image will be generated, and the image streamed directly to the browser from memory. Problem solved!
However, there's another problem. There has to be communication between the HttpHandler responsible for displaying the image, and the web page hosting the control -- otherwise, how would the calling control know what the randomly generated CAPTCHA text was? If you view source on the rendered control, you'll see that a GUID is passed in through the querystring:
<img src="CaptchaImage.aspx?guid=99fecb18-ba00-4b60-9783-37225179a704" 
     border='0'>
 
This GUID (globally unique identifier) is a key used to access a CAPTCHA object that was originally stored in the ASP.NET Cache by the control. Take a look at the CaptchaControl.GenerateNewCaptcha method:
Private Sub GenerateNewCaptcha()
    LocalGuid = Guid.NewGuid.ToString
    If Not IsDesignMode Then
        HttpContext.Current.Cache.Add(LocalGuid, _captcha, Nothing, _
            DateTime.Now.AddSeconds(HttpContext.Current.Session.Timeout), _
            TimeSpan.Zero, Caching.CacheItemPriority.NotRemovable, Nothing)
    End If
    Me.CaptchaText = _captcha.Text
    Me.GeneratedAt = Now
End Sub
 
 
It may seem a little strange, but it works great! The sequence of ASP.NET events is as follows:
  1. Page is rendered.
  2. Page calls CaptchaControl1.OnPreRender . This generates a new GUID and a new CAPTCHA object reflecting the control properties. The resulting CAPTCHA object is stored in the Cache by GUID.
  3. Page calls CaptchaControl1.Render; the special <img> tag URL is written to the browser.
  4. Browser attempts to retrieve the special <img> tag URL.
  5. CaptchaImageHandler.ProcessRequest fires. It retrieves the GUID from the querystring, the CAPTCHA object from the Cache, and renders the CAPTCHA image. It then removes the Cache object.
Note that there is a little cleanup involved at the end. If, for some reason, the control renders but the image URL is never retrieved, there would be an orphan CAPTCHA object in the Cache. This can happen, but should be rare in practice-- and our Cache entry only has a 20 minute lifetime anyway.
One mistake I made early on was storing the actual CAPTCHA text in the ViewState. The ViewState is not encrypted and can be easily decoded! I've switched to ControlState for the GUID, which is essential for retrieving the shared Captcha control from the Cache -- but by itself, it is useless.

CaptchaControl Properties

The CaptchaControl is a good ASP.NET citizen, and properly implements all the default ASP.NET Server Control properties. It also has a few properties of its own:
CAPTCHA control properties

Property Default Description
CacheStrategy HttpRuntime For security reasons, the CAPTCHA text is never sent to the client; it is only stored on the server. It can be stored in Session (web-farm friendly) or HttpRuntime (very fast, but local to one webserver).
CaptchaBackgroundNoise Low Amount of background noise to add to the CAPTCHA image. Ranges from None to Extreme.
CaptchaChars A-Z, 1-9 A whitelist of characters to use when building CAPTCHA text. A character will be picked randomly from this string. By default, I omit some characters likely to be confused, such as O, 0, I, 1, 8, B, etcetera.
CaptchaFont "" Font family to use for the CAPTCHA text. If not provided, a random installed font will be chosen for each character. A font whitelist is maintained internally so only known legible fonts will be used (e.g., not WingDings).
CaptchaFontWarping Low Level of warping used on each character of the CAPTCHA text. Ranges from None to Extreme.
CaptchaHeight 50 Default height of the CAPTCHA image, in pixels.
CaptchaLength 5 Number of characters used in the randomly generated CAPTCHA text.
CaptchaLineNoise None Amount of "scribble" line noise to add to the CAPTCHA image. Ranges from None to Extreme.
CaptchaMaxTimeout 90 Number of seconds that the CAPTCHA will remain valid and stored in the cache after it is generated.
CaptchaMinTimeout 3 Minimum number of seconds the user must wait before entering a CAPTCHA.
CaptchaWidth 180 Default width of the CAPTCHA image, in pixels.
UserValidated False After postback, returns True if the user entered text that matches the randomly generated CAPTCHA text. Note that the standard IValidation interface is implemented as well.
LayoutStyle Horizontal Determines if the text and input box are to the right, or below, the image. Allows greater layout flexibility.

Many of these properties have to do with the inherent tradeoff between human readability and machine readability. The harder a CAPTCHA is for OCR software to read, the harder it will be for us human beings, too! For illustration, compare these two CAPTCHA images:

No comments:

Post a Comment