How to use Google reCaptcha

12/16/2014 01:04:00 PM
Tweetable
I recently added a link so readers can email me. However, to prevent spam, I made the email link a webform controlled by google's reCaptcha system, which they recently revamped to be (allegedly) easier for humans and harder for spambots. It now looks like this:
Now you just have to check the box, and it may or may not prompt you with a challenge word depending on whether google thinks you might be a robot.
The system really is pretty easy to use, but even so I found a frustrating lack of up-to-date information on how to do this in the asp.net framework with C#. All of the forums I found were either way more complicated than they need to be, did not have up-to-date information, or did not fully explain how to do it. So here's my attempt to fill that gap:

To start, you need an aspx page containing your web form. I'll assume that you already know the basics of how to make an aspx web form(in Visual Studio, most likely), set up event handlers, do postbacks, etc from the code-behind. I won't assume anything more than a beginner's understanding of these, but if none of that sounds familiar, you need to start with an asp.net tutorial instead of what follows. Also, I'm doing this all in Visual Studio 2010 with C#. So our aspx page code looks like this:
A basic webform with three server controls: a label that says "label:", a text box where users can enter in some text, and a submit button.
The idea with this form is that users will type something in (an email message, for example), click submit, and then you can do something with that user input on the server side (send it as an email to someone, perhaps, or maybe save the text into a database).

The problem is that this form is accessible to both humans (good) and spambots (bad), so we need to add a reCaptcha to prevent spammers from being able to programmatically use these controls. To do that, you first need to go to google to get set up with reCaptcha. It is a free service, but you need to create an account and get three things: a script tag that looks like this:
<script src="https://www.google.com/recaptcha/api.js" async defer></script>
a div that looks like this:
<div class="g-recaptcha" data-sitekey="your_site_key"></div>
(but will have your private key instead of your_site_key), and of course, a private key, which you will also need for the server-side code. We insert the script tag into the head of our aspx page and the div into the form like so:
Note: Visual Studio may complain about the div above, which has an attribute the server won't recognize. Ignore it--the user's browser will know what to do.
It's important to note that I've added an OnClick event handler to the Submit button, which calls the function myFunction, which we will be adding to the codeBehind file shortly. Also note that I set the reCaptcha div to runat="server". This is what our aspx page will look like to users:
An aspx form with a reCaptcha
Now we need to head to the codeBehind file.

The code behind has a Page_Load event handler by default. We won't be using it. Below it, we'll add three functions: one of them is the myFunction that is being called when the Submit button is clicked, and the other two will get the users' IP address and check to see if the reCaptcha validated, respectively. Additionally, we will be adding three using statements to the top: two are for System.Net and System.IO respectively, which are part of the standard library, and the third is Recaptcha which is not. You'll need to download the Recaptcha library here, extract it from the zip file, and add a reference to the library from your IDE, which is different than just adding the using statement (in Visual Studio, in the Solution Explorer right click References, then click Add Reference, go to the Browse tab, and point it to the location of the file you just downloaded.) So we have a code behind skeleton that looks like this:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using Recaptcha;
using System.Net;
using System.IO;

namespace emailApp
{
    public partial class WebForm1 : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {

        }

        bool reCaptchaValidate(string ipAddress)
        {
            bool Valid = false;
            return Valid;
        }

        string getIpAddress()
        {
           
        }
        protected void myFunction(object sender, EventArgs e)
        {

        }
    }
}
Here's the guts of the method to get the user's IP address:
        string getIpAddress()
        {
            System.Web.HttpContext context = System.Web.HttpContext.Current;
            string ipAddress = context.Request.ServerVariables["HTTP_X_FORWARDED_FOR"];

            if (!string.IsNullOrEmpty(ipAddress))
            {
                string[] addresses = ipAddress.Split(',');
                if (addresses.Length != 0)
                {
                    return addresses[0];
                }
            }

            return context.Request.ServerVariables["REMOTE_ADDR"];
        }
You needn't worry too much how this works, it is just a generic method that attempts to grab the user's IP address and returns that as a string variable. It's a bit messy because it isn't in general possible to get the user's IP address if, for example, the user is using a proxy server. Sometimes IP addresses are forwarded by proxy servers, so this looks for that if it exists, but it won't always work and that's ok. ReCaptcha uses IP address as one of it's criteria to determine if someone is a bot, but your app will still work fine even when we can't obtain the correct IP for some users.

Next, we fill in the method that will check the reCaptcha to see if the user passed the test. It is as follows:
public bool reCaptchaValidate(string ipAddress)
        {
            bool Valid = false;
            string Response = Request["g-recaptcha-response"];//Getting Response String Append to Post Method
            string url = @"https://www.google.com/recaptcha/api/siteverify?secret=your_site_key&response=" + Response + @"&remoteip=" + ipAddress;
            //Request to Google Server
            HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
            try
            {
                //Google recaptcha Response
                using (WebResponse wResponse = req.GetResponse())
                {
                    using (StreamReader readStream = new StreamReader(wResponse.GetResponseStream()))
                    {
                        string jsonResponse = readStream.ReadToEnd();
                        if (jsonResponse.Substring(15, 4) == "true")
                        {
                            Valid = true;
                        }
                    }
                }

                return Valid;
            }
Ok, there's some stuff going on in there. The first thing that happens is we get the user's response to the reCaptcha using Request["g-recaptcha-response"]. This sends an http request asking for information pertaining to g-recaptcha-response, which is a property google built for the reCaptcha object. The next thing to note is the URL. We are performing a standard GET http web request, and the way that works is we send out a URL loaded with data, which will find it's way to that server, which will then send back a response based on the data we included in the URL (side note: the + signs concatenate strings, which is how we insert variables into other strings). You'll notice a question mark in the middle of the URL above--everything before the ? is an address to the server we want the response from, which is Google in our case, and everything after ? specifies parameters which Google will use to determine what response to send. There are three parameters: the first is your private key which google gave you earlier when you signed up for reCaptcha: enter that into the URL string. We've already inserted the variable for the user's reCaptcha response into the URL, and when we pass an IP address to this function, that will also get spliced into this URL.

The next statement after the URL in our function preps an http request with the URL we've specified, and the GetResponse() will fire off our request and return the server's reply as a WebResponse object. We use the StreamReader to parse a string out of the WebResponse reply. The using() syntax isn't totally necessary, and neither is the try{}catch{} syntax. This extra verbiage is just in there to minimize and deal with potential errors, which may or may not be an issue depending how your application works. The response string will be formatted as a json object, like so:
{
  "success": true|false,
  "error-codes": [...]   // optional
}
But all we want is whether success: is followed by true or false, so we extract that using the Substring() method and test whether the result is equal to "true". If it is, then the user is probably not a bot so we set Valid=true which will allow the rest of the application to execute. Otherwise, we will assume it's a bot, and refuse to execute the rest of the code.

But so far, our webform does nothing. That's because the we've not put anything into the myFunction method, and thus nether of the methods above are being called. So here's how we do that:
protected void myFunction(object sender, EventArgs e)
        {
            if (reCaptchaValidate(getIpAddress()))
            {
                string userInput=Server.HtmlEncode(TextBox1.Text);
                //do some stuff with userInput
            }

        }
myFunction will be called when the user clicks the Submit button. In the if() statement, the method will call the getIpAddress() method and send the returned value as the input parameter for the reCaptchaValidate() method, and if that returns true will execute the code inside the curly brackets. So far, the only thing happening in there is we grab the user's text from TextBox1, scrub it using the HtmlEncode command which converts all special characters like angle brackets into special HTML "entities", which is a useful step in making sure that users cannot inject malicious code into your input box. After that, you can do whatever you want with the data. Anything inside the if statement will be executed only if the user passes the reCaptcha test.