Wednesday, March 5, 2014

Is the spam protection on your site enough?


      For many years captcha has been the standard mechanism in preventing bots from spamming a site. We are going to show in this blog that it's a standard for a reason:  some other protection mechanisms may not be strong enough.

    For example, recently I saw a site that uses the following challenge as a way to differentiate between human visitors and bots.



    While it does prevent some simple bots from spamming the site,  it's still quite easy to break it.  Let's start by studying the HTML code for the above image.

    As you can observe,  the above image has multiple pieces,
  • the piece for "5" in the image corresponds to the file name "5.png", 
  • the piece for "x" (multiplication) corresponds to the file name "u.png",  
  • the piece for "2" in the image corresponds to the file name "2.png".
  After trying some other combinations,  it turned out "+" (addition) corresponds to the file name "g.png".    With this knowledge, it's not hard to write a simple script to run on NetGend platform to break the protection.

      action(http,"http://www.example.com/comments");  
      a = substring(http.replyBody, 'ArithmeticChallenge/', '.png', "all");  
      if (a[1] == "g") { //this means "+", web site convention  
           result = a[0] + a[2];  
      } else if (a[1] == "u") { //this means "x", multiplication  
           result = a[0] * a[2];  
      }  
      http.POSTData = "jsmith&city=chicago&challenge=${result}...."; //many fields...
      action(http,"http://www.example.com/comments");

     Here are the steps in the above script,
  • Do a HTTP transaction to retrieve the HTML page that has the challenge. The content is saved in the variable http.replyBody.   
  • Use the function "substring(..)" to get all the image file names (excluding suffix ".png").  This function takes 4 parameters and returns an array of all matched patterns ("a" in this case).  
    • first parameter is the string (http.replyBody) to search within
    • second parameter is the left boundary
    • third parameter is the right boundary
    • forth parameter "all" means to grab all patterns between left and right boundary.  It can be a number, like "2" (0 based) which means grabbing the 3rd occurrence of the pattern.
  • Understand that a[1] (the second element in the array) holds the value of operator ("u" for multiplication and "g" for addition)
  •  Do the simple calculation.
     Now that we see how easy it is to break the above protection, we need to think again when we deviate from the standard spamming protection.

No comments:

Post a Comment