Friday, February 28, 2014

Multiple downloading made easy

    Most test platforms can emulate a user downloading a (big) file.  What if we need to emulate a user downloading multiple (big) files at the same time?   That's why a question was asked in a forum on implementing HTTP asynchronous downloads.

    This nice tutorial gave an implementation on the JMeter platform.  The test scenarios is about getting to the main download page and download multiple software images simultaneously.   It's related to the test scenario of downloading all the resources files on a HTML pages (such as images, css, js), but that is typically handled directly by a test platform.  What's new here is that the tester needs to control explicitly on what HTTP request(s) to make simultaneously.

     Here is the key part of the HTML download page: a dropdown box and its HTML code:
 

 <select name="current_os" id="current_os" class="select stacked" onchange="this.form.submit()" >  
 <option label="Select Platform&hellip;" value="0">Select Platform&hellip;</option>  
 <option label="Microsoft Windows" value="3" selected="selected">Microsoft Windows</option>  
 <option label="Ubuntu Linux" value="22">Ubuntu Linux</option>  
 <option label="Fedora" value="20">Fedora</option>  
 <option label="Oracle &amp; Red Hat Linux 6" value="31">Oracle &amp; Red Hat Linux 6</option>  
 <option label="Mac OS X" value="5">Mac OS X</option>  
 <option label="Source Code" value="src">Source Code</option>  
 </select>  
    We need to extract the values (the characters in bold font) and use them to start downloads.

    The solution given in the above tutorial is quite detailed with long instructions and many snapshots. Some part is about using regular expression to extract the values,  the part on the Jmeter plugin for inter-thread communication is new and more technical.  For those who are less familiar with the concepts of an operating system, inter-thread communication can pass information (like a URL) from one execution thread to other execution threads so that other execution threads can process the information passed in (it means downloading in this case) at about the same time. Without this plugin, the JMeter will have to do all the downloading sequentially.

  This makes me wonder how to implement it on NetGend test platform. Turns out it's pretty simple:

 function VUSER () {  
      action(http,"http://www.example.com/downloadpage");  
      values = substring(http.replyBody, 'value="', '"', "all");  
      urls = [];
      for (i=0; i < length(values); i ++) {  
           if (match(/^\d/, values[i])) {
                 push(urls, "http://www.example.com/downloadpage?id=${values[i]}");  
           }
      }  
      spawn(urls);  
 }   

     In the above script, we used the function substring() to grab the values.  This function uses a left boundary and a right boundary and grabs the part in between. The last parameter ("all" in our case)  tells the function to grab all such parts. Not surprisingly it returns an array ("values" in our case).   Next, we use a simple loop to create an array of URLs (variable "urls" in our case) out of the values.  Note that we use a regular expression function to "match" only those values that are non-zero numbers (it will skip the values such as "0" or "src" in our case).   The part that does the job of the Jmeter inter-thread communication plugin is the spawn() function, it takes a list of URLs and starts the HTTP transactions on them at the same time and waits for all of them to finish before moving on.

    Note that the extraction of values can be improved by using this XPath:
 values = fromHtml(http.replyBody, '//select/option/@value', "text");   
It requires a little more knowledge, but it's actually not too difficult to learn and it's more resistant to the variation on HTML encoding where double quote may be skipped in some cases.  For example in an equivalent HTML code <option label="Microsoft Windows" value=3...    the double quote after value= is skipped.

    We are happy not only because NetGend can implement the test scenario, but also because our solution is so simple that users do not need to understand the concept "inter-thread communication" and do not need to install a plugin to do it.

Thursday, February 20, 2014

What chrome browser developer console can do for performance testing

     As professionals on performance testing, we always strive to
  1. find the capacity of a web site and
  2. pinpoint the bottleneck. 
    While we can find capacity of a web site by emulating enough virtual clients on a test platform,  it's less obvious how to find the bottleneck - especially when you don't have access to internals of a web site.  A web infrastructure typically consists of multiple components:
  • web servers, 
  • application servers, 
  • database servers. 
    Being able to identify the bottleneck will help a lot when it comes to improving the performance of the system or making decision on which hardware resource to add.

     In this blog, we give an example that shows we may get some idea on performance bottleneck using a tool that is surprisingly simple and common:  the developer console on chrome browser.  I am sure there are other such tools/plugins that can help as well.  To use developer console on chrome browser, all you need to do is to right click and select "Inspect Element" in the popup menu,  you can then open the "Network" tab.  Here is something you may see:


    Recently I did a test on a web site hosted in the cloud and it consists of a web server and a database  server.  Here are some of statistics on the response times of various pages collected by developer console.  The response time is defined as the time between sending a HTTP request and receiving the entire HTTP response.
web page
response time
/  (main page)
0.31s
/projects/abc
4.97s
/projects/abc/filter?kind=45&actors=47&means=1&page=0
1.91s
/projects/abc/filter?kind=45&actors=46,47&means=1&page=0
2.06s
/projects/abc/filter?kind=45&actors=46,47,48&means=1&page=0
2.68s

    As we can observe, the 3 HTTP requests for "/projects/abc/filter" with different parameters took incrementally more time with the increasing complexity on the query.   These  HTTP requests are generated by Ajax call and the responses are pretty simple: just a few blocks of HTML code.

<div class='results'>  
      <div class='tags'>Tags:   
           <a href="/projects/abc/subjects/52">cars</a>  
           <a href="/projects/bc/subjects/82">Flowers</a>  
      </div>  
 </div>       

    The processing done by web server (Apache/Ubuntu in this case) is pretty light,  judged by the simple structure, the time spent on web server part could be as low as a few milliseconds, so most of the time may have been spent on database operations.  Before we conclude that the bottleneck in this case is the database,  someone may ask,  what about round trip time?  Roundtrip time can potentially be high and varying a lot, making it harder to draw a conclusion.

  The round trip time can best be estimated by pinging the site. Here is a simple example.
 ping www.yahoo.com  
 PING ds-any-fp3-real.wa1.b.yahoo.com (98.138.253.109) 56(84) bytes of data.  
 64 bytes from ir1.fp.vip.ne1.yahoo.com (98.138.253.109): icmp_req=1 ttl=48 time=85.3 ms  
 64 bytes from ir1.fp.vip.ne1.yahoo.com (98.138.253.109): icmp_req=2 ttl=48 time=80.7 ms  
 64 bytes from ir1.fp.vip.ne1.yahoo.com (98.138.253.109): icmp_req=3 ttl=48 time=79.4 ms  

    However, it doesn't work all the time since some sites (including the one we were testing) has turned off the ICMP echo reply so ping will simply timeout and not give any information.   Developer console has another piece of information that can help: TCP connection time.   In the above snapshot, the yellow popup window has a line "Connecting", that's for TCP connection time.

    In my experience, this can be a good approximation on the roundtrip time (in theory, it also includes the time an OS TCP/IP stack spent on creating the connection and sending the TCP SYNACK, but the extra time is typically less than a ms).   Back to our test, the roundtrip times are typically around 50ms.  So we know the roundtrip time doesn't affect the total response time much and our conclusion will stand that the database side needs some optimization.

    Developer tool is such a great companion for professional performance testers, I hope you will like it too.
    

Saturday, February 15, 2014

Fast TCP port scanning


    One of the first tests that security professional may do on a target is to scan its open TCP ports (a.k.a. TCP listening ports).  From there, he/she may infer what applications/servers may be running on the target and move on to discover the vulnerabilities.  You can find out if a port is open by using the "telnet" command (there are other commands and tools that can do this too):

    If a port (say, 1000) is not open on target "10.3.0.3", here is what you may get:
 $ telnet 10.3.0.3 1000  
 Trying 10.3.0.3...  
 telnet: Unable to connect to remote host: Connection refused  

    For a port that's open (say, port 80),  here is what you get get:
 $ telnet 10.3.0.3 80  
 Trying 10.3.0.3...  
 Connected to 10.3.0.3.  
 Escape character is '^]'.  

    But if you try to find out all the open TCP ports on a target,  you don't want to do it manually since there are 65535 possible ports.  You can potentially script the above step but it may be slow too. In this blog, we show it's very easy to do port scanning on NetGend platform and do it fast!  Here is the little script for this:

 function userInit() {  
      var port = 1;  
 }  
 function VUSER() {  
      myport = port ++;  
      if (myport > 65535) {  exit(0);  }  
      connect("10.3.0.3", myport);  
      if (sock !== "") {  //connection got established
           println(myport);  
      }   
 }  

    I ran it at a rate of 10,000 new VUsers/second and in less than 7 seconds, it scanned through all the possible ports and gave me the following report:

22
80
111
443
902

   I knew what the TCP ports 22, 80, 111, 443 are used for, but was curious on what server/application uses the port 902, a quick google search showed that some Vmware software uses it.

    NetGend is a very scalable performance test platform,  the above example shows that it has the potential to do some fast security tests too.

Wednesday, February 12, 2014

Random (II)

    Randomness is important for performance testing because realistic users behave randomly to some degree, we need to have some randomness to better emulate the real users.  In one of earlier blogs, we covered some randomness related functions, here we show a more complete list.

 randNumber(<minNum>, <maxNum>, ["float"])  
    This function is the most commonly used function, it will give a random number between min and max. The third parameter is optional, if it's omitted, it will return an integer.   Here is a simple but interesting example:   randomly pick a product from a web page and view it
 numOfItems = //get number of items from a web page  
 id = randNum(1, numOfItem);  
 action(http, "http://www.example.com/viewProduct?id=${id}");  

 randString(<min>, <max>)  
     This function will return a random string whose length is between min and max. I haven't got a chance to use it real testing yet.

 rolldice(<percentage0>, <percentage1>,<percentage2>...);   
    This function is covered in detail in one of the earlier blogs.

 randElement(<arrayVariable>)  
    This function is what makes it so easy to pick an item from a set of web page.  As an example, suppose you want to emulate a virtual user who clicks a link in the navigation region of a web page, the following code snippet will suffice:
 action(http,"http://www.example.com/mainPage/");  
 links = fromHttp(http.replyBody, '//div[@class="navigation']//a/@href', text)  
 url = randElement(links);  
 action(http,url);  
    If you have used some other test platform(s), you may remember how hard it is to do something this simple -- you may have to write a long script ladened with API calls.

 randSequence(<min>, <max>)  
    This function will return a subset of numbers between min and max.  For example,  randSequence(1,10) returns a sub sequence of 1,2,3,...8,9,10.   Here is where it can be useful:  suppose a web page contains n items, you want to emulate user who is going to click a random sub sequence of 1,2,3, ...n.
 action(http, "http://www.example.com");  
 items = fromHtml(http.replyBody, '//a/@ref', "text");  
 sequence = randSequence(1, length(items));  
 for (i=0; i < length(sequence); i++) {  
    action(http, sequence[i]);  
 }  

 randSubset(<array>)  
    This function is the latest in this series, it's more general than the previous function (randSequence) in that the input may not be a sequence of consecutive numbers.  It treats the input array as a set and return a subset of it.   Here is the test scenario where it can be quite handy:   on a social media site,  user can look for interests news on a subset of people (the celebrities),  using this function, it was easy to emulate such a user behavior.

 action(http, "http://www.example.com");   
 celebrites = fromHtml(http.replyBody, '//span[@class="keypeople"]', "text");   
 subset = randSubset(celebrites);  
 str = join(subset, ",");  //concatenate the people with "," as the separator
 action(http, "http://www.example.com/track?people=${str}");   

    Are these all the random functions we will ever need?  Absolutely no, but given the flexibility on NetGend platform, I am sure adding support for the new ones will be easy.

Saturday, February 8, 2014

Selectively log http transactions


    Just as in software development, development of a performance test script may require debugging or trouble-shooting.  One of the common debugging techniques in software development is "printf".  For performance testing,  it's the logging of messages.  We don't want to log all messages in a test, it will impact the performance of the platform and make it harder to do debugging.

    Recently I saw a question by ShurupuS  about only logging request data when there is an error and it makes me realize that it may not be easy to log only relevant messages on many test platforms.    In this blog, we will show how easy it is to do so on NetGend platform.

   Our flexibility comes from the design that all the per-instance/per-VUser variables are easily accessible and also from the function "writeFile".  This function took two parameters, the first parameter is the name of a file, the second parameter is the data to be logged.   Here is a quick example on logging the HTTP requests that caused server to give response codes equal to 500 or other response codes that are not 200.

 function VUSER() {  
      id = randNumber(1,100000);
      action(http, "http://www.example.com/viewproducts/${id}");  
      if (http.respCode == 500) {
           writeFile("criticalerror.log", http.request); 
      } else if (http.respCode != 200) {  
           writeFile("error.log", http.request);  
      }  
 }  
     Of course,  the HTTP request can be much more dynamic and complex,  logging only the requests that cause error can save a lot of debugging time.  
     What if you want to check whether a response contains certain pattern (a string), you can do it using match function.  In the following example, we check if the server reply contains the string "success".
 function VUSER() {  
      //create a POST message
      action(http, "http://www.example.com/add2cart");  
      if (!match(http.replyBody, /success/) ) {  
           writeFile("error.log", http.replyBody);  
      }  
 }  
     This function helped me a lot when I was testing an e-commerce site, guess what was the reason for the failure?  this product was sold out :-)  You may be laughing at me,  I was scratching my head before I found the cause.

    What if you need to log multiple piece of data in a file?  like HTTP request, reply body or the reply header.   writeFile() function has a hidden third parameter called "append", when it's present, this function will append the data to the file rather than overwriting the existing file.
 writeFile("tmp.html", http.replyBody);  
 writeFile("tmp.html", http.request, "append);  
 writeFile("tmp.html", http.replyHeader, "append");  

    Since this happens so often, we created a new function logHtml() that will log the reply, request and reply header.
 logHtml("1.html");  
On the UI, you can quickly view the logs in a browser and/or plain text edior.

    You may wonder if it works for HTTPS?   Absolutely!  Everything remains the same except for the action statement uses https instead of http.
 //...  
 action(http, "https://www.example.com/viewproducts");  
 if (http.respCode == 500) {  
    logHtml("tmp.html");  
 }  
   This makes it handy to trouble-shoot HTTPS transactions which typically can't be done with network sniffers!

   In my experience, being able to get relevant logs can cut the time spent on debugging scripts by half.  Hope your favorite test platform also supports the flexible logging as on the NetGend platform.
 

Tuesday, February 4, 2014

1 Million concurrent connections from one load generator


    One million concurrent TCP connections is a great performance milestone both for a web server and for a performance test platform.  It's not trivial to be able to process 1M concurrent connections, maybe even harder to generate this many connections.   No wonder a performance test platform vendor is so proud that they put up ads that claim that their system can generate 1,000,000 concurrent connections - on hundreds of load generators.   In this blog, we are going to show that NetGend test platform can do that too - on a single load generator.

   Here is the simple script that our test platform uses to talk to our server.
 function VUSER() {   
      http.POSTData = "username=user${userId}&password=letmein";   
      action(http, "http://10.3.0.6/smartlogin");  
      sleep(500000);  
 }   
     Effectively, each client sets up a TCP connection to the server and does a HTTP transaction and keeps the connection open.  
     Here is the graph that shows that we have reached 1 M concurrent TCP Connections.  ("TCP open session" means established sessions).
 

   You may be thinking it's impossible to have that many TCP connections from the same client IP address to a web server (with one IP address) since there at most 65535 TCP ports on the client side.  You are right, with only one client IP address, we can't get there.  Luckily, NetGend platform  can emulate a range of IP addresses. In fact, it's not hard to emulate a million IP addresses on this platform.
The following is the set of parameters we used for our test,  there are about 25K IP addresses in the range.

   When you click the "run" button, the test platform will immediately start generating TCP sessions, in another word, it will not wait for minutes as is the case with the other test platforms.  At a rate of 5000 new sessions per second, it takes about 200 seconds to reach 1 million open sessions.

   In case you are wondering flexiblity of the test platform,  here is a short list of what this platform supports

  • Processing/parsing of JSON message, JSON Path,  JSONP messages,
  • JSON-RPC,
  • Processing of XML message and Xpath,
  • HTML message processing by Xpath, regular expression, string search 
  • HTML form filling that takes care of hidden fields and internal field names.
  • Combinations of (nested) conditions and/or loops
  • Rendezvous point
  • ...

     Note that the above parameterization and correlation, very complex on the other test platforms,  are made extremely simple (see other blogs in this site), even for those who don't have a lots of programming background.

     If you ever need to test a server on its ability to handle 1M connections,  you may want to give NetGend a try, it will show you the answer in a few minutes.

Saturday, February 1, 2014

Shortest script to performance-test a HTML form


    One of the guiding principles for the NetGend performance test platform is to make complex things easier - so easy that it requires no special knowledge to do complex tasks, such as performance testing on a HTML form.

    A HTML form may appear simple to an average online user but it's fairly complicated for us, the performance testers.  We have to take care of  the hidden fields,  be able to parameterize the user-visible fields, create the HTTP query and send it to the URL specified in the action attribute of the form.

    As we talked about in earlier blogs, on the other platforms,  the tester will have to extract the hidden fields in the HTML form, probably with regular expressions,  to construct the HTTP query.  On the NetGend platform,  a tester just needs to supply the customer-visible fields (their names and values) thanks to our function fillHtmlForm which takes care of the hidden fields automatically.   While it's a big improvement over the other platforms but it still requires the tester(s) to find the internal names of the visible fields by looking at the HTML data or using tools such as chrome developer console or firebug.   Can we do better?

    In this blog, we are going to show how to test a HTML form without having to know the internal names of the user-visible fields.  All that a tester needs to provide in the scripts are

  • The URL for the form
  • The list of values that a normal user will fill.
  • The name of a label that helps to identify the form - useful when there are multiple forms.

 These information can be obtained without special knowledge of HTML or using any developer tools.

    Here is an example of a common registration form that you may find on many web pages.



      If you are familiar with the other test platforms and have struggled with regular expressions used to extract hidden values,  you may be shocked at how short the following script is!


 function VUSER() {   
    quickForm("https://www.example.com/", ["John", "Smith", "jsmith@example.com", "abc123"], "First Name:");  
 }   

     This script essentially consists of one line, a call to the function quickForm.  The first parameter of the function is the URL for the registration form. The second parameter is an array, consisting of a list of items: "John", "Smith", "jsmith@example.com", "abc123" that you would otherwise need to fill in manually if you were a typical online user, the last parameter "First Name:" is used to identify the registration form.

    The function does a lot of things internally

  • It does the HTTP transaction to get the HTML page that contains the form,
  • It uses the third parameter to identify the right form,
  • It scans through all the fields in the form, constructing the HTTP query along the way, use the values provided in the parameter,
  • It uses the URL in the action attribute of the form to make the HTTP request and wait for the server reply. 

     If you want to check if the server reply is as expected, you can add a line after the function quickForm like
 quickForm(....);
 if (match(http.replyBody, /welcome/)) {  
    return 1;  
 }  

     If all the above looks easy, image 50,000 such clients performing this action. I am sure the server will get a good workout.

     We know that it's not always possible to make any complex tasks simple, when the opportunity does present itself,  NetGend will not let it slip by.