Monday, January 27, 2014

The power of simplicity

    In an earlier blog, we talked about the story for the nickname "ghetto coder".  Today I saw an interesting test scenario in google group that reminded me of it yet again (lost track of how many times I was reminded).  In this test scenario, we need to emulate multiple clients,  each client does multiple transactions with a server, each time the server sends a response (HTML) with a dropdown list (a.k.a. "select" tag with options),   the item names are the same, but the item values may change.   Here are the code snippets of 3 HTML responses.   In the emulation, we need to consistently select a user, say "Smith" when interacting with the server.

 1st iteration   
 <option value="456983">Jack</option> ------>456983[Dynamic value changes every time]   
 <option value="456984">Adam</option>   
 <option value="456985">John</option>   
 <option value="456987">Smith</option>   
 2nd iteration   
 <option value="456974">Jack</option>   
 <option value="456975">Smith</option>   
 <option value="456976">John</option>   
 <option value="456977">Adam</option>   
 3rd iteration   
 <option value="456654">Smith</option>   
 <option value="456655">John</option>   
 <option value="456656">Adam</option>   
 <option value="456657">Jack</option>   

    Here is one of the proposed solutions in the discussion thread. Note that the values of items are extracted by a left boundary and a right boundary and also note how hard it is to go through the list of item values:
 web_reg_save_param ("IDValues", "LB= value=\"", "RB=\"", "Ord=All", LAST);   
 // get number of matches   
 nCount = atoi(lr_eval_string("{IDValue_count}"));   
 for (i = 1; i <= nCount; i++) {   
   // create full name of a current parameter   
   sprintf(szParamName, "{IDValue_%d}", i);   
   // output a value of current parameter   
   lr_output_message("Value of %s: %s",szParamName,   

    In addition to being not so "simple", it's may not even work in some scenarios:    what if there are other tags (like "<input>") in the html file that also has the "value" attribute? what if there are attributes whose values are not surrounded by double quote (like value=1234)?  We have seen it in some real world sites like walmart.

    In contrast, the script on NetGend is much simpler.
 function VUSER() {  
      action(http, "");  
      mapping = mapFormOptions(http.replyBody, "Jack");  
      println(mapping.Jack);  //will have the value of Jack
      println(mapping.Smith); //will have the value of Smith
    Here we use the function "mapFormOptions" to grab the options, which is covered in the previous blog.  This function will grab the "select" node which has an item name that matches the second parameter ("Jack" in this case). This function returns a mapping from item names to item values.  With the mapping, it's easy to find the value of any item.

   It's pretty easy too if you need to iterate over all the items:
 items = keys(mapping);  
 for (i=0; i<length(items); i++) {  
    item = items[i];  
    value = mapping.$item;  
     What's new here is the function "keys()" which takes a mapping as input and returns the list of keys ("item names" in our case).  In the above code we sequentially go over all the values,  in real world test scenario, you may need to randomly choose an item and stick to it during the course of interactions with the server.   The implementation is left to the reader as an exercise.

    By being simple, the code is also more resistant to the variation that will  likely come in the future, so the cost of maintaining the script is greatly reduced.

Thursday, January 23, 2014

Test search performance on the walmart site

   Walmart is the largest retailer in the world,  it employees 2 million people and has 8,900 stores world wide.  Its online presence is quite impressive too.  I used it to print out my digital pictures from a cruise trip and was happy with both the quality and the price.    One of the things that people do on the web site is to search for items to buy,  so searching is absolutely important.

    A few points to notice on Walmart's search form:
  • use HTTP GET instead of POST
  • has multiple hidden fields,  
  • sometimes all the html are minified, so it's hard to read and do regular expression on 
  • attribute values are not always surrounded by double-quotes.
   Here is a simple script to test the searching functionality.

 function VUSER() {   
      action(http, "");  
      a.search_query = "candy";  
      b = fillHtmlForm(http.replyBody, a, "id", "searchbox");  
    It's not quite realistic due to the hard coded query "candy".   Yes, we can remedy it by reading from a file that contains a list of queries,   but it is still not realistic enough:   when a real user do searching, he/she may limit the search to a particular department, like someone who wants to buy DVD player may search it in "Electronic" department. A more realistic search is to have a file (csv file) that contains a list of items, each item has a department name and a query phrase, like the following:
 Music,The Bad  
 Health,vitamin E  
 Music,Say you say me  
 Books,the blood line  
    The challenge is,  how do we convert the department names such as "Health", "Music",  "Photo Center" into numbers (called "search_constraint").  This is what the NetGend platform supports well. To get the mapping, you just need a function called "mapFormOptions".  Here is the syntax. 
      mapFormOptions(<htmlString>,  <optionName>)

    The first parameter is a HTML string (the server response), the second parameter is the name of an item in the (dropdown) list of options.  Since this name is used to identify the dropdown list of options, you may want to pick a name that's unique to that dropdown/option list since there may be multiple dropdown lists on that page.  This function returns a mapping from the option names to their values.
     With the mapping, say, in variable "options",   we can easily find the value of  the option, say,  "Music" is options.Music,  the value of "Photo Center" is options."Photo Center",  (note that we need a quote around "Photo Center" because of the space). In general,  if a variable "opt" contains an option name,  the value of the option  is options.$opt.

    Here is the script that implements the search.
 function userInit() {  
         var options = {};  
         var db = fromCSV("searchs.csv");  
 function VUSER() {   
         record = getNext(db);  //record[0] is department name, record[1] is query 
    action(http, "");    
    if (hasElement(options) == 0) {   
      options = mapFormOptions(http.replyBody, "Health"); //"Health" is unique enough  
    a.search_query = record[1];   
    opt = record[0];  //department name
    a.search_constraint = options.$opt;   
    b = fillHtmlForm(http.replyBody, a, "id", "searchbox");   
     The script is still fairly simple (given all the mapping work it has to do!), with the super scalability of 50,000 VUsers emulated on a box, it can potentially give a good testing on the performance of the searching functionality (we will need to get the permission to run load testing).

     Walmart's web site is a typical  e-Commerce site,  if a performance test platform/tool can test it out well, chances are that it may be able to test the other e-Commerce sites well.

Monday, January 20, 2014

Performance testing on JSON RPC based server

    Just like the fact that JSON is overtaking XML as a more dominate data format for communication between clients and servers, JSON-RPC will eventually top the RPC mechanisms because of its flexibility and light weight.  For a performance test professional,  the natural question is, how do we run performance test on a JSON-RPC based server?

    NetGend, thanks to its great support for the processing on JSON messages,  is a good choice for the job.   In a typical script, you start by using the method createJSONRPCObject to create a JSON RPC object.  After that,  you can use method jsonrpc2string to serialize the JSON object into a string to be sent to the server.  Here is the syntax:
 jsonrpc2string(<rpcObject>, <method>, <parameters>);  

    Let's look at a concrete example, taken from an article in simple-is-better.
 function VUSER() {  
      a = createJSONRPCObject();  
      http.POSTData = jsonrpc2string(a, "echo", ["Hello World"]);  
    Here the method is "echo" and it has one parameter("Hello World" in this case). If there are more  parameters, you can add them to the array like:
    http.POSTData = jsonrpc2string(a, "echo", ["Hello World", "please discard"]);

     According to the same article,  JSON RPC excels in the support for the named parameter(s), let's look at such an example (also taken from the above mentioned article).

 function VUSER() {  
      a = createJSONRPCObject();  
      http.POSTData = jsonrpc2string(a, "search", {"last_name": "Python"});  
    In this example, the parameter field is {"last_name": "Python"},  It's not an array, but a dictionary/hash table where each field has a name (hence "named parameters").  It's very flexible in that it can easily support optional parameters and nested structures.

    We have talked about parsing and processing JSON message in earlier blogs, let's just look at a quick example:  suppose the server response is like the following and we need to print the number of results in the response.
 {"jsonrpc": "2.0", "result": [  
      {"first_name": "Brian", "last_name": "Python", "id": 1979, "number": 42},   
      {"first_name": "Monty", "last_name": "Python", "id": 4, "number": 1}  
      "id": 0  
    Here is the script for it:
 function VUSER() {   
      a = createJSONRPCObject();   
      lastName = "Python";  
      http.POSTData = jsonrpc2string(a, "search", {"last_name": "${lastName}"});   
      response = fromJson(http.replyBody); //http.replyBody has the server response, a JSON message
      size = length(response.result);  
      println("there are ${size} results in search for ${lastName}");  
   This example can be made a little more advanced by reading a list of last names from, say, a CSV file.  We have blogs covering this topic.

    As the last example, let's take a look at an interesting question in stockoverflow.  User needs to parameterize some fields in the named parameters, like the values of x and y in
{ "userid": 123456, "x": 1, "y": 3 } 

     Here is a possible implementation.
 function VUSER() {   
      a = createJSONRPCObject();  
      x = randNumber(100, 300);  
      y = randNumber(200, 500);  
      http.POSTData = jsonrpc2string(a, "move_to_tile", { "userid": 123456, "${x}": 1, "${y}": 3 });  
      action(http, "");  
    It's fairly simple, isn't it?

    JSON is clearly the winner among all the data formats for the internet, so a good performance test platform/tool should make handling JSON messages easier, that's exactly what NetGend is designed to do.

Saturday, January 18, 2014

Web site scalability

   It's many web site owner's dream to have millions of visitors.  Along with that comes the need to scale the sites to be able to handle that many visitors.  To that end, let's look at a simplified web site infrastructure diagram (thanks to a blog by Ryan).



     In this diagram there are two web servers and one database server.  Many of us may think scalability can be achieved simply by adding more instances of web servers and database servers. In the cloud where instance cost is so low (sometimes less than $0.01/hour),   the capacity can be scaled almost infinitely.   Unfortunately this is not true in many cases.

    While most of the time the web servers can be scaled by adding more instances,  database servers are much less so.   Some may not support multiple instances/shards.  Even when a database does support multiple instances/shards,  it typically doesn't scale linearly with the number of instances to infinity and the scalability diminishes quickly when the data scheme is more complex and degree of correlation among data is higher.

    Many client server interactions touch on the database, whether it's creating a new session, registering a user,  logging in or viewing a product.  In some sense, database is the most important part of the infrastructure (that's why Oracle made a lot of $$ :-)), but unfortunately it's also the part that's the hardest to scale.  Depending on the application,  some data structure may have such a high degree of correlation that it's virtually impossible to scale.

    From performance test's point of view, it's important to find the degree of scalability on a database.  If the database is scalable,  then it may be ok to run a test with less load and infer the capacity of a given system by means of extrapolation, otherwise, it's probably better to crank up the load to find the true capacity of the system - the point where the system can't catch up with demands/requests.
    One possible way to find the degree of database scalability is to compare the performance of one web server vs two web servers.  It's fairly easy to run test by pointing to two servers on NetGend platform:
 function userInit() {  
      var hosts = ["", ""];  
 function VUSER() {   
      host = randElement(hosts);  
      action(http, "http://${host}/");  
    If the response time dramatically improves with 2 servers over 1 server, the issue is on web server side. Otherwise, it probably indicates that the database doesn't scale as much you would like. You would need to emulate many virtual clients on the test platform to get the capacity of the system, a job where the NetGend platform excels, with 50,000 virtual client emulated on one system.

Monday, January 13, 2014

Test a performance test tool by JSON

    In earlier blogs, we have covered various cases on handling JSON messages and we saw how easy it was to extract any field(s) from a JSON message on the NetGend platform.  Saw this stackoverflow question, it's quite interesting because it's more than just value-extraction.

   The task is to analyze server responses (which are in the format of JSON) and find out how many unique types that each message contains.  The following is a sample server response. We can see there are 3 unique types: 8, 9, 14.


   A complex solution for JMeter was suggested, it involved using multiple regular expressions, loop controller, complicated manipulations of the variables.  It could be well above the level of a typical user.  It's quite easy on NetGend platform:

 function VUSER() {   
      action(http, "");  
      a = fromJson(http.replyBody); //reply may be different for different users 
      types = {};  
      count = 0;  
      for (i=0; i<length(a); i++) {  
           type = a[i].type;  
           if (types.$type != 1) {  
                types.$type = 1;  
                count ++;  
    The script is pretty simple, here is a summary of what the script does:

  • parse the http reponse "http.replyBody" using "fromJson" function,
  • Since we know this JSON message is an array, we use the "for" loop to go through the elements,  each element has the fields "type" and "id".    We assign the type of the element ("a[i].type" in our case) to the variable "type",
  • For each element, we use "if (types.$type != 1)" to check if we have seen the "type" before (e.g. if we have seen the value contained in "$type"), if not, we set "types.$type = 1" to mark this "type" as seen (hence we don't count it when we see this type again) and increment the count.
    The only part that needs a little explanation is "types.$type",  it's slightly different from an expression like "types.XYZ", which means the value of  the field "XYZ" of the compound variable "types".   "types.$type" refers to the field whose fieldName is the value of variable "type" ( assigned from "a[i].type").  In another word, if  "$type" has value "ABC",  "types.$type" means "types.ABC".

   As we been emphasizing, using the regular expression to process JSON message is not a good solution, it's hard, cumbersome and error-prone.  With the growing popularity of JSON message in web applications,  it's high time for other test platform to implement a good support for it.   If there is a set of questions to test the flexibility of a performance test platform,  this one should be one of them.

Friday, January 10, 2014

Scrap content from web pages

   Scraping web content is rewarding - you get the contents you need and at the same time you touch up  the technical skills.  One of our earlier blog is on downloading a list of MP3 files.   This blog was based on an interesting blog by Stu on scraping the list of word-press plugins from web pages.  Stu has created many creative scripts even with the limitations posed by the loadrunner and this is not the first time I am inspired by his blogs!

   Things have changed a lot since Stu wrote his blog 3 years ago. Here is the structure of a plugin block in the web page as of now.  The only fields that can be scrapped are plugin-name, version, last-updated-date, Downloads and stars.

 <div class="plugin-block">  
      <h3><a href=""> Url Shorter</a></h3>  
      This WordPress plugin can short your long url via       
        <ul class="plugin-meta">  
           <li><span class="info-marker">Version</span> 1.0.1</li>  
           <li><span class="info-marker">Updated</span> 2011-11-16</li>  
           <li><span class="info-marker">Downloads</span> 330</li>  
                <span class="info-marker left">Average Rating</span>  
                <div class="star-holder">  
                     <div class="star-rating" style="width: 0px">0 stars</div>  
      <br class="clear" />  

    The following is a script on the NetGend platform.   It's interesting to compare it to the loadrunner script in Stu's blog.  Note how short the NetGend script is.

 function userInit() {  
      var page = 1;  
      var maxPage = 2000;  
 function VUSER() {  
      currentPage = page ++;  
      if (currentPage > maxPage) { exit(0); }  
      action(http, "${currentPage}");  
      b = fromHtml(http.replyBody, '//div[@class="plugin-block"]', "node");  
      for (i=0; i<length(b); i++) {  
           c = fromXml(b[i]);  
           a = [ c.div.h3.a.value,[0].value,[1].value,[2].value,[3].div.div.value];
           println(join(a, ","));

    Why is the NetGend script so short?  If you compare it to the loadrunner script,  you will see that NetGend script doesn't extract values by left/right boundary method.  That method may be convenient but it's error-prone.    Instead it

  • Uses xpath to extract an array of HTML blocks for plugins by the function "fromHtml" ,  each of the HTML blocks will be a XML message,
  • parses the XML message using the function "fromXml" and accesses the fields. 

   It's true that the part //div[@class="plugin-block"] requires a little more knowledge such as xpath,  however, accessing fields of a XML message is actually straight forward, take c.div.h3.a.value as an example,  "c" is the variable name,  the part "div.h3.a" follows the structure of the HTML block in an obvious manner.  You add "value" at the end of expression to get the value of the field.

    Note that NetGend script is immune to changes in HTML tags - adding extra spaces or adding more tags will not affect the script.  The same can't be said on the left/right boundary method.

    By the way, for someone who truly want to extract values by left-boundary and right-boundary, we do have it supported on NetGend platform.

   Scraping content from a web site may or may not generate a lot of web traffic, but it definitely gives a good clue on how flexible the performance test platform is.

Wednesday, January 8, 2014

Early performance testing, 1 oz prevention >= 1 lb cure

   We all know testing is an essential part of product development, performance testing is a big part of testing, especially when we are dealing with the web based applications where customers will be lost if server crashes under high volume or response time becomes too long.  Unfortunately many project involves the performance testing only in the end, as part of the acceptance testing.  Common as it seems, there are serious problems with this approach.

   In some cases, performance test result shows that considerable amount of performance improvement is needed for the release of the product. Now developers have to search all over the code to find places that can be optimized. It may seem like a bug hunt,  but it's different because it is not about  logic errors.  Afterall, logic of code may be all right after going through rounds of unit-testing and functional testing. Developers may have to do trial-and-error by making some changes and see how much performance gains they get. The target for performance may be so high that it will take weeks if not months to get there.  I have seen some developer staring at the code so hard for optimization and they found some real bugs missed in unit-test or functional-test :-).  Obviously it's not the reason why performance test should be done at the end.

   In some other cases, performance testing shows that performance number is seriously lower than expected and after some experiments, a harsh reality comes to surface:  a part of design is fundamentally wrong.  After month's of development and testing, things have to go back to square one. This may seem impossible but it does happen from time to time. I was involved in a project, where the performance bottleneck turned out be disk IO. Simple as it may appear, it was found out after
  • trying different web servers, thinking asynchronous server can solve the problem
  • trying various database optimization, hope one silver bullet can save them.
The cold and hard fact found by the performance testing is that:  the bottleneck is on retrieving records from disk. How fast the records can be retrieved from disk depends on the access pattern, the database and the disk. A totally different database had to be used for that project.

     Now, why isn't early performance testing commonly practiced? One of the reasons is that
it requires a non-trivial amount of resources. Some times, such performance  tests involve some programing, so some developer resource has to be reallocated to do performance test development.  Is it worth it?  Very likely.  It can be a perfect application of the classic principle of "one ounce of prevention may worth a pound of cure".  An added benefit is that  testing team may also benefit from the tools/procedures created by the developer.
    Another possible reason is, performance testing requires a tool that is neither simple
to use nor cheap/free. In today's explosion of software products, new tools are coming up like
mushrooms after a rain, just check around and you will find the pleasant surprises. For
example, in the area of application performance testing,  new tools and low-cost/free services
like gatling,, NetGend are all new kids in the block and waiting to help you
do the early performance testing.
     Looking back on some of the projects, I feel they could benefit tremendously if we have done performance testing from the start - when the project was still in its infancy. We could have found out whether the major decisions (like the choice of database) were fundamentally wrong early on.  Just like curing a major disease (think cancer), making changes in the early stage is much easier than making changes in the late stage:
  • there is not a lot of time and effort invested,
  • the code structure is relatively simple.  Layers of patches on the code base during the course of project can make it very hard, if not impossible to hunt bugs, especially performance bugs,
  • If there is a performance bug introduced in a build, it’s much quicker to find the delta between the builds and zero in on the bug by applying exclusion on the delta.  It often happens that when the delta is reasonably small, developer(s) can spot the performance bug immediately.

     Is early performance testing possible when development has just started?  Not always, but early performance testing should be done as soon as it becomes possible - definitely before the end of the project.  Once it becomes possible, performance test should be performed regularly just like regular testing to find bugs introduced in various builds. Performance bugs are just as nasty as functional bugs and sometimes performance bugs can be harder to hunt since they are typically correct in logic.

     Serious performance issue can often derail a product, cause a project to miss its deadline.  Are you ready to use one ounce of performance testing and save one pound of cure?

Monday, January 6, 2014

Browser is getting smarter!

    For performance testing professionals,  packet sniffer (like wireshark) is a good friend.  It can help us troubleshooting issues that are otherwise hard to do. For example, in an earlier blog, we showed how to use wireshark to find out whether a delay in server response is due to the network or due to server infrastructure.

   Today I want to share how packet sniffer helped me to realize that my browser has become smarter.   It all started with a surprise in seeing my Chrome browser tried to setup two TCP connections simultaneously when I tried to visit a web site.   I had expected the browser to do the following:

  • starts with only one TCP connection,
  • send HTTP request and download the main page,
  • set up more concurrent TCP connections to download other resources.

    Did some search and found this message thread which shed some light.  To paraphrase what was mentioned there,  setting up multiple TCP connections can have two-fold benefits.
  • In case one connection took longer to set up (say, due to packet drops), the other connection(s) can be used to download main HTML page.  
  • The extra TCP connection can be used to download resource files (css, js, image etc).
    While both can help with improving page loading time, the first one can help more in some cases because when a TCP SYN or SYNACK packet was dropped, the re-transmission may kick in after 3 seconds on some platforms,   making TCP setup time much longer.

    The following is a summary of the packet capture.  We filtered out the packets that are not relevant.

 Timestamp       Source IP      srcPort      Dest IP  dstPort   Protocol msg      
 0.000000000     54486     80     SYN  
 0.000230000     54487     80     SYN  
 0.066502000     80     54486     SYN, ACK  
 0.066862000     54486     80     GET / HTTP/1.1   
 0.070695000     80     54487     SYN, ACK  
 0.188546000     54488     80     SYN  
 0.189143000     54487     80     GET /css/screen.css HTTP/1.1   
 0.255478000     80     54488     SYN, ACK  
 0.255785000     54488     80     GET /css/jquery.fancybox.css HTTP/1.1   
 0.348672000     80     54486     HTTP/1.1 200 OK (text/html)  
 0.351507000     54490     80     SYN  
 0.353380000     54491     80     SYN  
 0.389387000     80     54488     HTTP/1.1 200 OK (text/css)  
 0.421846000     80     54490     SYN, ACK  
 0.422441000     54490     80     GET /js/site.js HTTP/1.1   
 0.425096000     80     54491     SYN, ACK  
 0.425724000     54491     80     GET /js/homepage.js HTTP/1.1   
 0.494837000     80     54491     HTTP/1.1 200 OK (application/x-javascript)  
 0.750929000     80     54487     HTTP/1.1 200 OK (text/css)  
 1.014032000     80     54490     HTTP/1.1 200 OK (application/x-javascript)  
 1.027851000     54496     80     SYN  
 1.027970000     54497     80     SYN  
 1.028081000     54498     80     SYN  
 1.028166000     54499     80     SYN  
 1.028249000     54500     80     SYN  
 1.048127000     54501     80     SYN  
 1.096886000     80     54496     SYN, ACK  
 1.097185000     54496     80     GET /images/oldBrowser-bg.png HTTP/1.1   
 1.098732000     80     54498     SYN, ACK  
 1.098952000     54498     80     GET /images/logo.jpg HTTP/1.1   
 1.100747000     80     54497     SYN, ACK  
 1.100963000     54497     80     GET /images/body_bg.jpg HTTP/1.1   
 1.102099000     80     54499     SYN, ACK  
 1.102248000     54499     80     GET /images/header_bg.png HTTP/1.1   
 1.104004000     80     54500     SYN, ACK  
 1.104228000     54500     80     GET /images/utility_divider.gif HTTP/1.1   
 1.118830000     80     54501     SYN, ACK  
 1.119053000     54501     80     GET /css/linux.css HTTP/1.1   
 1.165477000     80     54496     HTTP/1.1 200 OK (PNG)  
 1.166331000     54503     80     SYN  
 1.179595000     80     54500     HTTP/1.1 200 OK (GIF89a) (GIF89a) (image/gif)  
 1.180519000     54504     80     SYN  
 1.239261000     80     54498     HTTP/1.1 200 OK (JPEG JFIF image)  
 1.239636000     54505     80     SYN  
 1.249838000     80     54499     HTTP/1.1 200 OK (PNG)  
 1.250503000     54506     80     SYN  
 1.300834000     80     54501     HTTP/1.1 200 OK (text/css)  
 1.302668000     54509     80     SYN  
 1.306116000     80     54503     SYN, ACK  
 1.306922000     54503     80     GET /images/sprite/icons.png HTTP/1.1   
 1.308188000     80     54504     SYN, ACK  
 1.308516000     54504     80     GET /images/secondary_nav_arrow.png HTTP/1.1   
 1.310150000     80     54505     SYN, ACK  
 1.310330000     54505     80     GET /images/resource-nav-bdr.png HTTP/1.1   
 1.318582000     80     54506     SYN, ACK  
 1.318908000     54506     80     GET /js/trackalyze.js HTTP/1.1   
 1.371181000     80     54509     SYN, ACK  
 1.371670000     54509     80     GET /images/resources/videos/small-pen-testing-sap.jpg HTTP/1.1   
 1.378268000     80     54504     HTTP/1.1 200 OK (PNG)  
 1.380136000     54511     80     SYN  
 1.384338000     80     54505     HTTP/1.1 200 OK (JPEG JFIF image)  
 1.385775000     80     54506     HTTP/1.1 200 OK (application/x-javascript)  
 1.385953000     54512     80     SYN  
 1.387611000     54513     80     SYN  
 1.442571000     80     54509     HTTP/1.1 200 OK (JPEG JFIF image)  
 1.442854000     54516     80     SYN  
 1.451614000     80     54511     SYN, ACK  
 1.451893000     54511     80     GET /images/slider_bg.png HTTP/1.1   
 1.456305000     80     54512     SYN, ACK  
 1.456679000     54512     80     GET /images/slides/slide_1.png HTTP/1.1   
 1.460844000     80     54513     SYN, ACK  
 1.461305000     54513     80     GET /images/slides/slide_8.png HTTP/1.1   
 1.519412000     80     54516     SYN, ACK  
 1.519879000     54516     80     GET /images/slides/slide_2.png HTTP/1.1   
 1.669946000     80     54516     HTTP/1.1 200 OK (PNG)  
 1.673090000     54547     80     SYN  
 1.678979000     80     54503     HTTP/1.1 200 OK (PNG)  
 1.679257000     54549     80     SYN  
 1.695996000     80     54513     HTTP/1.1 200 OK (PNG)  
 1.696478000     54551     80     SYN  
 1.715088000     80     54497     HTTP/1.1 200 OK (JPEG JFIF image)  
 1.715935000     54552     80     SYN  
 1.741428000     80     54547     SYN, ACK  
 1.741790000     54547     80     GET /images/slides/slide_3.png HTTP/1.1   
 1.753093000     80     54549     SYN, ACK  
 1.753296000     54549     80     GET /images/slides/slide_4.png HTTP/1.1   
 1.762919000     80     54551     SYN, ACK  
 1.763151000     54551     80     GET /images/sec-nav-outer-bg.png HTTP/1.1   
 1.787654000     80     54552     SYN, ACK  
 1.787985000     54552     80     GET /images/torso_bg.jpg HTTP/1.1   
 1.798214000     80     54512     HTTP/1.1 200 OK (PNG)  
 1.798825000     54556     80     SYN  
 1.837648000     80     54551     HTTP/1.1 200 OK (PNG)  
 1.839507000     54559     80     SYN  
 1.869998000     80     54556     SYN, ACK  
 1.872545000     54556     80     GET /images/torso_bg.png HTTP/1.1   
 1.875444000     80     54547     HTTP/1.1 200 OK (PNG)  
 1.875845000     54560     80     SYN  
 1.927286000     80     54549     HTTP/1.1 200 OK (PNG)  
 1.928296000     80     54559     SYN, ACK  
 1.928822000     54559     80     GET /images/home/nexpose.png HTTP/1.1   
 1.929061000     54561     80     SYN  
 1.944633000     80     54560     SYN, ACK  
 1.945062000     54560     80     GET /images/blue_button_bg.gif HTTP/1.1   
 2.004818000     80     54559     HTTP/1.1 200 OK (PNG)  
 2.007275000     54562     80     SYN  
 2.008405000     80     54561     SYN, ACK  
 2.009310000     54561     80     GET /images/forward-separator.png HTTP/1.1   
 2.010527000     80     54552     HTTP/1.1 200 OK (JPEG JFIF image)  
 2.017098000     80     54556     HTTP/1.1 200 OK (PNG)  
 2.018256000     54563     80     SYN  
 2.019943000     80     54560     HTTP/1.1 200 OK (GIF89a) (GIF89a) (image/gif)  
 2.021165000     54564     80     SYN  
 2.023154000     54565     80     SYN  
 2.049434000     80     54511     HTTP/1.1 200 OK (PNG)  
 2.051055000     54566     80     SYN  
 2.079561000     80     54562     SYN, ACK  
 2.079953000     54562     80     GET /images/home/metasploit.png HTTP/1.1   
 2.082981000     80     54563     SYN, ACK  
 2.085419000     80     54561     HTTP/1.1 200 OK (PNG)  
 2.089268000     80     54564     SYN, ACK  
 2.091658000     54563     80     GET /images/home/control-insight-logo.png HTTP/1.1   
 2.091795000     54564     80     GET /images/home/risk-rater-logo.png HTTP/1.1   
 2.092606000     54567     80     SYN  
 2.094013000     80     54565     SYN, ACK  
 2.096614000     54565     80     GET /images/grey_btn_bg.jpg HTTP/1.1   
 2.118005000     80     54566     SYN, ACK  
 2.118322000     54566     80     GET /images/home/user-insight-logo.png HTTP/1.1   
 2.146947000     80     54562     HTTP/1.1 200 OK (PNG)  
 2.149641000     54568     80     SYN  
 2.159017000     80     54563     HTTP/1.1 200 OK (PNG)  
 2.162342000     80     54567     SYN, ACK  
 2.162830000     54569     80     SYN  
 2.163137000     80     54564     HTTP/1.1 200 OK (PNG)  
 2.164078000     54567     80     GET /images/customer_bg.jpg HTTP/1.1   
 2.166080000     54570     80     SYN  
 2.167307000     80     54565     HTTP/1.1 200 OK (JPEG JFIF image)  
 2.169436000     54571     80     SYN  
 2.184629000     80     54566     HTTP/1.1 200 OK (PNG)  
 2.186997000     54572     80     SYN  
 2.219819000     80     54568     SYN, ACK  
 2.220139000     54568     80     GET /images/customers/carnegie_mellon.png HTTP/1.1   
 2.229254000     80     54569     SYN, ACK  
 2.229573000     54569     80     GET /images/customers/bcbs.png HTTP/1.1   
 2.242789000     80     54570     SYN, ACK  
 2.243155000     54570     80     GET /images/customers/lizclaiborne.png HTTP/1.1   
 2.244391000     80     54571     SYN, ACK  
 2.244762000     54571     80     GET /images/customers/usps.png HTTP/1.1   
 2.256889000     80     54572     SYN, ACK  
 2.257271000     54572     80     GET /images/customers/teradyne.png HTTP/1.1   
 2.287763000     80     54568     HTTP/1.1 200 OK (PNG)  
 2.289285000     54573     80     SYN  
 2.299719000     80     54569     HTTP/1.1 200 OK (PNG)  
 2.301075000     54574     80     SYN  
 2.314113000     80     54571     HTTP/1.1 200 OK (PNG)  
 2.315071000     80     54570     HTTP/1.1 200 OK (PNG)  
 2.329298000     80     54572     HTTP/1.1 200 OK (PNG)  
 2.358191000     80     54573     SYN, ACK  
 2.358506000     54573     80     GET /images/footer_bg.gif HTTP/1.1   
 2.369443000     80     54574     SYN, ACK  
 2.370632000     54574     80     GET /images/trustee_logo.jpg HTTP/1.1   
 2.429726000     80     54573     HTTP/1.1 200 OK (GIF89a) (GIF89a) (image/gif)  
 2.441633000     80     54574     HTTP/1.1 200 OK (JPEG JFIF image)  
 2.531537000     80     54567     HTTP/1.1 200 OK (JPEG JFIF image)  
 2.538506000     54575     80     SYN  
 2.605481000     80     54575     SYN, ACK  
 2.605965000     54575     80     GET /css/print.css HTTP/1.1   
 2.677186000     80     54575     HTTP/1.1 200 OK (text/css)  
 2.890407000     54577     80     SYN  
 2.961928000     80     54577     SYN, ACK  
 2.962359000     54577     80     GET /favicon.ico HTTP/1.1   
 3.030690000     80     54577     HTTP/1.1 200 OK (image/x-icon)  

     From this packet capture, we can observe a few things that our performance test professionals should take notes:

     First, as mentioned earlier, a browser may start with more than one tcp connections, whichever completes first will be used to do the HTTP transaction that downloads the main HTTP page.

     Secondly,  it will not wait for the main HTTP transaction to complete before sending HTTP requests to download the resource files. it can happen as soon as the URL for the resource file is available from the partial HTML page. See packet at timestamp 0.189143000.

     Thirdly, it can start up 6 connections at the same time to download resource files, see the TCP SYN packets starting at timestamp  1.027851000.  It's possible that the number "6" is configurable, but it shows that a test platform needs to be able to use multiple concurrent sessions to download resource files.   It can maintain 6 outstanding requests by starting a new request as soon one of the 6 transactions completes, see the connection started at timestamp 1.166331000.

     Now that the browser has become smarter,  we, the performance testing professionals,  need to be smarter too.


Saturday, January 4, 2014

Testing elasticseach

    Real time analytics on real time data has been increasingly used in today's e-commerce.  Elasticsearch is a great tool for that. Its distributed architecture, high availability and full text searching capability have gained a lot of followers. Not only it has a long list of partners (27 and growing), it has many thriving communities.

    For a customer who runs Elasticsearch software on the cloud infrastructure,  it's important to know the capacity of the reserved instances.  This will help the customer to reserve less instances and still meet his/her business needs.  In this blog, we are going to show how easy it is to run some performance test on ElasticSearch with the NetGend platform.  I would like to thank Joel Abrahamsson's excellent tutorial on the basic commands on ElasticSearch to insert/view records and do queries.

    We are going to do some simple tests on the ElasticSearch software running on my Ubuntu 12.04 server with Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz.   First, let's insert a comprehensive DVD list into the database so that we can do some searchs on it.  The dvdlist is in the format of

 !!!! Beat, Vol. 1: Shows 01 - 05|Bear Family||Discontinued|2.0|4:3|45.98|NR|UNK|Mus  
 ic|1.33:1|4000127201263|2005-05-10 00:00:00|61689|2012-12-24 00:00:00  
 !!!! Beat, Vol. 2: Shows 06 - 09|Bear Family||Discontinued|2.0|4:3|45.98|NR|UNK|Mus  
 ic|1.33:1|4000127201270|2005-05-10 00:00:00|61690|2012-12-24 00:00:00  
    Here is the NetGend script that does insertions.
 //Insert dvd lists to the database
 function userInit() {  
      var db = fromCSV("dvdlist.csv", "|");  
      rec = getNext(db); //skip the first record, which is header  
      var gId = 1;  
 function VUSER() {  
      id = gId ++;  
      rec = getNext(db);  
      http.POSTData = q|{  
 "title": "${rec[0]}",  
 "price": ${rec[6]},  
 "status": "${rec[3]}"  
      http.method = "PUT";  
      res = fromJson(http.replyBody);
      if (res.ok != true) {
            println("failed to insert rec ${id} ${res.error}");
     We were able to achieve 2500+ inserts/second with about 80% CPU usage on ElasticSearch process.

     Note that in constructing a JSON message, we don't have to escape the multiple double quotes in the template as on some other test platforms, which could be quite tedious.  In the last example of this blog, we will see an even simpler way of constructing a JSON message.

     Now let's make sure the records exist.
 function userInit() {  
      var gId = 1;  
 function VUSER() {  
      id = gId ++;  
      res = fromJson(http.replyBody);  
      if (res.exists != true) {  
            println("record ${id} doesnot exist");
    This is quite light on CPU.

    Finally let's do some simple searchs, our search items are from a comprehensive list of nouns with 2000+ items.   It's a simple file where each line is a noun.
function userInit() {  
 var db = fromCSV("nounlist.txt", "\n");
function VUSER() {  
 rec = getNext(db); //each record corresponds to a row in csv file, so it's an array
 q.query.query_string.query = rec[0];
 http.POSTData = toJson(q);
 res = fromJson(http.replyBody);
 if (res.took > 15) {
  println("query for ${rec[0]} took ${res.took} ms");
    When we send requests at the rate of 1000/second, the CPU usage on the elasticsearch process is 134%.  So searching can be heavier than insertion.

    This blog shows some simple examples on testing an elasticsearch system.  It's clear that NetGend platform greatly simplifies the processing of JSON messages, which are essential to test elasticsearch.

    We will be using it to do some real world testing, so stay tuned.

Wednesday, January 1, 2014

Performance test with limited bandwidth

    This Q&A from stackoverflow asks an interesting question: how to run performance test when there is no enough bandwidth between a test platform and a server.  In a typical HTTP transaction, the server response can be big, say 1000KB, so it's possible that the bandwidth bottleneck may force the client side to do less transactions and hence generate less load on the server. How do we workaround this limitation?  This question was asked on JMeter, but it applies to other test platforms as well.

    One idea to do this is by performing http transactions in a special way:  send HTTP requests and make server do all the processing to generate the responses but can't to send much of the data in them. This bandwidth conservation idea sounds easy but it is not trivial to implement.  Simply closing a TCP connection after sending a HTTP request may not work, since the server may detect that the TCP connection is torn down and abort the processing. If you close a connection after receiving the first data packet from server,  you may have received too much data since server will send the data in a burst.

   Luckily,  it's fairly easy to implement this idea on NetGend platform because it supports limiting the TCP receive window size (tcpRecvWinSize). This parameter will control how much data a serve can send to a client in a burst.  In addition, we can close the connection upon receiving the first data packet from server.  This is because we can send a HTTP request without doing a full HTTP transaction.  Note that on some test platforms, to send a desired HTTP request, you have to do a full HTTP transaction which involves waiting for the entire HTTP response to come.

    This effectively ensures that the server does all the heavy lifting in generating the responses but can't send too much data to consume the bandwidth.   Here is the simple script:
 function userInit() {  
      tcpRecvWinSize = 100;  //limit tcp window size
 function VUSER() {  
      connect("", 80);  
      msg = createHttpRequest("");  
     In this script, we set the TCP window size to be pretty small and it applies to all TCP connections in this test.  A VUser will connect to the server, create a HTTP request and send it to the server.  Finally it closes the connection upon receiving the first data packet.

     As we talked about in earlier blogs, performance test platform can also be used to do security testing like DDoS, the above script will "torture" a server so much that it can be both a good performance test and a good security test.