Monday, May 25, 2015

Common performance test metrics



There are two types of application performance test metrics –
1)      Client side metrics – These are the metrics which we can get without installing any software or tool on the server. These metrics tell us about the degradation of user experience with changing load.
Some of the common client side metrics are – Number of Vusers, transactions per second, transaction response time, throughput, errors, and hits per second

2)      Server side metrics – These are the metrics which are collected from servers. These need some kind of software or tool to be installed on the server to collect these, or other way is to directly log into server to find these metrics. Off course there are many types of servers (Windows, Unix , AIX) and we need to know how to collect these if we are directly logging and try to collect these.
Some of the common server side metrics are – CPU utilization, Memory Utilization and % Disk time. 

There are Network resource metrics too which cannot be divided in upper two categories. These metrics help us know the performance of the network and any performance issues because of those.

Some of the network resource metrics are – Network latency, Network roundtrip and Data transfer. 

Explanation of metrics: 

Number of VUsers – Number of virtual users which were running during the performance test time period. 

Transactions per second – Number of completed transactions during the time period of performance test. This can be measured on individual transaction level or total of whole test.

Transaction response time – It is the time taken by transaction from being initiated till the first byte of its response is received.  

When a transaction is initiated it goes through number of layers in the application architecture.

For example it is possible that a request (transaction) in a 3-tier application follows the following path.
 
 
Typically in all the performance test result reports we will see Average and 90th percentile response times.

Average transaction response times – Average of all the response times for a particular transaction for a specified time period.
Example if we have 5 transactions taking 2,3,2,3 and 5 seconds Average transaction response time will be = (2+3+2+3+5)/5 =3 seconds.
90th Percentile transaction response times – This is the 90th value of the response time if we arrange response times in increasing order. What this means is that 90 percent of the transactions took less time than this value.
Example if we have 5 transactions taking 2,3,2,3,1,5,4,2,4 and 6 seconds. We need to arrange these values in increasing order first 1,2,2,23,3,4,4,5,6 from here we got 5 seconds as the 90th percentile value.

Hits per second – Hits per second is the number of hits made on web servers per second of the test. Hits are different than a page request since a page contains many resources like image, videos or any other files. So a hit means number of resources asked from the server at a given time. So if in one request there are 3 resources (1 image, 1 video and 1 graphics) it means there are 3 hits on the web server.

Throughput – This metrics represent the number of byte received by client from a webserver in a particular unit of time. Throughput represents the load which is getting generated on the webserver because of hits generated by client on it.
 
CPU utilization – Average utilization of all the processors in the system during the test period. Or Average utilization of any particular processor during the test time.
 
On Windows platform – We can use Perfmon monitors to look into CPU utilization.
On UNIX platform – We can use VMSTAT command to look into CPU utilization. 

Processor counters can be divided in following two types:

1)      % User time – Time processor spent in user mode code processing.

2)      % Privileged time – Time processor spent in kernel mode processing.
 

Memory Utilization – Average memory consumed during the performance test on the servers.
There are many memory metrics as following: 

Available kilobytes – Average of Memory in kilobytes available during the performance test. There may be differenced in units like MBs or Bytes in some cases, only difference in those cases is the unit and level of detail available. 

Pages/Sec – Number of virtual pages which are read or written per second. We can also derive from this metrics the amount of data moved to and fro from RAM to Disk per second. To do this just multiply the Pages/Sec number with the size of page (4KB on most machines). 

% Disk time - This metrics can be calculated by multiplying “Average Disk Queue Length” counter with 100. Due to this reason we can see % disk time value more that 100% sometimes, the case when Average Disk Queue Length is more than 1. 

% Disk time can be measured for both disk read called % Disk read time and disk write called % Disk write time. 

% Disk idle time – Provides the time during which disk has no requests to process from operating system. 0 represents disk was always busy while 100 states that disk was always idle. 

There are many other disk metrics like Disk transfers/second, Disk bytes/second, Average Disk bytes/transfer, Average Disk Seconds/transfer etc. These have very clear nomenclature hence I am not explaining these. 

Errors – As the name suggests these are the error messages or exceptions which are thrown by application while it is under test. There are many kind of errors from application side but there can be some errors due to our scripts or scenarios. We need to check why are these errors thrown while executing a test and also need to keep an eye on errors which increase with load, because most of the performance issues creep in when we increase load. 

Network latency – How much time it takes for a packet of data to get from one designated point to another.
Latency can be due to many factors like –
Propagation – time to travel one packet of data to travel from one point to other. So data from India to China will take less time than data send from India to US (if off course medium is direct).
Transmission – every type of network medium has a different transmission delay than other.
Hop processing – There may be some processing time taken by network devices like routers, bridges etc.
Network round trip – This is a kind of latency measure where we measure latency from one point to another and back.
Network utilization – This metrics tells us about how loaded our network is. If this value is high (close to 100%) it shows we have a network congestion.
Data transfer – Bytes received/sec, Bytes sent/sec, Bytes total/sec – These metrics tell us about the amount of data getting thrown on network per second. Nomenclature is self-explanatory.

No comments:

Post a Comment