Friday, January 23, 2015

DynaTrace Monitoring Dashboard Metrics Explained


All these metrics can be seen in monitoring dashboard. You can open monitoring dashboard by clicking “Monitoring” button on the top left hand side on the open dashboard as shown below.




 
1) Throughput (Requests/Minute):  The “Throughput” chart shows the Throughput count and expected range based on data of the last week. 
 


 




 

 





There are no violation alerts as you notice in other metrics like “Failure Rate” and “Response Time” since variation in Throughput does not automatically signals an issue in application.
However if Throughput = 0, it may be the case that Application under test is not available.
How dynaTrace creates Baseline for “Throughput”?
dynaTrace calculates the expected range for this metric based on historical data. The time frame for historical data is previous week. If data for previous week is not available dynaTrace may use samples set from previous day, previous hour or from some time frame from start of current base lining.
2) Failure Rate: dynaTrace is capable of detecting all kind of unexpected events while a application is running. Some of the common errors are http errors like 4XX and 500 errors, JAVA exceptions etc.
dynaTrace creates a violation when at least one business transaction currently for application under test produces more failures than historical values of previous week (as shown in snapshot below).
 


 
 

If you click on “Failure Rate” as shown below in “Business Transactions” section you will see a new dashlet for “Failure Rates” focus only. As shown below.

 
 




 
 
 
 
 

 


Here –
Red line – Indicates Violations.
Blue Line – Indicates Failure Rate Trends.
Green Line – Indicates Failure Rate baseline.

How dynaTrace creates Baseline for “Failure Rate”?
dynaTrace uses binomial distribution for creating baseline for failure rate. Failure “Violations” are reported if measurements of failure rate at any particular time are statistically significant. Statically significant means there are enough request/minute (Throughput) to be certain that data is reliable.

3) Response time:  dynaTrace shows two values for Response times on monitoring dash dashboard.
a) Median Response time: This is the actual response time of current response time.
*Median - For data set having odd number of elements: value of middlemost element.
- For data set having even number of elements: mean of 2 mid most element.
b) Response time for slowest 10%:  This is the Average of all the values of response times which are more than the 90th percentile Response time value.
If you click on “Response Time” section of “Business Transactions” section of “Monitoring Dashboard” you will land up in the “Response time” dashlet. As shown below.
 
 
 


 
 


There will be another dashlet section on this page to the right hand side pane called “Response time (slowest 10%)”. If we click on that section it will minimize the current dashlet and open the “Response time (slowest 10%)” dashlet.

 
 
 


 
 


How dynaTrace creates Baseline for “Response Time”?
dynaTrace unlike other monitoring tools do not take averages or standard deviations for creating baselines.
Median – Baseline value is 50th percentile of all the response times.
Slowest 10% - Transactions having response time’s more than 90th percentile of all the transactions.

4) System Health: This portion of “Monitoring Dashboard” will display the state of “Processes” which are running and “Hosts” on which these are running.
There will be following which can be shown there:

GREEN – OK
RED – UNHEALTHY
Percentage – By which Process or Host is affected negatively.
GREY – N/A

 
 
 


If you hover on the status color for “Process” or “Hosts” you can see “Application Process” and “Hosts” hyperlink respectively. If you click on those links you will be taken to “Applcation Process” details and “Hosts” details pages as shown below.


 
 



 

 

 
 
 
 

No comments:

Post a Comment