The Performance Massager

At Riverbed, we strive to tackle all aspects of network and application performance management.  From improving performance with application acceleration and bandwidth optimization to monitoring performance with advanced analytics and deep application drill-downs.  Now, Riverbed is tackling that last link in the performance management puzzle: the final report to your boss

 

Consider this scenario: you know the network and all the applications that run on it like the back of your hand.  It's a well oiled machine running beautifully and applications are buzzing along with the help of all your favorite Riverbed products. 

 

But you've got that boss that's always looking for more.  The boss demands "I want to see that Response Time come down!"  You explain that the speed of light is constant and the laws of physics dictate....the boss interrupts you right there with "Just make it look better!" and storms off.

 

The Performance Massager

 

We took that message back to our engineering team and after a few very long days, they devised the perfect solution to this problem:

The Performance Massager takes  report data as input, along with an improvement factor, and generates better looking report data on output.   A prototype version of the tool called 'perf-massager.py' (see below) is written Python and leverages the FlyScript SDK.  As such, the tool can run on just about any platform and can be customized as needed. 

 

Options

 

OptionDescription
--factor <float>The "improvement" factor to apply to the data.  Use a negative number for reductions, use a positive number for gains.
--start <float>A fraction between 0 and 1 indicating when in the data stream the improvement becomes apparent, defaults to 0.5 (50%)
--mid <float>A fraction between 0 and 1 indicating when the improvement reaches its full effect, defaults to 0.7 (70%)

 

Sample Reports

 

Consider the two before and after reports (click to see the full image):

 

Screen Shot 2013-04-01 at 12.32.15 AM.pngScreen Shot 2013-04-01 at 12.32.02 AM.png
Report DataBetter Looking Report Data

 

Comparing the two reports, you'll notice the following improvements:

  • The time-series graph of Response Time shows an improvement factor of -0.5, that's an improvement from ~230ms down to ~110ms
  • The bar chart of Response Time by location shows the same improvement.  Note in particular that the DataCenter response time has dropped to ~90ms despite the fact that the physical round trip time to the data center is over 130ms
  • Even call quality can be improved, as shown in the VoIP MOS graph.  Over the same time period, MOS has improved from about 4.6 to a crisp and clear MOS of 6.26 (when the boss mentions that MOS is between 1 and 5, just point out the huge improvement in TCP resets in the next graph)
  • By using a nice large improvement factor of -0.9, TCP Resets can be dropped down to a mere 10% of what they were before

 

This script operates on real network monitoring data, which greatly minimizes the risk that the resulting reports look completely fabricated.  By carefully choosing appropriate improvement factors, you can fine tune just how much better you want your data to look.  It is recommended that you start with small values and gradually increase the magnitude of the improvement over time.

 

Script

 

Notes:

  • this script relies on pandas, the Python Data Analysis Library, a data analysis package that makes it easy to manipulate arrays of data.  Install with 'pip install pandas' on most machines.
  • the only metric queried in the report is Response Time, modify the script as needed for additional data columns
  • run the script with -h or --help to see all options.

 

from rvbd.profiler import *
from rvbd.profiler.app import ProfilerApp
from rvbd.profiler.filters import TimeFilter, TrafficFilter

import pprint
import pandas

class ProfilerReport(ProfilerApp):

    def add_options(self, parser):
        parser.add_option('--factor', default=0.5, dest='factor', type='float')
        parser.add_option('--start', default=0.5, dest='start', type='float')
        parser.add_option('--mid', default=0.7, dest='mid', type='float')

            
    def main(self):
        p = self.profiler
        
        # Create and run a traffic summary report
        report = TrafficOverallTimeSeriesReport(p)
        report.run(    
            columns = [p.columns.key.time,
                       p.columns.value.response_time],
            timefilter = TimeFilter.parse_range("last 60 m"),
            resolution = "1min")

        # Retrieve the data, and convert to a pandas DataFrame
        data = report.get_data()
        df = pandas.DataFrame(data)

        # Grab the first series (the data column)
        s = df[1]
        start = int(len(s) * self.options.start)
        mid = int(len(s) * self.options.mid)
        end = len(s)

        # Now, adjust all the values between the start and end according to the factor
        for i in xrange(start, end):
            if (i < mid):
                s[i] = s[i] * (1.0 + self.options.factor * (i - start) / (mid - start))
            else:
                s[i] = s[i] * (1.0 + self.options.factor)

        # Finally print the result
        printer = pprint.PrettyPrinter(2)
        printer.pprint(df.values)

ProfilerReport().run()