Profiler Reporting via FlyScript

This document will describe the basics of a script to run a single query against a Profiler via the FlyScript SDK.  If you don't have the SDK installed yet, see Getting started with SteelScript for Python.  That will walk through the installation instructions.

 

The Functional Specification

 

Like a good engineer, we first need a functional specification -- what are we trying to accomplish?

 

  • Report on the top 20 client hosts that have been connecting to a set of servers
  • The set of servers should be a command line argument to the script
  • Time range of the last hour
  • Data columns of interest:
    • Client IP Address
    • Total Bytes
    • Total Packets
    • Network RTT
    • Response Time
    • % TCP Retransmissions
    • % TCP Resets
  • Sort by % TCP Retransmissions

 

Importing necessary modules

 

The first several lines of any script simply import the necessary modules that we'll use later throughout the script:

 

from rvbd.profiler.app import ProfilerApp
from rvbd.profiler.report import TrafficSummaryReport
from rvbd.profiler.filters import TimeFilter, TrafficFilter
from rvbd.common.utils import Formatter
import pprint
import optparse


 

There is not much magic here, it's pretty standard Python.  How do you know when to add new "import" lines for necessary modules?  If you're using the copy/paste/modify pattern and grab interesting looking code from another script, watch for "NameError" exceptions when you try to run the script.  That's your indication that you need to go to the script you copied code from and copy over the appropriate import lines as well.

 

Basic structure of the app

 

The skeleton of the application is shown below.  For most any scripts that connect to a single Profiler, this will be a good model to follow:

 

class TopClientsApp(ProfilerApp):
    def main(self):
        # Code to run goes here!
        ...

TopClientsApp().run()



 

Lines 1-4 defines our TopClientsApp based on the ProfilerApp.  The 'main' method is where the body of what you want to do in your script will go.  Line 6 is where we actually create an instance of this class (the application) and runs it.

 

Using ProfilerApp gets you a number of features right out of the box:

  • Defines command line arguments that needed for any profiler application, as well as some common optional ones:
    • hostname
    • username / password
    • logging
  • Establishes the connection to the target profiler
  • Sets up logging

 

For example, if you simply put 'pass' as the body of the 'main' function and run with -h, you'll see the list of command line arguments that are included simply by using ProfilerApp.  Save the following as 'top_clients.py':

 

from rvbd.profiler.app import ProfilerApp
from rvbd.profiler.report import TrafficSummaryReport
from rvbd.profiler.filters import TimeFilter, TrafficFilter
from rvbd.common.utils import Formatter
import pprint
import optparse

class TopClientsApp(ProfilerApp):

    def main(self):
        # Main code goes here
        pass

TopClientsApp().run()

from rvbd.profiler import *
from rvbd.profiler.app import ProfilerApp
from rvbd.profiler.filters import TimeFilter, TrafficFilter


 

Then run as follows:

$ python top_clients.py -h

Usage: top_clients.py PROFILER_HOSTNAME <options>

 

Options:

  -h, --help            show this help message and exit

 

  Connection Parameters:

    -P PORT, --port=PORT

                        connect on this port

    --force-ssl         force ssl to be used

    --force-no-ssl      force ssl to not be used

    -u USERNAME, --username=USERNAME

                        username to connect with

    -p PASSWORD, --password=PASSWORD

                        password to connect with

    --oauth=OAUTH       OAuth Access Code, in place of username/password

    -A API_VERSION, --api_version=API_VERSION

                        api version to use unconditionally

    --httplib-debuglevel=HTTPLIB_DEBUGLEVEL

                        set httplib debug

    --debug-msg-body=DEBUG_MSG_BODY

                        number of bytes of message body to log

 

  Logging Parameters:

    --loglevel=LOGLEVEL

                        log level

    --logfile=LOGFILE   log file

 

Adding a command line argument for 'servers'

 

Let's modify this script to add a command line argument for setting the servers of interest.  Add the following code right after the 'class' line:

 

    def add_options(self, parser):
        parser.add_option('--servers', default=None, dest='servers', help='Servers to query against')


 

Now if you run with the '-h' option like above, you should now see an additional optional argument listed  '--servers=SERVERS'.

 

Creating a Profiler Report

 

The first step we need to accomplish is to create a report object.  Replace the 'pass' in the 'main' method with the following:

 

    def main(self):
        # Create a traffic summary report 
        report = TrafficSummaryReport(self.profiler)


 

This will create an instance of a TrafficSummaryReport.  Note that this report is tied to 'self.profiler', this is an instance of a profiler object.  By the time the main() routine is called, the ProfilerApp has already parsed the command line arguments and established a connection to the Profiler specified by the user at the command line.  This connection is saved in the 'self.profiler' variable (note that 'se;f' was passed in to 'main').

 

There are a few types of reports available:

  • TrafficSummaryReport - this is the basic report that generates a table of data (similar to the summary table in a traffic report run via the web UI).
  • TrafficOverallTimeSeriesReport - generates a table indexed by time, similar to the time series graph in reports
  • TrafficFlowListReport - generates a table of flow list data

 

See FlyScript Documentation for more information on each of these types of reports, and the list of arguments.  There are also examples of each type in the 'examples/profiler' directory of the FlyScript SDK.

 

Running the report

 

The next step is to run the report.  This is where we must add in the necessary options for time range, filter, and columns:

 

    def main(self):
        # Create a traffic summary report
        report = TrafficSummaryReport(self.profiler)


        # shortcuts to the key and value column sets
        key = self.profiler.columns.key
        value = self.profiler.columns.value

        # Run the report
        report.run(
            groupby = self.profiler.groupbys.host,
            columns = [key.host_ip,
                       value.total_bytes,
                       value.total_pkts,
                       value.network_rtt,
                       value.response_time,
                       value.total_bytes_rtx_pct,
                       value.total_conns_rsts_pct
                       ],
            sort_col = value.total_bytes_rtx_pct,
            timefilter = TimeFilter.parse_range("last 15 m"),
            trafficexpr = TrafficFilter("")
            )


 

The run() method of the report object takes the following arguments of interest:

  • groupby -- this defines how the rows are grouped.  It is equivalent to the "Report By" drop down list when running traffic reports via the web UI
  • columns -- this is an array of key and value columns that will be queried
  • sort_col -- the column (from columns) that should be used for sorting the output
  • timefilter -- the time range of the report
  • trafficexpr -- a filter expression that uses the same syntax as the Traffic Expression text box when running advanced reports via the UI

The valid values for the 'groupby' and 'columns' argument can be determined by using the 'utilities/profiler_columns.py' script:

$ python utilities/profiler_columns.py <profiler> -u <user> -p <pass> --list-groupbys

GroupBy                      Id    
------------------------------------
application                  app   
application_port             apt   
application_protoport_qos    apq   
device                       dev   
group_pair_protoport         gpr   
host                         hos   
host_and_vtep_pair           vhp   
host_group                   gro   
host_group_pair              gpp   
host_pair                    hop   
host_pair_protoport          hpr   
interface                    ifc   
interface_qos                ifq   
ip_mac                       ipm   
ip_mac_pair                  ipp   
ip_mac_pair_protoport        ipr   
peer                         per   
peer_group                   pgp   
peer_ip_mac                  pip   
port                         por   
port_group                   pgr   
protocol                     pro   
qos                          qos   
segment                      seg   
time                         tim   
time_host_user               thu   
total                        mzt   
vtep_pair                    vpa   
vxlan                        vxl  

 

Once you have chosen a groupby, you can use the same tool list the appropriate columns associated with that groupby:

python utilities/profiler_columns.py <profiler> -u <user> -p <pass> -g hos -c hos -r traffic_summary | head -20

Key Columns        Label                  ID    
-------------------------------------------------
group_name         Group                  23    
host_dhcp_dns      Host                   72    
host_dns           Host                   6     
host_ip            Host IP                5     
host_mac           MAC                    7     
host_switch        Host Switch IP Info    8     
host_switch_dns    Host Switch Info       120   
idx                Index                  1     


Value Columns                    Label                               ID    
----------------------------------------------------------------------------
avg_bytes                        Avg Bytes/s                         33    
avg_bytes_app                    Avg App Bytes/s                     504   
avg_bytes_app_persecconn         Avg App Bytes/s per Conn            578   
avg_bytes_persecconn             Avg Bytes/s per Conn                556   
avg_bytes_rtx                    Avg Retrans Bytes/s                 391

... 

 

Pick the desired "key" and "value" columns for use in the columns and sort_col arguments.  (Notice the '-c hos' and '-r traffic_summary' arguments -- these define 'host' centricity and 'traffic_summary' as the realm, necessary arguments for picking the right column set).

 

The timefilter argument expects a TimeFilter() object.  Used with 'parse_range' as above is a simple way to select the last 15 minutes.

 

The trafficexpr argument similarly takes a TrafficFilter() object.  This is where we need to plug in our "--servers" command line argument.  All command line arguments are available via the 'self.options' variable:

 

        trafficexpr = TrafficFilter("srv peer " + self.options.servers)

 

Collecting Report Output

 

Finally, let's collect the report output and print it to the screen:

 

        # Retrieve and print data
        data = report.get_data()
        printer = pprint.PrettyPrinter(2)
        printer.pprint(data[:20])

 

The 'report.get_data()' call will return the data for all requested columns as a 2-dimensional array.  The 'printer' is just a nice helper class that will dump it to the screen in a (somewhat) nice fashion.

 

Putting it altogether, we have the following script 'top_clients.py':

 

from rvbd.profiler.app import ProfilerApp
from rvbd.profiler.report import TrafficSummaryReport
from rvbd.profiler.filters import TimeFilter, TrafficFilter
from rvbd.common.utils import Formatter
import pprint
import optparse

class TopClientsApp(ProfilerApp):

    def add_options(self, parser):
        parser.add_option('--servers', default=None, dest='servers', help='Servers to query against')

    def main(self):
        # Create a traffic summary report
        report = TrafficSummaryReport(self.profiler)

        # shortcuts to the key and value column sets
        key = self.profiler.columns.key
        value = self.profiler.columns.value

        # Run the report
        report.run(
            groupby = self.profiler.groupbys.host,
            columns = [key.host_ip,
                       value.total_bytes,
                       value.total_pkts,
                       value.network_rtt,
                       value.response_time,
                       value.total_bytes_rtx_pct,
                       value.total_conns_rsts_pct
                       ],
            sort_col = value.total_bytes_rtx_pct,
            timefilter = TimeFilter.parse_range("last 15 m"),
            trafficexpr = TrafficFilter("srv peer " + self.options.servers)
            )

        # Retrieve and print data
        data = report.get_data()
        printer = pprint.PrettyPrinter(2)
        printer.pprint(data[:20])

TopClientsApp().run()

 

When run, this generates the following output:

$ python top_clients.py <profiler> -u <user> -p <pass> --servers 10.100.201.0/24

[ ['10.99.15.52', 3169308, 3395, 0.165, 0.171, 38.618, 0.0],

  ['10.99.15.54', 5780357, 6223, 0.163, 0.743, 37.653, 0.0],

  ['10.99.15.51', 4672410, 4424, 0.165, 0.589, 36.947, 0.0],

  ['10.99.15.12', 5015508, 5225, 0.164, 0.334, 36.603, 0.0],

  ['10.99.15.42', 5064120, 5968, 0.164, 0.37, 36.517, 0.0],

  ['10.99.15.60', 4848440, 4883, 0.164, 0.374, 36.473, 0.0],

  ['10.99.15.29', 7583941, 7010, 0.166, 0.519, 36.357, 0.0],

  ['10.99.15.61', 4140881, 3726, 0.166, 0.379, 36.276, 0.0],

  ['10.99.15.27', 6565867, 6948, 0.167, 0.309, 36.172, 2.85714285714],

  ['10.99.15.39', 6039102, 7517, 0.167, 0.32, 35.631, 0.0],

  ['10.99.15.48', 5621391, 5331, 0.166, 0.607, 35.514, 0.0],

  ['10.99.15.41', 6911179, 6972, 0.164, 0.274, 35.401, 4.44444444444],

  ['10.99.15.57', 3721397, 3508, 0.167, 0.173, 35.205, 0.0],

  ['10.99.15.36', 5023039, 5743, 0.167, 0.172, 35.032, 0.0],

  ['10.99.15.10', 5044344, 5741, 0.165, 0.555, 34.142, 0.0],

  ['10.99.15.50', 4052030, 4304, 0.169, 0.682, 33.795, 0.0],

  ['10.99.15.18', 5369872, 5916, 0.165, 0.318, 33.298, 3.125],

  ['10.99.15.67', 3636975, 4034, 0.167, 0.172, 32.99, 0.0],

  ['10.99.15.24', 4172420, 4702, 0.167, 0.381, 32.713, 4.34782608696],

  ['10.99.15.59', 4540512, 4390, 0.165, 0.17, 32.64, 0.0]]

 

You should now be well on your way to run any type of Profiler report!