Search your desktop (or wherever) for old packet captures with Heartbleed OpenSSL exploits (Python script)

A question being asked repeatedly by the technology community is whether or not the Heartbleed OpenSSL exploit has been in use "in the wild" for an extended period of time, prior to the public disclosure last week. The bug was present in OpenSSL for over two years, but so far it is not known conclusively if attackers have been taking advantage of Heartbleed during some or most of that multi-year period.

 

One observation: there are a lot of packet captures sitting around on people's laptops, desktops, or archived on file servers, etc. For many people who work in networking, IT, or in the tech industry in general, you end up with a pile of packet captures stretching back many years spread across many machines. This is in part because packets are so useful for solving problems, and in part because its human nature not to always delete something when you are done with it.


Each of those packet captures sitting on someone's machine is a small snapshot in time of exactly what was happening at the moment the capture was taken.

 

The question, though, is can we make it more convenient for people to check their old packet captures they might have "lying around" to see if they might have happened to have accidentally captured evidence of an "old" Heartbleed exploit months or even years ago?

 

This post includes a general purpose Python script that scours a set of user-specified directories looking for any packet capture file it can find. For every packet capture file it finds, it reports on possible Heartbleed exploits found in the packet capture.

 

This is helpful both for looking at very recent captures, as well as possibly finding old exploits. This post is not specific to Riverbed products, aside from Riverbed products being a possible source of packets to examine with this new script.

 

Riverbed wrote a blog entry last week describing how to use a general purpose BPF packet filter to check for Heartbleed exploits in live traffic or in stored packets, where BPF is broadly applicable for non-Riverbed customer as well as Riverbed customers, and BPF packet filters are understood by many operating systems and packet processing software. (For Riverbed customers, my colleague Chris White then wrote a nice script that adapted that general-purpose BPF filter from last week to be used with the Riverbed Shark REST API to make it very easy to check for possible captured Heartbleed exploits that might be in the many terabytes of rolling packet data on a Riverbed Shark appliance).

 

The new script described below improves on the BPF packet filter from last week in a few different ways, including:

  1. Automatically hunt for packet capture files by searching directories for packet captures located somewhere on your disk.
  2. Automatically correlate TLS heartbeat requests with TLS heartbeat responses on the same TCP connection in order to further reduce possible false positives.

 

An extremely quick overview of what how Heartbleed works (which is important if you are going to look at heartbeat packet traces)-- this is the best visual representation of how the bug enables an attacker to "lie" about the size of a string in a heartbeat echo request in order to remotely download "random" memory (or more accurately, adjacent memory) in the server, where the server's memory contents transmitted to the attacker unfortunately can contain things like passwords, usernames, and even the "crown jewels" of the server's secret key that then can be used for further exploits.

 

The script below looks for each possible SSL/TLS heartbeat request and compares the size of the next TLS heartbeat response that it finds on the same TCP connection. If the heartbeat response is within a (configurable) 1 second (and where this time threshold helps to further eliminate or reduce false positives), and if the heartbeat response payload is more than 10 bytes (configurable) larger than the corresponding heartbeat request payload, then the script flags that as a possible Heartbeat exploit (and prints out details so that you can validate in Wireshark or a similar tool). The script ends up looking at 6 bytes across two close-in-time packets prior to reporting a possible exploit, which reduces false positives (and avoids the mistakes made by some IDS and other people looking incorrectly for the wrong byte patterns last week).

 

This script could be improved, and feedback is welcome.

 

To help make it clear how this new script works, we'll first give a quick overview of how to invoke the script, then show some sample output from the script identifying two Heartbleed exploits, then show the corresponding Heartbleed packets in Wireshark, and then finally share the actual code at the bottom.

 

Running the Script

 

In this example, we specify two input directories (in this case, my Downloads folder as well as C:\temp, which are both places I have a pile of pcap files mixed in with other files).  The script uses tcpdump on UNIX, and on Windows requires you to have WinDump.exe downloaded from winpcap.org. When invoking the script, normally you will pass the path to windump.exe (Windows) or tcpdump (UNIX). I have also specified in the example below an output logging directory (that is, any directory that has write-permissions).

 

python heartbleed_bpf_checker_v1.py --windump C:\WinDump.exe

  --dir C:\users\pmalloy\Downloads --dir C:\temp --log-dir C:\temp

The execution starts by printing out an overview of the script:

 

---- OVERVIEW ----

Heartbleed checker. Given a directory, or set of directories, look for pcap

files and then identify possible Heartbleed exploits.

 

Note: a possible Heartbleed exploit found by this script should be manually

validated in Wireshark or similar tools.

 

Usage:

   Windows

     python heartbleed_bpf_checker_v1.py --windump C:\windump.exe

   UNIX:

     python heartbleed_bpf_checker_v1.py --tcpdump /usr/sbin/tcpdump

 

Run with --help for more info.

Then it describes some key aspects of what it is about to do:

---- INPUT ----

DIRECTORIES: This will recursively examine files in the following directories:

      C:\temp

      C:\users\pmalloy\Downloads

 

FILE EXTENSIONS: This will open & examine files with the following extensions:

      .cap

      .pcap

      .appcapture

      .dmp

      .pkt

 

server port:        443

windump.exe:        C:\temp\WinDump.exe

logging directory:  C:\temp

 

Continue? [y/n] y

It then asks you to confirm that you want to proceed.  It then starts visiting the specified directories (and all their sub-directories) looking for pcap files and then checking each pcap file for possible Heartbleed exploits.

 

Here's an example where two Heartbleed exploits were found. (Note: it flags likely successful exploits, rather than just attempts).

[PROGRESS] Starting to look in directory [C:\temp] and its sub-directories.

[EXPLOIT?] Found pcap file: C:\temp\riverbed_am_heartbleed_encrypted.pcap

[EXPLOIT?] Found apparent heartbeat echo request size [8] followed [0.079385] sec later

[EXPLOIT?] by apparent heartbeat echo response size [1448].

[EXPLOIT?]         Server IP.port:      54.184.217.42.443

[EXPLOIT?]         Client IP.port:      192.168.0.54.52220

[EXPLOIT?]         Heartbeat request:   2014-04-10 15:03:48.276788 TCP payload size: 8

[EXPLOIT?]         Heartbeat response:  2014-04-10 15:03:48.356173 TCP payload size: 1448

[EXPLOIT?] Found pcap file: C:\temp\riverbed_am_heartbleed_plaintext.pcap

[EXPLOIT?] Found apparent heartbeat echo request size [65] followed [0.076072] sec later

[EXPLOIT?] by apparent heartbeat echo response size [165].

[EXPLOIT?]         Server IP.port:      54.184.217.42.443

[EXPLOIT?]         Client IP.port:      192.168.0.54.52223

[EXPLOIT?]         Heartbeat request:   2014-04-10 15:05:01.126354 TCP payload size: 65

[EXPLOIT?]         Heartbeat response:  2014-04-10 15:05:01.202426 TCP payload size: 165

[EXPLOIT?] Please validate these possible Heartbleed exploit(s) in Wireshark or similar tool.

[PROGRESS] Starting to look in directory [C:\users\pmalloy\Downloads] and its sub-directories.

[PROGRESS] Done!

Resulting Packets in Wireshark - An Example Heartbleed Exploit

 

Let's look at the second example in more detail, where we can see in the script output above the exact exploit packet time stamps, along with the client/server IP and port numbers.

 

For the second example found above, we can see the request packet that was identified by the script that are now displayed in Wireshark (below). Note that the TCP LEN field (along the right-hand side of the image) is 65 bytes (that is, 65 bytes of payload above TCP), and the TLS variable record length is 60 bytes (highlighted on the left-hand side). This is an example of an encrypted heartbeat request, and there is yet another record length field inside the "Encrypted Heartbeat" message that is not visible here because it is encrypted. (And why so many length fields?  To some degree, that's the heart of the Heartbleed bug...).  One thing to be mindful of when looking at other examples is that many of the toy / test exploits immediately created after the Heartbleed public disclosure last week were done using non-encrypted heartbeats (primarily because it was easier), but it seems reasonable to think that a "real" attacker would more often use encrypted heartbeats.

heartbleed_encrypted_heartbeat_request_65_bytes_of_payload.PNG.png

 

And now in the screenshot below, we can see the next packet is the corresponding heartbeat response packet that was identified by the script. Note that the TCP LEN field along the right is 165 bytes (that is, 165 bytes of TCP payload) and the TLS variable record length is 160 bytes on the left side.  Why would an 'echo' heartbeat response be so much larger than the request it is echoing? Well, this was of course an actual exploit, where we set a malformed request to cause an extra 100 bytes of extra "random" server memory to be returned in the response.  This exploit was deliberately performed for experimental reasons.  In the other example exploit flagged by the script above, the request vs. response size is much more dramatic - 8 bytes of TCP payload in the 'echo' request, and 1448 bytes of TCP payload in the first 'echo' response segement.

heartbleed_encrypted_heartbeat_response_165_bytes_of_payload.PNG.png

 

Script Command-Line Options

 

For completeness, here are the command-line arguments currently supported by the script:

% python heartbleed_bpf_checker_v1.py -h

 

usage: heartbleed_bpf_checker_v1.py [-h] [--debug] [--test] [-p SERVER_PORT]

                                    [-d DIR] [-e FILE_EXTENSION_LIST]

                                    [--force-all-file-extensions]

                                    [--log-dir LOG_DIR] [--windump WINDUMP]

optional arguments:

  -h, --help            show this help message and exit

  --debug               Outputs debugging info while running.

  --test                Executes some unit tests.

  -p SERVER_PORT, --server-port SERVER_PORT

                        Server port (e.g., 443 for HTTPS, etc.)

  -d DIR, --dir DIR     Directory to recursively descend. Defaults to working

                        directory. Can be specified multiple times, e.g., -d

                        my_directory1 -d my_directory2

  -e FILE_EXTENSION_LIST, --file-extension FILE_EXTENSION_LIST

                        Only open and examine specific file extensions (e.g.,

                        "-e .pcap"). Can be specified multiple times, e.g.,

                        "-e .pcap -e .appcapture". Defaults to supporting:

                        .cap, .pcap, .appcapture, .dmp, .pkt.

  --force-all-file-extensions

                        Open and examine ALL files found, regardless of file

                        extensions. Stops processing a file if the file does

                        not start with a pcap magic number.

  --log-dir LOG_DIR     Location of log files. Must have write access.

                        Defaults to working directory.

  --windump WINDUMP     Location of windump. Note you must download from

                        winpcap.org. This arg defaults to C:\Program

                        Files\Wireshark\windump.exe

  --tcpdump TCPDUMP     Location of tcpdump. This arg defaults to

                        /usr/sbin/tcpdump, but can vary by platform.

Source Code

 

Finally, here is the source code for the script itself:

 

# pmalloy: 3/13/2014: relatively quick and dirty Heartbleed checker, using tcpdump or windump to parse the binary packets.
# For more info, see: http://www.riverbed.com/blogs/Retroactively-detecting-a-prior-Heartbleed-exploitation-from-stored-packets-using-a-BPF-expression.html

overview_string = '''Heartbleed checker. Given a directory, or set of directories, look for pcap files, and
then identify possible Heartbleed exploits (by default on port 443).

Note: any possible Heartbleed exploit found by this should be manually validated in Wireshark or similar tools.

Usage:
   Windows
     python heartbleed_bpf_checker_v1.py --windump C:\windump.exe
   UNIX:
     python heartbleed_bpf_checker_v1.py --tcpdump `which tcpdump`

Run with --help for more info.
'''

import sys
if sys.version_info[0:2] != (2, 7):
    raise Exception('This program requires python version 2.7')
import os, subprocess, re, datetime, argparse
from collections import namedtuple

# ----- constants ------

SIZE_INCREASE_THRESHOLD = 10 # flag a heartbeat response that is greater than N bytes larger than the prior heartbeat request (as long as within time threshold)
TIME_THRESHOLD = 1  # 1.0 seconds. This is the threshold used to identify a heartbeat response on the same tcp connection.

# BPF filter for requests & responses (IPv4, non VLAN; we then construct the VLAN-friendly version below)
requests_filter_string = 'tcp and (dst port {server_port} and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4] = 0x18) and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4 + 1] = 0x03) and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4 + 2] < 0x04))'
response_filter_string = 'tcp and (src port {server_port} and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4] = 0x18) and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4 + 1] = 0x03) and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4 + 2] < 0x04) and ((ip[2:2] - 4 * (ip[0] & 0x0F)  - 4 * ((tcp[12] & 0xF0) >> 4) > 69)))'
# BPF filter for requests or responses
request_or_responses_filter_string = '(%s) or (%s)' % (requests_filter_string, response_filter_string)
# VLAN-friendly version
vlan_friendly_filter_string = '((not ether proto 0x8100) and (%s)) or (vlan and (%s))' % (request_or_responses_filter_string, request_or_responses_filter_string)


# tcpdump magic numbers (first four bytes of tcpdump file header)
MAGIC1 = '\xA1\xB2\xC3\xD4'
MAGIC2 = '\xD4\xC3\xB2\xA1'
# let's make sure we didn't fat finger our magic numbers.
assert len(MAGIC2) == 4 and len(MAGIC1) == 4 and MAGIC1 == MAGIC2[::-1]


# tcpdump arguments to use.
tcpdump_args = ['-tt', '-q', '-n', '-r'] # will append file name after -r
tcpdump_help_arg = ['-h']


# default set of pcap file extensions if the user doesn't specify any
default_file_extensions = ['.cap', '.pcap', '.appcapture', '.dmp', '.pkt']


# ----- start argument parsing --------
parser = argparse.ArgumentParser()
parser.add_argument('--debug', action='store_true', default=False,
                    dest='debug',
                    help='Outputs debugging info while running.')
parser.add_argument('--test', action='store_true', default=False,
                    dest='test',
                    help='Executes some unit tests.')
parser.add_argument('-p', '--server-port', action='store', default=443, type=int,
                    dest='server_port',
                    help=r'Server port (e.g., 443 for HTTPS, etc.)')
parser.add_argument('-d', '--dir', action='append', dest='dir_list',
                    default=[], metavar='DIR',
                    help='Directory to recursively descend. Defaults to working directory. Can be specified multiple times, e.g., -d my_directory1 -d my_directory2')
parser.add_argument('-e', '--file-extension', action='append',
                    default=[], dest='file_extension_list',
                    help='Only open and examine specific file extensions (e.g., "-e .pcap"). Can be specified multiple times, e.g., "-e .pcap -e .appcapture". Defaults to supporting: %s.'
                           % ', '.join(default_file_extensions))
parser.add_argument('--force-all-file-extensions', action='store_true',
                    default=False, dest='force_all_file_extensions',
                    help='Open and examine ALL files found, regardless of file extensions. Stops processing a file if the file does not start with a pcap magic number.')
parser.add_argument('--log-dir', action='store', default='.',
                    dest='log_dir',
                    help=r'Location of log files. Must have write access. Defaults to working directory.')


if os.name == 'nt':
    parser.add_argument('--windump', action='store', default=r'C:\Program Files\Wireshark\windump.exe',
                        dest='tcpdump', metavar='WINDUMP',
                        help=r'Location of windump. Note you must download from winpcap.org. This arg defaults to C:\Program Files\Wireshark\windump.exe')
else:
    parser.add_argument('--tcpdump', action='store', default='/usr/sbin/tcpdump',
                        dest='tcpdump',
                        help=r'Location of tcpdump. This arg defaults to /usr/sbin/tcpdump, but can vary by platform.')
# ----- end argument parsing --------


# Some quick module-level variables...
file_counter = {} # relative file name -> possible exploit count
no_trailing_newline = True  # track for our simple 'progress bar' whether or not there is a trailing new line.


# --------- logging utils ---------


# Logging intent is to only log possible exploits plus per file events by default. (Per packet log entries should only be output if --debug is set).
# (This started with "Let me do something very quick / simple for logging"; in retrospect, should have used a logging package... ;-)


def output_possible_exploit(full_file_path, connection, prior_heartbeat_request, heartbeat, time_delta):
    if full_file_path not in file_counter:
        file_counter[full_file_path] = 1 # one possible exploit so far
        exploit_log('pcap file: %s' % os.path.abspath(full_file_path))
        exploit_log('Please validate these possible Heartbleed exploit(s) in Wireshark or similar tool.')
    else:
        # already seen a possible exploit in this file; increment counter
        file_counter[full_file_path] += 1
    exploit_log('Found apparent heartbeat echo request size [%d] followed [%f] sec later by apparent heartbeat echo response size [%d].' %
            (prior_heartbeat_request.payload_size, time_delta, heartbeat.payload_size))
    exploit_log('        Server IP.port:      %s ' % connection.server_ip_port)
    exploit_log('        Client IP.port:      %s ' % connection.client_ip_port)
    exploit_log('        Heartbeat request:   %s TCP payload size: %d' % (format_timestamp(prior_heartbeat_request.timestamp), prior_heartbeat_request.payload_size))
    exploit_log('        Heartbeat response:  %s TCP payload size: %d' % (format_timestamp(heartbeat.timestamp), heartbeat.payload_size))


def exploit_log(msg):
    # stdout and exploit_log
    print_trailing_new_line_if_needed()
    print '[EXPLOIT?] %s' % msg
    # this is the good stuff, so put in 'progress' log and 'possible exploits' log (E.g., in case someone is tailing progress log, etc.)
    heartbleed_bpf_checker_possible_exploits_log_file.write('[EXPLOIT?] %s\n' % msg)
    heartbleed_bpf_checker_progress_log_file.write('[EXPLOIT?] %s\n' % msg)


def debug_log(msg):
    if DEBUG:
        print_trailing_new_line_if_needed()
        print '[DEBUG] %s' % msg
    progress_log('[DEBUG] %s' % msg)


def progress_log(msg, also_to_stdout=False):
    if also_to_stdout:
        print_trailing_new_line_if_needed()
        print '[PROGRESS] %s' % msg
    heartbleed_bpf_checker_progress_log_file.write('[PROGRESS] %s\n' % msg)


def warn_log(msg):
    if DEBUG:
        print_trailing_new_line_if_needed()
        print '[WARN] %s' % msg
    progress_log('[WARN] %s' % msg)


def print_trailing_new_line_if_needed():
    '''because of quick and dirty 'progress bar', we might need to put out a newline.'''
    global no_trailing_newline
    if no_trailing_newline:
        print ''
        no_trailing_newline = False


def format_timestamp(timestamp):
    ''' handles strings or float timestamps (UTC)
    >>> format_timestamp(1391722819.085921)
    '2014-02-06 16:40:19.085921'
    >>> format_timestamp('1391722819.085921')
    '2014-02-06 16:40:19.085921'
    '''
    return str(datetime.datetime.fromtimestamp(float(timestamp)))


# ---- primary functions ----


def process_tcpdump_outout(child, full_file_path, server_port):
    '''Given an invoked tcpdump/windump process, loop over the output, processing each line, looking for possible heartbeat exploitations.
    This is where the main Heartbleed indentification logic lives (in addition to the embedded logic in the BPF filters,
    which push down most of the computationally-intensive 'work' to tcpdump).
    '''
    request_count = 0
    response_count = 0
    possible_exploits = 0
    pending_heartbeat_request_map = {} #  (client_ip_port, server_ip_port) --> (timestamp, size)
    for line in child.stdout:
        try:
            connection, heartbeat = parse_line(line, server_port)
            if not heartbeat:
                # failed to parse; should have already logged, so just continue
                continue
            else:
                assert connection and heartbeat.request in [True, False]
                if heartbeat.request:
                   request_count += 1
                  if DEBUG: debug_log('Found a new heartbeat REQUEST [%s] on connection [%s]' % (heartbeat, connection))
                    if connection in pending_heartbeat_request_map:
                        # We found a request pending, as well as a new request (without having found a response yet for the earlier request).
                        # According to RFC-6520, there 'MUST' be at most one heartbeat request in flight at a time.
                        # This could be 'expected' for real heartbeat if there are retranmissions, 'observation drops', and/or crossing packets, etc.
                        # However, I've seen multiple heartbeat requests in flight (on a LAN environment without retransmissions), which might have been a bad client?
                        # This could also be due to having a false positive id of a heartbeat message (e.g., if a multi-segmement message happens to
                        # look like a heartbeat based on the first 3 bytes), which is to be expected if examining millions of packets.
                        if DEBUG:
                            prior_heartbeat_request = pending_heartbeat_request_map[connection]
                            debug_log('Found two heartbeat REQUESTS in a row [%s] [%s] on connection [%s]' % (prior_heartbeat_request, heartbeat, connection))
                    # store this as the most recent pending heartbeat request on this connection
                    pending_heartbeat_request_map[connection] = heartbeat
                else:    # This is a response (heartbeat.request == False)
                    response_count += 1
                    if DEBUG: debug_log('Found a heartbeat RESPONSE [%s] on connection [%s]' % (heartbeat, connection))
                    prior_heartbeat_request = pending_heartbeat_request_map.get(connection, None) # default to None if not found
                    if prior_heartbeat_request is None:
                        # this heartbeat response may very well be a false positve (which is to be expected if examining millions of packets)
                        if DEBUG: debug_log('No prior request found on connection [%s]. Ignoring this heartbeat RESPONSE [%s]' % (connection, heartbeat))
                    else:
                        time_delta = heartbeat.timestamp - prior_heartbeat_request.timestamp
                        if time_delta < 0:
                            if DEBUG:
                                debug_log('Two heartbeat RESPONSES seen out of time order [%s] [%s] with negative time delta [%d] on connection [%s]' %
                                    (prior_heartbeat_request, heartbeat, time_delta, connection))
                        elif time_delta < TIME_THRESHOLD:
                            size_increase = heartbeat.payload_size - prior_heartbeat_request.payload_size
                            if size_increase > SIZE_INCREASE_THRESHOLD:
                                    possible_exploits += 1
                                    output_possible_exploit(full_file_path, connection, prior_heartbeat_request, heartbeat, time_delta)
                            else: # no or only modest increase in size
                                if DEBUG: debug_log('Heartbeat response at time [%s] with payload size [%d] within size threshold of prior request payload size [%d].' %
                                             (format_timestamp(heartbeat.timestamp), heartbeat.payload_size, prior_heartbeat_request.payload_size))
        except:
            warn_log('Unexpected error in process_tcpdump_outout() for file [%s]. Error: [%s]' % (full_file_path, sys.exc_info()[0:2]))
            # If there seems to be a significant problem, run with --debug or make 'raise' unconditional here.
            if DEBUG:
                raise
            continue # if not --debug, soldier on...


Connection = namedtuple('Connection', ['client_ip_port', 'server_ip_port'])
HeartbeatMsg = namedtuple('HeartbeatMsg', ['timestamp', 'payload_size', 'request'])
regex = re.compile(r'(?P<timestamp>\d+\.\d+)[ ]+IP[ ]+(?P<src>(\d+\.){4}\d+)[ ]+[>][ ]+(?P<dest>(\d+\.){4}\d+)[:][ ]+tcp[ ]+(?P<payload_size>\d+)', re.IGNORECASE)


def parse_line(line, server_port):
    '''parse one line of output from tcpdump/windump. Return namedtuples representing connection and a heartbeat message.
    >>> parse_line('1397055473.533683 IP 192.168.86.128.40863 > 168.75.167.166.443: tcp 8', '443')
    (Connection(client_ip_port='192.168.86.128.40863', server_ip_port='168.75.167.166.443'), HeartbeatMsg(timestamp=1397055473.533683, payload_size=8, request=True))
    >>> parse_line('1397055473.587115 IP 168.75.167.166.443 > 192.168.86.128.40863: tcp 1380', '443')
    (Connection(client_ip_port='192.168.86.128.40863', server_ip_port='168.75.167.166.443'), HeartbeatMsg(timestamp=1397055473.587115, payload_size=1380, request=False))
    '''
    try:
        m = regex.search(line)
        if not m:
            if DEBUG: debug_log('Failed to match regex for output line [%s]' % line)
            return None, None
        else: # our regex matched
            timestamp = float(m.group('timestamp'))
            payload_size = int(m.group('payload_size'))
            src = m.group('src')
            dest = m.group('dest')
            server_port_end_str = '.%s' % (server_port) # e.g., '.443'
            if dest.endswith(server_port_end_str):
                request = True
                connection = Connection(client_ip_port=src, server_ip_port=dest)
            elif src.endswith(server_port_end_str):
                request = False
                connection = Connection(client_ip_port=dest, server_ip_port=src)
            else:
                if DEBUG: debug_log('Failed to find server port [%s] for output line [%s]' % (server_port, line))
                return None, None
            heartbeat = HeartbeatMsg(timestamp=timestamp, payload_size=payload_size, request=request)
            return connection, heartbeat
    except:
            if DEBUG: debug_log('Unexpected error [%s] when parsing line [%s]' % (sys.exc_info()[0], line))
            return None, None


def invoke_tcpdump(executable, arg_list, filter_string=None, stderr_mode='suppress', print_invocation=False):
    '''invoke tcpdump or windump, returning child process.'''
    if filter_string:
        arg_list.append(filter_string)
    try:
        if print_invocation:
            print_trailing_new_line_if_needed()
            print '[INVOKE]', ' '.join (executable + arg_list)
        if stderr_mode == 'suppress':
            stderr = open(os.devnull)
        elif stderr_mode == 'capture':
            stderr = subprocess.PIPE
        elif stderr_mode == 'print':
            stderr = sys.stderr
        else:
            assert False, 'Unexpected value for stderr_mode [%s]' % stderr_mode
        # do some simple checks for user errors...
        if not os.path.exists(executable[0]):
            error_string = '[ERROR] Unable to find tcpdump/windump: %s' % executable[0]
            progress_log(error_string, also_to_stdout=True)
            raise Exception(error_string)
        elif os.path.isdir(executable[0]):
            error_string = '[ERROR] Please specify executable for tcpdump/windump (not a directory). tcpdump/windump executable: [%s]' % executable[0]
            progress_log(error_string, also_to_stdout=True)
            raise Exception(error_string)
        # and now let's invoke tcpdump!
        child = subprocess.Popen(executable + arg_list, shell=False, stdout=subprocess.PIPE, stderr=stderr)
    except (OSError, ValueError) as e:
        progress_log('[ERROR] Execution failed: [%s] tcpdump/windump executable: [%s]' % (e, executable[0]), also_to_stdout=True)
        raise e
    return child

def go(args):
    '''This is our main(). Finishes setting up args, walks directories recursively, invokes tcpdump if
    a file seems to be a pcap (based on magic number), then hands off to process_tcpdump_outout()
    to do the real work.'''


    # test to see if tcpdump is there
    tcpdump = [args.tcpdump] # list for passing to subprocess
    progress_log('Testing to see if tcpdump/windump is found:', also_to_stdout=True)
    child = invoke_tcpdump(tcpdump, tcpdump_help_arg, stderr_mode='capture')
    stdout_text, stderr_text = child.communicate()
    if stderr_text:
        progress_log('Found tcpdump or windump! %s\n' % tcpdump[0], also_to_stdout=True)
    else:
        progress_log('[ERROR] Unable to execute tcpdump/windump %s. Check location of tcpdump or windump. Re-run with --debug if needed.' % tcpdump[0],
                     also_to_stdout=True)
        return
    progress_log('Starting to look for pcap files. We will show a \'.\' for every 100 files examined, and a \'p\' for every pcap examined', also_to_stdout=True)


    for directory in args.dir_list:
        progress_log('Starting to look in directory [%s] and its sub-directories.' % os.path.abspath(directory), also_to_stdout=True)
        # quick fix to handle corner case where someone specifies overlapping directories and file is already present
        # reset the file_counter at the start of every new directory. (Better fix certainly possible)
        global file_counter
        file_counter = {} # relative file name -> possible exploit count
        total_file_count = 0
        global no_trailing_newline # quick hack to set a simple progress bar. Better ways to do this...
        for dir_path, dir_names, file_names in os.walk(directory):
            for file_name in file_names:
                full_file_path = os.path.join(dir_path, file_name)
                if total_file_count % 100 == 0:
                    sys.stdout.write('.') # simple 'progress bar'
                    no_trailing_newline = True
                total_file_count += 1

                # check to see if we've been asked to only look at certain file extensions.
                if args.file_extension_list:
                    matched_a_file_extension = False
                    for file_extension in args.file_extension_list:
                        if full_file_path.endswith(file_extension):
                            matched_a_file_extension = True
                    if not matched_a_file_extension:
                        continue # skip files that don't match our requested extensions
      
                try:
                    f = open(full_file_path, 'rb')
                    first_4_bytes = f.read(4)
                except IOError as e:
                    warn_log('Unable to read file {0} I/O error({0}): {1}'.format(full_file_path, e.errno, e.strerror))
                    continue
                except:
                    warn_log('Unexpected error reading file {0} Error: {1}.'.format((full_file_path, (sys.exc_info()[0:2],))))
                    continue
                if first_4_bytes == MAGIC1 or first_4_bytes == MAGIC2:
                    sys.stdout.write('p') # simple 'progress bar' to stdout
                    progress_log('Starting to examine pcap file: %s' % os.path.abspath(full_file_path))
                    no_trailing_newline = True
                    if DEBUG:
                        debug_log('pcap file: %s' % full_file_path)
                        stderr_mode='print'
                    else:
                        stderr_mode='suppress'
                    final_filter_string = vlan_friendly_filter_string.format(server_port=args.server_port)
                    child = invoke_tcpdump(tcpdump, tcpdump_args + [full_file_path], final_filter_string, stderr_mode=stderr_mode)
                    process_tcpdump_outout(child, full_file_path, server_port=SERVER_PORT)   
                    child.wait()
                    progress_log('Finished examining pcap file.')


def describe_args_and_get_confirmation(args):
    '''Let user know what we're about to do, and get them to confirm!'''
    print '\n---- OVERVIEW ----\n'
    print overview_string
    # ------ dump key arguments ------
    print '\n---- INPUT ----\n'
    print 'DIRECTORIES: This will recursively examine files in the following directories:'
    for directory in args.dir_list:
        print ' '*5, os.path.abspath(directory)
    print ''
    if args.file_extension_list:
        print 'FILE EXTENSIONS: This will open & examine files that end with the following extensions:'
        for file_ext in args.file_extension_list:
            print ' '*5, file_ext
    else:
        print 'FILE EXTENSIONS: This will open ALL files found, but only deeply examine files that seem to be pcap files (based on file magic number).'
    print ''
    print 'server port:       ', args.server_port
    if os.name == 'nt':
        print 'windump.exe:       ', args.tcpdump
    else:
        print 'tcpdump:           ', args.tcpdump
    print 'logging directory: ', os.path.abspath(args.log_dir)
    # get confirmation!

    print '\nContinue? [y/n]',
    return raw_input().lower()



if __name__ == '__main__':
    # set up our args
    args = parser.parse_args()
    if args.test:
        # some very quick unit tests (mainly on the parsing).
        import doctest
        doctest.testmod(verbose=True, optionflags=doctest.NORMALIZE_WHITESPACE)
    else:
        if args.debug:
            DEBUG=True
        else:
            DEBUG=False
        SERVER_PORT = args.server_port
        if not args.dir_list:
            # default to working directory
            args.dir_list.append('.')
        if args.force_all_file_extensions:
            args.file_extension_list = [] # This will be translated to 'all'
        else:
            if not args.file_extension_list:
                # nothing specified by user for file extensions, so
                # default to a set of 'common' pcap file extensions
                args.file_extension_list = default_file_extensions


        # simple loop to get confirmation user understands what we are about to do.
        while True:
            response = describe_args_and_get_confirmation(args)
            if response == 'y':
                # get ready to go!
                # open our logs
                progress_log_path = os.path.join(args.log_dir, 'heartbleed_bpf_checker_progress.log.txt')
                exploit_log_path = os.path.join(args.log_dir, 'heartbleed_bpf_checker_possible_exploits.log.txt')
                try:
                    heartbleed_bpf_checker_progress_log_file = open(progress_log_path, 'a') # make sure you have write permissions...
                    heartbleed_bpf_checker_possible_exploits_log_file = open(exploit_log_path, 'a') # make sure you have write permissions...
                except:
                    print '\n[ERROR] Unable to open log files [%s].' % os.path.abspath(progress_log_path)
                    print '[ERROR] Please set \'--log_dir\' to valid dir with write permissions.'
                    print '[ERROR] Error: [%s]' % ((sys.exc_info()[0:2],))
                    print '[ERROR] Terminating...\n'
                    raise
                # Things look good so far... let's go!
                go(args)
                progress_log('Done!', also_to_stdout=True)
                break
            elif response == 'n':
                break
            else:
                continue


# snippet from RFC-6520. (Note, however, I've seen multiple heartbeats in flight).
##   There MUST NOT be more than one HeartbeatRequest message in flight at
##   a time.  A HeartbeatRequest message is considered to be in flight
##   until the corresponding HeartbeatResponse message is received, or
##   until the retransmit timer expires.