Chapter 3 - Windows Event Log Parsing

Section 3.1 - Using python-evtx

Example for opening EVTX files, iterating over events, and filtering events.

Demonstrates how to open an EVTX file and get basic details about the event log. This section makes use of python-evtx, a python library for reading event log files. To install, run pip install python-evtx.

Other libraries for parsing these event logs exist and we welcome others to add snippets that showcase how to make use of them in reading EVTX files.

Example Usage:

$ python using_python_evtx.py System.evtx

References:

Open Windows Event Logs (EVTX)

This function shows an example of opening an EVTX file and parsing out several header metadata parameters about the file.

def open_evtx(input_file):
    """Opens a Windows Event Log and displays common log parameters.

    Arguments:
        input_file (str): Path to evtx file to open

    Examples:
        >>> open_evtx("System.evtx")
        File version (major): 3
        File version (minor): 1
        File is ditry: True
        File is full: False
        Next record number: 10549

    """

    with evtx.Evtx(input_file) as open_log:
        header = open_log.get_file_header()
        properties = OrderedDict(
            [
                ("major_version", "File version (major)"),
                ("minor_version", "File version (minor)"),
                ("is_dirty", "File is dirty"),
                ("is_full", "File is full"),
                ("next_record_number", "Next record number"),
            ]
        )

        for key, value in properties.items():
            print(f"{value}: {getattr(header, key)()}")

Iterate over record XML data (EVTX)

In this function, we iterate over the records within an EVTX file and expose the raw XML. This leverages a yield generator for low impact on resources.

Additionally, if you would like to parse the XML, or interact with the child elements, you can enable it by assigning the parse_xml parameter as True, which will then call the .lxml() method on the individual event record. This requires the installation of the lxml Library, as it returns a lxml.etree object that you can interact with.

def get_events(input_file, parse_xml=False):
    """Opens a Windows Event Log and returns XML information from
    the event record.

    Arguments:
        input_file (str): Path to evtx file to open
        parse_xml (bool): If True, return an lxml object, otherwise a string

    Yields:
        (generator): XML information in object or string format

    Examples:
        >>> for event_xml in enumerate(get_events("System.evtx")):
        >>>     print(event_xml)

    """
    with evtx.Evtx(input_file) as event_log:
        for record in event_log.records():
            if parse_xml:
                yield record.lxml()
            else:
                yield record.xml()

Filtering records within events logs

Now that we have get_events(), we can begin to perform operations on the newly accessible data. In this function, we extract information from the LXML object, and use that to filter results based on Event ID and other fields within the results. You can easily extend this to support other fields, filters, and return values. Some examples include:

  • extracting all login and logoff events, with their session identifiers, then calculating the session durations

  • Identify PowerShell events and expose arguments for further processing (ie. Base64 decoding, shellcode analysis)

def filter_events_json(event_data, event_ids, fields=None):
    """Provide events where the event id is found within the provided list
    of event ids. If found, it will return a JSON formatted object per event.

    If a list of fields are provided, it will filter the resulting JSON event
    object to contain only those fields.

    Arguments:
        event_data (genertor): Iterable containing event data as XML. Preferably
            the result of the :func:`get_events()` method.
        event_ids (list): A list of event identifiers. Each element should be a
            string value, even though the identifier is an integer.
        fields (list): Collection of fields from the XML data to include in the
            JSON output. Only supports top-level fields.

    Yields:
        (dict): A dictionary containing the filtered record information

    Example:

        >>> filtered_logins = filter_events_json(
        >>>     get_events("System.evtx", parse_xml=True),
        >>>     event_ids=['4624', '4625'],
        >>>     fields=["SubjectUserName", "SubjectUserSid",
        >>>             "SubjectDomainName", "TargetUserName", "TargetUserSid",
        >>>             "TargetDomainName", "WorkstationName", "IpAddress",
        >>>             "IpPort", "ProcessName"]
        >>> )
        >>> for filtered_login in filtered_logins:
        >>>     print(json.dumps(filtered_login, indent=2))

    """
    for evt in event_data:
        system_tag = evt.find("System", evt.nsmap)
        event_id = system_tag.find("EventID", evt.nsmap)
        if event_id.text in event_ids:
            event_data = evt.find("EventData", evt.nsmap)
            json_data = {}
            for data in event_data.getchildren():
                if not fields or data.attrib["Name"] in fields:
                    # If we don't have a specified field filter list, print all
                    # Otherwise filter for only those fields within the list
                    json_data[data.attrib["Name"]] = data.text

            yield json_data

Docstring References

filter_events_json(event_data, event_ids, fields=None)

Provide events where the event id is found within the provided list of event ids. If found, it will return a JSON formatted object per event.

If a list of fields are provided, it will filter the resulting JSON event object to contain only those fields.

Parameters
  • event_data (genertor) – Iterable containing event data as XML. Preferably the result of the get_events() method.

  • event_ids (list) – A list of event identifiers. Each element should be a string value, even though the identifier is an integer.

  • fields (list) – Collection of fields from the XML data to include in the JSON output. Only supports top-level fields.

Yields

(dict) – A dictionary containing the filtered record information

Example

>>> filtered_logins = filter_events_json(
>>>     get_events("System.evtx", parse_xml=True),
>>>     event_ids=['4624', '4625'],
>>>     fields=["SubjectUserName", "SubjectUserSid",
>>>             "SubjectDomainName", "TargetUserName", "TargetUserSid",
>>>             "TargetDomainName", "WorkstationName", "IpAddress",
>>>             "IpPort", "ProcessName"]
>>> )
>>> for filtered_login in filtered_logins:
>>>     print(json.dumps(filtered_login, indent=2))
get_events(input_file, parse_xml=False)

Opens a Windows Event Log and returns XML information from the event record.

Parameters
  • input_file (str) – Path to evtx file to open

  • parse_xml (bool) – If True, return an lxml object, otherwise a string

Yields

(generator) – XML information in object or string format

Examples

>>> for event_xml in enumerate(get_events("System.evtx")):
>>>     print(event_xml)
open_evtx(input_file)

Opens a Windows Event Log and displays common log parameters.

Parameters

input_file (str) – Path to evtx file to open

Examples

>>> open_evtx("System.evtx")
File version (major): 3
File version (minor): 1
File is ditry: True
File is full: False
Next record number: 10549

Indices and tables