Quick Tip: Using PowerShell to generate a GPO report

Someone asked me today how to easily export a readable report of all GPOs applied to a system (they were performing a security audit and needed an easy to way to script this).  Of course, I immediately thought of PowerShell!  So, here’s how you can export a readable report of all GPOs applied to a system in question in PowerShell:

> Import-Module GroupPolicy
> Get-GPOReport -All -ReportType Html -Path AllGPOsReport.htm

Of course, you can also use Get-GPOReport to generate a report for a specific GPO and/or export as XML, if you prefer.

Is the sky falling? Yet another vulnerability reminds us that risks exist.

Many organizations choose to allow direct access to systems via Remote Desktop Protocol (RDP) from untrusted networks.  This presents a number of inherent risks, which opens the system up for direct exploitation via the RDP protocol.  With the recent announcement of MS12-020, we are reminded of this risk: it has been reported to Microsoft that vulnerabilities exist within RDP that may allow an attacker to send a sequence of specially crafted packets to gain the ability to remotely execute code on the system in question.

Let’s take a quick step back here and talk a bit about risks posed by exposing services.  Software is written (generally) by people and people make mistakes.  Therefore one should assume that allowing direct access to any service from an untrusted (or even trusted) network poses some risk.  The key here is making sure the benefits of allowing this access outweigh the risks posed by allowing the access (a formal risk assessment can help here).  Of course, one can also (and should) put processes in place to help further mitigate the risk, if possible.  In this specific example, you may have a legitimate business reason for directly exposing RDP to the Internet that outweighs the risk of allowing that direct access and that’s OK.  Hopefully you have systems (IDS, logging, FIM, etc) in place to help you figure out if something malicious is going on…but just because there is risk in exposing RDP to the world doesn’t mean that you should stop doing it if your company absolutely needs direct RDP access to do business.

One must always remember that the function of IT Security within an organization is to support the organization in their ability to do business.  From a purely technical security standpoint, is directly exposing RDP to untrusted networks a good idea?  No.  Should one present a case for requiring an additional layer of security (i.e VPN) prior to accessing an RDP connection?  Absolutely.  However, the same could be said for basically any network service – albeit less complex ones could be argued to have less risk of compromise due to the lack of complexity.  Either way, understanding that exposed services (yes, even VPN fits in this category) poses some risk is a good first step.

So where do you go from here?  First – patch your systems.  Once a vulnerability (like MS12-020) is publicly exposed, the likelihood of exploitation increases dramatically.  Second – understand that exposing network services poses some risk and take steps to determine whether the services need to be exposed.  Third – if a service needs to be exposed, assume that it is exploitable and put some layers of protection in place (proxies, VPN, IDS, logging, SEIM, etc) to help you mitigate some of the risk posed by the exposure.  Finally – continue to work side by side with the business to help promote a clear understanding of what risks presently exist and work together to come up with solutions to allow the business to continue to operate while accepting only the minimal amount of risk possible.

Oh yes – and did I mention that you should patch?

Registry Artifacts: Adobe Acrobat Reader

As you probably already know, the Windows Registry is a treasure trove of forensics artifacts that can come in quite handy during investigations and incident response.  Many applications leave quite the trail, and I’ve decided to start documenting these less common sections in the registry and sharing the information that I find on my blog.  We’ll start with Adobe Acrobat Reader:

In addition to recently accessed files showing up under the RecentDocs key, Acrobat Reader itself stores a list of the 5 most recently accessed PDF files in the user’s hive.  This information can be found in the subkeys under Software\Adobe\AVGeneral\cRecentFiles.  The subkeys found in this location are labeled cx (where x is replaced by the numbers 1 through 5), and under each of these subkeys you’ll find a value named tDIText which contains the full path and filename of the recently accessed pdf file.  Every time a new PDF file is opened in Reader, any existing values found in cx are copied to cx+1 and any values that were in c5 are lost (of course, keep in mind that you may be able to use VSS to recover old hives).  Unfortunately, Reader does not store date/time stamp values in these subkeys; however, you can get the date and time of the most recent file access (for the file information stored in c1) by reviewing the registry key’s last write time.  For all of the other files described in the other subkeys, given no other supporting data, you’ll only be able to state that the pdf file was accessed but will be unable to definitively state when.

If/when I discover any other interesting artifacts left by Adobe Acrobat Reader in the registry, I’ll make sure to update this post with my findings.  Feel free to leave me a comment as well if you have any additional Reader related artifacts that you review as part of your workflow…

Monitoring file system changes with PowerShell

I recently returned from facilitating Lenny Zeltser‘s excellent Reverse Engineering Malware course at SANS Security West.  One of the utilities covered in the course is called CaptureBAT, which is a useful utility for monitoring a system for changes while performing malware analysis.  Of course, given my ongoing interest in PowerShell, I decided to see if I could emulate some of CaptureBAT’s file system monitoring functionality natively in a PS script.

There are several strategies that you can use to monitor the file system in PowerShell and I decided to use the .Net framework to help accomplish this task.  Here’s how I did it:

  1. Create a new System.IO.FileSystemWatcher object, and set appropriate settings:
    $watcher = New-Object System.IO.FileSystemWatcher
    $watcher.Path = $searchPath
    $watcher.IncludeSubdirectories = $true
    $watcher.EnableRaisingEvents = $true

    .Path is the path that will be monitored, .IncludeSubdirectories tells the FileSystemWatcher to monitor all subdirectories of .Path

  2. Now we need to define some events that will fire when $watcher detects a filesystem change, I’m going to define an event for Changed, Created, Deleted, and Renamed:
    $changed = Register-ObjectEvent $watcher "Changed" -Action {
       write-host "Changed: $($eventArgs.FullPath)"
    $created = Register-ObjectEvent $watcher "Created" -Action {
       write-host "Created: $($eventArgs.FullPath)"
    $deleted = Register-ObjectEvent $watcher "Deleted" -Action {
       write-host "Deleted: $($eventArgs.FullPath)"
    $renamed = Register-ObjectEvent $watcher "Renamed" -Action {
       write-host "Renamed: $($eventArgs.FullPath)"

    Within each event you can define code for what you want to happen when the event fires.  In this example I’m just directly outputting the type of action and the full path of the changed object on the filesystem.

  3. That’s pretty much it.  These events will hang around until you close your current PowerShell session or manually unregister the events.  You can unregister the events using the Unregister-Event command:
    Unregister-Event $changed.Id
    Unregister-Event $created.Id
    Unregister-Event $deleted.Id
    Unregister-Event $renamed.Id

As you can see – monitoring file system changes is actually quite easy with PowerShell.  You could easily take the output of this script and write it to a file (either using output redirection, or specifying output directly in the script) and with a little more code you could read in a CSV/structured file containing a list of exclusions (i.e. what CaptureBat does) and within each event filter out unwanted “noise.”

The great thing about this method is that it just *works* in Windows systems that have PowerShell.  No need to install any applications or run any third-party utilities – which the malware that you’re looking at may be looking for (i.e. anti-analysis measures built into some malware).  My goal is to build a complete replacement for CaptureBat within PowerShell in my spare time over the next few weeks.  Once done, I’ll post the complete script with explanations on my blog…

But for now, time to spend some quality time with the family.  Have a great weekend.

Forensic artifacts: Dropbox

I got a bit waylaid with how Dropbox performs host-level authentication while I was researching and documenting forensic artifacts that Dropbox leaves lying around, but finally have gotten the chance to come back around to finish my research/documentation.  Here’s a summary of my observations:

  • Dropbox binaries are installed into %AppData%\Dropbox\bin instead of the standard %PROGRAMFILES%.  During the install, a number of registry keys were added (13), although they contained no forensically useful data.
  • The Dropbox configuration and state is stored in SQLite files found in %AppData%\Dropbox
    • config.db: contains baseline configuration settings that the Dropbox client references in order to run in a table named config.   Records of interest include:
      • host_id: the authentication hash used by the Dropbox client to authenticate into the Dropbox “cloud.”  This hash is assigned upon initial install/authentication and does not change unless revoked from the Dropbox web interface.
      • email: account holder’s email address.  Can be changed to any value without consequence – set at install/authentication.
      • dropbox_path: actual path to the user’s Dropbox on the local system.
      • recently_changed3: lists the path/filename for the five most recently changed files- this includes files removed/deleted from the Dropbox.  This is probably the only truly useful forensic artifact produced by Dropbox (other than the usual filesystem related artifacts).  The BLOB for this record is text-based and is consistently formatted:
        • text begins with “lp1″, ends with “a.”
        • entries are in order of most recent to least recent and each entry the filename/path is followed by “I00″ and “tp#” (replace # with the order that the file is in + 1, i.e. first entry is followed by “tp2″), separate by line breaks.
        • if the file has been removed/deleted from the Dropbox, the “I00″ text is removed and a “N” is placed in front of the “tp#”.  So, an example of a removed/deleted file is would be:(V41725479:/new file.txt
      • root_ns: appears to be used throughout the Dropbox DBs to reference the base Dropbox path/location.
    • filecache.db: contains a number of tables, but the primary focus is to describe all files actively in the Dropbox (deleted/removed files are removed from this table upon deletion/removal).  Tables and records of interest:
      • file_journal: includes the filename, path, size (in Bytes), mtime (file modified time, in Unix/POSIX format), ctime (file created time, in Unix/POSIX format), local_dir (flag indicating whether the entry is a directory), and more (mainly unpopulated).
      • block_ref: maps file IDs (fj_id) to file hashes (hash_id) found in the block_cache table.
      • block_cache: hash id (id) and hash.  Hash is of an unknown format and did not match up with anything I could generate using standard tools.
      • mount_table: appears to list folders that are shared with other Dropbox users.
    • host.db: actually not a SQLite database but contains what looks to be a hash of some sort (possibly SHA-1?) and the dropbox path (dropbox_path in config.db) encoded in base-64.  The entire file may be encoded in base-64 (basing this on a few Dropbox forum postings I read), but the first part of the file does not decode into anything human readable or match any other fields that I observed in the other DBs.
    • sigstore.db: stores hash values which correspond to the values found in the block_cache table in filecache.db.
    • unlink.db: appears to be a binary file and is not a SQLite database.  Format and purpose is unknown.

Honestly, short of the recently_changed3 record in the config database, there really isn’t a significant number of useful forensic artifacts generated by Dropbox.  Given Dropbox writes to the local filesystem, your standard filesystem analysis steps will encompass files stored/synced into a subject’s Dropbox; but perhaps, under certain circumstances, the recently_changed3 record and/or the Dropbox ctime/mtime entries for files could come in handy…

Happy Forensicating.

Dropbox authentication: insecure by design

For the past several days I have been focused on understanding the inner workings of several of the popular file synchronization tools with the purpose of finding useful forensics-related artifacts that may be left on a system as a result of using these tools.  Given the prevalence of Dropbox, I decided that it would be one of the first synchronization tools that I would analyze, and while working to better understand it I came across some interesting security related findings.  The basis for this finding has actually been briefly discussed in a number of forum posts in Dropbox’s official forum (here and here), but it doesn’t quite seem that people understand the significance of the way Dropbox is handling authentication.  So, I’m taking a brief break in my forensics-artifacts research, to try to shed some light about what appears to be going on from an authentication standpoint and the significant security implications that the present implementation of Dropbox brings to the table.

To fully understand the security implications, you need to understand how Dropbox works (for those of you that aren’t familiar with what Dropbox is – a brief feature primer can be found on their official website).  Dropbox’s primary feature is the ability to sync files across systems and devices that you own, automatically.  In order to support this syncing process, a client (the Dropbox client) is installed on a system that you wish to participate in this synchronization.  At the end of the installation process the user is prompted to enter their Dropbox credentials (or create a new account) and then the Dropbox folder on your local system syncs up with the Dropbox “cloud.”  The client runs constantly looking for new changes locally in your designated Dropbox folder and/or in the cloud and syncs as required; there are versions that support a number of operating systems (Windows, Mac, and Linux) as well as a number of portable devices (iOS, Android, etc).  However, given my research is focusing on the use of Dropbox on a Windows system, the information I’ll be providing is Windows specific (but should be applicable on any platform).

Under Windows, Dropbox stores configuration data, file/directory listings, hashes, etc in a number of SQLite database files located in %APPDATA%\Dropbox.  We’re going to focus on the primary database relating to the client configuration: config.db.  Opening config.db with your favorite SQLite DB tool will show you that there is only one table contained in the database (config) with a number of rows, which the Dropbox client references to get its settings.  I’m going to focus on the following rows of interest:

  • email: this is the account holder’s email address.  Surprisingly, this does not appear to be used as part of the authentication process and can be changed to any value (formatted like an email address) without any ill-effects.
  • dropbox_path: defines where the root of Dropbox’s synchronized folder is on the system that the client is running on.
  • host_id: assigned to the system after initial authentication is performed, post-install.  Does not appear to change over time.

After some testing (modification of data within the config table, etc) it became clear that the Dropbox client uses only the host_id to authenticate.  Here’s the problem: the config.db file is completely portable and is *not* tied to the system in any way. This means that if you gain access to a person’s config.db file (or just the host_id), you gain complete access to the person’s Dropbox until such time that the person removes the host from the list of linked devices via the Dropbox web interface.  Taking the config.db file, copying it onto another system (you may need to modify the dropbox_path, to a valid path), and then starting the Dropbox client immediately joins that system into the synchronization group without notifying the authorized user, prompting for credentials, or even getting added to the list of linked devices within your Dropbox account (even though the new system has a completely different name) – this appears to be by design.  Additionally, the host_id is still valid even after the user changes their Dropbox password (thus a standard remediation step of changing credentials does not resolve this issue).

Of course, if an attacker has access to the config.db file (assuming that it wasn’t sent by the user as part of social engineering attack), the assumption is that the attacker most likely also has access to all of the files stored in your Dropbox, so what’s the big deal?  Well, there are a few significant security implications that come to mind:

  • Relatively simple targeted malware could be designed with the specific purpose of exfiltrating the Dropbox config.db files to “interested” parties who then could use the host_id to retrieve files, infect files, etc.
  • If the attacker/malware is detected in the system post-compromise, normal remediation steps (malware removal, system re-image, credential rotation, etc) will not prevent continued access to the user’s Dropbox.  The user would have to remember to purposefully remove the system from the list of authorized devices on the Dropbox website.  This means that access could be maintained without continued access/compromise of a system.
  • Transmitting the host_id/config.db file  is most likely much smaller than exfiltrating all data found within a Dropbox folder and thus most likely not set off any detective alarms.  Review/theft/etc of the data contained within the Dropbox could be done at the attackers leisure from an external attacker-owned system.

So, given that Dropbox appears to utilize only the host_id for authentication by design, what can you do to protect yourself and/or your organization?

  1. Don’t use Dropbox and/or allow your users to use Dropbox.  This is the obvious remediating step, but is not always practical – I do think that Dropbox can be useful, if you take steps to protect your data…
  2. Protect your data: use strong encryption to protect sensitive data stored in your Dropbox and protect your passphrase (do not store your passphrase in your Dropbox or on the same system/device).
  3. Be diligent about removing old systems from your list of authorized systems within Dropbox.  Also, monitor the “Last Activity” time listed on the My Computers list within Dropbox.  If you see a system checking in that shouldn’t be, unlink it immediately.

Hopefully, Dropbox will recognize the need for additional security and add in protection mechanisms that will make it less trivial to gain long-term unauthorized access to a user’s Dropbox as well as provide better means to mitigate and detect an exposure.  Until such time, I’m hoping that this write-up helps brings to light how the authentication method used by Dropbox may not be as secure as previously assumed and that, as always, it is important to take steps to protect your data from compromise.

Update (10/31/2011): Dropbox has release version 1.2.48 that utilizes an encrypted local database and reportedly puts in place security enhancements to prevent theft of the machine credentials.  I have not personally re-tested this release – feel free to comment if you’ve validated that the new protection mechanisms operate as described.

Getting Things Done in the InfoSec world…

Many times individuals in an Information Security role feel that they’re treading water – running around fighting fires while the security posture of the organization that they work at remains unchanged and the fires just keep spreading.  Unfortunately, many organizations don’t encourage, reward, and/or provide time and resources for their InfoSec practitioners to actually improve security, and success is simply measured around how many fires were put out in the last x period of time (as well as how the team responded to said fires).  Now, I know that many organizations have teams (i.e. Incident Response Teams) that are dedicated to “fighting fires” and that’s a good thing.  However, many IT Security personnel (and IT workers in general) time after time find themselves in a position to make a real measurable impact and are stymied for one reason or another.  I’m writing this blog post to give you hope and hopefully help you deal with the major items that are, in all appearances, looming roadblocks to you actually making a meaningful impact.

Not Enough Money

This is a very common roadblock and is one that I’m sure you’ve all experienced at some point in your career.  Most companies have limited resources, especially when it comes to IT Security, and management generally prefer to accept risks rather than pay upfront to protect against the possibility of said risk occurring, especially when the cost to mitigate the risk is significant.  If you don’t take the right approach in your presentation of a solution, management will just end up seeing the dollar signs associated with a new technology that you’re proposing rather than the value the solution brings to the table.

So…you need to be strategic.  In my opinion, providing information to management so that they can make informed IT Security related spending decisions is relatively straight forward:

  1. Define your threat.
  2. Define the probability of the threat actually occurring.
  3. Define the probable impact that the threat would have on the organization, and the cost of the impact (if possible).
  4. Finally, define a cost to mitigate the threat (or a number of cost options for varying degrees of mitigation).

Of course, this is easier said than done – however, the bottom line is that you want to communicate to management how a proposed solution is going to cost-effectively mitigate a well-defined risk.

Sometimes, even though you may have convinced your management of the merits of incurring costs to protect against a defined risk, it is just the case that your company really doesn’t have the funds available.  So this is where you’ll need to be creative – if you truly/strongly believe that a risk needs to be mitigated and you have management buy-in, you can usually come up with some options to mitigate the risk that do not involve your company actually spending money (short of your or your team’s time).  Sure, these solutions most likely are not as effective as your originally proposed solution, but at this point a little protection is better than none (as a side note, remember that there is no such thing as realistic complete protection) and you can provide value to your organization.  Just make sure that you are very clear as to the limitations, etc of your new solution so that no-one assumes protections exists where there may be none.

No matter the financial environment at your organization, it is always a good strategy to come into a presentation prepared with a number of solutions ranging in cost and accompany each solution with an explanation of what level of mitigation it provides.  The more prepared you are, the higher the chance of success and the greater the value you bring to your organization.

Not Enough Time

So you’ve gotten past the initial roadblock of not having enough money and management has tasked you with implementing your proposed solution.  The only problem is: you’re slammed with work (i.e. firefighting) and really don’t have the time to actually plan, test, implement, etc.  You could always go back to your management and ask for some professional services dollars, but even with help, you’re still going to have to dedicate some of your time (albeit less).  So, here you are with approval to make an impact on your organization’s security stance, but you don’t actually feel like you have the time to make that impact.

Obviously, working more hours gets you more time – but the principal problem here is probably not that you need more hours, but that you need to allocate/prioritize your time a bit differently to support doing more than simply firefighting.  You’re going to need management buy-in for this one and convincing management to allow you to step back from firefighting to do some project work can be similar to actually “selling” the initial project (and should probably be done at the same time).  But don’t worry – since your management has already bought into supporting the project (at least financially), it shouldn’t be that much of a stretch to allow you to re-prioritize your time to support the implementation of said project.  Here are a couple of key discussion points you probably want to focus on when talking around task/time priortization:

  • Delayed completion of other tasks: once complete, the project should actually reduce the amount of time spent firefighting either by way of mitigating a threat and/or making you more efficient.  Depending on the benefit that the project is bringing to the table, some “fires” may be able to smolder while you work on moving things forward.
  • Task prioritization: continuing on with the previous point, it is better to have already defined what type of “fire” you can delay putting out versus one that needs immediate attention.  A clear understanding of task priorities will allow you to know where your project fits in alongside your operational/firefighting tasks; it will also become clear as to whether you can feasibly accomplish your project or not (i.e. you already have a full-time job’s worth of “Critical” tasks).  Finally, you may need to work with your management to further re-prioritize your tasks, allocate more resources, etc.  Don’t give up on your project – if you truly believe that it will bring a significant benefit to your organization, work to convince your management of that!

Like money, for most organizations (and people) time is a limited/scarce resource and getting management buy-in to use your time for a specific project can sometimes be even more difficult than getting money allocated.  I highly recommend that you combine the time and money discussions into one, so that when you get full buy-in you already know that you have both the time and money available to be successful.

Overwhelming Obstacles

OK – if you’ve gotten this far you’ve taken care of some significant roadblocks, so be encouraged!  But alas you’ll probably end up hitting a few more speed bumps, one of which is people.  There are a number of obstacles that can come up during a project, but one of the major ones that you’ll see time and time again is people within your organization (hopefully outside of your team/group).  They generally mean well and oppose your project for good reasons, so it is important that you keep that in mind when someone within your organization is either choosing to be unhelpful, active opposing your project, etc.  Again, you’re going to need put on your “sales” hat (sense the recurring theme? 😎 ) and work to convince people of the merits of what you’re doing.  You should focus on:

  • Understanding why they don’t think it is a good idea and then showing them that it is actually a good idea – keep an open mind though, as a lot of time as you’re discussing your project people will bring up very good points that you should seriously consider.
  • Keeping the end-goal in mind (what you’re actually *trying* to do) and be reasonable with the *how* you’re going to accomplish this goal.  If people don’t agree with the way you’re trying to accomplish your goal, then perhaps you can come up with an alternative approach that the key players can agree on (while still accomplishing the same end result).
  • Showing how your project adds value to the organization and to their team.  Understanding what their team does and how they work can be helpful as part of this demonstration.

At some point, if you cannot get agreement between the key players, you’ll need to get management involved (hopefully upper-levels that you have convinced on the merits of your project) and directives can be made to influence agreement at lower organization levels.  This is definitely a worst case scenario, and you should attempt to get reasonable buy-in before resorting to this strategy.

Overwhelming Situation

You most likely feel like you’re treading water for a reason – so many tasks, so many fires, so many holes, so little time.  There are times that you can feel completely overwhelmed by your situation and perhaps even your project itself could feel overwhelming.  Here’s a good key to actually making an impact: take small doable steps to incrementally improve your organization’s situation.  Identify an item that you can accomplish as part of your project and *do it*.  Don’t sit back and let the sheer weight of everything else going on overwhelm you.  Pick a bite-sized chunk of what you’re needing to work on and then *do it*.  Of course, it is important to keep the big picture in mind, but if you stay so focussed on everything that needs to be accomplished for that big picture to be done, you’ll end up feeling like you’re unable to make any impact at all.  You can make an impact!  Even completing a small task and moving your organization forward in the area of IT Security can have sizable long-term organizational benefits.

So – make it a goal to impact your organization, prepare and sell your ideas to your management and teams around you, and then take incremental steps to move your organization forward.  Before you know it, you’ll be floating down the river in a log raft – which is better than just treading water.


Searching and extracting data from PST files

Keyword searches can be a significant aspect of an investigation and given the prevalence of Microsoft Outlook you’ll most likely find yourself needing to search through PST files for data, be it a simple keyword or more complex pattern.  Even though you can use Outlook to open up a PST file, my personal preference is not to do the search within Outlook itself for two primary reasons:

  1. Outlook will change data within the PST file; of course, you’re working on a copy – but I prefer to not have dynamically changing data (i.e. unread/read status, etc) when I’m doing my analysis.
  2. If you’re wanting to find data matching a certain pattern (i.e. Regular Expressions) or data that is not within the message body (i.e. message header data), Outlook does not really have the facilities to support these kinds of searches.

Of course, there are several commercial investigative tools that will parse through and allow you to search PST files (FTK and Encase come to mind) but in this post I’m going to focus on performing the extraction and search with only free tools in a Linux environment.

What you’ll need:

  • A relatively up-to-date Linux system (be it physical or VM).
  • Readpst compiled/installed (in Ubuntu: apt-get install readpst) – readpst is a utility included with libpst which can be found here.

Also, I’m going to begin by assuming that you’ve acquired the PST file in a forensically sound fashion and that a copy of the file is accessible on your Linux system.  Let’s get started…

Extracting data from a PST file using readpst

Run readpst on the PST file to extract all objects within the PST (i.e. messages/attachments, calendar entries, contacts, etc).  By default, readpst exports data in mbox format – this ends up placing all of the extracted objects into a set of mbox files (one per subfolder), which can make extracting objects that match a search criteria a bit tedious.  Instead, we’re going to tell readpst to write each object into its own file, the command looks like:

readpst -S -o out/ outlook.pst

Where out/ is the directory where you’d like readpst to output the files and outlook.pst is the PST file that you’re extracting data from.  The -S flag indicates that you’d like readpst to extract each object separately, rather than in mbox format.

Once readpst has finished, in your output directory you’ll find a directory structure that matches the folder structure of the PST (generally starting with a base directory of Outlook).  Within each of these folders you’ll find numerically named files that contain plain text representing the exported object (i.e. for a email message you’ll find the message body, headers, etc).

Working with the extracted data

Thanks to readpst, it is quite trivial to extract all data within a PST file into a nicely organized (and basically human readable) set of files and at this point you can begin processing these files as you would any other text file.  For example, a commonly seen forensic task would be to search all objects within a PST for certain keywords or perhaps a pattern.  As an example of pattern matching, let’s say you were investigating a PII incident and you wanted to see whether a subject had utilized email to send or receive emails that appear to contain social security numbers.  You could use grep to search the files within the directory structure that readpst created with the following command:

grep -R -P '\b(?!000)(?!666)([0-6]\d{2}|7([0-6]\d|7[012]))([ -])?(?!00)\d\d([ -|])?(?!0000)\d{4}\b' out/

This is telling grep to run a recursive search using a regular expression which will match numbers that look like SSNs in the readpst output directory.  From there, you could even automate this process using a script to automatically move matching messages to a target folder that you could manually validate (or whatever the next step of you given workflow is).

As you can see, forensically analyzing PST files using freely available software is quite easy and can be a very powerful method for efficiently extracting case-pertinent data.  Give it a try sometime…

On a side note, I’ve added a new Resources section to my blog and one of the pages contained within this section is dedicated to listing useful regular expressions (such as the SSN matching regular expression I used above).  Right now, that is the only one I have up there, but I’ll keep adding to this page as I think of other useful regular expressions, so check back regularly.

Quick Tip: Meaning of MAC times in different file systems

Every file system handles MAC times slightly differently, however sleuthkit (as well as other forensics software products) use the same acronym/fields no matter which file system you’re analyzing.  Here’s a quick run-down of some popular file systems and what the M, A, C, and B mean:

File System m a c b
Ext2/3 Modified Accessed Attribute modification and/or file content change N/A
FAT File Modified Accessed N/A Created
NTFS File Modified Accessed MFT Modified Created
UFS Modified Accessed Attribute modification and/or file content change N/A
And now, back to your regularly scheduled programming…

Volume Shadow Copies

Harlan Carvey recently wrote a post on his blog called Accessing Volume Shadow Copies, which provided some excellent instruction on how you can go about accessing Volume Shadow Copies (VSCs) from an existing image without having to use expensive tools (in fact, his solution uses completely free tools).  In Windows 7 and Vista, VSS is turned on by default and thus additional artifacts are possibly just waiting to be discovered.  Accessing a system’s VSC(s) can be highly useful in an investigation and can possibly help you get your hands on older copies of registry hives (i.e. being able to get historical UserAssist data, etc) as well as other older file snapshots (pictures, etc), which can come in very handy.  So, needless to say, if you’re not presently looking for VSCs as part of your forensics workflow, you probably should be…

In the process of testing Harlan’s procedure, I started to wonder how Windows, by default, decides to generate these VSCs (and what is included).  I came up with some data and I thought that I’d go ahead and post my findings (feel free to comment if I’ve gotten anything wrong here):

  • Windows 7/Vista automatically (out of the box) create restore points at pseudo-random intervals:
    • A scheduled task (named SR in Win7) controls when a snapshot occurs.  By default, the task is set to run at 12:00AM every day and 30 minutes after every system startup, but will only execute when the system is plugged in and has been idle for 10 minutes.  If the system is not idle, the task will continue to wait for idle for 23 hours.
    • If a restore point/snapshot has not been successfully created in the last seven days, system protection will create one automatically.
    • And finally, a restore point may be created “automagically” as part of certain software installation/driver installation processes.
    • The bottom line: it is basically impossible to predict with any degree of certainty when a snapshot will occur.
  • All files/folders are covered in a volume snapshot, except for those defined under the HKLM\System\CurrentControlSet\Control\Backup Restore\FilesNotToSnapshot registry key.
  • If a file is modified several times between snapshots, only the version that was current when the restore point/VSC was made will be available to you for analysis.  Mind you, there may be multiple VSCs available, so that can be helpful with getting further historical revisions.

Additional resources:

Wikipedia – System Restore
Wikipedia – Shadow Copy

QCCIS – Reliably recovering evidential data from Volume Shadow Copies Whitepaper