There Are Four Lights: LNK Parsing tools

June 18, 2013, 7:06 am

≪ Previous: There Are Four Lights: Shell Items

Based on the content of my last post regarding shell items, I wanted to take a look at some of the available tools for parsing Windows shortcuts/LNK files. I started by asking folks what tools they used to parse LNK files, and then went looking for those, and other, tools.

The purpose of this blog post is to take a look at how effective some of the various tools used by analysts are at parsing shell item ID lists within Windows shortcut/LNK files.

This blog post contains an excellent description of what we're looking for and trying to achieve with the testing.

My previous post includes sample output from the tool I use to parse LNK files; one of the files I used for testing is an LNK file for a device that does not contain a LinkInfo block, but instead contains ONLY a shell item ID list. I did not specifically craft this LNK file...it was taking from a Windows 2008 R2 system. I copied this file to my desktop, and changed the extension from ".lnk" to ".txt".

For comparison purposes, the script I wrote parses the device test file as follows:

File: c:\users\harlan\desktop\camera.txt
shitemidlist My Computer/DROID2/Removable Storage/dcim/Camera

The other file I used in this testing is an LNK file created by the installation of Google Chrome. All of the tools tested handled parsing this LNK file just fine, although not all of them parsed the shell item ID list.

Now, on to the tools themselves. Some of the things I'm most interested in when looking at tools for parsing LNK files include completeness/correctness of output, ease of use, the ease with which I can incorporate the output into my analysis processes, etc. I know that some of these aspects may mean different things to different people...for example, if you're not familiar with parsing shell item ID lists, how do you determine completeness/correctness? Also, "ease of use" may mean "GUI" to some, but it may mean "CSV output" to others. As such, I opted to not give any recommendations or ratings, but to instead just provide what I saw in the output from each tool.

TZWorks lp64 v0.55 - lp64 (the 64-bit version of the tool) handled the Google Chrome LNK file easily, as did the other tools included in this test. Unlike some of the other tools, lp64 parsed the shell item ID list from the LNK file:

ID List: {CLSID_UsersFiles}\Local\Google\Chrome\Application\chrome.exe

For the device test file described above, lp64 provided the following output:

ID List: {CLSID_MyComputer}\{2006014e-0831-0003-0000-000000000000}

I'm not sure what the GUID refers to...I did a look up via Google and didn't find anything that would really give me an indication of what that meant. Looking at the file itself in a hex editor (i.e., UltraEdit), I can see from where that data originated, and I can tell that the shell item was not parsed correctly; that is to say, the 16 bytes extracted from the file are NOT a GUID, yet lp64 parses them as such.

WoanWare LnkAnalyser v1.01 - This tool is a CLI utility that took me a couple of attempts to run, first because I had typed "lnkanalyzer" instead of "lnkanalyser". ;-) I then pointed it at the camera.txt file from the previous post (renamed from camera.lnk) and it did not display any shell item contents. In fact, the tool listed several sections (i.e., Target Metadata, Volume ID, TrackerDataBlock, etc.), all of which were empty, with the exception of the time stamps, which were listed as "1/1/0001 12:00:00 AM".

LnkAnalyser did handle the Google Chrome LNK file just fine, but without parsing the shell item ID list.

Lnk_parser - The Google Code page states that this tool is "in beta" and should not be rehosted...I opted to include it in testing. It turns out that this tool is very interactive (which I could have avoided, had I read the command line usage instructions), posting a list of questions to the console for the analyst to answer, with respect to where the target file is located, the type of output that you want, and where you want the output to go. I chose CSV output, going to the current working directory, as well as to the console. The output of the tool did include:

[Link Target ID List]
CLSID: 20d04fe0-3aea-1069-a2d8-08002b30309d = My Computer

This was followed by a number of "[Property Store]" entries that made little sense; that is to say, I am familiar with what these entries might represent from my research, but the data that they contain doesn't look as it would be meaningful or usable to an analyst. I did find a reference to one of the PROPERTYSTORAGE values from the lnk_parser output listed in the Cloud Storage Forensic Analysis PDF, reportedly as part of the output from XWays 16.5, but I'm not clear as to what it refers to.

Lnk_parser did not handle the Google Chrome LNK file at all. I used the same settings/choices as I did for the previous file, and got no output the console. The resulting CSV file in the working directory had only one entry, and it was just some garbled data.

MiTeC Windows File Analyzer (WFA) - LNK files are just one of the file formats that WFA is capable of parsing. WFA is GUI-based and works on directories (rather than individual files), so I had to rename the camera.txt file to camera.lnk. WFA did not parse any data from the camera.lnk file, although it handled the Google Chrome LNK file just fine. WFA did not, however, parse the shell item ID list from the Google Chrome LNK file.

Log2Timeline - a user over on the Win4n6 forum mentioned that log2timeline parses shell item ID lists in LNK files, but I verified with Kristinn that at the moment, it does not. As such, log2timeline was not included in the test. I am including it in this listing simply due to the fact that someone had mentioned that it does parse shell item ID lists.

Other tools - some others have mentioned using EnCase 6 and/or 7 for parsing LNK files; I do not have access to either one, so I cannot test them.

Results
The overall results of my (admittedly limited) testing indicates that the TZWorks lp64 tool does the best job of the available tools when it comes to parsing shell item ID lists within LNK files. That being said, however, some shell items do not appear to be parsed correctly.

On a side note, something that I do like about lp64 is that it lists it's output in an easy-to-parse format. Each element is listed on a single line with an element ID, a colon, and then the data...this makes it easy to parse using Perl or Python.

So What?
So, why is this important? After all, who cares? Well, to be honest, every analyst should, for the simple fact that shell items can be found in a number of artifacts besides just Windows LNK files. They exist in shellbags artifacts, within the MenuOrder subkeys, they're embedded within Windows 7 and 8 Jump Lists, ComDlg32 subkey values (Vista+), and they can even be found in the Windows 8 Photo artifacts. Being able to understand shell items can be important, and being able to correctly parse device shell items can be even more important; in CP cases, the use of devices may indicate production, and in IP theft cases, a device may have been used as the target of a copy or move operation. Also, there is malware that is known to use specially-constructed LNK files (ie, target.lnk) as part of their infection/propagation mechanism, so being able to accurately parse these files will be valuable to the malware threat analysis.

Resources
ForensicsWiki page - LNK
LinuxSleuthing blog post

↧

Reading

June 19, 2013, 5:30 am

≫ Next: Crossing Streams

≪ Previous: There Are Four Lights: LNK Parsing tools

I wanted to share some of the interesting items I've read over the passed couple of weeks, and in doing so, I think that it's important to share not just that it was read, but to also share any insights gleaned from the material. It's one thing to provide a bunch of links...it's another thing entirely to share the impact that the material had on your thinking, and what insights you might have had. I see this a good deal, not just in the DFIR community...someone posts links to material that they have read, as if to say, "hey, click on this and read it...", but don't share their insights as to why someone should do that. If something is of value, I think that a quick synopsis of why it's of value would be useful to folks.

I look at it this way...have you ever looked up a restaurant on Yelp to read the reviews, and used what you saw to decide whether you wanted to go or not? Or have you ever look at reviews of movies in order to decide if you wanted to spend the money to see it in the theater now, or just wait until it hits the cable system, where you can see it for $5? That's the approach I'm taking with this post...

Anyway, onward...

The Needs of the Many - this is an excellent blog post that presents and discusses the characteristics of servant security leader.

This is an excellent read, not just for those seeking to understand how to be a servant leader, but also for the any Star Trek fan, as Andrew uses not just quotes from the series and movies, but also uses scenes as metaphors for the topic. It's one thing to write a paragraph and add a Wikipedia link for reference, but it's another thing entirely to use iconic movie characters and scenes to illustrate a point, such as three-dimensional thinking, giving the reader that, "oh, yeah" moment.

Survivorship Bias - this blog post was an excellent read that really opened my eyes to how we tend to view our efforts in tool testing, as well as in analysis.

A quote from the article that really caught my attention is:

Failure to look for what is missing is a common shortcoming, not just within yourself but also within the institutions that surround you.

This is very true in a lot of ways, especially within the DFIR community, which is the "institution" in the quote. Training courses (vendor-specific and otherwise) tend to talk a lot about tools, and as such, many analysts focus on tools, rather than the analysis process. Some analysts will use tools endorsed by others, never asking if the tools have been tested or validated, and simply trusting that they have been. In other cases, analysts will use one tool to validate the output of another tool, without ever understanding the underlying data structures being parsed...this is what I refer to as the tool validation "myth-odology".

This focus on tools is taking analysts away from where they need to be focused...which should be on the analysis process. I've seen analysts say that the tools allow non-experts to be useful, but how useful are they if there is no understanding of what the tool is parsing, or how the tools does what it does? A non-expert finding a piece of "evidence" at a physical crime scene will not know to provide the context of that evidence, and the same is true in the digital realm, as well. Tools should not be viewed as "making non-experts useful". Again, this is part of the "institution".

What I've seen this lead to is the repeated endorsement and use of tools that do not completely parse data structures, and do not provide any indication that they have not parsed those structures. As the tools are endorsed by "experts", and analysts find just those things that they are looking for (that "survivorship bias", where failures are not considered) the tools continue to be used, and this is something that appears to be institutional rather than an isolated, individual problem.

If you have the time, I highly recommend reading through other posts at the YouAreNotSoSmart blog. There are some really good posts there that are definitely worth your time to not just read, but consider and ingest. I'm a firm believer that anyone who wants to progress in a field needs to regularly seek knowledge outside of that field, and this is a good place to spend some time doing just that.

Paul Melson: GrrCON 2012 Forensic Challenge Write-up - Folks seem to really like stories about how others accomplished something...Paul provides how he answered the GrrCON 2012 Forensic Challenge. This is actually from the end of last year, but that's fine...everything he did still holds. Paul walks through the entire process used, describing the tools used, providing command lines, and illustrating the output. If you attempted this challenge, compare what you did to what Paul provided.

Within the DFIR community especially, I've found that analysts really tend to enjoy reading about or hearing how others have gone about solving problems. However, one of the shortcomings of the community is that not a lot of folks within it like to share how they've gone about solving problems. I know that last year, a friend of mine tried to set up a web site where folks could share those stories, but it never got off the ground. It's unfortunate...none of us alone is as smart as all of us together, and there is a lot of great information sitting out there, untapped and unused, because we aren't willing or able to share it.

Digital Forensics Stream: Amazon Cloud Drive Forensics, pt I - Similar to Paul's post, this DFStream blog post wasn't about a challenge, but it did provide an end-to-end walk through of an important aspect of analysis...testing. With the rise of cloud services, and an increased discussion of their use, one of the aspects that analysts will have to contend with is the use of desktop applications for accessing these services. Access to these services are being built into platforms, and their use is going to become more transparent (to the user) over time. As such, analysts are going to need to have an understanding of the effect of these applications on the systems being analyzed, and many analysts are in fact already asking those questions...and this post provides some answers.

GSN: Computer forensics experts discover how to determine how many times a hard drive had been turned on, and for how many hours it had run. Very interesting find within the SMART info of some drives, this can definitely be useful.

↧

Crossing Streams

June 20, 2013, 9:09 am

≫ Next: HowTo: Tie LNK Files to a Device

≪ Previous: Reading

Sometimes, crossing the streams can be a good thing. I was checking out some of the new posts on my RSS feed recently, and saw SQLite on the Case over on the LinuxSleuthing blog.

I'm not an anti-Linux, Windows-only guy. I'm just a guy who's used Windows, done vulnerability assessments of Windows systems, and been asked to do IR and forensic analysis of Windows systems for a long time. I also like looking in other places for something I can use to make my job easier, progress more smoothly, and allow me to perform a more comprehensive analysis of Windows systems. Because you can find SQLite databases on Windows systems (usually associated with a third-party application, such as Firefox or Chrome, or found in iDevice backups), I like to see what's out there in the DFIR community with respect to this database structure, and the LinuxSleuthing blog has been both generous and valuable in that regard.

The blog as a whole, and this post specifically, contains a lot of great info, but this time, it wasn't the technical info within the post that caught my attention...it was something of a crossing of the streams. What I mean by that is that I saw a couple of statements in the post that reminded me of things I'd said in my own blog, and it was as if two different people, with different backgrounds, interests, etc., were following the same (or very similar) reasoning.

The first statement:

It is very common in SQLite databases for integers to represent a deeper meaning than their numeric value.

This sentence took me back to my post on understanding data structures. Much like many of the data structures on Windows systems, the SQLite table in question has a column that contains integers, which must be translated in order to be meaningful to the analyst. Either the analyst does this automatically, or a software tool, which provides a layer of abstraction over the raw data, does it for us. In this case, "4" refers to an "incoming" call...that's important because doing a text-based keyword search for "incoming" won't reveal anything directly pertinent to the table or column in question.

In the case of Windows systems, a 4 byte DWORD value might be an identifier, providing information regarding the type of something, or it might be a flag value, containing multiple settings AND'd together. Our tools tend to provide a layer of abstraction over the data, and many will translate the integers to their corresponding human-readable values. As such, it is important that we understand the data structure and it's constituent elements, rather than simply relying on tools to spit out the data for us, as this helps us better understand the context of the data that we're looking at.

Consider the DOSDate time stamps found embedded in some shell items...what do they represent, and where do they come from? Okay, we have something of an understanding of what they represent - the MAC times of the target resource (folder, file usually). If the file system in which the target resource resides is NTFS, we know that the values start as FILETIME objects with 100 nanosecond granularity, are truncated to the second, and then (per the publicly available MS API), the second value is multiplied by 2. So, an NTFS time value of "23:15:05.657" becomes "23:15:10", and we have a considerable loss of granularity. We also know that the embedded time stamps are a snapshot in time, and that the resources can be impacted by actions outside the purview of the shell items. For example, after a shell item ID list within an LNK file is created, files can be added to or deleted from one of the constituent folders, updating the last modification time. Finally, of what value is the last accessed time on Vista and above systems?

So, my (rather long winded) point is that when we see a date/time stamp labeled "last access time" in the output of a tool, do we really understand the context of that integer value?

Okay, on to the second statement...

If you thought use another tool and see what it says then go outside and drag your knuckles on the concrete for a bit.

This statement reminded me on my thoughts on what I refer to as the tool validation "myth-odology".

Consider another recent blog post regarding LNK file parsing tools; in that post, I described the issue I wanted to test for, and then after querying a forum for the tools folks currently used to parse these files, I ran several of them against some test data. In this case, my goal was to see which of the tools correctly parsed shell item ID lists within the LNK files. Given this goal, would you download a tool that does NOT parse/display the shell item ID list in an LNK file, and use it to validate the output of other tools?

Here's the issue...many of the tools recommended for parsing LNK files parse only the header and LinkInfo block, and not the shell item ID list. Some that do parse the shell item ID lists do not appear to do so correctly. The shell item data structures are not isolated to just LNK files, they're actually pretty pervasive on Windows systems, but I chose to start with LNK files as they've been around for a lot longer than other artifacts, and tools for parsing them might be more mature.

So, why the interest in shell item ID lists? Well, in most normal cases, the shell item ID lists might simply provide redundant information, replicating what's seen in the LinkInfo block. However, there are legitimate cases where Windows will create an LNK file for a device (smart phone, digital camera, etc.) that consists solely of a header and a shell item ID list, and does not contain a LinkInfo block. Not being able to accurately parse these LNK files can present an issue to IP theft (devices are great at storing files) and illicit images cases (might mean the difference between possession and production). Also, there is malware out there that uses specially-crafted LNK files (i.e., "target.lnk") as a propagation mechanism.

Given all this, if someone wants to use the output of a tool that does not parse the shell item ID list of an LNK file to validate the output of another tool, I think the above imagery of "drag your knuckles" is appropriate. ;-) Just sayin'.

↧

HowTo: Tie LNK Files to a Device

June 29, 2013, 7:04 am

≫ Next: HowTo: Correlate an Attached Device to a User

≪ Previous: Crossing Streams

Based on commentary I've seen in a couple of online forums, I thought I'd resurrect the "HowTo" label from some previous blog posts, and share (for commentary, feedback and improvement) some of the analysis processes that I've used while examining images of Windows systems. There is a good deal of information available regarding various Windows artifacts, and one of perhaps the most difficult aspects of analysis is to tie various disparate bits of information together, correlating the artifacts, and building a complete picture so that your analysis can be used to answer questions and provide solutions.

This particular topic was previously discussed in this blog (and here's another, much older post), but sometimes processes like this need to be revisited. Before we start, however, it's important to point out that this process will work only on Windows Vista systems and above, due to the information that is required for the process to work properly.

LNK Files
A Windows shortcut/LNK file can contain volume serial number, or VSNs. This is intended to be a unique 4-byte (DWORD) value that identifies the volume, and is changed when the volume is reformatted. Many tools that parse LNK files will display the VSN in their output, if one exists.

Note: Prefetch files include a volume information block which also contains a VSN. If this information is different from the local system...that is, if a user launched an application from an external storage device...you can also use this process to correlate the VSN to the particular device. You can view the VSN for a volume on a live system by navigating to the volume via the command prompt and typing the 'vol' command.

Registry
The EMDMgmt key (within the Software hive) contains information about USB external devices connected to the system. This information is generated and used by the ReadyBoost service, at least in part to determine the suitability of the device for use as external RAM.

The path to the key in question is:
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\EMDMgmt

This key will contain subkeys that pertain to and describe external storage media. The subkeys that we're interested in are those that begin with "_??_USBSTOR#". These subkey names are very similar to artifacts found in the System hive, particularly in the USBStor subkeys. These subkey names include device serial number, as well a volume name (if one exists) and a VSN in decimal format.

An example of such a subkey name, with the VSN in bold, appears as follows:
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\EMDMgmt\_??_USBSTOR#Disk&Ven_Best_Buy&Prod_Geek_Squad_U3&Rev_6.15#0C90195032E36889&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}TEST_1677970716

For those subkeys that pertain to USB thumb drives, the emdmgmt.pl RegRipper plugin will parse the subkey name, and display the VSN formatted in a usable, understandable manner. That is to say that the plugin will translate the decimal value for the VSN into a hexidecimal format, and display it in the same manner as the VSN seen in LNK and Prefetch files, as well as what is displayed by the vol command on live systems.

Again, it is important to note the EMDMgmt key exists on Vista systems and above, but not on XP systems. As such, this technique will not work for XP/2003 systems.

Now that we have these two pieces of information, we can correlate LNK files (or Prefetch files, if necessary) to a particular device, based on the VSNs. I've used this technique a number of times, most recently in an attempt to determine a user's access to a particular device (remember, LNK files are most often associated with a user, as they are often located within the user's profile). If you know what it is that you're attempting to determine or demonstrate...that is, the goals of your analysis...then the tools and artifacts tend to fall right into place. When I've had to perform this type of correlation of artifacts, because of the tools I have available, this analysis is complete in just a few minutes.

As a final note, do not forget the value of historical information on the system, particularly for the Registry. The RegBack folder should contain a backed-up copy of the Software hive, and there is additional information available in VSCs. Corey Harrell has a number of excellent posts on his blog that demonstrate how to use simple tools and processes...batch files...to exploit the information available in VSCs.

Resources
MS-SHLLINK file format specification
Description of EMDMgmt RegRipper plugin

↧

HowTo: Correlate an Attached Device to a User

July 1, 2013, 10:49 am

≫ Next: HowTo: Determine Users on the System

≪ Previous: HowTo: Tie LNK Files to a Device

Not long after my previous post on correlating LNK files to an external device, I received a question regarding correlating a device to a particular user. Some may look at this and think, well, that's easy...the LNK files in question are located in the user profile, so correlating the user to the device is actually pretty easy.

Okay, but what happens if there are no LNK files that point to the device? After all, for a LNK file to be available, the user must either create it manually, or perform some action where the operating system will create it automatically, right? Usually, this means that the user has opened a folder on the external device and double-clicked a non-executable file of some kind, such as a .txt file or a Word or PowerPoint document. The resulting action is that the appropriate application is launched based on the file extension, and the file is opened in that application, and an LNK file is created.

So, if you're examining a system and you suspect (or can show) that a USB device had been connected to the system, then how would you go about associating the device with a particular user, in the absence of LNK files?

MountPoints2 Key
As part of your USB device discovery process, one of the places that you're going to look is in the MountedDevices key within the System hive, in order to map the devices you've found in the USBStor subkeys to the volume globablly-unique identifier (GUID). Beneath the MountedDevices key, some of the value names will be the volume GUIDs, and their binary data will contain the device information, in Unicode format. Parsing the data and mapping back to the value name will provide you with the volume GUID. You can then use this information to search the subkeys beneath the MountPoints2 key in the NTUSER.DAT hive files for the volume GUID

The path to the MountPoints2 key is:
\Software\Microsoft\Windows\CurrentVersion\Explorer\MountPoints2

Volume GUID key LastWrite time
It's commonly accepted that the LastWrite time for the volume GUID subkey beneath the MountPoints2 key indicates when that device was last connected to the system.

Shellbags Artifacts
Another artifact that allows us to correlate an attached device to a user is the shellbags artifacts, which on Windows 7 systems are found in the USRCLASS.DAT hive file within the user profile. This is how we can tie a device to a particular user.

The shellbags artifacts are simply paths to resources composed of shell items, the same types of data structures that can be found in LNK file shell item ID lists, Jump Lists, as well as other locations within the Registry. On Vista and Windows 7 systems, the first time that the user opens an Explorer window to a folder on an attached USB device (external hard drive or thumb drive), one of the shell items in the path will represent the drive letter to which the device was mapped, followed by the folder paths. The drive letter can be correlated to a particular USB device by mapping to the contents of the MountedDevices key values (or, for a device previously connected to the system, you may want to dig a bit into the MountedDevices key contents available in VSCs), or to the "Windows Portable Devices\Devices" subkeys in the Software hive (again, be sure to check VSCs, as well).

Historical Registry Information
If you're looking for historical information from within the Registry, be sure to check the contents of the C:\Windows\system32\config\RegBack folder for backed-up copies of the System and Software hives.

However...and this is very important...the shellbags artifacts may contain information about attached devices that are not immediately obvious via other means of analysis. For example, USB external drives and thumb drives are usually represented on the system as a volume (i.e., F:\, G:\) or drive letter, whereas smartphones, digital cameras, and MP3 players, while storage devices, usually appear beneath in the shellbags artifacts as a different type (type == 0x2e) of shell item. Not all of the tools available for parsing shellbags will parse these types of shell items; in fact, those that don't seem to simply skip parsing the entire path, so shellbags with type 0x2e shell items at their root are not displayed.

Further, depending upon the version of Windows and the type of device, the shell items the comprise the path the resource might be type 0x00, or a variable type. As the name implies, the routine for parsing these types of shell items varies, and in many instances, a great deal of information can be retrieved via knowledgeable manual analysis.

Devices connected via Bluetooth
Devices can be connected to a Windows system in other ways, including via Bluetooth. What's interesting about this type of connection is that smartphones are pretty ubiquitous at this point, and once the initial connection has been made, reconnecting to the device is trivial, and the device doesn't even have to be in view. The device can be reconnected as long as it is in range, and can be on a belt, or in a backpack or purse.

In my research, I found that a lot of users will connect to their smartphone via Bluetooth and use that connection to play music via their computer. I also found out that MS provides a file called fsquirt.exe, which is loaded on a system during installation if the system is found to have a Bluetooth radio.

Use the bthport.pl RegRipper plugin to get some information about devices connected to a Windows system via Bluetooth.

↧

HowTo: Determine Users on the System

July 3, 2013, 5:05 am

≫ Next: HowTo: Correlate Files To An Application

≪ Previous: HowTo: Correlate an Attached Device to a User

Now and again, I see questions in various forums that are related to (or flat out asking) about how to go about determining users on a Windows system. In several instances, I've seen the question posted to the EnCase User Forum, asking why the results of the "Initialize Case" EnScript are different from what is retrieved by other tools. There are several locations within the Windows system that can contain information about accounts on the system.

SAM hive - The SAM hive maintains information about user accounts local to that system. In a corporate environment, many times you won't find the user account for the active user listed in the SAM hive, as the user account was set up in Active Directory and managed from a domain controller. In home environments, you're likely to see multiple user accounts listed in the SAM hive.

Tool: RegRipper samparse.pl plugin; the samparse_tln.pl plugin will parse the SAM hive and output various items (account creation date, last login date, etc.) in TLN format for inclusion in a timeline.

Software hive - Within the Software hive is a key named "ProfileList" that maintains a list of...you guessed it...user profiles on the system. This information can then be correlated against what you find within the file system (see below).

Tool: RegRipper profilelist.pl and winlogon.pl plugins. The winlogon.pl plugin checks for "Special Accounts", which are accounts that do not appear on the Welcome screen. This technique used by intruders in order to "hide" accounts from administrators.

File system - User information is maintained and "recorded" in the user's profile within the file system...on Vista+ systems, in the C:\Users folder. There should be a correlation between what's in the ProfileList key, and what can be observed within the file system.

Tool: Any file viewer, or mount the image as a volume and use the dir command

Note
The LocalService or NetworkService accounts having a populated IE index.dat (web history) may be an indication of a malware infection. I've examined systems infected with malware that is used for click-fraud and found an enormous index.dat file for one of these accounts.

Now, most analysts are aware that you can have an account listed in the SAM hive, but not have a user profile folder within the file system. What this can indicate is that the user account was set up, but has not been used to access the system yet. User profiles are not created until the user logs into the system using the account credentials.

Changing Settings
In order to determine if someone was accessing user settings (changing user account information, or modifying account information), there are two places you can look. First, examine the Windows Security Event Log for indications of events that pertain to user account management (see MS KB 977519 for a list of pertinent event IDs).

Tools: LogParser, evtxparse.pl

Second, look to the shellbags. What? That's right...look to the shellbag artifacts. Event Logs can roll over, and sometimes quickly, depending upon activity and Event Log settings (events being audited, Event Log size, etc.). If you suspect that a user has been creating user accounts, or if you just want to determine if that has been the case, check the shellbags artifacts, and you might see something similar to the following in the artifact listing:

Desktop\Control Panel\User Accounts
Desktop\Control Panel\User Accounts\Create Your Password
Desktop\Control Panel\User Accounts\Change Your Picture

The above listing is an extract that was pulled out of the shellbags artifacts from my own system, but I should note that while investigating a system that had been compromised via Terminal Services, I parsed the shellbags artifacts for the compromised user account, and found similar entries related to those above, with the exception that they indicated that a user account had been created. The intruder had then attempted to "hide" the account from the Welcome screen by making the new account a "Special Account", but they had misspelled one of the keys in the path, so the functionality was not enabled for the account.

Tool: RegRipper shellbags.pl plugin

Note
If you have thoughts as to how to expand these "HowTo" posts, or questions regarding how to take the analysis further, please let me know. Also, if there's anything specific that you'd like to see addressed, please comment here or contact me at keydet89 at yahoo dot com.

↧

HowTo: Correlate Files To An Application

July 5, 2013, 5:11 am

≫ Next: HowTo: Determine Program Execution

≪ Previous: HowTo: Determine Users on the System

Not long ago, I ran across a question on the ForensicFocus forum, in which the original poster (OP) said that

there were a number of files were found in a user profile during an examination, and they wanted to know which application was "responsible for" these files. There wasn't much of a description of the files (extension, content), so it wasn't as if someone could say, "oh, yeah...I know what those files are from."

There are number of analysis techniques that you can use in an effort to determine the origin of a file. My hope in sharing this information is to perhaps provide something you may not have seen or thought of before. Also, I'm hoping that others will share their thoughts and experiences, as well.

What's in a name?
Some applications have a naming convention for their files. For example, when you open MS Word and work on a document, there are temp files saved along the way while you edit the document that have a particular naming convention; using this naming convention, MS has advice for recovering lost MS Word documents.

Another example that I find to be useful is the naming convention used by digital cameras. We see this many times when our friends post pictures to social media without changing the names of the files, and we'll recognize the naming convention of the files (i.e., file name starts with "IMG" or "DSC", or something similar) and know that the files were uploaded directly from a digital camera or smartphone. This may also be true if the files were copied directly from the storage medium of the device to the computer system that you're examining.

Location
Some applications will save various files in specific locations, which are not usually changed by the user. However, in other instances, applications simply use the user or system %Temp% folder as a temporary storage location. MS Office, as mentioned above, uses the current working directory to store it's temp files, which are created at (by default) regular intervals while the application is open. If you have an MS Word document open on your desktop, and you're editing it, you can see these files being created.

Content
Try opening the file in question in a viewer or editor of some kind. Sometimes, a viewer like Notepad might be enough to see the contents of the file, and the file may contain contents that provide insight as to it's origin.

Tip
I remember working on a case a long time ago, assisting another analyst. They'd sent me a file that contained several lines, including an IP address, and what looked like a user name and password. I asked for the location of where the file was located on the system, but that wasn't much help to either of us. As we dug into the examination, it turned out that the system had been subject to a SQL injection attack, and what we were looking at was an FTP batch script; we found the commands used to create the script embedded within the web server logs, and we found the file downloaded to the system, as well.

One aspect of file contents is the file signature. File signature analysis is still in use, and most seasoned analysts are aware of the uses and limitations of this analysis technique. However, it may be a good place to start by opening the file in a hex editor, and viewing the first 20 or so bytes of the file, comparing that to the file extension.

Another aspect of content is metadata. Many file types...PDF, DOCX/PPTX, JPG, etc...have the capacity to store metadata within the file. Metadata stays with the file, regardless of where the file goes, or what the file name is changed to...as long as the format isn't modified (.jpg file opened in MS Paint, and saved in .gif format), or the file isn't somehow manipulated, then the metadata will remain.

Here's an excellent post that can provide some insight into where certain, specific files may have come from. This is a great example of how a file may be created as a result of a simple command line, rather than a full-blown GUI application.

While not specific to the contents of the file itself, look to see if the file has an associated alternate data stream. When XP SP2 was rolled out, any file downloaded via IE or OutLook had a specific ADS associated with it, which was referred to as the file's "zoneID". In many instances, I've see the same sort of thing on Windows 7 systems, even though the browser was Firefox or Chrome. If a file has an associated ADS, document the name and contents of the ADS, as it may provide a very good indication of the origin of the file, regardless of location. Also, keep in mind that it is trivial to fake these ADSs.

Timelines
Timeline analysis is a fantastic analytic tool for determining where files "came from". Timelines provide both context and granularity, and as such, can provide significant insight into what was happening on the system when the files were created (or modified).

Consider this...with just a file that you're curious about, you don't have much. Sure, you can open the file in an editor, but what if the contents are simply a binary mess that makes no sense to you? Okay, you check the creation date of the file, and then compare that to information you were able to pull together regarding the users logged on to the system, and you see the "cdavis" was logged on at the time in question. What does that tell you? I know...not a lot. However, if you were to create a timeline of system and user activity, you would see who was logged into the system, what they were doing and possibly even additional details about what may have occurred "near" the file being created. For example, you might have information about a user logging in and then sometime later, their UserAssist data shows that they launched an application, and this is followed by a Prefetch file being modified, which is followed by other activity, and then the file in question was created on the system.

If you're performing timeline analysis and suspect that the time stamps on the file in question may have been modified (this happens quite often, simply because it's so easy to do...), open the MFT and compare the creation date from the $FILE_NAME attribute to that of the $STANDARD_INFORMATION attribute; it may behoove you to include the $FILE_NAME attribute information in your timeline, as well.

↧

HowTo: Determine Program Execution

July 6, 2013, 8:40 am

≫ Next: HowTo: Determine User Access To Files

≪ Previous: HowTo: Correlate Files To An Application

Sometimes during an examination, it can be important to determine what programs have been executed on a

system, and more specifically, when and by which user. Some of the artifacts on a system will provide us with indications of programs that have been executed, while others will provide information about which user launched the program, and when. As such, some of this information can be included in a timeline.

Hopefully, something that will become evident throughout this post, as well as other HowTo posts, is that rather than focusing on individual artifacts, we're going to start putting various artifacts into "buckets" or categories. The purpose for doing this is so that analysts don't get lost in a sea of artifacts, and are instead able to tailor their initial approach to an examination, possibly using an analysis matrix.

Okay, let's get started...

AutoStart Locations
Before we begin to look at the different artifacts that can be directly tied to a user (or not), I wanted to briefly discuss autostart locations. These are locations within the system...file system, Registry...where references to programs can reside that allow programs to be executed automatically, without any interaction from the user beyond booting the system or logging in. There are a number of such locations and techniques that can be used...Registry autostart locations, including the ubiquitous Run key, Windows services, the StartUp folder on the user's Program Menu, and even the use of the DLL Search Order functionality/vulnerability. Each of these can be (and have been) discussed in multiple blog posts, so for now, I'm simply going to present them here, under this "umbrella" heading, for completeness.

Scheduled Tasks can be, and are, used as an autostart location. Many of us may have QuickTime or iTunes installed on our system; during installation, a Scheduled Task to check for software updates is created, and we see the results of this task now and again. Further, on Windows 7 systems, a Scheduled Task creates backups of the Software, System, Security, and SAM hive files into the C:\Windows\system32\config\RegBack folder every 10 days. When considering autostart locations, be sure to check the Scheduled Tasks folder.

Tip
On a live system, you need to use both the schtasks.exe and at.exe commands to get a complete listing of all of the available Scheduled Tasks.

Tools: RegRipper plugins, MS/SysInternals AutoRuns; for XP/2003 Scheduled Task *.job files, jobparse.pl; on Vista+ systems, the files are XML

User
There are a number of artifacts within the user context that can indicate program execution. This can be very useful, as it allows analysts to correlate program execution to the user context in which the program was executed.

UserAssist
The contents of value data within a user's UserAssist subkeys can provide an excellent view into what programs the user has launched via the Explorer shell...by double-clicking icons or shortcuts, as well as by navigating via the Program Menu. Most analysts are aware that the value names are Rot-13 encoded (and hence, easily decoded), and folks like Didier Stevens have gone to great lengths to document the changes in what information is maintained within the value data, as versions of the operating systems have progressed from Windows 2000 to Windows 8.

Tools: RegRipper userassist.pl and userassist_tln.pl plugins

RunMRU
When a user clicks on the Start button on their Windows XP desktop, and then types a command into the Run box that appears, that command is added to the RunMRU key.

Interestingly, I have not found this key to be populated on Windows 7 systems, even though the key does exist. For example, I continually use the Run box to launch tools such as RegEdit and the calculator, but when I dump the hive file and run the runmru.pl RegRipper plugin against it, I don't see any entries. I have found the same to be true for other hives retrieved from Windows 7 systems.

Tools: RegRipper runmru.pl plugin

ComDlg32\CIDSizeMRU Values
The binary values located beneath this key appear to contain names of applications that the user recently launched. From my experience, the majority of the content of these values, following the name of the executable file, is largely zeros, with some remnant data (possibly window position/size settings?) at the end of the file. As one of the values is named MRUListEx, we can not only see (via a timeline) when the most recent application was launched, but we can also see when other applications were launched by examining available VSCs.

AppCompatFlags
According to MS, the Program Compatibility Assistant is used to determine if a program needs to be run in XP Compatibility Mode. Further, "PCA stores a list of programs for which it came up...even if no compatibility modes were applied", under the Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\Compatibility Assistant\Persisted key in the user's NTUSER.DAT hive. As such, we can query these values and retrieve a list of programs run by the user.

Tools: RegRipper appcompatflags.pl plugin (I updated the plugin, originally written by Brendan Cole, to include retrieving the values beneath the Persisted key, on 6 July 2013; as such, the plugin will be included in the next rollout)

MUICache
The contents of this key within the user hives (NTUSER.DAT for XP/2003, USRCLASS.DAT for Win7) often contains references to applications that were launched within the user context. Often times, these application will include command line interface (CLI) utilities.

Windows shortcuts/LNK files and Jump Lists
You're probably thinking..."huh?" Most analysts are familiar with how shortcuts/LNK files (and Jump Lists) can be used to demonstrate access to files or external storage devices, but they can also be used to demonstrate program execution within the context of a user.

Most of us are familiar with the LNK files found in the ..\Windows\Recent and ..\Office\Recent folders within the user profile...so, think about how those shortcuts are created. What usually happens is that the user double-clicks a file, the OS will read the file extension from the file, and then query the Registry to determine which application to launch in order to open the file. Windows will then launch the application...and this is where we have program execution.

Many times when a user installs an application on their system, a desktop shortcut may be created so that the user can easily launch the application. The presence of an icon on the desktop may indicate that the user launched an installer application.

Tools: custom Perl script, tools to parse LNK files

Java Deployment Cache Index (*.idx) Files
The beginning of 2013 saw a lot of discussion about vulnerabilities to Java, as well as reports of 0-days, and as a result, there was a small number of folks within the community looking into the use of Java deployment cache index (*.idx) files during analysis. The use of these files as artifacts during an investigation goes back to well before then, thanks to Corey Harrell. These files provide indications of downloads to the system via Java, and in some cases, those downloads might be malicious in nature. These artifacts are related specifically to Java being executed, and may lead to indications of additional programs being executed. Further, given that the path to the files is within the user profile folder, we can associate the launch of Java with a specific user context.

Tools: idxparse.pl parser

Browser History
A user's browser history not only indicates that they were browsing the web (i.e., executing the browser program), but the history can also be correlated to the *.idx files discussed above in order to determine which site they were visiting that caused Java to be launched.

System
There are a number of artifacts on the system that can provide indications of program execution

Prefetch File Analysis
Most analysts are aware of some of the metadata found within Prefetch files. Application prefetch files include metadata indicating when the application was last launched, as well as how many times it has been launched. This can provide some excellent information

Tools: pref.pl, or any other tools/scripts that parse the embedded module strings. Recent versions of scripts I've written and use incorporate an alerting mechanism to identify items within the strings and string paths found to be "suspicious" or "unusual".

AppCompatCache
This value within the System hive in the Registry was first discussed publicly by Mandiant, and has proven to be a treasure trove of information, particularly when it comes to malware detection and determining program execution, in general.

Tools: Mandiant's shim cache parser, RegRipper appcompatcache.pl plugin (appcompatcache_tln.pl plugin outputs in TLN format, for inclusion in timelines).

Legacy_* Keys
Within the System hive, most of use are familiar with the Windows services keys. What you may not realize is that there is another set of keys that can be very valuable when it comes to understanding when Windows services were run...the Enum\Root\Legacy_* keys. Beneath the ControlSet00n\Enum\Root key in the System hive, there are a number of subkeys whose names being with LEGACY_, and include the names of services.

There are a number of variants of malware (Win32/Alman.NAD, for example) that install as a service, or driver, and when launched, the operating system will create the Enum\Root\Legacy_* key for the service/driver. Also, these keys persist after the service or driver is no longer used, or even removed from the system. Malware writeups by AV vendors will indicate that the keys are created when the malware is run (in a sandbox), but it is more correct to say that the OS creates the key(s) automatically as a result of the execution of the malware. This can be an important distinction, which is better addressed in another blog post.

Tools: RegRipper legacy.pl plugin

Direct* and Tracing Keys
These keys within the Software hive can provide information regarding program execution.

The "Direct*" keys are found beneath the Microsoft key, and are keys whose names start with "Direct", such as Direct3D, DirectDraw, etc. Beneath each of these keys, you may find a MostRecentApplication key, which contains a value named Name, the data of which indicates an application that used the particular graphics functionality. Many times during an exam, I'll see "iexplore.exe" listed in the data, but during one particular exam, I found "DVDMaker.exe" listed beneath the DirectDraw key. In another case, I found "mmc.exe" listed beneath the same key.

I've found during exams that the Microsoft\Tracing key contains references to some applications that appear to have networking capabilities. I do not have any references to provide information as to which applications are subject to tracing and appear beneath this key, but I have found references to interesting applications that were installed on systems, such as Juniper and Kiwi Syslog tools (during incident response engagements, this can be very helpful and allow you collect Event Logs from the system that have since been overwritten, and included in a timeline...). Unfortunately, these artifacts have nothing more than the EXE name (no path or other information is included or available), but adding the information to a timeline can provide a bit of context and granularity for analysis.

Tip
When examining these and other keys, do not forget to check the corresponding key beneath the Wow6432Node key within the Software hive. The RegRipper plugins address this automatically.

Tools: RegRipper direct.pl and tracing.pl plugins

Event Logs
Service Control Manager events within the System Event Log, particularly those with event IDs 7035 and 7036, provide indications of services that were successfully sent controls, for either starting or stopping the service. Most often within the System Event Log, you'll see these types of events clustered around a system start or shutdown. During DFIR analysis, you're likely going to be interested in either oddly named services, services that only appear recently, or services that are started well after a boot or system startup. Also, you may want to pay close attention to services such as "PSExeSvc", "XCmdSvc", "RCmdSvc", and "AtSvc", as they may indicate lateral movement within the infrastructure.

On Windows 2008 R2 systems, I've found indications of program execution in the Application Experience Event Logs; specifically, I was examining a system that had been compromised via an easily-guessed Terminal Services password, and one of the intruders had installed Havij (and other tools) on the system. The Application-Experience/Program-Inventory Event Log contained a number of events associated with program installation (event IDs 903 and 904), application updates (event ID 905), and application removal (event IDs 907 and 908). While this doesn't provide a direct indication of a program executing, it does illustrate that the program was installed, and that an installer of some kind was run.

On my own Windows 7 system, I can open the Event Viewer, navigate to the Event Log, and view the records that illustrate when I have installed various programs knowingly (FTK Imager) and unknowningly (Google+ Chat). There are even a number of application updates to things like my ActiveState Perl and Python installations.

Tools: LogParser, evtxparse.pl

Other Indirect Artifacts
Many times, we may be able to determine program execution through the use of indirect artifacts, particularly those that persist well after the application has finished executing, or even been deleted. Many of the artifacts that we've discussed are, in fact, indirect artifacts, but there may still be others available, depending upon the program that was executed.

A number of years ago, I was...and I don't like to admit this...certified to perform PCI forensic audits. On one case, I ran into my first instance of a RAM scraper...this was a bit of malware that was installed on a point-of-sale (POS) back office server (running Windows) as a Windows service. After the system was booted, this instance of the malware would read the contents of a register, do some math, and use that value as a seed to wait a random amount of time before waking up and dumping the virtual memory from one of eight named (the names were listed in the executable file) processes. The next step was to parse the memory dump for track data, and this was accomplished via the use of Perl script that was "compiled" via Perl2Exe. I'm somewhat familiar with such executables, and one of the artifacts we found to validate our findings with respect to the actual execution of the malicious code was temporary directories created by "compiled" script. When executables "compiled" with Perl2Exe are run, any of the Perl modules (including the runtime) packed into the executable are extracted as DLLs into a temporary directory, at which time they are "available" to the running code. As the code was launched by a Windows service, the "temp" directories were found in the C:\Windows\Temp folder. The interesting thing that we found was that the temp directories used to hold the modules/DLLs are not deleted after the code completes, and they persist even if the program itself is removed from the system. In short, we had a pretty good timeline for each time the parsing code was launched.

On my own Windows 7 system, because I run a number of Perl scripts that were "compiled" with Perl2Exe within the context of my user account, the temp directories are found in the path, C:\Users\harlan\AppData\Local\Temp...the subdirectories themselves are named "p2xtmp-", and are followed by an integer, and themselves contain subdirectories that represent the Perl runtime namespace. The time stamps (creation dates) for these subdirectories provide indications of when I executed scripts that had been compiled via Perl2Exe.

Memory Dumps
During dead box analysis, memory dumps can be an excellent source of information. When an application crashes, a memory dump is created, and a log file containing information including a process list also created. When another application crash occurs, the memory dump is overwritten, but the log file is appended to, meaning that you can have a number of crash events available for analysis. I have found this historical information to be very useful during examinations because, while the information is somewhat limited, it can illustrate whether or not a program was running at some point in the past.

We're not going to discuss hibernation files here, as once you access a hibernation file and begin analysis, there really is very little difference between analyzing the hibernation file and analyzing a memory dump for a live system. Many of the techniques that you'd use, and the artifacts that you would look for, are pretty much the same.

Tools: text viewer

Malware Detection
Another use of this artifact category is that it can be extremely valuable in detecting the presence of malware on a system. However, malware detection is a topic that is best addressed in another post, as there is simply too much information to limit the topic to just a portion of a blog post.

Resources
This idea of determining program execution has been discussed before:
Timeline Analysis, and Program Execution
There Are Four Lights: Program Execution

↧

HowTo: Determine User Access To Files

July 8, 2013, 6:35 am

≫ Next: HowTo: Track Lateral Movement

≪ Previous: HowTo: Determine Program Execution

Sometimes during an examination, it is important for the analyst to determine files that the user may have accessed, or at least had knowledge of. There are a number of artifacts that can be used to determine which files a user accessed. As of Vista, the Windows operating systems, by default, do not update the last accessed times on files when normal system activity occurs, so some other method of determining the user (or process) that accessed files, and when, needs to be developed. Fortunately, Windows systems maintain a good deal of information regarding files users have accessed, and some artifacts may be of value, particularly in the face of the use of anti- or counter-forensics techniques.

Files on the Desktop
By default, Windows does not normally place files on a user's desktop. Most often, installation programs provide an option for putting a shortcut to the application on the desktop, but files themselves usually end up on the desktop as a result of direct and explicit actions taken by the user. The presence of the files themselves on the desktop can be correlated with other artifacts in order to determine when the user may have accessed those files.

Recycle Bin
Clearly, if a user deleted files, they had knowledge of them, and accessed them to the point that they deleted those files. However, keep in mind that this is not definitive...for example, if the user uses the Shift+Del key combination, or the NukeOnDelete value has been set, the files will bypass the Recycle Bin. As such, an empty Recycle Bin should not lead to the assumption that the user took the explicit action to empty it. It's trivial to check artifacts that can significantly impact your findings, so do so.

Tools: Various, including custom Perl script (recbin.pl)

LNK Files/Jump Lists
Most analysts are familiar with Windows shortcut/LNK files, particularly as a means of demonstrating user knowledge of and access to files. LNK files within the user's Windows\Recent and Office\Recent folders are created automatically when the user double-clicks the files (and the application to view the file is launched), and will contain information about the location of the file, etc.

AutomaticDestinations Jump Lists are similarly created automatically by the operating system, as a result of user actions. These files consist of a compound binary format, containing two types of streams...LNK streams, and a DestList stream, both of which are documented and easily parsed. The DestList stream serves the purpose of an MRU list.

Browser History
Browser history, in general, illustrates access to pages found on the Internet, but can also provide information about other files that a user has accessed. Browser history, particularly for IE, can contain file:/// entries, indicating access to local, rather than web-based, resources.

Tip
If you find that the LocalService, NetworkService, or Default User accounts have an IE browser history (i.e., the index.dat is populated...), this may be an indication of a process running with System level privileges that is accessing the Internet via the WinInet API.

When it comes to browser history, you will also want to look at file downloads. Beginning with Windows XP SP2, files downloaded via IE or as OutLook attachments had a Zone Identifier NTFS alternate data stream attached to them by the operating system. I have seen this, as well, on Windows 7 systems, but irrespective of the browser used.

To illustrate this, open a command prompt (should open to your user profile directory) and type the following command:

dir /s /r * | find "$data" /i | more

This can be very revealing. Of course, it doesn't illustrate where the files are located, as you're applying a filter to the output of 'dir' and only showing specific lines, but it can provide a good indication of files that you've downloaded. Also, I'd be very wary of any Zone Identifier ADSs that are more than 26 or 28 bytes in size.

Prefetch Files
It's common knowledge amongst analysts that application prefetch files provide information regarding when an application was run, and how many times that application was run. These files also contain module paths, which are essentially embedded strings that point to various files, usually DLLs. Many times, however, the module strings may point to other files. For example, an application prefetch file for IE will include strings pointing to the index.dat files within the user profile (for the user who launched it). The application prefetch file for sms.exe in Lance Mueller's first practical contains a path to "del10.bat", as well as a path to the sms.exe file itself, both of which are very telling. I've seen application prefetch files for keystroke loggers that have contained the path to the file where the keystrokes are recorded, which just goes to show how useful these files can sometimes be, and that there's a great deal of information that can be pulled from these files. Application prefetch files can be tied to applications launched by a particular user (i.e., correlate the last run time embedded within the file to program execution information for the user), which can then provide information regarding how the files were accessed.

Tools: Custom Perl script (pref.pl)

Registry
The Registry is well-known for being a veritable treasure trove of information, particularly when combined with other analysis techniques (timeline analysis, parsing VSCs, etc.). While not everything can be found in the Registry (for example, Windows does not maintain records of files copied...), there is a great deal that can be found within the Registry.

RecentDocs
Most of us are familiar with the RecentDocs key within the user hive. This is one of the classic MRU keys, as the key itself and all of it's subkeys contain values, and on Windows 7 systems, one of the values is named MRUListEx, and contains the MRU order of the other keys. The other values beneath each key are numbered, and the data is a binary format that contains the name of the file accessed, as well as the name of an associated shortcut/LNK file.

Each of the subkeys beneath this key are named for various file extensions, and as such, not only provide information about which files the user may accessed, but also which applications the user may have had installed.

A means for determining the possible use of counter-forensics techniques is to compare the list of value names against the contents of the MRUListEx value; numbers in this value that do not have corresponding value names may indicate attempts to delete individual values.

Tools: RegRipper recentdocs.pl plugin

ComDlg32
The user's ComDlg32 key contains information related to common dialogs used within Windows, and can provide specific indications regarding how the user interacted with the files in question.

Some of the subkeys beneath the ComDlg32 key are...

CIDSizeMRU
As described in a previous post, this key provides indications of program execution.

LastVisitedPidlMRU and LastVisitedPidlMRULegacy
These keys contain MRUListEx values, as well as numbered values. The numbered values contain the name of the executable from which the common dialog was launched, followed by a shell item ID list of the path that was accessed.

OpenSavePidlMRU
This key is found on Vista+ systems, and corresponds to the OpenSaveMRU key found on XP systems. The subkeys beneath this key correspond to the extensions of opened or saved files. Each of the numbered values beneath the subkeys consist of shell item ID lists, and there's an MRUListEx value that provides the MRU functionality for each key.

On Windows XP, value data beneath the ComDlg32 subkeys consist of ASCII strings, whereas on Vista and Windows 7, the value data consists of shell items. This means that in some cases, those data structures will contain time stamps. However, to be of use during an examination, the analyst needs to understand how those time stamps are created, and maintained.

Tools: RegRipper comdlg32.pl plugin

MS Office File/Place MRU Values
Each of the applications within MS Office 2010 maintains an MRU list of not only files accessed, but places from which files have been accessed (in separate keys). In addition to the paths to the files or folders, respectively, the value string data contain entries that look like, "[T01CD76253F25ECD0]", which is a string representation of a 64-bit FILETIME time stamp. As such, these keys aren't MRU keys in the more traditional sense of having an MRUList or MRUListEx value.

Tools: RegRipper office2010.pl plugin

TrustRecords
A while back, Andrew Case pointed out this interesting artifact. When an MS Office document is downloaded or accessed from the network, there's a yellow bar that appears at the top of the application window, just below the menu bar. Most often, this bar contains a button, which the user must click in order to enable editing of the document. Within the TrustRecords key, the value names are the paths and names of the files accessed, and the first 8 bytes of the binary data appears to correlate to the time when the user clicked the "Enable Editing" button.

Tools: RegRipper trustrecords.pl plugin

Application-specific MRUs
A number of file viewers (Adobe Reader, MS Paint, etc.) maintain their own MRU lists. Most often when interacting with the application, if you click on File in the menu bar of the app, the drop-down menu will contain (usually toward the bottom) a list of recently accessed files. Many times, that information can be found in the Registry.

Tools: RegRipper applets.pl and adoberdr.pl plugins

On Windows 8, the Photos key in the user's USRCLASS.DAT hive is used to track photos opened via the Photos app on the Windows 8 desktop (many thanks to Jason Hale for sharing his research on this topic).

Tools: RegRipper photos.pl plugin

ShellBags
Ah, there they are again! Shellbags...can shellbags artifacts be used to help determine files that a user may have accessed? Yes, they can. On Windows XP, correlating the shellbags artifacts includes mapping the NodeSlot value in the BagMRU key to the Bags keys, where in some cases you may find what amounts to a directory listing of files; this does not appear to be the case on Windows 7. However, the shellbags artifacts can illustrate directories that the user accessed, both locally and on network resources, and can therefore provide some information regarding files that the user may have accessed.

In addition to folders on local and removable drives, shellbags artifacts can also illustrate access to mapped network drives, Internet resources (i.e., accessing FTP sites via Windows Explorer), as well as to zipped archives. Most of us don't usually think twice when we click on a zipped archive and it opens up just like a regular folder, in an Explorer window...however, it is important to note that this does appear as a shellbag artifact.

Tools: RegRipper shellbags.pl plugin

TypedPaths
Similar to the shellbags artifacts, the TypedPaths key in the user's NTUSER.DAT hive maintains a list of folders that the user accessed; however, for this artifact, the paths were typed into the Windows Explorer Address Bar.

Users can also disable this feature, so if you find no values in the TypedPaths key, check for the AutoSuggest value.

Tools: RegRipper typedpaths.pl plugin

Searches
With Windows XP, user search terms were maintained in the ACMru key; on Windows 7, they're found in the WordWheelQuery key. These values can provide information regarding what the user was looking for, and provide some indications of files they may have accessed (often determined through timeline analysis).

Tools: RegRipper acrmu.pl and wordwheelquery.pl plugins

A good deal of the information that provides indications of a user's access to files has time stamps associated with it, and as such, can be included in a timeline. This timeline can provide context and granularity to your analysis, particularly when correlated with other artifacts.

↧

HowTo: Track Lateral Movement

July 10, 2013, 7:53 am

≫ Next: Programming and DFIR

≪ Previous: HowTo: Determine User Access To Files

A reader recently commented and asked that the topic of scoping an incident and tracking lateral movement be addressed. I've performed incident response for some time and been involved in a wide variety of cases, so I thought I'd present something about the types of lateral movement I've encountered and how these types of cases were scoped. Some types of lateral movement are easier to scope than others, so YMMV.

When there's been lateral movement there are usually two systems involved; system A (the source) and system B (the destination). Throughout this post, I'll try to point out which system the artifacts would appear on, as this can be a very important distinction.

SQL Injection
SQL injection (SQLi) is an interesting beast, as this exploit, if successful, allows an unprivileged user to access a system on your network, often with System-level privileges. Not all SQLi cases necessarily involve lateral movement, specifically if the web and database servers are on the same system. However, I have been involved in cases in which the web server was in the DMZ, but the database server was situated on the internal infrastructure.

After gaining access to a system via this type of exploit, the next thing we tended to see was that a reverse shell tool was downloaded and launched on the system, providing shell-based access to the attacker. Very often, this can be achieved through the use of a modified version of VNC, of which there are several variants (OpenVNC, TightVNC, RealVNC, etc.). It was usually at this point that the intruder was able to orient themselves, perform recon, and then 'hop' to other systems, as necessary.

Tools: Editor, coding skillz

Note
I remember in one case, an intruder had installed a reverse shell on a system (we found this later during our investigation) and had gone undetected until they found that they were on the system at the same time as one of the admins...at which point, they opened up Notepad and typed a message to the admin. It was only after this event that we were called. ;-)

Terminal Services
I was once engaged in a case where an employee's home system had been compromised, and a keystroke logger installed. The intruder found that the user used their home system to access their employer's infrastructure via Terminal Services, and took advantage of this to use the stolen credentials to access the infrastructure themselves. The access was via RDP, and after initial access to the infrastructure, the intruder continued to use RDP to access other systems. Further, all of the systems that the intruder logged into had never been accessed with those credentials. As such, it was a simple matter to examine a few artifacts on the each of the "compromised" systems in turn, and then to verify other systems on which the user profile was found.

The systems that we dealt with were a mix of Windows XP, and Windows 2000 and 2003 for servers. As such, the artifacts we were interested in were found in the user profile's NTUSER.DAT hive file. If the workstation systems had been Windows 7 systems, we would have included Jump Lists (specifically, those with the AppID specific to Terminal Services) in our examination.

The system A artifacts would include Registry values, and for Windows 7 systems, Jump Lists. Depending upon the actions the user took (i.e., double-clicking a file), there may also be shortcuts/LNK files that point to the remote system.

System B artifacts would include logins found in the Security Event Log...on Win2003 systems, look for event ID 528 type 10 logins. On Win2008 R2, look for event ID 4624 events, as well as TerminalServices-LocalSessionManager events with ID 21.

Tools: RegRipper tsclient.pl plugin, Jump List parser (system A), Logparser (system B)

Mapping Shares
While the interaction with shares is somewhat limited, it can still be seen as a form of "lateral movement". Shares mapped via the Map Network Drive wizard appear within the user's Registry hive, in the Map Network Drive MRU key, on system A.

On system B, the connection would appear as a login, accessing resources in the Event Log.

As Event Logs tend to rotate, the artifacts on system A may persist for much longer than those on system B.

Note
It is important to make the distinction between GUI and CLI artifacts. Many of the artifacts that we see in the user's hive that are associated with accessing other systems are the result of the user interacting via the Windows Explorer shell, which is why the path where they can be found is Windows\CurrentVersion\Explorer. Access via CLI tools such as mapping/accessing a remote share via net.exe does not produce a similar set of artifacts.

Tools: RegRipper mndmru.pl plugin (system A); Logparser (system B)

Shellbags
It's funny that when I sit down to outline some of these HowTo blog posts, I start by typing in subheaders )like this one, which I then italicize) for topics to include in the overall subject of the post. It's interesting that the shellbags artifacts tend to appear in so many of these posts! Similar to mapping shares, these artifacts can provide indications of access to other systems. There are specific top-level shell items that indicate network-based resources, and will illustrate (on system A) a user accessing those remote resources.

Tools: RegRipper shellbags.pl plugin (system A)

Scheduled Tasks
Scheduled Tasks can easily be created through the use of one of two CLI tools on systems: schtasks.exe and at.exe. Both tools utilize switches for creating scheduled tasks on remote systems. On system A, you may find indications of the use of these CLI applications in the Prefetch files or other artifacts of program execution. On system B, you may find indications of a service being started in the System Event Log (event ID 7035/7036), and on WinXP and 2003 systems, you may find indications of the task being run in the SchedLgU.txt file (although this file, like the Event Logs, tends to roll-over...). On Windows 2008 R2, you should look to the Microsoft-Windows-TaskScheduler/Operational Event Log...event ID 102 indicates a completed task, and event ID 129 indicates that a task process was created.

PSExec
PSExec and other similar services (I've seen rcmd and xcmd used) can be used to execute processes on remote services. The artifacts you would look for would be similar to those for Scheduled Tasks, with the exception of the specific event records on system B.

Artifacts on system A might include Prefetch files and other artifacts of program execution.

Artifacts on system B might include System Event Log entries, specifically those with event source Service Control Manager and event IDs of 7035 (service sent a start command) and 7036 (service entered a running state).

Tools: LogParser, TZWorks evtwalk or evtx_view (system B)

Testing, and Artifacts
In order to see for yourself what these artifacts "look like", try running these tools on your own. You can do so fairly easily by setting up a virtual system and using any of these methods to access the "remote" system.

↧

Programming and DFIR

July 11, 2013, 6:24 am

≫ Next: HowTo: Malware Detection, pt I

≪ Previous: HowTo: Track Lateral Movement

I was browsing through an online list recently and I came across an older post that I'd written, that had to do with tools. In it, I'd made the statement, "Tweaked my browser history parser to add other available data to the events, giving me additional context." This brought to mind just how valuable even the smallest modicum of programming skill can be to an analyst.

This statement takes understanding data structures a step further because we're not simply recognizing that, say, a particular data structure contains a time stamp. In this case, we're modifying code to meet the needs of a specific task. However, simply understanding basic programming principles can be a very valuable skill for DFIR work, in general, as the foundational concepts behind programming teach us a lot about scoping, and programming in practice allows us to move into task automation and eventually code customization.

Note
David Cowen has been doing very well on his own blog-a-day-for-a-year challenge, and recently posted a blog regarding some DFIR analyst milestones that he outlined. In this post, David mentions that milestone 11 includes "basic programming". This could include batch file programming, which is still alive and well, and extremely valuable...just ask Corey Harrell. Corey's done some great things, such as automating exploiting VSCs, through batch files.

Scoping
My programming background goes back to the early '80s, programming BASIC on the Timex-Sinclair 1000 and Apple IIe. In high school, I learned some basic Pascal on the TRS-80, and then in college, moved on to BASIC on the same platform. Then in graduate school, I picked up some C (one course), some M68K

assembly, and a LOT of Java and MatLab, to the point that I used both in my thesis. This may seem like a lot, but none of it was really very extensive. For example, when I was programming BASIC in college, my programs included one that displayed the Punisher skull on the screen and played the "Peter Gunn theme" in the background, and another one interfaced with a temperature sensor to display fluctuations on the screen. In graduate school, the C programming course required as part of the MSEE curriculum really didn't have us to much more than open, write to or read from, and then close a file. Some of the MatLab stuff was a bit more extensive, as we used it in linear algebra, digital signal processing and neural network courses. But we weren't doing DFIR work, nor anything close to it.

The result of this is not that I became an expert programmer...rather, take a look that something David had said in a recent blog post, specifically that an understanding of programming helps you put your goals into perspective and reduce the scope of the problem you are trying to solve. This is the single most valuable aspect of programming experience...being able to look at the goals of a case, and break them down into compartmentalized, achievable tasks. Far too many times, I have seen analysts simply overwhelmed by goals such as, "Find all bad stuff", and even when going back to the customer to get clarification as to what the goals of the case should be, they still are unable to compartmentalize the tasks necessary to complete the examination.

Task Automation
There's a lot that we do that is repetitive...not just in a single case, but if you really sit down and think about the things you do during a typical exam, I'm sure that you'll come across tasks that you perform over and over again. One of the questions I've heard at conferences, as well as while conducting training courses, is, "How do I fully exploit VSCs?" My response to that is usually, "what do you want to do?" If your goal is to run all the tools that you ran against the base portion of the image against the available VSCs, then you should consider taking a look at what Corey did early in 2012...as far as I can see, and from my experience, batch scripting such as this is still one of the most effective means of automating tasks such as this, and there is a LOT of information and sample code freely available on the Interwebs for automating an almost infinite number of tasks.

If batch scripting doesn't provide the necessary flexibility, there are scripting languages (Python, Perl) that might be more suitable, and there are a number of folks in the DFIR community with varying levels of experience using these languages...so don't be afraid to reach out for assistance.

Code Customization
There's a good deal of open source code out there that allows us to do the things we do. In other cases, a tool that we use may not be open source, but we do have open source code that allows us to manipulate the output of the tool into a format that is more useful, and more easily incorporated into our analysis process. Going back to the intro paragraph to this post, sometimes we may need to tweak some code, even if it's to simply change one small portion of the output from a decimal to hex when displaying a number. Understanding some basic coding lets us not only be able to see what a tool is doing, but it also allows us to adjust that code when necessary.

Being able to customize code as needed also means that we can complete our analysis tasks in a much more thorough and timely manner. After all, for "heavy lifting", or highly repetitive tasks, why not let the computer do most of the work? Computers are really good at doing the same thing, over and over again, really fast...so why not take advantage of that?

Summary
While there is no requirement within the DFIR community (at large) to be able to write code, programming principles can go a long way toward developing our individual skills, as well as developing each of us into better analysts. My advice to you is:

Don't be overwhelmed when you see code...try opening the code in a text viewer and just reading it. Sure, you may not understand Perl or C or Python, but most times, you don't need to understand the actual code to figure out what it's doing.

Don't be afraid to reach out for help and ask a question. Have a question about some code? Reach out to the author. Many times, folks crowdsource their questions, reaching to the "community" as a whole, and that may work for some. However, I've had much better success by reaching directly to the coder...I can usually find their contact info in the headers of the code they wrote. Who better to answer a question about some code than the person who wrote it?

Don't be afraid to ask for assistance in writing or modifying code. From the very beginning (circa 2008), I've had a standing offer to modify RegRipper plugins or create custom plugins...all you gotta do is ask (provide a concise description of what's needed, and perhaps some sample data...). That's it. I've found that in most cases, getting an update/modification is as simple as asking.

Make the effort to learn some basic coding, even if it's batch scripting. Program flow control structures are pretty consistent...a for loop is a for loop. Just understanding programming can be so much more valuable than simply allowing you to write a program.

↧

HowTo: Malware Detection, pt I

July 15, 2013, 7:25 am

≫ Next: HowTo: Detecting Persistence Mechanisms

≪ Previous: Programming and DFIR

Many times we'll come across a case where we need to determine the presence of malware on a system. As many of us are aware, AV products don't always work the way we hope they would...they don't provide us with 100% coverage and detect everything that could possibly affect our systems.

This post is NOT about malware analysis. This post addresses malware detection during dead box analysis. Malware detection is pretty expansive, so to really address the topic, I'm going to spread this one out across several blog posts.

Malware Detection
Malware detection during dead box analysis can be really easy, or it can be very hard. I say this because we can mount an image as a read-only volume and run several (or more) AV scanners against the volume, and keep track of all the malware found. Or, we can run several AV scanners against the volume, and they will all find nothing - but does that mean that there isn't any malware on the system?

This post is the first of several that will write, in an attempt to fully address this issue.

Characteristics
Before we start digging into the guts of detecting malware during dead box analysis, it is important to understand the four characteristics of malware, specifically the initial infection vector, the propagation mechanism, the persistence mechanism, and artifacts of the malware. I originally developed these characteristics as a way of helping new analysts develop a confident, professional demeanor when engaging with customers; rather than reacting like a deer in the headlights, my hope was to help these new analysts understand malware itself to the point where they could respond to a customer request in a professional and confident manner. Understanding and applying these characteristics enables analysts to understand, detect, locate, and mitigate malware within an infrastructure.

Initial Infection Vector
The initial infection vector (or IIV) refers to how the malware originally made it's way on to the system. Worms like SQL Slammer took advantage of poor configuration of systems; other malware gets on to systems as a result of exploiting vulnerabilities in browsers.

Understanding malware IIV mechanisms not only provides us with a starting point for beginning our investigation, but also allows analysts to go from "I think it got on the system..." or "..if I had written the malware...", to actually being able to demonstrate, through empirical data, how the malware ended up on the system. Too many times, the IIV is assumed, and that assumption is passed on the customer, who uses that information to make critical business decisions, possibly even validating (or, invalidating) compliance.

Also, keep in mind that many of the more recent malware samples appear on systems as a result of a multi-stage delivery mechanism. For example, a user may open an email attachment, and the document will contain embedded malicious code that will exploit a vulnerability in the target application, which may reach to a server with additional instructions, and then the second stage will reach out to another server and download the actual malware. As such, the malware does not simply appear on the system, and the IIV is actually much more complex than one might expect.

Determining the IIV can be an important factor in a number of exams. For example, PCI exams require that the analyst determine the window of compromise, which is essentially the time from when the system was first compromised or infected, to when the incident was discovered and taken offline. While this is done for fairly obvious reasons, other non-PCI cases ultimately have similar requirements. Being able to accurately determine when the system was first infected or compromised can be a critical part of an exam, and as such should not be left to speculation and guesswork, particularly when this can be determined through the use well-though-out processes.

Propagation Mechanism
This characteristic refers to how the malware moves between systems, if it does. Some malware doesn't move between systems on it's own...instead, it infects one system and doesn't move on to other systems. For example, RAM scrapers found during PCI cases don't infect one system and then propagate to another...rather, the malware is usually placed on specific systems by an intruder who has unrestricted access to the entire infrastructure.

Some malware will specifically infect removable devices as a means of propagation. Worms are known primarily for their ability to propagate via the network. Other malware is known to infect network shares, in the hopes that by infecting files on network shares, the malware will spread through the infrastructure as users access the infected files.

It's important to note that the malware propagation mechanism may be the same as the IIV, but analysts should not assume that this is the case. Some malware may get onto a system within an infrastructure as a result of a spear-phishing campaign, and once on the internal infrastructure, propagate via network or removable drives.

Artifacts
According to Jesse Kornblum's Rootkit Paradox, rootkits want to remain hidden, and they want to run. The paradox exists in the fact that by running, there will be ways to detect rootkits, even though they want to remain hidden. The same is true with malware, in general, although malware authors are not as interested in remaining hidden as rootkit authors. However, the fact is that as malware interacts with it's environment, it will leave artifacts.

Self-Inflicted Artifacts
As malware interacts with it's environment, artifacts will be created. These artifacts may be extremely transient, while others, being created by the environment itself, may be much more persistent.

One of the issues I've seen over the years when AV vendors have produced technical reports regarding malware is that there are a number of self-inflicted artifacts; that is, artifacts are created as a result of how the malware is launched in the testing environment. One of the best examples of this occurs when the IIV of a malware sample is a multi-stage delivery mechanism, and the analyst only has a copy of the executable delivered in the final stage. When this occurs, the malware report will contain artifacts of the analyst launching the malware, which will not show up on a system that was infected in the wild.

Looking for malware artifacts is a lot like using those "expert eyes" that Chris Pogue talks about in his Sniper Forensics presentations. When malware executes, it interacts with it's environment, and the great thing about the Windows environment is that it records a lot of stuff. In that way, it's a lot like using 'expert eyes' to look for deer on the Manassas Battlefield park...as deer move through the park, they leave signs or 'artifacts' of their presence. They're grass eaters, and they leave scat that is different from fox, dogs, and bears (as well as from other grass eaters, like our horses). They leave tracks, whether it's in the mud, sand, soft dirt or snow. When they move through tall grass, they leave trails that are easily visible from horseback. When they lay down for the night, they leave matted grass. Sometimes the bucks will leave velvet from their horns, or if a couple of bucks are sparing, you may actually find a horn or two (we have a couple hanging in the barn). My point is that I don't have to actually see a deer standing in a field or along a stream to know that there are deer in the area, as I can see their "artifacts". The same thing is true for a lot of the malware out there...we just need to know what artifacts to look for, and how to look for them. This is how we're able to detect malware during dead box analysis, particularly when that malware is not detected by AV scanners.

Some of the more popular locations where I've found artifacts include AV logs, and on Windows XP systems in particular, within the Windows firewall configuration. I've seen a number of instances where malware has been detected by AV, but according to the logs, the AV was configured to take no action. As such, as strange as it may seem, the malware infection is clearly visible in the AV logs. In another instance, I was reviewing the on-demand scan AV logs and found an instance of malware being detected. A closer examination of the system indicated that malware had originally been created on the system less than two days prior to the log entries, and I was able to verify that the malware had indeed been quarantined and then deleted. About six weeks later, a new variant of the malware, with the same name, was deposited on the system, and throughout the course of the timeline, none of the subsequent scans detected this new variant. Previously-detected malware can provide valuable clues of an intruder attempting to get their malware on to a system.

In other instances, the malware infection process creates a rule in the Windows firewall configuration to allow itself out onto the network. Finding a reference to a suspicious executable in the firewall configuration can point directly to the malware.

In another instance, the "malware" we found consisted of three components...a Windows service, the "compiled" Perl script, and a utility that would shuttle the collected track data off of the system to a waiting server. The first time that the malware components were installed on the system, within two days, an on-demand AV scan detected one of the components and removed it from the system (as indicated by the AV logs and subsequent lack of Perl2Exe temp folders). Several weeks later, the intruder installed a new set of components, which ran happily...the installed AV no longer detected the first component. This can happen a lot, and even when the system has been acquired, AV scans do not detect the malware within the mounted image. As such, we very often have to rely on other artifacts...unusual entries in autostart locations, indirect artifacts,

The use of artifacts to detect the presence of malware is similar to the use of rods and cones in the human eye to detect objects at night or in low light. Essentially, the way our vision works, we do not see an object clearly at night by looking directly at it...rather, we have to look to the side of the object. AV scanners tend to "look directly at" malware, and may not detect new variants (or in some cases, older variants)

Persistence Mechanism
The persistence mechanism refers to the means that malware utilizes to survive reboots. Some persistence mechanisms may allow the malware to launch as soon as the system is booted, while others will start the malware until a user logs in, or until a user launches a specific application. Persistence mechanisms can be found in Registry locations...during the 2012 SANS Forensic Summit, Beth (a Google engineer) described writing a web scraper to run across an AV vendor's web site, and around 50% of the Registry keys she found pointed to the ubiquitous Run key. Persistence can also be achieved through a variety of other means...locations within the file system, DLL Search Order parsing, etc. Determining the persistence mechanism can be a very valuable part of gaining intelligence from your examination, and detecting new persistence mechanisms (to be covered in a future post) can been equally important.

Mis-Identified Persistence
Sometimes when reading a malware write-up from a vendor, particularly Microsoft, I will see that the persistence mechanism is listed as being in the HKCU\..\Run key, and the write-up goes on to say, "...so that the malware will start at system boot. This is incorrect, and can be critical to your examination; for example, if a user context on a system is infected, but when the system is rebooted another user logs in, the malware is not active.

As you may already see, the persistence mechanism of malware is also an artifact of that malware; persistence is a subset of the artifact set. In many ways, this can help us to a great extent when it comes to detecting malware, particular malware missed by AV scanners.

↧

HowTo: Detecting Persistence Mechanisms

July 15, 2013, 9:01 am

≫ Next: HowTo: Data Exfiltration

≪ Previous: HowTo: Malware Detection, pt I

This post is about actually detecting persistence mechanisms...not querying them, but detecting them. There's a difference between querying known persistence mechanisms, and detecting previously unknown persistence mechanisms used by malware; the former we can do with tools such as AutoRuns and RegRipper, but the latter requires a bit more work.

Detecting the persistence mechanism used by malware can be a critical component of an investigation; for one, it helps us determine the window of compromise, or how long it's been since the system was infected (or compromised). For PCI exams in particular, this is important because many organizations know approximately how many credit card transactions they process on a daily or weekly basis, and combining this information with the window of compromise can help them estimate their exposure. If malware infects a system in a user context but does not escalate it's privileges, then it will mostly likely start back up after a reboot only after that user logs back into the system. If the system is rebooted and another user logs in (or in the case of a server, no user logs in...), then the malware will remain dormant.

Detecting Persistence Mechanisms
Most often, we can determine a malware persistence mechanism by querying the system with tools such as those mentioned previously in this post. However, neither of these tools is comprehensive enough to cover other possible persistence mechanisms, and as such, we need to seek other processes or methods of analysis and detection.

One process that I've found to be very useful is timeline analysis. Timelines provide us with context and an increased relative confidence in our data, and depending upon which data we include in our timeline, an unparalleled level of granularity.

Several years ago, I determined the existence of a variant of W32/Crimea on a system (used in online banking fraud) by creating a timeline of system activity. I had started by reviewing the AV logs from the installed application, and then moved on to scanning the image (mounted as a volume) with several licensed commercial AV scanners, none of which located any malware. I finally used an AV scanner called "a-squared" (now apparently owned by Emsisoft), and it found a malicious DLL. Using that DLL name as a pivot point within my timeline, I saw that relatively close to the creation date of the malicious DLL, the file C:\Windows\system32\imm32.dll was modified; parsing the file with a PE analysis tool, I could see that the PE Import Table had been modified to point to the malicious DLL. The persistence mechanism employed by the malware was to 'attach' to a DLL that is loaded by user processes that interact with the keyboard, in particular web browsers. It appeared to be a keystroke logger that was only interested in information entered into form fields in web pages for online banking sites.

Interestingly enough, this particular malware was very well designed, in that it did not write the information it collected to a file on the system. Instead, it immediately sent the information off of the system to a waiting server, and the only artifacts that we could find of that communication were web server responses embedded in the pagefile.

Something else to consider is the DLL Search Order"issue", often referred to as hijacking. This has been discussed at length, and likely still remains an issue because it's not so much a specific vulnerability that can be patched or fixed, but more a matter of functionality provided by the architecture of the operating system.

In the case of ntshrui.dll (discussed here by Nick Harbour, while he was still with Mandiant), this is how it worked...ntshrui.dll is listed in the Windows Registry as an approved shell extension for Windows Explorer. In the Registry, many of the approved shell extensions have explicit paths listed...that is, the value is C:\Windows\system32\some_dll.dll, and Windows knows to go load that file. Other shell extensions, however, are listed with implicit paths; that is, only the name of the DLL is provided, and when the executable (explorer.exe) loads, it has to go search for that DLL. In the case of ntshrui.dll, the legitimate copy of the DLL is located in the system32 folder, but another file of the same name had been created in the C:\Windows folder, right next to the explorer.exe file. As explorer.exe starts searching for the DLL in it's own directory, it happily loaded the malicious DLL without any sort of checking, and therefore, no errors were thrown.

Around the time that Nick was writing up his blog post, I'd run across a Windows 2003 system that had been compromised, and fortunately for me, the sysadmins had a policy for a bit more extensive logging enabled on systems. As I was examining the timeline, starting from the most recent events to occur, I marveled at how the additional logging really added a great deal of granularity to thing such as a user logging in; I could see where the system assigned a token to the user, and then transferred the security context of the login to that user. I then saw a number of DLLs being accessed (that is, their last accessed times were modified) from the system32 folder...and then I saw one (ntshrui.dll) from the C:\Windows folder. This stood out to me as strange, particularly when I ran a search across the timeline for that file name, and found another file of the same name in the system32 folder. I began researching the issue, and was able to determine that the persistence mechanism of the malware was indeed the use of the DLL search order "vulnerability".

Creating Timelines
Several years ago, I was asked to write a Perl script that would list all Registry keys within a hive file, along with their LastWrite times, in bodyfile format. Seeing the utility of this information, I also wrote a version that would output to TLN format, for inclusion in the timelines I create and use for analysis. This allows for significant information that I might not otherwise see to be included in the timeline; once suspicious activity has been found, or a pivot point located, finding unusual Registry keys (such as those beneath the CLSID subkey) can lead to identification of a persistence mechanism.

Additional levels of granularity can be achieved in timelines through the incorporation of intelligence into the tools used to create timelines, something that I started adding to RegRipper with the release of version 2.8. One of the drawbacks to timelines is that they will show the creation, last accessed, and last modification times of files, but not incorporate any sort of information regarding the contents of that file into the timeline. For example, a timeline will show a file with a ".tmp" extension in the user's Temp folder, but little beyond that; incorporating additional functionality for accessing such files would allow us to include intelligence from previous analyses into our parsing routines, and hence, into our timelines. As such, we may want to generate an alert for that ".tmp" file, specifically if the binary contents indicate that it is an executable file, or a PDF, or some other form of document.

Another example of how this functionality can be incorporated into timelines and assist us in detecting persistence mechanisms might be to add grep() statements to RegRipper plugins that parse file paths from values. For example, your timeline would include the LastWrite time for a user's Run key as an event, but because the values for this key are not maintained in any MRU order, there's really nothing else to add. However, if your experience were to show that file paths that include "AppData", "Application Data", or "Temp" might be suspicious, why not add checks for these to the RegRipper plugin, and generate an alert if one is found? Would you normally expect to see a program being automatically launched from the user's "Temporary Internet Files" folder, or is that something that you'd like to be alerted on. The same sort of thing applies to values listed in the InProcServer keys beneath the CLSID key in the Software hive.

Adding this alerting functionality to tools that parse data into timeline formats can significantly increase the level of granularity in our timelines, and help us to detect previously unknown persistence mechanisms.

Resources
Mandiant: Malware Persistence without the Windows Registry
Mandiant: What the fxsst?
jIIR: Finding Malware like Iron Man
jIIR: Tracking down persistence mechanisms

↧

HowTo: Data Exfiltration

July 18, 2013, 5:18 am

≫ Next: HowTos

≪ Previous: HowTo: Detecting Persistence Mechanisms

One of the questions I see time and again, in forums as well as from customers, is "what data was taken from the system?" Sometimes, an organization will find out what data was taken when they get a call from an outside third party (just review any of the annual reports from Verizon, Mandiant, or TrustWave); if this is the case, they may have a pretty good idea of what data was taken.

This post is not intended to be totally comprehensive; rather, the idea here is to present artifacts that you can look for/at that you may not have seen before, and provide other opportunities for finding indications of data exfiltration. Many of these artifacts are very simple to check for and analyze, and provide for a more thorough and complete examination, even if nothing is found. After all, simply illustrating to the customer that you checked all of these possibilities provides value, regardless of whether any useful evidence was turned up.

It's also very important to point out that these artifacts may provide an indication of data exfiltration of some kind; the only way to determine if data was exfiltrated at a particular time, and what that data might have been, in a definitive manner is to have a full packet capture from the time of exfiltration. That way, you can see exactly what was exfiltrated.

Attachments
Attachments are an easy means for getting files off of a system. Attachments can be made to email, web mail, as well as to chat/IM programs. Files can be uploaded via the web to sites like Twitter, Yahoo Groups, Google Docs, etc. The use of social media should be examined closely.

Program Execution
One of the things you'll want to look for is artifacts of program execution...many times, exfiltrating data requires that a program of some type be executed; whether it's launching a standalone application or uploading something via a browser, a program must be running for data exfil to occur.

Programs you might want to look for include the Windows native ftp.exe command line utility, third-party FTP utilities, etc. Also, you might consider looking for the use of archiving utilities, such as rar.exe, particularly in cases where files may have been archived prior to transmittal. As stated in the previous blog post, you'll want to look for artifacts such as application Prefetch files, etc. You might also want to look to user-specific Registry values, such as:

User's UserAssist (GUI) or MUICache (CLI) - RegRipper userassist.pl or muicache.pl plugins, respectively

Tracing key - RegRipper tracing.pl plugin; this key contains subkeys that I have found refer to applications with networking capabilities. I say this in part due to observation, but also because during one investigation where an intruder had installed and run an vulnerability exploitation tool (specifically, Havij), I found references to this tool beneath this key.

BlueTooth
The first time I ran across the use of the Windows native fsquirt.exe utility, I found an entry for the utility in the user's MUICache data. The path pointed to the file in the system32 folder, which, after an initial investigation, appeared to be correct. I then found a Prefetch file for the utility, as well. The utility is actually a wizard with a GUI, and as such, uses common dialogs to allow the user to select files to send to device; analysis of values in the ComDlg32\OpenSavePidlMRU key provided indications of files that might have been copied to the device.

Shellbags
You're probably thinking, "Really?" Well, you can find indications of possible data exfiltration (or infiltration) within the shellbags artifacts.

Shellbag artifacts can provide indications of access to network resources (such as shares), not only within the network infrastructure, but also beyond the borders of the infrastructure.

Shellbags can show indications of access to resources for data exfiltration through different types of shell items. When I first started working with my publisher, I was provided with instructions for accessing their FTP site via Windows Explorer, which would allow me to drag-and-drop files between folders. This method for accessing an FTP server does not leave what one would expect to be the "normal" artifacts...there is no Prefetch file (or other artifact) created for the use of the native ftp.exe utility, nor are there any UserAssist artifacts created. However, as this method of exchanging files requires interaction with Windows Explorer, shellbag artifacts are created, and as a result, artifacts are also created beneath the Software\Microsoft\FTP\Accounts key within the user's NTUSER.DAT hive.

As mentioned previously, shellbags artifacts can also provide indications of access not only to traditional USB storage devices (i.e., thumb drives), but also to other devices (smartphones, MP3 players, and digital cameras) that can be connected to the system via a USB cable. This is important to understand, as a number of the available tools for parsing shellbag artifacts do not parse the shell items for these devices; as such, access to these devices will not be apparent when some of the popular tools are used to parse and display these artifacts.

Cloud Services
I purposely haven't addressed cloud services in this post, as this topic should likely be addressed on it's own. I have provided some resources (below) that may be of value. As there are a number of different types of cloud services available, I would like to get input from the DFIR community regarding this topic, specifically. Many of these cloud services can be accessed via the web, or by specific applications installed on the system, and as such, artifacts may vary depending upon how the user accessed those services.

I've provided links to some interesting resources on this topic below.

Resources
Jad Saliba's presentation
Mary Hughes' capstone project blog; this site hasn't been updated since Feb, 2013, but it does have a number of very useful links
Derek Newtons Forensic Artifacts: Dropbox blog post
Some Carbonite artifacts - lists some Registry keys and files, not much explanation or detail

Other Means
Not all means of exfiltrating data out of an infrastructure are really very sophisticated. I once worked for a company that was going out of business, and was involved with providing physical security when offices were being shut down. At one point, we were contacted by HR, as they suspected that the office closure schedule had been obtained after someone "hacked" one of their computers. A short investigation determined that someone had printed the schedule, and left it on the printer, where someone else had found the schedule and faxed it to all of the involved offices. I've also seen where information has been pasted into an AIM chat window.

Examples
A number of years ago, I had a couple of data exfil/exposure incidents while I was filling an FTE position at a now-defunct telecom company. In 2001, the official story was that the company was going to go through bankruptcy, and morale was very low. The security team was contacted by a member of senior management, as apparently company memos regarding the proceedings of meetings and the direction of the company were being posted to a site called "doomedcompany.com" (this site had a sister site, whose name was a bit more vulgar). In many cases, these memos were being posted before offices outside of the corporate headquarters received the memos via email. We knew that a number of employees in the company were visiting the site, and that most were doing so in order to read the memos and commentary. We had to create an account on the site, and then upload a file in order to be able to differentiate between an employee reading the memo, and someone uploading a file. By identifying those artifacts, we were able to incorporate that information into searches.

In another case, the security team was contacted by members of HR, with the complaint that their computers were 'hacked'. The issue centered around the impending shutdown of a call center office in Arizona, and the fact that the list of the employees being laid off was made available to that entire office. We sat down with the HR associate who had put the list together, and asked questions, such as "...did you email this list to anyone?" and "...did you place this list on a file server?" Ultimately, we found out that she'd sent the list to the printer, and then stepped out for a meeting. Apparently, another employee had collected their printed document from the printer, found the list, and then faxed the list to the Arizona office.

The point is that sometimes, data exfiltration is as simple as picking up something off of the printer.

Summary
Data can be exfiltrated from a system in a number of ways, but those methods generally come down to using the network (moving data to a file share, attaching files to web-based emails, etc.), or copying files to an "attached" device (attached physically via USB, or via Bluetooth). If data exfiltration is suspected, then the steps that you might want to take include:

Attempt to determine with a greater level of certainty or clarity why data exfil is suspected; what details are available? Is this based on third-party notification, or the result of monitoring?
Determine where the data that was thought to have been exfil'd originally or usually could be found; was the data on a file or database server, or did it reside on a user's system?
Did the user have access to that data? Are there indications that the user interacted with or accessed that data?
Determine indications of programs that could be used for data exfil being executed.

One final thought...Windows systems do NOT maintain artifacts of copy operations. You will not find any logs or Registry values that indicate files that were copied to a removable device, for example, particularly if all you have available for analysis is an image of the system. If the user were to copy a file to an external resource, such as a thumb drive or remote file share, and then open the file from there, then a Windows shortcut/LNK file would be created on the user's system. However, by itself, all that LNK file shows is that the user opened a file...it does not provide an indication that the user explicitly copied the file to the external resource. In the absence of some sort of monitoring agent, additional analysis, particularly of file system time stamps on both resources, would be required in order to more closely determine if the user copied the file.

↧

HowTos

July 18, 2013, 8:29 am

≫ Next: HowTo: Add Intelligence to Analysis Processes

≪ Previous: HowTo: Data Exfiltration

I've offered up a number of HowTo blog posts thus far, and hopefully DFIR folks out there have found use in them. In the comments of one of the posts, a reader offered up a list of proposed some HowTo topics which he would like to see addressed. As many of you may have noticed, most of my current posts have been technical in nature...specific artifacts to look for, specific tools to use, etc. My hope has been to enable folks to expand their own analysis processes, either through the content I have provided, or by any exchange or discussion that occurs as a result of the material posted. Most of the requested topics are either very general, or refer to soft-skills topics, so I wanted to take the opportunity to address them and see what others might have to add...

How to do a root cause analysis

This is an interesting question, in part because there's been discussion in the past regarding the need for conducting a root cause analysis, or "RCA".

An excellent resource for this topic is Corey Harrell'sjIIr blog, as he's written blog posts regarding malware RCA, compromise RCA, and there are other posts that discuss topics associated with root cause analysis.

How to work with clients who are stressed, want answers now, point fingers, or heads to roll.

I think that like any other type of incident, it depends, and it's simply something that you need to be prepared for.

I remember one engagement that I was attempting to address. The customer who called was not part of the IT staff, and during the initial discussions, it was clear that there was a good deal of stress involved in this incident. At one point, we came down to the customer simply wanting us to get someone on site immediately, and we were trying to determine which site we needed to send someone to...the corporate offices were located in a different city than the data center, and as such, anyone sent might need to fly into a different airport. If the responder flew into the wrong one, they'd have to drive several hours to the correct location, further delaying response. The more the question was asked of the customer, the more frustrated they became, and they just didn't answer the question.

In my experience, the key to surviving trying times such as these are process and documentation. Process provides analysts with a starting point, particularly during stressful times when everything seems like a whirlwind and you're being pulled in different directions. Documenting what you did, and why, can save your butt after the fact, as well.

When I was in the military, like many military units, we'd go on training exercises. During one exercise, we came back to base and during our "hot washup" after-action meeting, one of the operators made the statement that throughout the exercise, "comm sucked", indicating that communications was inadequate. During the next training exercise, we instituted a problem reporting and resolution process, and maintained detailed records in a log book. Operators would call into a central point and the problem would be logged, reported to the appropriate section (we had tactical data systems, radar, and communications sections), and the troubleshooting and resolution of the issue would be logged, as well. After the exercise, we were in our "hot washup" when one of the operators got up and said "comm sucked", at which point we pushed the log book across the table and said, "show us where and why...". The operators changed their tune after that. Without the process and documentation, however, we would have been left with commanders asking us to explain an issue that didn't have any data to back it up. The same thing can occur during an incident response engagement in the private sector.

How to hit the ground running when you arrive at a client with little information.

During my time as an emergency incident responder, this happened often...a customer would call, and want someone on-site immediately. We'd start to ask questions regarding the nature of the incident (helped us determine staffing levels and required skill sets), and all we would hear back is, "Send someone...NOW!"

The key to this is having a process that responders use in order to get started. For instance, I like to have a list of questions available when a customer calls (referred to as a triage worksheet); these are questions that are asked of all customers, and during the triage process the analyst will rely on their experience to ask more probing questions and obtain additional information, as necessary. The responder to go on-site is given the completed questionnaire, and one of the first things they do is meet with the customer point of contact (PoC) and go through the questions again, to see if any new information has been developed.

One of the first things I tend to do during this process is ask the PoC to describe the incident, and I'll ask questions regarding the data that was used to arrive at various conclusions. For example, if the customer says that they're suffering from a malware infection, I would ask what they saw that indicated a malware infection...AV alerts or logs, network traffic logged/blocked at the firewall, etc.

Generally speaking, my next step would be to either ask for a network diagram, or work with the PoC to document a diagram of the affected network (or portion thereof) on a white board. This not only provides situational awareness, but allows me to start asking about network devices and available logs.

So, I guess the short answer is, in order to "hit the ground running" under those circumstances, have a process in place for collecting information, and document your steps.

How to communicate during an incident with respect to security and syngergy with other IRT members.

As with many aspects of incident response, it depends. It depends on the type and extent of incident, who's involved, etc. Most of all, it depends upon the preparedness of the organization experiencing the incident. I've seen organizations with Nextel phones, and the walkie-talkie functionality was used for communications.

Some organizations will use the Remedy trouble-ticketing system, or something similar. Most organizations will stay off of email all together, assuming that this has been 'hacked', and may even move to having key personnel meet in a war room. In this way, communications handled face-to-face, and where applicable, I've found this to be very effective. For example, if someone adds "it's a virus" to an email thread, it may be hard to track that person down and get specific details, particularly when that information is critical to the response. I have been in a war room when someone has made that statement, and then been asked very pointed questions about the data used to arrive at that statement. Those who have that data are willing to share it for the benefit of the entire response team, and those who don't learn an important lesson.

How to detect and deal with timestomping, data wiping, or some other antiforensic [sic] technique.

I'm not really sure how to address this one, in part because I'm not really sure what value I could add to what's already out there. The topic of time stomping, using either timestomp.exe or some other means, such as copying the time stamps from kernel32.dll via the GetFileTime/SetFileTime API calls, and how to detect their use has been addressed at a number of sites, including on the ForensicsWiki, as well as on Chris Pogue's blog.

How to "deal with" data wiping is an interesting question...I suppose that if the issue is one of spoliation, then being able to determine the difference between an automated process, and one launched specifically by a user (and when) may be a critical component of the case.

As far as "some other antiforensic[sic] technique", I would say again, it depends. However, I will say that the use of anti-forensic techniques should never be assumed, simply because one artifact is found, or as the case may be, not found. More than once, I've been in a meeting when someone said, "...it was ...", but when asked for specific artifacts to support that finding, none were available.

How to get a DFIR job, and keep it.

I think that to some degree, any response to this question would be very dependent upon where you're located, and if you're willing to relocate.

My experience has been that applying for jobs found online rarely works, particularly for those sites that link to an automated application/submission process. I've found that it's a matter of who you know, or who knows you. The best way to achieve this level of recognition, particularly in the wider community, is to engage with other analysts and responders, through online meetings, blogging, etc. Be willing to open yourself up to peer review, and ignore the haters, simply because haters gonna hate.

How to make sure management understands and applies your recomendations [sic] after an incident when they're most likely to listen.

Honestly, I have no idea. Our job as analysts and responders is to present facts, and if asked, possibly make recommendations, but there's nothing that I'm aware of that can make sure that management applies those recommendations. After all, look at a lot of the compliance and legislative regulatory requirements that have been published (PCI, HIPAA, NCUA, etc.) and then look at the news. You'll see a number of these bodies setting forth requirements that are not followed.

How to find hidden data; in registry, outside of the partition, ADS, or if you've seen data hidden in the MFT, slackspace, steganography, etc.

Good question...if something is hidden, how do you find it...and by extension, if you follow a thorough, documented process to attempt to detect data hidden by any of these means and don't find anything, does that necessarily mean that the data wasn't there?

Notice that I used the word "process" and "documented" together. This is the most critical part of any analysis...if you don't document what you did, did it really happen?

Let's take a look at each of the items requested, in order:

Registry - my first impression of this is that 'hiding' data in the Registry amounts to creating keys and/or values that an analyst is not aware of. I'm familiar with some techniques used to hide data from RegEdit on a live system, but those tend to not work when you acquire an image of the system and use a tool other than RegEdit, so the data really isn't "hidden", per se. I have seen instances where searches have revealed hits "in" the Registry, and then searching the Registry itself via a viewer has not turned up those same items, but as addressed in Windows Registry Forensics, this data really isn't "hidden", and it's pretty easy to identify if the hits are in unallocated space within the hive file, or in slackspace.

Outside the partition - it depends where outside the partition that you're referring. I've written tools to start at the beginning of a physical image and look for indications of the use of MBR infectors; while not definitive, it did help me narrow the scope of what I was looking at and for. For this one, I'd suggest looking outside the partition as a solution. ;-)

ADS - NTFS alternate data streams really aren't hidden, per se, once you have an image of the system. Some commercial frameworks even highlight ADSs by printing the stream names in red.

MFT - There've been a number of articles written on residual data found in MFT records, specifically associated with files transitioning from resident to non-resident data. I'm not specifically aware of an intruder hiding data in an MFT record...to me, it sounds like something that would not be too persistent unless the intruder had complete control of the system, to a very low level. If someone has seen this used, I would greatly appreciate seeing the data.

Slackspace - there are tools that let you access the contents of slackspace, but one of the things to consider is, if an intruder or user 'hides' something in slackspace, what is the likelihood that the data will remain available and accessible to them, at a later date? After all, the word "hiding" has connotations of accessing the data at a later date...by definition, slackspace may not be available. Choosing a file at random and hiding data in the slackspace associated with that file may not be a good choice; how would you guarantee that the file would not grow, or that the file would not be deleted? This is not to say that someone hasn't purposely hidden data in file slackspace; rather, I'm simply trying to reason through the motivations. If you've seen this technique used, I'd greatly appreciate seeing the data.

Steganography - I generally wouldn't consider looking for this sort of hidden data unless there was a compelling reason to do so, such as searches in the user's web history, tool downloads, and indications of the user actually using tools for this.

How to contain an incident.

Once again, the best answer I can give is, it depends. It depends on the type of incident, the infrastructure affected, as well as the culture of the affected organization. I've seen incidents in which the issue has been easy to contain, but I've also been involved in response engagements where we couldn't contain the issue because of cultural issues. I'm aware of times where a customer has asked the response team to monitor the issue, rather than contain it.

Again, many of the topics that the reader listed were more on the "soft" side of skills, and it's important that responders and analysts alike have those skills. In many cases, the way to address this is to have a process in place for responders to use, particularly during stressful times, and to require analysts to maintain documentation of what they do. Yes, I know...no one likes to write, particularly if someone else is going to read it, but you'll wish you had kept it when those times come.

↧

HowTo: Add Intelligence to Analysis Processes

July 22, 2013, 6:38 am

≫ Next: HowTo: Determine/Detect the use of Anti-Forensics Techniques

≪ Previous: HowTos

How many times do we launch a tool to parse some data, and then sit there looking at the output, wondering how someone would see something "suspicious" or "malicious" in the output? How many times do we look at lines of data, wondering how someone else could easily look at the same data and say, "there it is...there's the malware"? I've done IR engagements where I could look at the output of a couple of tools and identify the "bad" stuff, after someone else had spent several days trying to find out what was going wrong with their systems. How do we go about doing this?

The best and most effective way I've found to get to this point is to take what I learned on one engagement and roll it into the next. If I find something unusual...a file path of interest, something particular within the binary contents of a file, etc...I'll attempt to incorporate that information into my overall analysis process and use it during future engagements. Anything that's interesting, as a result of either direct or ancillary analysis will be incorporated into my analysis process. Over time, I've found that some things keep coming back, while other artifacts are only seen every now and then. Those artifacts that are less frequent are no less important, not simply because of the specific artifacts themselves, but also for the trends that they illustrate over time.

Before too long, the analysis process includes, "access this data, run this tool, and look for these things..."; we can then make this process easier on ourselves by taking the "look for these things" section of the process and automating it. After all, we're human, get tired from looking at a lot of data, and we can make mistakes, particularly when there is a LOT of data. By automating what we look for (or, what we've have found before), we can speed up those searches and reduce the potential for mistakes.

Okay, I know what you're going to say..."I already do keyword searches, so I'm good". Great, that's fine...but what I'm talking about goes beyond keyword searches. Sure, I'll open up a lot of lines of output (RegRipper output, web server logs) in UltraEdit or Notepad++, and search for specific items, based on information I have about the particular analysis that I'm working on (what are my goals, etc.). However, more often than not, I tend to take that keyword search one step further...the keyword itself will indicate items of interest, but will be loose enough that I'm going have a number of false positives. Once I locate a hit, I'll look for other items in the same line that are of interest.

For example, let's take a look at Corey Harrell's recent post regarding locating an injected iframe. This is an excellent, very detailed post where Corey walks through his analysis process, and at one point, locates two 'suspicious' process names in the output of a volatile data collection script. The names of the processes themselves are likely random, and therefore difficult to include in a keyword list when conducting a search. However, what we can take away from just that section of the blog post is that executable files located in the root of the ProgramData folder would be suspicious, and potentially malicious. Therefore, a script that that parses the file path and looks for that condition would be extremely useful, and written in Perl, might look something like this:

my @path = split(/\\/,$filepath);
my $len = scalar(@path);
if (lc($path[$len - 2]) eq "programdata" && lc($path[$len - 1]) =~ m/\.exe$/) {
print "Suspicious path found: ".$filepath."\n";
}

Similar paths of interest might include "AppData\Local\Temp"; we see this one and the previous one in one of the images that Corey posted of his timeline later in the blog post, specifically associated with the AppCompatCache data output.

Java *.idx files
A while back, I posted about parsing Java deployment cache index (*.idx) files, and incorporating the information into a timeline. One of the items I'd seen during analysis that might indicate something suspicious is the last modified time embedded in the server response be relatively close (in time) to when the file was actually sent to the client (indicated by the "date:" field). As such, I added a rule to my own code, and had the script generate an alert if the "last modified" field was within 5 days of the "date" field; this value was purely arbitrary, but it would've thrown an alert when parsing the files that Corey ran across and discussed in his blog.

Adding intel is generally difficult to do with third-party, closed source tools that we download from someone else's web site, particularly GUI tools. In such cases, we have to access the data in question, export that data out to a different format, and then run our analysis process against that data. This is why I recommend that DFIR analysts develop some modicum of programming skill...you can either modify someone else's open source code, or write your own parsing tool to meet your own specific needs. I tend to do this...many of the tools I've written and use, including those for creating timelines, will incorporate some modicum of alerting functionality. For example, RegRipper version 2.8 incorporates alerting functionality directly into the plugins. This alerting functionality can greatly enhance our analysis processes when it comes to detecting persistence mechanisms, as well as illustrating suspicious artifacts as a result of program execution.

Writing Tools
I tend to write my own tools for two basic reasons:

First, doing so allows me to develop a better understanding of the data being parsed or analyzed. Prior to writing the first version of RegRipper, I had written a Registry hive file parser; as such, I had a very deep understanding of the data being parsed. That way, I'm better able to troubleshoot an issue with any similar tool, rather than simply saying, "it doesn't work", and not being able to describe what that means. Around the time that Mandiant released their shim cache parsing script, I found that the Perl module used by RegRipper was not able to parse value "big data"; rather than contacting the author and saying simply, "it doesn't work", I was able to determine what about the code wasn't working, and provide a fix. A side effect of having this level of insight into data structures is that you're able to recognize which tools work correctly, and select the proper tool for the job.

Second, I'm able to update and make changes to the scripts I write in pretty short order, and don't have to rely on someone else's schedule to allow me to get the data that I'm interested in or need. I've been able to create or update RegRipper plugins in around 10 - 15 minutes, and when needed, create new tools in an hour or so.

We don't always have to get our intelligence just from our own analysis. For example, this morning on Twitter, I saw a tweet from +Chris Obscuresec indicating that he'd found another DLL search order issue, this one on Windows 8 (application looked for cryptbase.dll in the ehome folder before looking in system32); as soon as I saw that, I thought, "note to self: add checking for this specific issue to my Win8 analysis process, and incorporate it into my overall DLL search order analysis process".

The key here is that no one of us knows everything, but together, we're smarter than any one of us.

I know that what we've discussed so far in this post sounds a lot like the purpose behind the OpenIOC framework. I agree that there needs to be a common framework or "language" for representing and sharing this sort of information, but it would appear that some of the available frameworks may be too stringent, not offer enough flexibility, or are simply internal to some organizations. Or, the issue may be as Chris Pogue mentioned during the 2012 SANS DFIR Summit..."no one is going to share their secret sauce." I still believe that this is the case, but I also believe that there are some fantastic opportunities being missed because so much is being incorporated under the umbrella of "secret sauce"; sometimes, simply sharing that you're seeing something similar to what others are seeing can be a very powerful data point.

Regardless of the reason, we need to overcome our own (possibly self-imposed) roadblocks for sharing those things that we learn, as sharing information between analysts has considerable value. Consider this post...who had heard of the issue with imm32.dll prior to reading that post? We all become smarter through sharing information and intelligence. This way, we're able to incorporate not just our own intelligence into our analysis processes, but we're also able to extend our capabilities by adding intelligence derived and shared by others.

↧

HowTo: Determine/Detect the use of Anti-Forensics Techniques

July 23, 2013, 4:25 am

≫ Next: HowTo: Investigate an Online Banking Fraud Incident

≪ Previous: HowTo: Add Intelligence to Analysis Processes

The use of anti-forensics techniques to hide malicious activity (malware installation, intrusion, data theft, etc.) can be something of a concern during an examination; in fact, in some cases, it's simply assumed when particular data or artifacts can't be found. It's easy to assume that these techniques were used when we look at a very limited range of artifacts; however, as we begin to incorporate additional and more comprehensive data sources into our analysis processes, we begin to be able to separate out the anti-forensics signal from the noise.

The term "anti-forensics" can refer to a lot of different things. When someone asks me about this topic, I generally try to get them to describe to me what they're referring to, and to be more specific. As with anything else, nomenclature can be important, and messages get scrambled when the use of terms becomes too loose. Rather than address this as broad topic, I thought we'd take a look at some of the common techniques used to hide evidence on or remove it from a system...

TimeStomp
One of perhaps the most publicly discussed anti-forensic techniques is referred to astime stomping, in part due the name of the tool used to demonstrate this capability. While this initially threw a monkey wrench into our analysis processes, it was quickly realized that the use of this sort of technique (and tool) could be detected. Then, as things tend to go in any eco-system, there was an adaptation to the technique...rather than modifying a 64-bit time stamp with a 32-bit value, the technique was adapted to copy the file times from kernel32.dll onto the target file, preserving 64-bit granularity. Once again, analysis techniques were updated. For example, in 2009, Lance Mueller talked about detecting the use of time changing utilities in his blog. There's been discussion regarding techniques for changing the $FILE_NAME attribute time stamps, as well as those within the $STANDARD_INFORMATION attribute, just as there have been techniques for detecting the use of this technique. Direct and thorough analysis of the MFT (the analysis of which is predicated by having a thorough understanding of the MFT records themselves) can be revealing, whereas techniques such as detecting program execution and David Cowen's NTFS TriForce can prove valuable insight, as well.

Tools: MFT parser, knowledge of MFT records

Changing the System Time
Okay, let's say that rather than changing the times of specific files, and intruder changes the system time itself. This would mean that, after that change, the times recorded by the system would be different...so how could we detect this? One way to do this is to list available Event Log records by sequence number and generated time...if the system time were rolled back, this activity would become evident as the sequence numbers increased but at one point, the time generated was earlier than the time for the previous record. Lance Mueller's first forensic practical exercise provided a great example of how to detect system time changes using this technique.

Tools: evtparse.pl ('-s' switch)

Zapping Event Records
I've heard analysts state the there were gaps in the available Event Logs, so an intruder must have been able to remove specific event records from the log. Again, I've heard this claimed, but I've never seen the data to support this sort of thing. Writing a tool to do this is hazardous to the intruder...it may not work, and may instead crash the system. Why not just do something much simpler, such as (given the appropriate privileges) clear the Event Log and disable auditing all together.

I've had to analyze a number of Windows systems where the Event Logs have been cleared, and with Windows XP and 2003 systems in particular, it's been pretty trivial to recover a good deal of those deleted event records.

Checking the LastWrite time of a Registry key within the Security hive file (see the auditpol.pl RegRipper plugin) will help you determine when the audit policy of the system was last modified.

Multiple Techniques
What we've discussed thus far was not intended to be a comprehensive listing of all anti-forensics techniques; rather, I wanted to look at a couple and point out analysis processes that you could employ to detect the use of such techniques. The thing about using anti-forensics techniques is that less is better; the fewer and more simple the techniques used, the harder they are to address. For example, simply deleting a file...downloader, executable file, etc...after us is perhaps the simplest technique to use, as it prevents an analyst from obtaining a copy of the file for analysis. Say a script downloads and executes a file, then deletes it...the analyst may still find artifacts to indicate that the file was executed (i.e., Prefetch file, AppCompatCache artifacts, etc.) but not be able to determine explicitly what the file was designed to do.

However, to use multiple techniques requires additional planning and effort. If this is done automatically, then either a larger application, or multiple applications will need to be downloaded to the system. The problem that the intruder then runs up against is that the applications have to be tested specifically against the version of Windows that has been compromised...different versions of Windows may have different functionality behind the API, and the applications may not work correctly, or may even crash the system. The more "sophisticated" the technique used, the more planning and effort is required. If multiple applications are used, it's more likely that indications of program execution will be found. If a more manual approach is used, then the intruder must spend more time and engage with the system more, again leaving more tracks and artifacts as they interact with the environment.

Summary
The key things to remember with respect to determining or detecting the use of anti-forensics techniques are:

1. If you suspect it, provide it. Find the evidence. If you suspect that a particular technique has been used, gather the data that supports, or ultimately disproves, your theory. Don't just wave your hand and suggest that "anti-forensics techniques were used." If you suspect that one or more techniques were used, identify them explicitly. Then, you can pursue demonstrating or disproving your theory.

2. Remember that you're not only on the same battlefield as the bad guy, but you actually have an advantage. You're examining an acquired image, which is the "scene of the crime", frozen in time and unchanging. You can go back and start your analysis over again, without the fear of loosing any of your previous artifacts.

3. Document your analysis; if you didn't document it, it didn't happen. Once you've documented your analysis, including both what worked and what didn't, you can then incorporate your findings into future analysis, as well as share your finding with other analysts.

↧

HowTo: Investigate an Online Banking Fraud Incident

July 23, 2013, 6:36 am

≫ Next: Data Structures, Revisited

≪ Previous: HowTo: Determine/Detect the use of Anti-Forensics Techniques

A recent comment over on Google Plus caught my attention, and I thought it was important enough to incorporate into a HowTo post. The comment was made with respect to the HowTo: Detecting Persistence Mechanisms post, and had to do with another means of persistence associated specifically with (according to the person who left the comment) online banking fraud.

Online banking fraud is a significant issue. It's long been known that criminals go where the money is, and there are some interesting aspects with regards to this criminal activity. For example, by targeting the computers used by small businesses to perform online banking, very often no investigation is done. After all, a small business has just lost a significant amount of money, and possibly gone out of business...who's going to pay to have a thorough examination of the system performed? Further, in the US, small businesses are not protected in the same manner as individuals, so once the money's gone, it's gone.

Small businesses can take a number of steps in order to protect themselves from online banking fraud; however, there is a lack of information and intel available that law enforcement can use to pursue the criminals, simply due to the fact that the systems used do not seem to be examined. A thorough examination can be, and should be, conducted immediately so that law enforcement has the information that they need to pursue an investigation.

Whodunit?
Brian Krebs has discussed the Zeus malware quite extensively in his blog, particularly with respect to online banking fraud. More often than not, once Brian has been contacted, the systems have likely already been wiped and repurposed, without any sort of examination having been conducted.

However, there're more ways of achieving this sort of criminal activity than simply using Zeus; W32/Crimea is one example. During the infection process, the PE header of imm32.dll is modified (for Windows XP, Windows File Protection is disabled temporarily) to point to a malicious DLL, which is loaded when imm32.dll is loaded. Imm32.dll is associated with keyboard interaction, so it's loaded by a number of processes, including the web browser. The malicious DLL focuses specifically on retaining information entered into browser form fields associated with specific sites (i.e., online banking sites). The collected information is not stored locally; instead, it is immediately shuttled off of the system. As such, the artifacts are minimal, at best. The most significant artifacts were found through examination of the malicious DLL itself, which then led to findings in the pagefile.

Another example is one pointed out by Sandro Suffert on Google Plus...he mentioned that a "downloader" had modified two Registry settings (modified or created two values):

- Within the HKLM\Software\Microsoft\Windows\CurrentVersion\Internet Settings key, the AutoConfigURL value pointed to either a local or remote .pac file

- Within the HKCU\Software\Policies\Microsoft\Internet Explorer key, the Autoconfig value was set to 0x1.

I cannot attest to these key paths or values being correct, as I have not seen the data. However, this is an interesting technique to use, as Sandro pointed out that particularly with a remote .pac file, there's no actual malware on the system, and therefore no file for AV to alert on. Yet, this technique allows the bad guys to capture information using a man-in-the-middle attack.

Similar techniques are used by Trojan-Banker.Win32.Banbra, Troj-MereDrop, as well as this baddie from ThreatExpert.

As Hamlet said to Horatio, "...there are more things on heaven and earth than are dreamt of in your philosophy..."; Win32\Theola is a Chrome plugin that is used to commit online banking fraud.

Examination
So...in order to investigate a potential online banking fraud issue, as soon as this issue is suspected (or a small business is notified of such an issue), immediately sit down with the employee responsible for conducting online banking and determine all of the systems that they used for this activity. You may find that they have one system from which they conduct this activity, or you will find out that they had an issue at some point and used another system. Immediately isolate that system, and depending upon the timeframe of the fraudulent activity, acquire a dump of physical memory from the system. Then, acquire an image of the system and conduct a thorough examination, or contact someone who can.

If you create a timeline of system activity, it should go without saying that you should focus your attention on activity that prior to the date of the fraudulent transaction (or the first one, if there are several).

Resources
MS KB: How to reset your IE proxy settings

↧

Data Structures, Revisited

September 2, 2013, 4:53 am

≫ Next: Links

≪ Previous: HowTo: Investigate an Online Banking Fraud Incident

A while back, I wrote this article regarding understanding data structures. The importance of this topic has

not diminished with time; if anything, it deserves much more visibility. Understanding data structures provides analysts with insight into the nature and context of artifacts, which in turn provides a better picture of their overall case.

First off, what am I talking about? When I say, "data structures", I'm referring to the stuff that makes up files. Most of us probably tend to visualize files on a system as being either lines of ASCII text (*.txt files, some log files, etc.), or an amorphous blob of binary data. We may sometimes even visualize these blobs of binary data as text files, because of how our tools present the information found in those blobs. However, as we've seen over time, there are parts of these blobs that can be extremely meaningful to us, particularly during an examination. For example, in some of these blobs, there may be an 8-byte sequence that is the FILETIME format time stamp that represents when a file was accessed, or when a device was installed on a system.

A while back, as an exercise to learn more about the format of the IE (version 5 - 9) index.dat file, I wrote a script that would parse the file based on the contents of the header, which includes a directory table that points to all of the valid records within the file, according to information available on the ForensicsWiki (thanks to Joachim Metz for documenting the format, the PDF of which can be found here). Again, this was purely an exercise for me, and not something monumentally astounding...I'm sure that we're all familiar with pasco. Using what I'd learned, I wrote another script that I could use to parse just the headers of the index.dat as part of malware detection, the idea being that if a user account such as "Default User", LocalService, or NetworkService has a populated index.dat file, this would be an indication that malware on the system is running with System-level privileges and communicating off-system via the WinInet API. I've not only discussed this technique on this blog and in my books, but I've also used this technique quite successfully a number of times, most recently to quickly identify a system infected with ZeroAccess.

More recently, I was analyzing a user's index.dat, as I'd confirmed that the user was using IE during the time frame in question. I parsed the index.dat with pasco, and did not find any indication of a specific domain in which I was interested. I tried my script again...same results. Exactly. I then mounted the image as a read-only volume and ran strings across the user's "Temporary Internet Files" subfolders (with the '-o' switch), looking specifically for the domain name...that command looked like this:

C:\tools>strings -o -n 4 -s | find "domain" /i

Interestingly enough, I got 14 hits for the domain name in the index.dat file. Hhhhmmmm....that got me to thinking. Since I had used the '-o' switch in the strings command, the output included the offsets within the file to the hits, so I opened the index.dat in a hex editor and manually scrolled on down to one of the offsets; in the first case, I found full records (based on the format specification that Joachim had published). In another case, there was only a partial record, but the string I was looking for was right there. So, I wrote another script that would parse through the file, from beginning to end, and locate records without using the directory table. When the script finds a complete record, it will parse it and display the record contents. If the record is not complete, the script will dump the bytes in a hex dump so that I could see the contents. In this way, I was able to retrieve 10 complete records that were not listed in the directory table (and were essentially deleted), and 4 partial records, all of which contained the domain that I was looking for.

Microsoft refers to the compound file binary file format as a "file system within a file", and if you dig into the format document just a bit, you'll start to see why...the specification details sectors of two sizes, not all of which are necessarily allocated. This means that you can have strings and other data buried within the file that are not part of the file when viewed through the appropriate application.

CFB Format
The Compound File Binary Format document available from MS specifies the use of a sector allocation table, as well as a small sector allocation table. For Jump Lists in particular, these structures specify which sectors are in use; mapping the ones that are in use, and targeting just those sectors within the file that are not in use can allow you to recover potentially deleted information.

MS Office documents no longer use this file format specification, but it is used in *.automaticDestinations-ms Jump Lists on Windows 7 and 8. The Registry is similar, in that the various "cells" that comprise a hive file can allow for a good bit of unallocated or "deleted" data...either deleted keys and values, or residual information in sectors that were allocated to the hive file as it continued to grow in size. MS does a very good job of making the Windows XP/2003 Event Log record format structure available; as such, not only can Event Logs from these systems be parsed on a binary basis (to not only locate valid records within the .evt file that are "hidden" by the information in the header), but records can also be recovered from unallocated space and other unstructured data. MFT records have been shown to contain useful data , particularly as a file moves from being resident to non-resident (specific to the $DATA attribute), and that can be particularly true for systems on which MFT records are 4K in size (rather than the 1K that most of us are familiar with).

Understanding data structures can help us develop greater detail and additional context with respect to the available data during an examination. We can recover data from within files that is not "visible" in a file by going beyond the API. Several years ago, I was conducting a PCI forensic audit, and found several potential credit card numbers "in" a Registry hive...understanding the structures within the file, and a bit of a closer look revealed that what I was seeing wasn't part of the Registry structure, but instead part of the sectors allocated to the hive file as it grew...they simply hadn't been overwritten with key and value cells yet. This information had a significant impact on the examination. In another instance, I was trying to determine which files a user had accessed, and found that the user did not have a RecentDocs key within their NTUSER.DAT; I found this to be odd, as even a newly-created profile will have a RecentDocs key. Using regslack.exe, I was able to retrieve the deleted RecentDocs key, as well as several subkeys and values.

Summary
Understanding the nature of the data that we're looking at is critical, as it directs our interpretations of that data. This interpretation will not only direct subsequent analysis, but also significantly impact our conclusions. If we don't understand the nature of the data and the underlying data structures, our interpretation can be significantly impacted. Is that credit card number, which we found via a search, actually stored in the Registry as value data? Just because our search utility located it within the physical sectors associated with a particular file name, do we understand enough about the file's underlying data structures to understand the true nature and context of the data?

↧

Links

September 12, 2013, 6:19 am

≫ Next: Forensic Perspective

≪ Previous: Data Structures, Revisited

Artifacts
Jason Hale has a new post over on the Digital Forensics Stream blog, this one going into detail regarding the Search History artifacts associated with Windows 8.1. In this post, Jason points out a number of artifacts, so it's a good idea to read it closely. Apparently, with Windows 8.1, LNK files are used to maintain records of searches. Jason also brought us this blog post describing the artifacts of a user viewing images via the Photos tile in Windows 8 (which, by the way, also makes use of LNK streams...).

Claus is back with another interesting post, this one regarding Microsoft's Security Essentials download. One of the things I've always found useful about Claus's blog posts is that I can usually go to his blog and see links to some of the latest options with respect to anti-virus applications, including portable options.

Speaking of artifacts, David Cowen's Daily Blog #81 serves as the initiation of the Encyclopedia Forensica project. David's ultimate goal with this project is to document what we know, from a forensic analysis perspective, about major operating systems so that we can then determine what we don't know. I think that this is a very interesting project, and one well worth getting involved in, but my fear is that it will die off too soon, from nothing more than lack of involvement. There are a LOT of folks in the DFIR community, many of whom would never contribute to a project of this nature.

One of perhaps the biggest issues regarding knowledge and information sharing within the community, that I've heard, going back as far as WACCI 2010 and beyond, is that too many practitioners simply feel that they don't have any means for contributing to the community in a manner that allows them to do so. Some want to, but can't be publicly linked to what they share. Whatever the reason, there are always ways to contribute. For example, if you don't want to request login credentials on the ForensicsWiki and actually write something, how about suggesting content (or clarity or elaboration on content) or modifications via social media (Twitter, G+, whatever...even directly emailing someone who has edited pages)?

Challenges
Like working forensic challenges, or just trying to expand your skills? I caught this new DFIR challenge this morning via Twitter, complete with an ISO download. This one involves a web server, and comes with 25 questions to answer. I also have some links to other resources on the FOSS Tools page for this blog.

Speaking of challenges, David Cowen's been continuing his blog-a-day challenge, keeping with the Sunday Funday challenges that he posts. These are always interesting, and come with prizes for the best, most complete answers. These generally don't include images, and are mostly based on scenarios, but they can also be very informative. It can be very beneficial to read winning answers come Monday morning.

Academia
I ran across this extremely interesting paper authored by Dr. Joshua James and Pavel Gladyshev, titled Challenges with Automation in Digital Forensics Investigations. It's a bit long, with the Conclusions paragraph on pg. 14, but it is an interesting read. The paper starts off by discussing "push-button forensics" (PBF), then delves into the topics of training, education, licensing, knowledge retention, etc., all issues that are an integral part of the PBF topic.

I fully agree that there is a need for intelligent automation in what we do. Automation should NOT be used to make "non-experts useful"...any use of automation should be accompanied with an understanding of why the button is being pushed, as well as what the expected results should be so that anomalies can be recognized.

It's also clear that some of what's in the paper relates back to Corey's post about his journey into academia, where he points out the difference between training and education.

Video
I ran across a link to Mudge's comments at DefCon21. I don't know Mudge, and have never had the honor of meeting him...about all I can say is that a company I used to work for used the original L0phtCrack...a lot. Watching the video and listening to the stories he shared was very interesting, in part because one of the points he made was getting out and engaging with others, so that you can see their perspectives

↧