7. Memory Acquisition (OBJ 4.4)
Memory acquisition. Now, as we go through our order of volatility, one of the things we need to collect very early on is our system memory, that stuff that’s stored inside of RAM. The way we do this is through system memory and image acquisition. This is a process that creates an image file of the system memory that can be analysed to identify the processes that we’re rerunning, the contents of temporary file systems, registry data network connections, cryptographic keys, and much more.
Now, there are lots of different ways to do this, and we’re going to talk about the four most common categories in this lesson. These four categories are live acquisition, crashdump, hibernation files, and page files. First, live acquisition. Now, “live acquisition” is the process of capturing the contents of memory while the computer is running. And you do this using specialised hardware or software tools. Now, there are lots of different tools out there, but unfortunately, all of these tools need to be preinstalled because they need kernel mode drivers to be able to function and get the data out of the system. There are many of them available, but the two we’ll discuss here are FireEye and F response from Tactical. Both of these are good tools that you can use to be able to capture the memory contents.
But again, they need to be included as part of your build for your workstations across your enterprise. So they’re ready and waiting when you’re ready to use them. Now, the second way you can capture memory is by performing a crash dump. Now, essentially, a “crash dump” is when the contents of memory are written to a dump file when Windows encounters some kind of unrecoverable kernel error. These days, most modern computers have a lot of memory. If you look at your standard computer nowadays, it has 8 GB, 16 GB, or 32 GB of RAM. When a crash jump occurs, it does not have time to write all of that to disk. So instead, you’ll usually get the results in a mini-dump file. This may contain some valuable information and potential evidence, but it won’t be a full copy of the memory because there’s just simply not enough time to write all that to disc before the system crashes. The third method we have is known as the “Hibernation file.” Now, on every computer, there is a file that will be written to disc whenever the workstation is put into a sleep state.
This is known as the “Hibernation” file. Essentially, if you take your laptop and shut the lid, it creates this hibernation file so that when you open the lid again and turn on the computer, it can wake up and go right back to what it was doing. Now, that Hibernation file can be read and analysed using different forensic tools, and this will allow you to go through it and find any information that may have been written there. Now, this is a great way to get information, but there are some drawbacks, too. If you’re trying to hunt for malware, Some malware is capable of detecting the use of a sleep state. As a result, they will engage in any forensics to conceal their activity. So it won’t be foolproof to find everything you’re looking for inside a hibernation file. Now, our fourth method is looking at the page file. Now, a page file is a file that stores pages of memory that were in use but exceeded the capacity of the host’s physical RAM modules. So let’s say you’re using a little network, and it only has 4 GB of RAM. Well, on most Windows systems, that’s pretty minimal these days. And so what will end up happening is that there’ll be this page file or swap file on the computer, and as it needs to get things into and out of memory, it will go into this temporary memory, this page file, and that’s actually written on the hard drive.
Now, this slows down the overall system, which is something we cover as a plus. But for our purposes as digital forensic analysts, this is actually a good thing, because that means there’s data that was written to the hard drive, because when we turn off the computer, what’s on the hard drive still stays there, and so we can analyse that information. Now, the bad thing is that this is one page of memory at a time. It’s not the whole thing. As a result, it will appear very random. So maybe you’re not going to pull out entire files, but you can search for strings and try to find interesting things that may help you in your investigation. All right, so those are our four methods that we can use. Now, again, when I’m talking about memory, I’m talking about Ram. And this is all something that’s going to lose its contents whenever you power off the system. And so you have to be able to capture that data. And, as we’ve seen, you’ll often be able to capture it in a variety of ways, as we’ve seen here. Now, remember, even when you’re dealing with a specialised tool, live acquisition will just generate a snapshot of the data that you’re seeing. But remember, this data is changing second by second. Ram is very quick to move, and that’s why it’s high on our list of volatile stocks. But we want to do this very early in our collection cycle.
Remember, if I have a computer and it’s been sitting here for 3 hours, whatever was on the computer 3 hours ago in RAM may not be there anymore. It may be that whatever I did five minutes ago is not there either. The more your computer is in use, the more things are swapped in and out of RAM, and that makes it even harder to collect. So what kind of things can you expect to find when you start analysing the image from memory? Well, there are lots of things you can find. You might be able to find a list of the running processes at the time of collection. You may find password hashes that could be useful. You might find cryptographic keys that can help you unlock encrypted hard drives that you won’t be able to access when you shut down the computer if you don’t have those keys. You might be able to find registry keys that were useful. You might find cache files, and you might find strings in open files. All of these are things that are going to be useful to you as you’re going through your analysis and your investigation. And so it’s important to collect that information by acquiring your memory image early on in the process.
8. Disk Image Acquisition (OBJ 4.4)
Forensic tools. In this lesson, we are going to talk about forensic tools, which are specialised applications and hardware that we use to do data collection, data analysis, and data acquisition. Now, digital forensic kits are something we’re going to put together with a lot of different tools in them. This is going to be a kit containing the software and hardware (tools) required for us to acquire and analyse evidence from system memory dumps and mass storage file systems.
Now this is important because digital forensics software is specialised software that’s designed to assist in the collection and analysis of this digital evidence. You can’t just copy it like you would a file from your hard drive onto a USB drive. You can’t just drag and drop it. There are special ways you have to do this to make sure it’s forensically sound. Now, there are lots of different tools out there, but the ones we’re going to focus on in this lesson are InCase, the forensic toolkit, also known as FTK, and the sleuth toolkit. These are the three big ones we’re going to do in COVID inside the Cys Plus curriculum. Now, just a quick exam note. You don’t need to know how to use these tools for the exam. You should know what they are. And if you see their name, you should know their digital forensic tools. First, we have NKS, and Ncase is a digital forensic case management product that was created by guidance software. It uses built-in pathways or workflow templates that will show you the key steps in many types of investigations. Remember how I said it was really important to follow a written process?
Well, these pathways and workflows help you do that. They give you all the key steps that you need as you’re going through your process, almost like a checklist. Now, when you look at Ncase, it is a graphical user environment, and it runs on Windows. The great thing about it is that you can use it for both acquisition and analysis. Ncase is a very powerful tool, and it can read bit-by-bit copies of the hard drive and do the analysis inside the slack space and the deleted files and bring those back to life. It can also help you with things like timeline generation and lots more. We’re going to talk about more of these features later on in this section. Next, we have FTK, the forensic toolkit. This is a digital forensic investigation suite that accesses data, and it runs on Windows servers or server clusters that allow for faster searching and analysis because of the way it does data indexing whenever you import evidence. Now, most of the features you’re going to find in InCase, you’re going to find in FTK as well. They are really the two big competitors in the digital forensic software market, and they are both commercial solutions that cost a lot of money.
When you look at FTK, you see a lot of the same things that you see in case right now. It has the same kind of style inside the windows. You can see the binary data written in hexadecimal with the ASCII at the bottom. Off to the side, you can see the files at the top and bottom of the file list that it’s found as it’s gone through this hard drive. And you can see that it has basically the same type of stuff that you found in Ncase. Next, I want to talk about the SLEW toolkit. Now this is a good one for you to start learning how to use digital forensics. The reason for this is that it’s an open-source digital forensics collection of a lot of different command-line tools and programming libraries for disc imaging and file analysis. And it interfaces with a programme called Autopsy. That is the graphical user interface for this kit. Now, the great thing about the Sleuth Toolkit is that it is a completely free and open-source solution. So you can go to Google and download the Sleuth Toolkit, install it on your machine, and start playing with it right now.
Now the Sleuth Toolkit looks a lot like the other ones when you’re using it on Windows. Again, a graphical environment And it’s basically made to be a clone of FTK or Incase, but in the open-source, free-for-you-to-use market. So you may be wondering, Jason, which one should I learn to use? Well, it really comes down to which one your organisation uses. Now, I like to start out with the Sleeptoolkit because, again, it’s free and open source. However, if you have access to In Case or FTK, you should try learning those as well. And both of those do have free demos that you can download and use. Now, as an analyst, which one are you going to become proficient in and use? Well, most likely you’re going to be using Incase or FTK. Why? Because if you’re doing forensic analysis, you’re probably working for a corporation or for law enforcement. And most police stations in law enforcement use either FDK or In Case, and which one you’re going to use is going to be based on the place that hires you. If they’re already using In Case, that’s what you’re going to use. If they use the SDK, that’s what you’re going to use.
And that’s the idea here. As a student, I would go ahead and download the Sleuth Toolkit and start learning how that works. And I’ll actually show you how I use the Sleuth Toolkit a little bit later on in this section as I do a demonstration for you. Now, in addition to having the software, you also need hardware. And when you start dealing with a forensic workstation, these things have to be powerful. When you talk about a digital forensic workstation, this is something that is going to have multiple processors, multiple cores, and lots of memory, usually 32, 64, or 128 GB of memory. You’re going to need a very large hard drive to store the data internally as you’re working on it. And you need to have access to data stored offsite as well. Now, in addition to software, you also need hardware. And one of the biggest pieces of hardware you need is a digital forensics workstation. Now a digital forensics workstation is going to be a very powerful computer, and you can see a couple of them here on the screen. Now these are standalone forensic tools. They’re going to have lots of power behind them. You’re going to have multiple processors in the system, and you’re going to have multiple cores inside the system. You’re going to have 32, 64, or 128 GB of main memory in these systems.
You’re going to need fast SSDs to be able to run all this stuff. And you’re going to have a wide variety of drive host bus adapters, things like EIDE SATA, SCSI, SAS, USB, FireWire, Thunderbolt, and pretty much any other connection mechanism you may need because you might need to connect an external drive of some kind to your system to import that evidence. So you want to make sure you have access to all of that. In addition to that, you’re going to have optical drives, CD, DVD, Blu-ray, and even memory card readers. All of this in one machine Now, in addition to all this, your forensic workstation also needs access to a high-capacity disc array subsystem, like a raid or a storage area network. And the reason for this is simple: the evidence files are huge. If you took an evidence collection on my personal computer right now, you would have two terabytes worth of data—and that’s just on one of my machines. I have four or five machines sitting around my office right now. If you came in here and collected data on each one of those, you’d have 510, 15 terabytes of data to store if you came in to look at my server; that’s 40 terabytes of data. And so you need to have access to some place to put these huge evidence files as you’re collecting them. Another thing I mentioned before was that we always want to do our analysis on the copies of your acquired images. You never do it on the actual drives themselves.
So the way this works is that you’ll have the original evidence, which could be a hard drive or an SSD. You’re going to make a bit-by-bit copy of that and acquire it using your digital forensic tools. Now that you have that acquired image, we’re not going to do the analysis on that image either. We’re going to make a copy of that image, and that’s what we’re going to do our analysis on. This way, we can always go back to the source image, make another copy, and do more analysis without affecting the original. One last big warning here in this lesson: as an analyst, you should always make sure your forensic workstation is prohibited from accessing the Internet. You don’t want to connect to the Internet. Why? Because if you can connect to the Internet, that means the Internet can connect to you. And if the Internet connects to you, there’s a possibility your forensic workstation could be compromised with malware. It can get a remote access Trojan, and then a bad guy could get in and start manipulating your evidence and making it say what it’s supposed to say. So you always want to make sure that your forensic workstation is cut off from the Internet. This way, your evidence stays pure, as does your machine.
9. Hashing (OBJ 4.4)
Hashing. Now, I’ve mentioned a couple of times the idea of a hash or a hash digest. And this is an important concept that we have to implement in COVID. Now, if you’ve taken Security Plus, this lesson is going to be a review for you. But that’s okay because it’s really important when you’re dealing with digital forensics to ensure you have a good hash. When you look at a hash, all it is is a digital fingerprint. It uniquely identifies a file, a folder, a drive image, or anything like that using this unique hash. Now a hash is just a cryptographic function, and it converts an arbitrary-length string input into a fixed-length string output. And there are a couple of different hashes out there, things like Shaw NMD Five. And we’re going to talk about both of them as we go through this lesson. Now, the first one I want to talk about is Shaw, because that is the standard these days. Shaw is the secure hash algorithm, and this is a cryptographic hashing algorithm that was created to address the possible weakness in older algorithms like the MD-5 hashing algorithm. Unlike the older MD-5, sha1 uses a larger bit size. So if you’re using Sha1, it’s going to have a 160-bit hash digest.
But Shaw One isn’t really considered strong these days. It was in 1995 when it was released. But computers change, and things get weaker over time as computers get stronger. So instead they brought out Shaw 2. Shaw Version 2 uses a 256- or 512-bit hash digest, and it’s the current version that’s used in all of modern forensics. Now, before that was the most popular, we used to use something called MD Five. MD5 is the message digest algorithm. Now, this was a cryptographic hashing algorithm created all the way back in 1990, so over 30 years ago. Within the message digest algorithm, MD5 was the most commonly used variant. So you might hear it called the MDA (Message Digest Algorithm) algorithm, or more specifically, MD-5, which is the one that we use most of the time. Now the problem with MD5 is that it uses a 128-bit hash digest. This makes it susceptible to collisions, and therefore it should only be used as a second factor of integrity checking. Now, what do I mean by a collision? Well, a collision occurs when two of those random values that we input into an algorithm give us the same output. Remember that all of these hash digests can take a random input? So whether I give them a word, a sentence, a paragraph, a book, or a library, I’m still only going to get the hash digest.
That’s the same value. In this case, MD5 is only 128 bits long. Now that means there are only so many possibilities, and eventually there are going to be two things that give you the same output. That’s what a collision is, and we want to avoid those because collisions mean we can have multiple files with the same digital fingerprint. and that would be bad, especially in forensics. So you may be asking yourself, “What tools can actually create these hash values and calculate them for us?” Well, I already mentioned the fact that NCS and FDK will do this during your image acquisition for you.However, there are numerous other options. For instance, in the Windows operating system, there is a built-in command known as Certutil. Certutil is a built-in command where you can give it a file and tell it what algorithm you want, such as MD-5, Shaw-1, Sha-256, or Shaw-512, and it will give you that hash digest. Another one you can use is the file checksum. Integrity, Verifier, or FCIV. This is a downloadable utility that you can use as an alternative to Certutil. And again, it works on Windows if you happen to be running Linux. There are lots of great utilities for this too. Similarly to MD five sum Sha, one sum Shaw 512 sum and Sha 256 sum All of these are tools to calculate those particular algorithms and give you that particular hash digest. So with all that discussion around hashing, what do we really want to use hashing for?
Well, it provides us with that digital fingerprint, and so it can be used to prove the file integrity of your operating system files and application files. This can be done using a software utility that includes File Integrity Monitoring, or FIM. This is a type of software that reviews your system files to ensure they haven’t been tampered with. How do they do this? Well, when you install some kind of software, for instance, the Windows operating system, Microsoft gives you a list of authorised and approved hash values for those files. Anytime the File Integrity Monitoring programme thinks that the file may have changed, it can run a hash on that file, compare it to the known good file hash, and if they don’t match, that will be flagged as something bad because somebody has modified this file. This is a great way to check if some kind of malware has modified system files for you or any of your known applications. This can also be done against your data files, although it’s a lot less common in that regard. Some third-party utilities may also do this for different data files as well, but it’s a lot less common most of the time. You’re going to see this used against system files and application files.
10. Timeline Generation (OBJ 4.4)
Timeline generation. So at this point, we’ve gone through a lot of our process of acquiring things. We’ve collected memory, we’ve collected the disc image, and we’ve hashed it. But now we begin all of our analysis. Now, a significant portion of our investigation will be devoted to locating that needle in a haystack, determining who touched what files when and for what reason.
Now, as we start gathering all that information, there should be a good way to present that information as part of our analysis and our report. How do you want to do that? Well, one of the best ways is to use a timeline. Now a timeline is a tool that will show the sequence of file system events within a source image in a graphical format. So I can give you a lot of different ways to say, “Here are all the different files I found, when they were touched, and who touched them.” I might put that in a spreadsheet. I might put it in a Word document. But one of the best ways is to graphically depict it. And that’s what the timeline allows you to do. Now some of your tools will help you do this automatically. For instance, here’s an image from NK. Notice on the left side of the screen that we can see the file system for our target disc that we’ve been doing our analysis on.
On the right side, you’ll see a timeline view of all the files that were touched at a particular time. For instance, you can see here every minute within this hour of this day. And so we can see here, highlighted in red, that in the 19th minute, all of those files were touched. Now, if you’re on a server, this can be a lot of files. So it’s critical to state that we know bad thing X occurred at this time based on our Steam data or log data. And once we go into that, we can then correlate that with the analysis that we’re looking at, especially if we’re trying to track down an intruder. Now, in the forensic world, this is important as well, because if I’m trying to put a bad guy away for something he did, I need to prove he did it and that he had means, motive, and intent. And one of the things that can help me is if I know he was on this computer at this time and this activity happened during that time frame, that’s going to be able to help me create the evidence I need to punish that bad guy. So once you start constructing your timeline in this graphical format, you’ll also back it up with a written report. Now, a lot of things have to go into this report.
And one of the things you’re trying to aim for as you’re constructing this timeline and your report is to answer a lot of different questions. For instance, you might want to answer How was access to the system obtained? Was it done remotely or locally? Did they get somebody’s password or did they steal it? Were they able to break in using some type of exploit? All of those things You want to figure out what kind of tools might have been installed. If it was a bad actor who hacked into your system over the internet, Did they install a Trojan? Did they install some kind of remote access tool? All those things are things you want to identify as part of the analysis. Then you also want to think about what changes were made to the files were made.And again, your analysis and your timeline are going to help you here because you’ll be able to see which files were touched at which particular time. You also want to figure out what data has been retrieved. Were you the victim of a data exfiltration? If that’s the case, you should see the files being read and transmitted over the network. And again, all that can be documented as part of your timeline. And finally, was the data actually exfiltrated? Just because somebody accessed the data doesn’t mean they took it with them when they left the network. And so you need to be able to prove that as well.
All of these are things you’re going to do as part of your analysis as you’re going through your timeline generation. Now, many forensic tools can generate a timeline based on the evidence you’ve been scanning and analyzing. This can be done with lots of different tools. For instance, I showed you the one for Ncase earlier. You can also do this with the open source Sleuth Kit and its companion tool, Autopsy. Here on the screen, you can see the timeline editor within Autopsy notice.It looks a little different than the way we saw it inside of.In this case, we have the option of specifying which time units we want. Do we want to look at it as years, days, or minutes? How far out or how far in do we want to zoom? We can look at the event type. Is it a base type or a subtype? We can look at the description. Is it going to be short or long? We can apply different filters to only show certain events. For instance, show me all the files that were PNG files that somebody had touched, which are graphic files.
I can then see a table view or a thumbnail view of all the files that were touched. On the bottom left and on the right side, I can see some data associated with different files that are being highlighted. And up on the top right, you can also see the counts, the details, or a list view of all the things that have been touched. Again, you can start playing with this tool on your own because it is open source, so you can feel free to download it, load it up, and try it out yourself. Now, one of the questions I often get is, “What if your tool doesn’t support creating your own timeline?” Well, if that happens, you can create a sequence of events within a spreadsheet to serve as your timeline. Now, because this is a manual process, it is more time-intensive and requires more work on your part. But it is still helpful to have some sort of timeline included in your report, so people can see exactly what happened, when, and who did it.
11. Carving (OBJ 4.4)
Carving. I mentioned before that when you delete a file, it’s not really gone from the hard drive. It’s just still stored on there, and we need to find a way to get it off. And that’s what we’re going to talk about in this lesson when we talk about carving. But before we talk about carving, you have to understand how a hard drive works. If you look at a hard drive, you may think, “Well, that’s just one single platter, and we just store information on it.” But that’s not really true. We actually take that disc and divide it up into smaller areas. Whether you’re using a hard drive or an SSD, these are going to be divided into sectors.
These sectors are either 512 bytes of data, which is a standard size, or 4096 bytes of data, which is an advanced size. This allows us to identify individual parts of that disc where we’re going to store files. Now, again, this seems really small when we talk about something like a video file that may be a gigabyte in size. And so what ends up happening is that we end up taking these files and breaking them apart to fit inside these sectors.
Now, in addition to these sectors, we start using blocks and clusters. Now, a block or cluster is the smallest unit a filesystem can address, and by default, this is 4096 bytes. So if you’re using the advanced format, one block or cluster is equal to one sector. But if you’re using the standard sector size of 512 bytes, you’re going to have multiple sectors making up that one block or cluster. Now, because of the way these hard drives work and because we start breaking things up into these blocks, clusters, and sectors, I can take a really big file and make it into a lot of different pieces, and then I can store those pieces all over the hard drive.
Now, by default, it will begin at sector one and continue adding linearly 12345. But eventually, you delete a file. And so when you want to fill that hole, it’s going to start breaking up your files and putting them across the hard drive in any holes it finds. That’s the idea of how this works. Now, to be able to identify where all these files are, we have to have something called the master file table. The master file table is a table that contains metadata with the location of each file in terms of the block and cluster on the disk. This is also used within NTFS file systems. If you’re using Fat, it uses a file allocation table instead. But most Windows machines these days use NTFS, so we’ll just talk about it in terms of the master file table for our purposes here in this lesson. Now, when you delete a file, you really aren’t deleting the file. As I said before, you’re deleting the entry for that file; you’re going into the MFT, the master file table, and you’re erasing the metadata that says where that file is located. That marks that spot on the hard drive as “open” and available for another file to be written there.
Again, this can lead to little blocks and little sectors all over the hard drive in various places that you start putting data in.So when you’re deleting a file as a user, you’re actually only deleting the reference in the table, and that converts that previous location to Free, also known as Slack space. This is an important concept to understand because, as forensic analysts, it’s our job to find those files that are deleted and bring them back to life, especially when they contain evidence of a crime. This brings us to the idea of filecarving, which was the title for this lesson. Now, when we talk about “file carving,” this is the process of extracting data from a computer when the data has no associated file system metadata. Essentially, somebody has deleted the file, and I want to get that file back now again because there are little pieces of the file all over the hard drive, and those spots are now allocated as “Open.” Pieces could have been overwritten, and so you may not be able to get the entire file back. Instead, by doing file carving, I can attempt to piece together those data fragments from all the unallocated and slack spaces to reconstruct the deleted files. And if I can’t get the whole file, I can at least get parts of those files.
This is why, if your hard drive fails and you want someone to analyse it and recover lost files for you, or if you accidentally delete something, the first thing we recommend is that you stop using the computer right away. Don’t touch it anymore. Because the more you use it, the more chance there is that something is going to get overwritten and we won’t be able to recover your files. But if you just stop what you’re doing and bring us that computer immediately, we can bring those files back to life for you. Now, there are lots and lots of tools out there that can do file carving. N-case can do it, FTK can do it, and, of course, autopsy can do it using the Smooth kit. For instance, here on the screen, you’ll see the graphical user interface for Autopsy. And we went in here, and we are carving a file in the upper right-hand corner; you see that PDF file. It says “ebook PDF.” There are two or three of those, and then there’s another PDF file under that, and there’s a red x over the file icon. That means this was an unallocated space on the hard drive. And we found something that looks like a PDF file. This appears to be file zero zero plus ebook PDF.
And if you look at the bottom, you can see the metadata associated with it, and we can go through a recovery operation to put that piece back together and be able to get that file back, hopefully, and use it as evidence. Now, the great thing about using something like Autopsy is that it has a graphical user interface, so it’s really easy to point and click and recover your files. Now, another tool you can use is Scalpel, and Scalpel is actually the tool that’s being used by Autopsy, but it’s actually a command-line tool. This is an open-source command-line tool that is part of the Sleep kit and is used to conduct file carving on Linux and Windows systems. When you’re using Autopsy to do file carving, you’re actually using Scalpel, and you can use Scalpel directly and use it from the command line environment, but it is a lot more complex. But for the exam, you don’t need to know how to use Scalpel from the command line or even from the user interface. You just need to know the concept that file carving is used to bring back pieces of a file or, if you’re lucky, the full file, as evidence. In addition, forensics investigations.
12. Chain of Custody (OBJ 4.4)
chain of custody Now, this is a term that came out long before we had computers and long before we had digital forensics. But the same principle is going to apply here in digital forensics, too, because digital forensics produces evidence for us, and evidence has to maintain a chain of custody. When I talk about a chain of custody, I mean the record of evidence’s history from collection to presentation in court, all the way through disposal. Everything we do with that piece of evidence, from birth to death, has to be covered.
If you’re like most of us, you’ve probably seen TV shows or movies in which a police officer enters a scene and discovers a smoking gun that they want to collect as evidence. They pick up the gun using their pens. They don’t get their fingerprints on it because they don’t want to contaminate the evidence, right? They also placed it in an evidence bag. Now, on that bag, they will end up signing their name to show evidence was started at this point. I collected this gun at this time from this place, and that’s what they’ll write on the bag. Well, that’s the whole idea here when we start dealing with the chain of custody. And it’s the same thing for us in the computer world. For instance, if you’re working for law enforcement and they went into a suspect’s home and collected their computer, they’re going to bag it, tag it, and collect that as evidence, the physical machine. After that, you’ll take that hard drive and begin analysing the data on it. And when you take that hard drive from the evidence, you have to log that you have that hard drive.
Now, because every place that hard drive goes and every place that computer goes where evidence was collected has to be logged as it goes from place to place, Now, if you’re dealing with something sensitive like a hard drive, some kind of circuit board, a solid-state device, or something like that, you want to put it in a specialised evidence bag. not the big brown paper bag I just showed you. These specialised evidence bags are used for electronic media to ensure that they cannot be damaged or corrupted by electrostatic discharge. Now, another type of specialised bag you mightsee is what’s called a faraday bag. Normally, if we’re taking something like a cell phone or something that has a radio chip in it, we will put it in an evidence bag and then put that evidence bag inside the ferry bag to block any external signals from getting in. I wouldn’t want somebody to send a remote wipe of their cell phone when I have their phone in custodial custody and haven’t taken it as evidence yet. So that’s the idea of a FARA bag as well. Now, when we deal with criminal cases or internal security audits, these things can take a long time—sometimes months or years—to resolve. And the entire time this is happening, we have to hold on to this evidence and keep track of it. And we need to know where it is. So we need to make sure we identify it, bag it, seal it, label it, and know exactly where it is at all times.
Now, one of the reasons I bring up this lengthy period of time is that you have to be careful to take care of the information that you’re collecting and the evidence that you’re collecting. For example, if I have seized a bunch of backup tapes as evidence, that is magnetic media, and I need to protect it to make sure it doesn’t fail over time or become corrupted based on where we store it. I actually have a sad example of storing information incorrectly. When I was a child, my father and mother had filmed me as a baby—you know, baby videos. And they put this on VHS tapes. Well, those VHS tapes are made out of essentially the same stuff as a backup tape for a computer. They didn’t store it properly. And so when we tried to watch it about 20 years later, what happened was that it jammed up the VCR and ate the tape. And so we can’t watch that baby footage anymore. If that happened in a court case, that case could actually get thrown out of court. And some court cases do go on for 10, 15, or 20 years, or the person goes to jail, and they go on appeal time and time again. And it may be something that was a life sentence, for instance. And they can now leave because the evidence has been tainted. So you need to make sure you’re taking care of this and think about humidity, temperature, and where all this stuff is going to be stored securely.
Now, the other problem with evidence is that it can become a lot of information, right? The amount of evidence collected can become extremely large. If you think about some of the big white collar crimes like Enron or WorldCom that happened in the early 2000s, they had boxes and boxes and boxes of paperwork and files and hard drives and everything else. And all that has to be stored some places.So you need to make sure you have an adequate place to do it that is safe and secure. Also, when you’re dealing with large amounts of information, you need a way to organise it so you can identify what it is. If you get called up to court and they say, “Hey, I need the evidence on John Smith’s trial, and I need this particular piece of evidence that you collected, you now need to be able to find it in this mountain of stuff.” And so one of the ways we do this is by having metadata, which is data about the data. This is a way for you to self-describe the information that’s contained in it with some kind of code or some kind of system. Now, one of the things I like to include on all my evidence is a proper label of what it is, a short description of what is inside of it and why it’s important to me, and then some kind of date, such as year, year, year, dash, month, month, dash, day, day, hour, hour, minute, minute.
So, for example, when I filmed this video in 2020, it was July 12 at 9:45 p.m. And you can tell that based on the date code that I would have. And so if I were collecting this evidence, I would attach that time-date code so I’d know when I collected that evidence. The final thing we need to think about when we store evidence is: where are we going to store it that is safe and secure? So in addition to things like temperature, humidity, and moisture, we also want to make sure it’s secure from somebody trying to steal the evidence. Again, I’m sure you’ve seen all the movies where the bad guys go into the police station to destroy the evidence, and they break into the evidence lockup and take the drive and corrupt it, and now the bad guy is going to get out on the street. We don’t want that to happen. So when you’re dealing with law enforcement, they’re going to have secure places to lock it up. If you’re dealing with employees and insider threats, you need to have a secure place to lock up your information as well. So these are things you need to think about as an organization. And if you’re working in the forensic world, where you’re going to store all this information, remember to keep a record of it from cradle to grave, because that’s your chain of custody.
13. Collecting and Validating Evidence (OBJ 4.4)
In this lesson, I’m going to show you how to create a disc image from a USB thumbdrive using DD on a Linux, Unix, or Macintosh system. So the first thing you need to do is connect your USB thumb stick through the right blocker. Now, if you don’t have a right blocker for the purposes of this lab, you really don’t need one because, again, we’re not doing forensic imaging necessarily because we’re not law enforcement. But if you are working in the field, you really do want to use a right blocker.
So I’m going to plug it into my machine, and I’m going to use Fdisk L to list the devices that I have. You can see I have an internal hard drive of 8 GB in size. I have a virtual hard disc with a capacity of two and a half gigabytes. And then this is my two-gigabyte thumbstick that I’m looking for. SDB is the device name, and it does have one partition on it that is 9 GB in size, and it is a fat 32 or Windows type partition. So to create the disc image, what I’m going to do is do DD and then a block size of 64 KB. My input file is whatever that disc is. Now this is where you have to decide what you want to copy. Do I want to copy the entire disc or just the one partition?
Now, in my case, I do want the entire disk, and the reason why is because I want everything that is partitioned and everything that’s not partitioned. So, if I’m going through as a legal investigator looking for hidden things, those things may exist outside the partition. So by doing the slash SDB, I get the entire diskdrive, all of the partitions, and any of the blank space, and then the output file, which in my case I’m just going to call USB Two-Gigabyte DD, and hit Enter, and it will start copying that drive. This will usually take about 30 seconds for every gigabyte you’re going to copy, depending on your system speed. So I’m going to go ahead and fast-forward to the end of this copy, and we’ll pick it up from there. All right, we’re back, and the disc has finished copying, which we can see here in the directory. The file is a USB to gigabyte DD. So what I want to do at this point is create a hash because, again, that chain of custody is very important. In Linux, simply type MD5 sum, which is an MD5 hash, followed by the filename, in our case, USB 2-gigabyte DD, and press Enter. It will calculate what that hash will be and output it there on the screen. At that point, I would enter that into my log, and that would become part of the chain of custody.
I would also want to use a SHA-1 or a SHA-256 because MD-5 is considered a little bit weak in this day and age. Now, the first thing you’re going to do is download and install the FTK imager. Once you have that done, you’ll go ahead and open it. You will have to give it administrative permissions, as I did just there, and it will open up the program. Now, FTK Image does allow you to do some browsing through the files. But in this case, we’re going to first look at collecting the image for a forensic image. So you’re going to go to File and then go down to create a disc image. Then you’ll select whether it’s a physical drive, a logical drive, an image file, the contents of a folder, or multiple CDs and DVDs. In my case, it is a physical drive because it’s a USB thumb drive. And I’ll click Next, and then I’m going to select the drive, which in my case is physical drive 1. It is a memoryx 2-gigabyte USB thumb stick, and I’ll hit finish. At this point, it will ask where I want to save the file that I’m going to create. So I’m going to go ahead and do it as a raw DD image because any forensic tool can use that. Whereas EO one is set aside for incase and AFF is set aside for FTK, Then select “next.” You can give it the information you want. In my case, I’m just going to call this case zero, the first piece of evidence number.We’ll call that zero-one.
Unique description: USB 2-gigabyte drive. The examiner was named Jason Dion. and any notes you may have then click Next, then select where you want it to be stored. In my case, I’m just going to save it directly to my desktop so I can find it easily. And then what is the file name going to be called? I’m going to call USB 2-gigabyte DD for the DD image, and then I will hit finish and start, and when I’m done, it’s going to verify the images after they’re created, which will create the hash for me. And we will go ahead and hit start, and off it will go. Now, this will take a couple of minutes because it is a two-gigabyte thumb stick, and 2GB is quite a bit of data to be imaging. So it will probably take us about five minutes. So I will speed up the video so you don’t just hear and watch it for five minutes, and then I’ll come back and we’ll talk about it. So as you can see, it took about two minutes for it to copy the two-gigabyte drive. And now it’s going through and doing a verification, which is creating the hash. This is going to take us maybe 20 seconds, and as soon as it’s done, we get our drive results. So let’s scroll up here, and we can look at this.
So you’ll have the name of the drive, which in my case is USB, the two-gigabyte DDOT one, which is the first file, and the sector count that’s going to be involved. You’ll see the hash, the reported hash, and the computed hash, and they both match, as they should because that’s what we just did. And then it will also give you a sha1 hash, computed hash, and reported hash, and there were no bad blocks in the image. So we can go ahead and hit close, and then we can hit close again. Now let me minimise this so you’ll be able to see the disc image. It is going to sit here inside the JSON.Dion folder, which is my administrative account. And on the desktop, you’ll see that there are a couple of files here. One and zero zero two are present. Now, what is the difference? Well, if you’ll notice, this is only a 1.5 gigabyte image.
That’s where the software, by default, is going to break these into chunks. So, if you have a 1 TB hard drive, every 5GB or so will be chunked into its own file. That’s okay, because it’s going to be able to read that as I bring that back into the software. And then you’ll see this text file, which is just going to have the summary contents for us. So it tells us it was created by FDK. This is part of our chain of custody. Now it’s going to tell us what the drive looked like, what the device looked like, and its serial number, and it’s going to give us the computed hashes and the reported hashes that we did, as well as our verified hashes. So this was a hash before we took the image, and this is the hash after we copied the image. Now, if you want to open this file, we’re going to do that inside of FDK, and we can analyse it. So we’ll file it and add it as evidence. It’s going to be an image file this time because we just created the image, and then we’re going to find it, which is sitting inside my JSON.Dion folder on the desktop. And you’ll open up the first 1001, and it will open zero, one, and two for me. The drive shows up here in the evidence tree.
Now, as you open it, you’ll see the partitioned and unpartitioned spaces. So any files that may have been hidden would show up in this unallocated space. Now, if I open up the drive itself, it was formatted as Fat32, and I can look at the root of the drive, and you will see the different types of files on it. Take note of the ones with the XS here in MP3. This is a deleted file, but I can see it because of the forensic techniques that we’re using. And you can see all sorts of different music that I used to have on this thumb drive that has been deleted at some point. And some of these files can be restored using this forensic software. The other thing we’re going to be looking for here is that we can scroll down and see anything that’s been deleted. You can see all of those files, and you’ll be able to see the ones that have not been deleted. So let’s look at the modified date modified.What were the most recently touched items on the system? Well, this deleted folder was, so maybe the bad guy was trying to hide something from us. And so I can actually go in and restore that and look at that. Then you can see these other files that are sitting here. Again, these are in the slack space because they were deleted a long time ago.
Again, this is not a forensics course in which I will teach you how to do everything. I just wanted to show you some of the tools you can use to go back and pull some of this information. So if we open this, we can see that inside this folder there were all of these different slides. And so maybe if I open this slide, oh, look, we can find this deleted folder and see what it looked like. It looks like an embryo for some sort of operation. What exactly is this? Well, this was something I did for my church. We did a spy night for the kids. And these are some old files from that “spy night” folder that we used. But that’s the idea here, is that you can go back and restore some of these things and be able to see what the bad guy was trying to hide as you go through and do the analysis. That’s the benefit of this. And we’re doing this off the disc image, not the drive we originally collected, because that USB drive isn’t even plugged into the computer anymore because we don’t need it.