# Malware Analysis - 13cubed

## Some Assembly required

## Payload Distribution Format

### pdfid

* analyze PDF files

<mark style="color:red;">`pdfid.py {file_name}`</mark>

**look out for:**

* JavaScript / JS
* OpenAction - take some action upon opening a file
* RichMedia - flash program embedded
* Launch - external or embedded software
* URI - interaction with a website

### pdf-parser

* extract embedded file contained within pdf

### **PDF Structure**

<div align="left"><img src="https://935233489-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUGyWMBzIPoBvXNyR5UOW%2Fuploads%2FTOJjsYmiKHLT9H4EgJwW%2FUntitled(1).png?alt=media&#x26;token=506db986-1e50-474e-b1f8-09c69fefa3f7" alt=""></div>

* Objects section in body contains text, fonts, graphics etc.
* Xref Table maps the offsets of the files objects.

<mark style="color:red;">`pdf-parser.py {file_name} | more`</mark>

* Object with Type: /EmbeddedFile has a specified length and Filter that says FlatDecode. This is the encoded file (MSword) embedded within the PDF.

<mark style="color:red;">`pdf-parser.py {file_name} --object 8 --filter --raw --dump out.doc`</mark>

* filter : decode the string, raw: display without escaping special characters

<mark style="color:red;">`file out.doc`</mark>

* Composite Document

### **oledump**

* extract the embedded VBA macros from within ole or composite document format files (older MS Office file format)

<mark style="color:red;">`oledump.py {file_name}`</mark>

* "m" - indicates VBA macros are present, but only composed of attribute or option statements
* "M" - contains macros

<mark style="color:red;">`oledump.py {file_name} --select 7 --vbadecompress | more`</mark>

* you can see a file is being opened and a sequence of bytes are written to it which could be malicious code

## Juicy PDFs

### pdfextract

from Origami framework

<mark style="color:red;">`pdfextract {file_name}`</mark>

* streams, sccripts, fonts, attachments, images will be extracted to a folder
* in the attachments, you can see the doc file that we extracted manually using pdf-parser

If a PDF contains images, the EXIF data of those images is still attached to them within the pdf.

<mark style="color:red;">`exiftool *.jpg | more`</mark>

## Visual analysis with ProcDOT

### ProcMon

* use filters to filter out the data
* to export the data to ProcDot, use the following settings:
  * disable “Enable Advanced Output” in Filter
  * disable “Show Resolved Network Addresses” in Options
  * under “Select columns” in Options, enable “Thread ID” and disable “Sequence Number”
* File → Save and export as CSV

### ProcDOT

* import procmon logfile and wireshark pcap file
* select the process from which you want the graph to be focussed

at the bottom, you can see FilmStrip option → choose the fps and you can see the process in real time

## Finding Evil with YARA

* Rule-based approach to create descriptions of malware families based on textual or binary patterns.
* Can be used to run against and quickly look for IOCs in multiple endpoints across multiple files.
* Can be used alongside Volatility to analyze a memory image. YARA rules are also supported by other tools like crowdstrike etc.

### Example rules

```c
rule Example1 {
	meta:
		description = "description"
		author = "Neeraj"
		date = "2023-06-16"

	strings:
		$domain1 = "badsite1.com" nocase
		$domain2 = "badsite2.com" nocase
		$domain3 = "badsite3.com" nocase
		$ip = /([0-9]{1,3}\\.){3}[0-9]{1,3}/ wide ascii

	condition:
		2 of ($domain1, $domain2, $domain3) and 
		$ip and 
		filesize < 1KB
```

* meta: description of the author&#x20;
* wide: search for strings encoded with 2 bytes per char&#x20;
* wide ascii: search for ascii only strings with 2 bytes per char fullword ascii: string will match only if it appears in the file delimited by non-alphanumeric characters

```c
rule upx_packed {
	meta:
		description = "detect a UPX packed file"

	strings:
		$mz = "MZ"
		$upx0 = {55505830000000}
		$upx1 = {55505831000000}
		$upx_sig = "UPX!"

	condition:
		$mz at 0 and $upx in (0..1024) and $upx1 in (0..1024) 
			and $upx_sig in (0..1024)
```

### Running YARA rules

<mark style="color:red;">`yara {yara_rule_file} {binary_file} -s`</mark>

<mark style="color:red;">`-s`</mark> : show the strings matched \ <mark style="color:red;">`-r`</mark> : recursively search through a directory

### Memory Forensics

<mark style="color:red;">`yarascan`</mark> : built-in plugin for volatility

<mark style="color:red;">`python volatility/vol.py -f {memory_image} --profile=win10x17763 yarascan -y {rule_file}`</mark>
