How to analyze malicious PDF document using tools available in REMnux?
I have tried to explain some of the useful tools, available in REMnux, that can be used for analyzing a malicious PDF.
Pdfid will scan a PDF document for the following strings and count their occurrences (total and obfuscated):
Almost every PDF document will contain the first 7 words (obj through startxref). The more interesting fields to analyze when identifying a malicious pdf are:
/Page gives an indication of the number of pages in the PDF document. Most malicious PDF document have only one page.
/Encrypt indicates that the PDF document has DRM or needs a password to be read.
/ObjStm counts the number of object streams. An object stream can contain other objects and can therefore be used to obfuscate objects.
/AcroForm refers to PDF form.
/JBIG2Decode indicates if the PDF document uses JBIG2 compression.
/RichMedia is for embedded Flash.
/Launch counts launch actions.
/XFA is for XML Forms Architecture.
Ex: We have a pdf file by name ‘malicious.pdf‘ that we need to analyse it on our remnux box say ‘vmstation‘. Running pdfid on the document gives us the following output:
root@vmstation:/home/remnux/# pdfid malicious.pdf
PDFiD 0.2.1 malicious.pdf
PDF Header: %PDF-1.7
Ex. Running peepdf on same file not only shows the suspicious objects but the heap corruption (CVE-2013-2729) vulnerability also that it triggers.
root@vmstation:/home/remnux# peepdf malicious.pdf
Size: 50717 bytes
Objects (6): [1, 2, 3, 4, 5, 6]
Errors (2): [1, 6]
Streams (2): [1, 6]
Encoded (1): 
Objects with JS code (1): 
BMP/RLE heap corruption (CVE-2013-2729): 
Pdf-parser is a command-line program that parses and analyses PDF documents. It can be used to identify PDF documents with unusual/unexpected objects.
Ex. Running pdf-parser on malicious pdf shows Object 1 which contains a stream compressed and looks rather suspicious.
root@vmstation:/home/remnux/# pdf-parser malicious.pdf
PDF Comment ‘%PDF-1.7\n’
PDF Comment ‘%\xc0\xff\xee\xfa\xba\xda\n’
obj 1 0
/Filter [ /Fl /Fl ]
Ex. Running pdfextract on malicious.pdf, extract 4 scripts and dump them to: ‘malicious.pdf.dump/scripts’
root@vmstation:/home/remnux# pdfextract malicious.pdf
Extracted 2 PDF streams to ‘malicious.pdf.dump/streams’.
Extracted 4 scripts to ‘malicious.pdf.dump/scripts’.
Extracted 0 attachments to ‘malicious.pdf.dump/attachments’.
Extracted 0 fonts to ‘malicious.pdf.dump/fonts’.
Extracted 0 images to ‘malicious.pdf.dump/images’.
root@vmstation:/home/remnux# cd malicious.pdf.dump/
root@vmstation:/home/remnux/malicious.pdf.dump# cd scripts/
root@vmstation:/home/remnux/malicious.pdf.dump/scripts# lsscript_1208899462995164754.js script_-3196157284528695661.js script_3802492399520803490.js script_537885029827703918.js
script_2339165404470982253.js script_-3507872836391180146.js script_-4152317962809273.js script_566555573402854993.js
We have now the malicious script handy.