Python program for analysis of email messages

profilesravanh
L7.pdf

Cyber Forensics

Email Forensics

Dr. Rami 1

Agenda

• Why Email? • Email

– Content, Technology, Addresses, Protocols,

• Email – Client & Servers

• Header and MIME header • Tracing Email • Search • Advanced Search

Dr. Rami 2

Why Email?

• Email is Often the Best Evidence, Why?

– Contents can demonstrate intent

– Header data can demonstrate the source/trace

– Timestamps can show intent to mislead

– It can be used as evidence in a lot of cases

Dr. Rami 3

Why Email?

• Understanding the Email technology can help in:

– Locating e-mail messages thought to be destroyed and

– Proving the source of a message.

Dr. Rami 4

Email Content

• An Email Can have:

– Be Plain text (old days)

• No support graphics

– Be HTML structured (currently)

• support graphics and embedded content/formatting

– Have Attachments

• as a separate file

Dr. Rami 5

Email Technology

• Email Technology has 2 main parts: – Client Side: (Outlook Express, Outlook, Thunderbird,….)

• Mail User Agent -MUA- – Software interface that represents the end user

– The application that provides end user support

– Server Side: (Microsoft Exchange Server or the Linux-based SendMail,…etc.) • Mail Transport Agent - MTA-

– moves messages from point A to point B

• Mail Delivery Agent -MDA- – sorts received emails

– gets each message to the correct recipient

Dr. Rami 6

Email Trip

1. Run/Access Client App.

1. Local application on the user’s computer or (outlook, thunderbird,..)

2. Web application (Gmail, Hotmail, Yahoo,….)

2. Compose message

3. Click Send button

4. Hop several e-mail servers

5. Arrive at recipient server

6. Request to the server to download all new messages

7. Server sends all messages that have accumulated

1. Server can keep copies , OR

2. Server deletes after download (depending on the configuration and protocols used)

Dr. Rami 7

Email Addresses

• User ID – must be unique to a particular domain

– The same user ID on a different domain may or may not represent the same user

– You can choose your User ID

• Issue?: – User IDs can be spoofed with the

right software

– Spoofing is • Altering/modifying information so it appears as if it was sent from somewhere else

Dr. Rami 8

[email protected]

User ID

Domain

Email Addresses

• Domain:

– Is the Domain name that hosts the user account.

• Yahoo.com, gmail.com, saumag.edu,……

Dr. Rami 9

Email Protocols

• Two types of Protocols: (for Sending & Receiving)

– Transport protocols (Sending)

• Simple Mail Transport Protocol (SMTP)

– Mailbox protocols (Receiving)

• Post Office Protocol, ver. 3 (POP3)

• Internet Message Access Protocol (IMAP)

Dr. Rami 10

Email Protocols

• SMTP – Client

• Connects to the server application over TCP/IP port 25 or 587

• Sends a simple handshaking HELO packet – To tell server that [email protected] wants

to send a message to [email protected]

– Server • Examines both addresses

• Tries to send message

Dr. Rami 11

Email Servers

• SMTP Servers:

– handles all outgoing messages

• An SMTP server address, sometimes, looks like: smtp.xyz.com

– verifies the sender and target addresses

– If domain is the same between sender and receiver:

• A functions called delivery agent of SMTP hands the message to the POP3/IMAP server in the same domain

Dr. Rami 12

Email Servers

• SMTP Servers cont.: – If domain is different:

• SMPT server sends a request to a DNS server to resolve the domain name to the IP address of a target SMTP server

• When IP is known, the message will be sent to the address

• Send Success: – an ACK packet will be sent back

• Send Failed: – A NACK packet will be returned

– Failure message be sent from SMTP to POP3/IMAP to sender Client

Dr. Rami 13

Email Servers

• SMTP Servers cont.:

– Each intermediate server will be appended the header of the message with a Received: line.

Dr. Rami 14

Email Protocols • POP3:

– Client: • Connect to server over TCP/IP port 110

– Server: • Downloads messages to the client

• POP3

– Allows for • standard text messages, attachments, and

HTML encoded

– Messages can be: • Configured to:

– Remain on the server , Or

– Deleted Dr. Rami 15

Email Protocols

• IMAP (similar to POP3 but with two main differences)

• IMAP: – leaves all messages on the server after downloading (Good)

– Can be configured so that: • multiple users can administer the same mailbox (Not Good)

– Client: • Connect to server over TCP/IP port 143

– Server: • Downloads messages to the client

Dr. Rami 16

Email Servers

• POP and IMAP Servers: – the post office for the network

– Stores incoming messages

– Waits for users to access and download

– SMTP server: • Retrieves the message traveling in the internet

• Puts in message queue

• Notifies Delivery Agent –DA- of SMTP server

• DA transfers the message to the mail storage folder of POP3/IMAP – IMAP folder is on the server not on the client

Dr. Rami 17

Email Servers

• Important for Forensics

• POP3:

– Default config.:

• Delete messages from server after download

• IMAP:

– Default config.:

• Not to Delete messages from server after download

Dr. Rami 18

Email Clients

• Usually pre-installed with every OS

• Perform some basic functions – Send messages

– Receive messages

– Manage content (including attachments)

– Display list of messages in inbox by header

– Open a message • And associated attachments

– Add attachments to outgoing messages • And Receive attachments with incoming messages

Dr. Rami 19

Provider can apply

some restrictions

Email Clients

• Email Clients:

– Are Operating System specific

– Determine how information is archived on the system

– May be a local client or web-based

• How to archive emails on Outlook 2013 and 2016?

Dr. Rami 20

https://www.youtube.com/watch?v=CFmegBWHmXQ

Information Stores

• Acts as a cabinet for the information stored by the client

– Sent/Received messages

– Address books

– Calendars

• Each client has a specific format for storing data

• For example Microsoft Outlook, you can check this link:

– https://support.office.com/en-us/article/Locating-the-Outlook- data-files-0996ece3-57c6-49bc-977b-0d1892e2aacc

Dr. Rami 21

Email Servers

• Carrying messages among clients, there could be: – at least one e-mail server (example?)

– In most cases two or more

• The servers act as relay agents for moving messages across the Internet

• SMTP servers – handle all outgoing messages – Through the Internet

• IMAP/POP3 servers – handle all incoming messages – From Internet to Client

• Some Server applications combine SMTP with POP/IMAP – Such as Microsoft Exchange

Dr. Rami 22

Email Servers

Dr. Rami 23

Standard Header Information

• The structure of an e-mail is based on a standard called the Multipurpose Internet Mail Extensions (MIME)

• MIME defines message to have

– Header:

• Contains control information used by servers to identify and direct the message

– Body:

• Content created by the author of the message

Dr. Rami 24

Standard Header Information

• Header: – A Metadata fields contained in every message – Fields mainly used by email clients are:

• TO:

– Contains the name of the addressee. Separated by , or ; – Issues?

» Message can reach several people though only few are targeted

• FROM: – Sender of the message – Issues?

» Email spoofing is more likely to happen » Some viruses can send messages in the owner’s name

Dr. Rami 25

Standard Header Information

• Header cont.: • SUBJECT:

– It is optional – Can start with Re: or Fw:

• DATE: – When the message was sent – Generated by the e-mail client – Depends on the time and date of the client machine – Issues?

» Can be modified , » How to reveal the truth?

• Time/date stamps found in the header fields and generated by intermediate transport servers

Dr. Rami 26

Standard Header Information

• There are other metadata fields populated by e-mail clients as well as servers along the path of the message.

– This header information can be extracted easily from the e-mail client

Dr. Rami 27

Standard Header Information

Dr. Rami 28

MIME Header Information

• Information stored in the header that includes:

– Time/Date stamps for various actions along the way

– Server information

• for relay servers along the way

– A message ID

• It is unique to this message across the Internet

– Versions of software used along the way

– IDs of recipients

– A Return path

Dr. Rami 29

Dr. Rami 30

SAU header outlook example

Intermediate servers

start servers

end servers

MIME Header Information

• Gmail:

Dr. Rami 31

Gmail header example

Tracing the Origin of a Message

• Each server that relays the message adds its IP address

• Each relay server maintains logs for a certain period of time that indicates the IP address of the sender as well as the intended recipient

• While the time stamp can be manipulated at the origin, the ones added along the way are likely real

Dr. Rami 32

Tracing the Origin of a Message

• Online tracking example(ip2location.com)

– Copy the header and go to the website

– Paste it in the box

Dr. Rami 33

Track an email account with security concerns

• On Parrot Linux

Dr. Rami 34

Dr. Rami 35

Dr. Rami 36

Tracing the Origin of a Message

• Track an IP address ( Linux command line ):

– Install curl (if not installed already)

• sudo apt install curl

– Run the command:

• First: Sign Up and Get access code from ipstack.com

– The command looks like: • curl http://api.ipstack.com/175.216.169.45?access_key=YOUR_ACCESS_KEY

Dr. Rami 37

Tracing the Origin of a Message

• The data is sent in JSON format

Dr. Rami 38

Tracing the Origin of a Message

• Try the location information: (Google Maps)

– "latitude": 37.5112

– "longitude": 126.9741

Dr. Rami 39

Some Email Search Tools

• Commercial:

– Clearwell

– Paraben

• Free:

– GREP:

• Famous command in Linux

Dr. Rami 40

• Download and extract Enron1

• Be in enron1 folder

Dr. Rami 41

More on

grep

Some Email Search Tools

• Usually connecting terms/operators to improve/narrow results: – AND:

• The search must include both words (or both phrases enclosed in quotes). – Honda AND ford

– OR: • The search must include either of the words or quoted phrases, but not necessarily both.

– Honda OR ford

– + or “”: • Search for the phrase exactly as typed (do not put a space between + and first term of search

string). – Honda+ford

– - or NOT: • Do not include any entity that contains the following string along with the defined search

– Honda -Ford

Dr. Rami 42

Search Results

• Searching emails can result in: – False positives – [it is a hit but it is wrong hit]

• Retrieved but are not relevant

– False negatives [it is a miss but it is wrong miss] • Not retrieved but are relevant

– True positive: [it is a hit and it is a right hit] • Retrieved and relevant

– True Negative: [it is a miss and it is right miss] • Not retrieved and Not relevant

Dr. Rami 43

Search Results

• Precision: – The fraction of retrieved emails that

are relevant to the search query:

• Example: – Entire data: 1000 emails

– Search hit: 100 emails • 90 are relevant (true positive, tp)

• 10 are irrelevant (false positive, fp)

• Precision: 90/ (90+10) = 0.9

Dr. Rami 44

Search Results

• Recall: – The fraction of emails that are relevant

to the query that are successfully retrieved

• Example: – Entire data: 1000 emails

– Search hit: 100 emails • 90 are relevant (true positive, tp)

• 10 are irrelevant (false positive, fp)

• But, the dataset has 500 relevant emails not retrieved (false negative, fn)

• Recall: 90/ (90+500) = 0.15

Dr. Rami 45

Search Results

• Search Accuracy:

Dr. Rami 46

Advanced Search Methods

• Advanced Analysis of User Email Behavior:

– Stationary User Profiles • a method of determining if a user makes use of multiple accounts

– Similar Users • a way of determining if what appears to be a single user is actually multiple users

– Attachment Statistics • a user’s typical behavior regarding attachments is analyzed

– Recipient Frequency

• what types of messages a specific user usually receives

Dr. Rami 47

References

• Digital Archaeology

• http://www.cse.scu.edu/~tschwarz/COEN252_09/Lectures/ Email%20Investigation.html

• http://lifehacker.com/how-spammers-spoof-your-email- address-and-how-to-prote-1579478914

Dr. Rami 48

Email Protocols

Dr. Rami 49

• Email Tracker Pro:

Dr. Rami 50

Smart-ip.net

Dr. Rami 51