Security Operations
Incident Response and Handling
Learning objectives
Apply incident response models to manage network security incidents.
Be familiar with the people, process and technology used in incident handling within a SOC
Be familiar with the NIST incident handling guide;
Given a scenario of an incident, be able to provide appropriate suggestions on how to respond.
Incident response team
3
CSIRTs CSIRT Overview
A Computer Security Incident Response Team (CSIRT) is a group commonly found within an organization that provides services and functions to secure the assets of that organization.
A CSIRT:
Responds to incidents that have already happened.
Provides proactive services and functions such as penetration testing, intrusion detection, or even security awareness training.
Is authorized by the incident response policy which is approved by the highest possible level of authority within an organization.
4
The NIST: “Computer Security Incident Handling Guide”
5
Incident response aims to limit the impact of the attack, assess the damage caused, and implement recovery procedures.
The NIST “Computer Security Incident Handling Guide” Special Publication 800-61, revision 2 (800-61r2) provides guidelines for:
Incident handling
Analyzing incident-related data
Determining the appropriate response to each incident
Incident response aims to limit the impact of the attack, assess the damage caused, and implement recovery procedures.
Incident Response involves the plans, policies, and procedures that are used by an organization to respond to a cyber attack.
NIST 800-61r2 Incident Response Stakeholders
6
Management - Managers create the policies that everyone must follow. They also design the budget and are in charge of staffing all of the departments. Management must coordinate the incident response with other stakeholders and minimize the damage of an incident.
Information Assurance - This group may need to be called in to change things such as firewall rules during some stages of incident management such as containment or recovery.
IT Support - This is the group that works with the technology in the organization and understands it the most. Because IT support has a deeper understanding, it is more likely that they will perform the correct action to minimize the effectiveness of the attack or preserve evidence properly.
Legal Department - It is a best practice to have the legal department review the incident policies, plans, and procedures to make sure that they do not violate any local or federal guidelines. Also, if any incident has legal implications, a legal expert will need to become involved. This might include prosecution, evidence collection, or lawsuits.
Public Affairs and Media Relations - There are times when the media and the public might need to be informed of an incident, such as when their personal information has been compromised during an incident.
Human Resources - The human resources department might need to perform disciplinary measures if an incident caused by an employee occurs.
Business Continuity Planners - Security incidents may alter an organization’s business continuity. It is important that those in charge of business continuity planning are aware of security incidents and the impact they have had on the organization as a whole. This will allow them to make any changes in plans and risk assessments.
Physical Security and Facilities Management - When a security incident happens because of a physical attack, such as tailgating or shoulder surfing, these teams might need to be informed and involved. It is also their responsibility to secure facilities that contain evidence from an investigation.
NIST 800-61r2 NIST Incident Response Life Cycle
NIST defines the following four steps in the incident response process life cycle:
Preparation – the CSIRT is trained in how to respond to an incident; selecting and implementing a set of controls based on the results of risk assessments
Detection and Analysis – Through continuous monitoring, the CSIRT quickly identifies, analyzes, and validates an incident.
Containment, Eradication, and Recovery – The CSIRT implements procedures to contain the threat, eradicate the impact on organizational assets, and use backups to restore data and software.
Post-Incident Activities – The CSIRT then documents how the incident was handled, recommends changes for future response, and specifies how to avoid a reoccurrence.
7
NIST 800-61r2 Preparation
The preparation phase is when the CSIRT is created and trained.
This phase is also when the tools and assets that will be needed by the team to investigate incidents are acquired and deployed.
Often, the CSIRT may have a jump kit prepared.
8
The preparation phase is when the CSIRT is created and trained. This phase is also when the tools and assets that will be needed by the team to investigate incidents are acquired and deployed.
Additional incident analysis resources might be required. Examples of these resources are a list of critical assets, network diagrams, port lists, hashes of critical files, and baseline readings of system and network activity. Mitigation software is also an important item when preparing to handle a security incident. An image of a clean OS and application installation files may be needed to recover a computer from an incident.
Often, the CSIRT may have a jump kit prepared. This is a portable box with many of the items listed above to help in establishing a swift response. Some of these items may be a laptop with appropriate software installed, backup media, and any other hardware, software, or information to help in the investigation. It is important to inspect the jump kit on a regular basis to install updates and make sure that all the necessary elements are available and ready for use. It is helpful to practice deploying the jump kit with the CSIRT to ensure that the team members know how to use its contents properly.
NIST 800-61r2 Detection and Analysis
Different types of incidents will require different responses.
Attack Vectors: Web, Email, Loss or Theft, Impersonation, Attrition and Media.
Detection: Automated detection - Antivirus software, IDS, manual detection - user reports.
Analysis: Use Network and System Profiling to determine the validity of security incidents.
Scoping: Provide information on the containment of the incident and deeper analysis of the effects of the incident.
9
Detection
Some incidents are easy to detect while others may go undetected for months. The detection of security incidents might be the most difficult phase in the incident response process. Incidents are detected in many different ways and not all of these ways are very detailed or provide detailed clarity. There are automated ways of detection such as antivirus software or an IDS. There are also manual detections through user reports.
Analysis
Incident analysis is difficult because not all of the indicators are accurate. In a perfect world, each indicator should be analyzed to find out if it is accurate. This is nearly impossible due to the number and variety of logged and reported incidents. The use of complex algorithms and machine learning often help to determine the validity of security incidents. This is more prevalent in large organizations that have thousands or even millions of incidents daily. One method that can be used is network and system profiling. Profiling is measuring the characteristics of expected activity in networking devices and systems so that changes to it can be more easily identified.
When an indicator is found to be accurate, it does not necessarily mean that a security incident has occurred. Some indicators happen for other reasons besides security. A server that continually crashes, for example, may have bad RAM instead of a buffer overflow attack occurring. To be safe, even ambiguous or contradictory symptoms must be analyzed to determine if a legitimate security incident has taken place. The CSIRT must react quickly to validate and analyze incidents. This is performed by following a predefined process and documenting each step.
Scoping
When the CSIRT believes that an incident has occurred, it should immediately perform an initial analysis to determine the incident’s scope, such as which networks, systems, or applications are affected, who or what originated the incident, and how the incident is occurring. This scoping activity should provide enough information for the team to prioritize subsequent activities, such as containment of the incident and deeper analysis of the effects of the incident.
NIST 800-61r2 Common attack vectors
External/Removable Media: An attack executed from removable media (e.g., flash drive, CD) or a peripheral device.
Attrition: An attack that employs brute force methods to compromise, degrade, or destroy systems, networks, or services.
Web: An attack executed from a website or web-based application.
Email: An attack executed via an email message or attachment.
Improper Usage: Any incident resulting from violation of an organization’s acceptable usage policies by an authorized user, excluding the above categories.
Loss or Theft of Equipment: The loss or theft of a computing device or media used by the organization, such as a laptop or smartphone.
Other: An attack that does not fit into any of the other categories.
NIST Computer Security Incident Handling Guide – page 2
10
NIST 800-61r2 Containment, Eradication, and Recovery
Containment ensures the incident does not continue.
For every type of incident, a containment strategy should be created and enforced.
During an incident, evidence must be gathered and documented in a clear and concise manner for subsequent investigation by authorities.
Eradication is identifying all of the hosts that need remediation and all of the effects of the security incident must be eliminated.
Recovery of hosts requires clean and recent backups, or they will have to be rebuilt with installation media.
11
NIST 800-61r2 Post-Incident Activity Phase
Root-cause analysis:
a systematic process to identify the initial source of the incident and how to prevent it from occurring again.
Lesson learned
review the effectiveness of the incident handling process and identify necessary hardening needed for existing security controls and practices
After-action report
12
So after the incident response team has conducted their assessment, performed their actions, and worked with the system administrators to correct the incident, the organization is still not fully done with this incident. Instead, there are three more things that have to be completed. The root cause analysis, the lessons learned process, and the after-action report. The root cause analysis is a systematic process to identify the initial source of the incident and how to prevent it from occurring again.
After incident response activities have eradicated the threats and the organization has begun to recover from the effects of the attack, it is important to take a step back and periodically meet with all of the parties involved to discuss the events that took place and the actions of all of the individuals while handling the incident. This will provide a platform to learn what was done right, what was done wrong, what could be changed, and what should be improved upon.
Lessons-based hardening
After a major incident has been handled, the organization should hold a “lessons learned” meeting to review the effectiveness of the incident handling process and identify necessary hardening needed for existing security controls and practices. Examples of good questions to answer during the meeting include the following:
Exactly what happened, and when?
How well did the staff and management perform while dealing with the incident?
Were the documented procedures followed? Were they adequate?
What information was needed sooner?
Were any steps or actions taken that might have inhibited the recovery?
What would the staff and management do differently the next time a similar incident occurs?
How could information sharing with other organizations be improved?
What corrective actions can prevent similar incidents in the future?
What precursors or indicators should be watched for in the future to detect similar incidents?
What additional tools or resources are needed to detect, analyze, and mitigate future incidents?
Playbook
After the incident is adequately handled, the organization should prepare a report that details the attackers’ activities, a summary of the incident, procedures for remediation, and the steps the organization should take to prevent a future incident.
13
https://www.incidentresponse.com/playbooks
14
https://www.incidentresponse.com/playbooks/malware-outbreak