Technologies may have moved on a long way from when Marty Roache worked on a pet project (SNORT) at home in 1998, however our requirements to have instructions and the procedures for how to utilize these tools and make them effective has not. I started my infosec journey in the UK’s Royal Air Force (RAF) and, if there is one thing that the military loves, it’s planning. In fact the title of this blog post “plan the flight, fly the plan” is a pilot’s saying, similar to those familiar to running – “train hard, run easy”. 

In other words, if you put the effort into identifying what you have to do when times are easy you will have a better time of it when times are hard. I have also worked in SOCs and other security monitoring teams where there has been an ad-hoc approach to security monitoring and incident response. None of these was bad at what they did, but wholly relied upon the technical skillsets of key members of staff (which is all well and good, until they move onto another role).

So why do we use processes and procedures? In short, so all of our staff can know what to do, and when. However, this would be a very short blog post if we left it there….

Let’s start off with an example. Recently, I was at an NTT Security's conference in London, ISW 2018, where John Volanthen was giving a closing keynote speech on the rescue of the Thai football team where he lead the UK’s cave diving rescue effort. John talked at length about the importance of having procedures in place to ensure that all of the team knew what it was doing and that safety (which was of the highest importance here) was achieved. John was always seen on site walking around with his blue procedures folder.

As an example of just how effective prior planning was in this situation, take a look at this picture below which shows all of John's personal dive equipment at Heathrow waiting to be boarded onto the aircraft. He and the team received two hours' notice before leaving for the airport! Without planning, what would be required (equipment, permissions for gas tanks etc) in a two hour turnaround would have been impossible. This (albeit in less dramatic fashion) directly relates what security monitoring and incident response staff need to do on a daily basis.

Photo with kind permission of John Volanthen

For security monitoring, this relates to having an incident response plan in place. This is the high-level plan for a SOC/CDC/CIRT which dictates the actions to be taken when an incident occurs. These compromise of (at a minimum) the following:

  • Workflows: These are typically swim-lanes showing areas of responsibilities and decision points for escalation, involving external agencies, declaring breaches, gathering intelligence and closing down completed Incidents.
  • Communications: Quite simply, who to talk to when something happens. This can be to other members of the SOC team but more typically involves IT operations(server team, gateway team, architects etc), physical security, HR, the media team and via the SOC manager and senior management. There is nothing worse than being in the middle of a major incident and not knowing who to talk too! In the case of the media, who within the team (or company) is authorized to speak to the media about an incident and under what circumstances?
  • Escalation: Escalation was mentioned in 'Workflows' above but, from this perspective, we are looking at the criteria for escalation. Is the incident for a piece of malware on a single non-critical asset or has it spread across half of the production network and is destructive in nature? Both examples are simple to define here, but at which point should the SOC staff escalate the incident or declare a crisis?
  • Sharing: Any security team is going to be constrained by the nature of the information it is protecting (especially in the new world of the General Data Protection Regulation), so it is important that decisions be made about what information (if any) the SOC wants to share with peer groups, national agencies or other organizations. Most SOC teams utilize the Traffic Light Code for identifying the level of confidentiality associated with security incidents and associated indicators of compromise. Defining what information can be shared and who is authorized to do so ahead of time removed the risk of leaking confidential data.
  • Incident Response Procedures (IRP): These are closely linked to John’s blue book mentioned earlier. When a security incident happens, the SOC staff have to know what to do at each point of an investigation. Whilst no two Incidents are going to be the same, an IRP can be created for each high level attack type i.e. phishing, DDoS, web defacement etc. An appropriate IRP gives the analyst guidance for what steps they should be taking to ensure that nothing is missed, actions are taken rapidly, and all containment and remediation activities are followed for a given threat. Many incident management or orchestration tools add to and complement this approach.

From an incident response point of view, the team have to consider additional components as well related to the actual deployment of equipment, visas, flights, SLAs, site plans for customer environment/network and, from a managerial perspective, ensuring that enough staff are located in geographical positions to support ongoing incident response activities.

All this is well and good, however without actually testing out these plans there is no way of confirming if they will ultimately work and, as such, give a false sense of reassurance to organizations. Therefore it is essential that these plans, processes and procedures are all tested regularly. Most of these scenarios can be tested locally with table top exercises which can be delivered to remote teams as well. 

These should be designed to test the security team, both against their own procedures and against their technical capabilities. The idea being to ratify that the team functions as expected. These exercises can be broadened to senior management as these individuals typically do not receive any guidance or awareness on cybersecurity incidents i.e. dealing with the media, crisis decisions for IT etc. In addition, there are options to test the whole solution in a real world scenario with the offensive security team gathering threat intelligence on the organization launching a phishing attack and delivering first stage malware to gain a foothold in the network. This is as real as testing gets and the security team is thus able to hone its skills in a real world scenario.

So stop and look to your processes. Do you have everything covered, and do you have a plan in place? Do you have a blue procedures book and, if not, what are you waiting for?