“Where does a wise man hide a leaf? In the forest,” G.K. Chesterton wrote. This is very true in cybersecurity. Attackers are hiding malicious sites and pages in over 1.5 billion sites on the entire internet by using increasingly sophisticated evasion techniques. As a countermeasure, NTT Security is enhancing its domain and URL inspection capabilities in close collaboration with NTT Secure Platform Laboratories through LRR, a cloud-based platform that provides domain and URL inspection features.

LRR is very unique in two ways: (1) its cutting edge features of domain/URL inspection is empowered by machine learning, and (2) its agile operation model accelerates collaboration between production and R&D teams.

Figure 1: LRR provides actionable analysis on websites to analysts by leveraging latest NTT R&D research

Let’s begin with the unique features that only NTT can provide. LRR is currently providing the following features that automate and enhance security analysis of domains and URLs:

  • MineSpider [1] that crawls the web to discover malicious pages hidden under legitimate looking URLs in an automated manner
  • RedChainer [2] that detects malicious sites by using machine learning based analysis of web redirection patterns
  • DomainProfiler [3] that analyzes the maliciousness of domains and URLs based on the analysis of changes in domain registration status over time
  • DomainChroma [4] that conduct the “chromatographic” analysis of domains and URLs, which tells analysts what kinds of actions necessary to take against domains and URLs

MineSpider is an automated tool to detect web pages used for drive-by download attacks. In particular, MineSpider exhaustively analyzes JavaScript codes and detects malicious pages that are even hidden with evasion techniques, such as code obfuscation and environment-dependent redirection. An experiment shows that these anti-evasion techniques enable MineSpider to detect 32% more malicious URLs that otherwise would remain hidden (123,397 by MineSpider over 93,386 by conventional tools).

Figure 2: MineSpider automatically crawls and detects malicious pages and scripts

RedChainer discovers malicious websites by applying machine learning to analyze distinct redirection patterns between the malicious and compromised sites. For example, RedChainer found that the system achieved a 91.7% true positive rate for malicious websites containing exploit URLs at a low false positive rate of 0.1%. Moreover, it detected 143 more evasive malicious websites than previous top research efforts.

Figure 3: RedChainer discovers malicious redirection patterns by using machine learning

DomainProfiler discovers malicious domain names that are likely to be abused in future by analyzing changes in domain name registration status over time or Temporal Variation Patterns (TVPs). Our evaluation revealed that DomainProfiler can predict malicious domain names 220 days beforehand with a true positive rate of 0.985.

Figure 4: DomainProfiler analyzes the maliciousness of domains based on the registration history

DomainChroma is a unified tool that provides actionable threat intelligence on websites based on the “chromatographic” concept.  It does not only tell whether websites are malicious or not but also provides the recommendation for actions against the malicious sites by categorizing the sites into two types: compromised and dedicated. The compromised are malicious domain names abusing legitimate services. These sites can be blocked at URL-level by using network security appliances on user perimeters, such as intrusion detection systems. The dedicated are those malicious domain names prepared exclusively for malicious purposes. They should be taken down at the DNS level by those responsible for the DNS, such as web hosting service providers. We evaluated DomainChroma using a large real dataset to show that over 70% of domain names require only DNS-level defense with no collateral damage of legitimate accesses.



Figure 5: DomainChroma recommends actions against attacks based on the “chromatography” analysis


All in all, these four tools integrated as LRR automate and enhance the domain and URL inspection at NTT Security, resulting in faster and stronger threat detection services to customers.

Furthermore, LRR serves as a great venue for analysts of NTT Security and researchers of NTT to work together by directly exchanging feedbacks from production environments. To overcome “death valley” of innovation, LRR enables NTT’s security professionals to access early R&D outcomes on the cloud and give feedback from hands-on experiences to researchers. The R&D team quickly responds to the feedback, making the improved technologies immediately available on the LRR platform. In this way, R&D and production teams work together to quickly and efficiently productize innovations.  

For the faster introduction of new innovations with minimum impacts to current production systems, LRR facilitates the “canary” deployment of the innovations. Such canary deployment enables the production team to apply the innovations to a small portion of the production data and share the results with the R&D team.  In turn, the R&D team scientifically analyzes the results and improves the innovations. In addition, the usage data tells the R&D team attack trends. For example, URLs and domains that are analyzed most by the production team is a clear indicator of the level of activities of attacks using those resources.  

There are more innovations in the pipeline beyond the four features described above, including deep learning based log analysis and the integration with Internet backbone traffic analysis. 

By the way, the quote in the beginning is followed with “but what does he do if there is no forest? He grows a forest to hide it in.” We must not let this happen – NTT’s global team is working around the clock to prevent it.

 

References

[1] Takata, Yuta, et al. "Minespider: Extracting urls from environment-dependent drive-by download attacks." Computer Software and Applications Conference (COMPSAC), 2015 IEEE 39th Annual. Vol. 2. IEEE, 2015.

[2] Shibahara, Toshiki, et al. "Detecting Malicious Websites by Integrating Malicious, Benign, and Compromised Redirection Subgraph Similarities." Computer Software and Applications Conference (COMPSAC), 2017 IEEE 41st Annual. Vol. 1. IEEE, 2017.

[3] Chiba, Daiki, et al. "DomainProfiler: Discovering domain names abused in future." 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 2016.

[4] Chiba, Daiki, et al. "DomainChroma: Building Actionable Threat Intelligence from Malicious Domain Names." Computers & Security. 2018.