This week, we have a guest blog post from Dr. Christoph Hönscheid, Manager Digital Workplace at NTT Security.

Encryption is a cornerstone of IT security. Without end-to-end encryption in use, in transit, and at rest, data can hardly be protected from unauthorized access. With the proliferation of IT usage in the cloud or in insecure environments, the importance of encryption has increased significantly in recent years, especially when mobile systems such as smartphones and tablets are in insecure environments with critical data from corporate IT work. Only with encryption can the relevant regulations - recently extended by the General Data Protection Regulation (GDPR) - be complied with at all in such application scenarios.


However, encryption for some applications needs to be complemented by related technologies. One problem is the lack of processing capability of encrypted data. These are created by mathematic methods, which generate clear-text strings from plain data. These can, if the key is known, be reconverted back but, as long as the data is encrypted, they cannot be further processed or evaluated. It is not possible to look at such a string as to whether the clear data originally was a postal code, a credit card number or a weather report. This is of course intended in the context of cryptographic methods; the identifiability of content types would even be a serious gap that could corrupt a whole encryption process. A famous example of this is the encryption machine Enigma, with which the German Wehrmacht encrypted its news in World War II. The British codebreakers around Alan Turing were aware that encrypted weather reports had been transmitted at certain times and the corresponding code segments formed the starting point for decoding the entire Enigma process.


Business data processing with cloud computing has the opposite problem: for example, China-based companies that process CRM data with Salesforce have to encrypt it because the cloud servers in question are in Japan and certain data is land may not leave unencrypted. However, the resulting cryptic strings are no longer accepted by the application. An email address must therefore, for example, continue to look like an email address, but names contained therein may no longer be recognizable. 


Dimming instead of encrypting


With data obfuscation (data dimming), this requirement can be met. In doing so, the data is obfuscated or masked so that its structure remains intact. A postcode still looks like a postcode, only its content has been changed so that no conclusions on the original record are possible. The method may be refined to more specific masking such that, for example, the first two digits of the real postcode are preserved so that the associated region of the postcode remains visible.


The disguised wildcard (token) is formed by a rule-based determination of similar values that, unlike Enigma or modern encryption, cannot be recalculated with a key. The allocation of clear data and tokens takes place by means of a table, which is specially secured. A determination of the clear data is therefore only possible when accessing this table. The obfuscation of the data takes place "on the fly" through an anonymization gateway between the end-point and application. This then resolves the obfuscation using the table when retrieving the data again. The advantage over encryption is that the applications and databases in question continue to work, because postal codes, email addresses, and credit card numbers still look like obscurations, email addresses, and credit card numbers. At the same time, however, the data protection regulations are fulfilled, because no conclusions can be drawn on the clear data.


A common use case for this technology is the preparation of test data. By tokenising, highly sensitive data can be made available for representative tests by third parties - mostly external service providers - who are not allowed to see the original data.


Data obfuscation cannot and does not want to replace encryption. Encryption is more efficient than maintaining token-tuplets. However, data obfuscation is a useful supplement for special applications. Anonymization gateways therefore usually offer both functions. Encrypt where possible and tokenize where necessary.