Public Statement in Relation to Data Briefly Exposed on an ElasticSearch Database

This statement is to provide readers with more information and context in relation to a cyber incident Keepnet Labs experienced in March 2020. There are articles online connected to this event that contain inaccuracies which could be misleading – many of these posts have now been amended, but we would like to set the record straight.

Background and Summary of Incident

As part of the Keepnet Labs Solution, we provide a “compromised email credentials” threat intelligence service. To provide this service, we are continuously collecting publicly known data-breach data from online public resources. We then store this data in our own secure Elasticsearch database and provide companies with the information relating to their business email domains via our Keepnet platform. We do not process this data for any other purposes. As an answer to any question that may arise, we only collect these data categories: (1) source of the breach; (2) year the breach was made public; (3) breached email address; (4) breached password or hash; and (5) format of the breached password (e.g. plaintext, encrypted or hash). We do not collect any other data categories, including any other personal data that might be included in a breach, such as names, addresses, phone numbers, etc. This is the same methodology that similar threat intelligence services are using, following the same collection methods and the overall purpose is to inform companies (or individuals) of a breach that they may be unaware of, in order to take action, beginning by changing their password(s).

The integration and database management services have been outsourced to the IT service providers since February 2018, and they have been also responsible for the maintenance of the servers and systems, including databases. 

In March 2020, we started to work with a new service provider, and this service provider was performing scheduled maintenance and was migrating the ElasticSearch database. For context, the reason for the migration was to deal with significant performance issues as a result of increasing customer numbers.

During this operation, regrettably, the engineer responsible later reported that he had to disable the firewall for approximately 10 minutes to speed up the process. During this window, the Internet indexing service, BinaryEdge indexed this data. A security researcher (Mr. Bob Diachenko), found this indexed data and could access the ElasticSearch database via an unprotected port. He could query the data and see high-level information regarding the dataset. Please see screenshot, this is how he could discover how many records were stored in that database

Fig 1: Previously unpublished screenshot of ElasticSearch database query taken by Bob Diachenko (researcher).

With the performance issues we were experiencing during the operation, the researcher confirmed that he only managed to extract approximately 2 megabytes of data and took a screenshot which was redacted and published in his original blog. For clarity, at the time, we were storing more than 867 gigabytes (5+ billion records) and it would not be possible to extract this amount of data during the window of downtime and given the significant performance issues that were being experienced. When the researcher tried to access the database again after sending us the email, he reported that it was unavailable “an hour later”. This would be expected, as it was only exposed for a very short period of time during the migration.

The Original Source Article

The researcher did contact us via email but unfortunately, the email landed in our spam folder. Hours after the researcher’s blog post was published, we were approached by a media outlet for comment, quoting the link to the original post. Naturally, as soon as we were aware of this, we read the article, saw that he claimed he’d tried to contact us and upon checking all inbox folders, we found the message in spam. We swiftly contacted the researcher and opened an investigation.

We thanked the researcher for bringing this to our attention and fully support the proactive investigation activities he performs. As a community, we need more expert white hat security researchers to help us all to identify vulnerabilities to mitigate future cyber incidents. 

Together we discussed what he found, how he found it and confirmed that he did not make a copy of the data, other than the screenshot for his article, which he confirmed to be true. However, unfortunately, the original blog post contained inaccuracies and could be considered misleading. We raised this with the researcher and he agreed with our observations. He volunteered to amend his post (see Fig 2 below), including removing some inaccurate statements and our business name – primarily because, other than him accessing the database during the firewall downtime, there is no other evidence, technical or otherwise to suggest any other party could have feasibly extracted any data whatsoever. 

Fig 2: Extract from the email received from Bob Diachenko (researcher) on 19th March 2020.

The researcher agreed that the original post could mislead someone to suspect that Keepnet Labs were responsible for an actual data breach involving the loss of 5+ billion records, which is not the case. He understood that a third-party service provider made a mistake and no client data was exposed at any point (a statement he later added to his article). 

Fig 3: An amendment added to Mr Diachenko’s article, to clarify that no customer data was exposed.

Client data is not stored on the database in question and was at no time vulnerable during the incident. He confirmed that the data on that Elasticsearch database was already publicly exposed data. Please note, all of this data is easily available via a web search (e.g. Google).

The researcher also identified that the SSL certificate was owned by Keepnet Labs, so believed we were directly managing the environment. We would like to clarify that we do own the SSL certificate but Keepnet had created a subdomain on Cloudflare which was secured by Cloudflare’s SSL certificate that pointed to the IP address of the ElasticSearch service (owned and managed by the service provider). For example, Keepnet can create a subdomain https://elasticsearch.keepnetlabs.com and register an SSL certificate that points to www.anydomain.com. It doesn’t mean that www.anydomain.com is owned by Keepnet Labs, and that was true in this case. 

We performed a detailed investigation and concluded that mistakes were made but there was no material impact or damage caused to any party resulting from the incident. We considered our obligations under the General Data Protection Regulations. We followed ICO guidelines and performed a self-assessment to help determine whether our company needed to report to the ICO. Using the ICO website tool, it concluded that the incident did not meet the requirements to report. Our legal basis for collecting and processing the data is detailed in our privacy policy (https://www.keepnetlabs.com/privacy-policy/). All of the threat intelligence data is already publicly exposed and associated with known data breaches, for example, Adobe or Last.FM. Regarding the obligation to notify the data subject(s) in regard to the breached data, we understand the individuals will already have the information from the original source of the breach, e.g. notified by Adobe, Last.FM. Furthermore, providing the information to the individuals would involve a disproportionate effort – there is a very large volume of data and lots of the data is essentially old (dating back to 2012). 

Lessons Learned

Keepnet Labs accepts full accountability for this incident and after completing a detailed investigation, has implemented a number of changes to our business and technology to ensure this will not happen again. We have also taken proactive steps to reduce the data we process.  We accept that for the period the firewall was disabled by the service provider, those individuals were at increased risk on the basis that there was another duplicate copy of the data online for those with the technical skills to access it. For this, on behalf of Keepnet Labs and the service provider, we are very sorry.

Our actions:

  • We have brought the IT service management of all services in-house and under our direct control. We no longer work with the third party.
  • We have added the threat intelligence service to our 24/7 monitoring systems and conduct continuous vulnerability scanning.
  • We considered the volume of data we were processing and took the view that we could reduce it substantially. We have deleted 94% of the records (4.8 billion records) and all data that we reasonably believe may not be a ‘business email address’. We have also deleted older data and continue to manage data retention in accordance with our policy.
  • All breached passwords are now obfuscated in the database (both plaintext and encrypted/hash). We recognise that even though the passwords are available on the internet, we do not actually require them and the threat intelligence service is not degraded as a result.
  • We have updated our incident response procedures and run similar DR and BC simulations on other parts of the Keepnet Labs platform. 
  • We have engaged a third-party security consultancy to conduct an audit on our security processes and perform a threat perimeter assessment. 

Other Stories and Posts Online – Debunked

The ‘breaking news’ was that Keepnet Labs was responsible for the “world’s biggest data breach with over 5+ billion records” which is just not true. Stories suggested that the exposed database “could have included data that has not been previously breached” and “Keepnet ignored the warnings from the researcher” and Keepnet “clearly do not take security seriously” and so on. These are all inaccurate and could be misleading, causing damage to our brand and reputation. 

Whilst the story is important to publish, i.e. ElasticSearch databases are vulnerable to exploitation if left publicly available, insinuating that Keepnet Labs was grossly negligent or exposed previously uncompromised data is untrue. We are a small, start-up cyber business which is striving to improve every day. As part of the cyber community, we take our responsibilities very seriously and do accept that errors were made, and we are sorry for letting the community down. 

We have been working over the past few months to get in contact with the authors of posts who have shared inaccurate aspects of this story and have politely asked them to update their articles. We feel that whilst it may be a potentially newsworthy story, it is important that the details are correct and the reader is not misled over this incident, however unintentional that may have been. We have been courteous and cooperative for several months but regrettably, as a last resort and upon taking legal advice, our legal representatives contacted one company. 

We are a start-up business and like many others, we are struggling during these difficult economic times. Any inaccurate statements could have a significant impact and cause damage to our business and unfortunately, we need to defend ourselves if polite requests are ignored, particularly when these statements damage our reputation. 

We sincerely hope that this full statement explains the context and clarifies the facts about what happened.

For further information

If you do have any other questions, please do email them to us (info@keepnetlabs.com) and we will do our best to respond.