The utility industry has undergone a remarkable shift over the past 10 years. What was traditionally a one-way commodity electron-flow to customers has now become bidirectional traffic of both electrons and bytes.

With the rise of distributed energy resources (DERs) and customer demands for improved energy efficiency, utilities are handling an increasing amount of useful data, especially from consumers. Today, there are few new appliances without a tiny computer and network connection. Having data from devices connected to a smart home service can create a detailed behavioral pattern of daily usage, which – if used correctly – could be a basis for providing demand response and other services that benefit both consumers and the utility. It can also open new markets for utilities to offer new services. However, handling consumer behavioral data raises serious issues around ensuring data privacy.

Intertrust has been privileged to work on a data platform for a German smart home service operated by a major European utility. As utilities around the world consider utilizing smart home data for any number of services, here are some observations and lessons around ensuring data privacy and complying with European data privacy regulations gleaned from this work.

Data Flows in Smart Home Services

The value of smart home services actually comes from data analysis and machine learning. For the most part, smart home devices send data to a smart home hub. The hub then sends this data to the service provider, where it is enriched with useful external data and then analyzed.

Smart home data itself is a collection of seemingly insignificant information. This is all wrapped into a layer of metadata from your devices (serial numbers, timestamps) and sent to a cloud computing system that analyzes your data and provide educated feedback to your devices. This is done either directly or via third parties that are involved in providing your smart home services.

At first glance, this may not seem like a lot of information, but the technical data devices generate, such as temperature readings in different rooms and power consumption patterns, can reveal all kinds of personal preferences. If this data can be connected to your identity, it will result into a creepily detailed profile of your everyday life at home.

There are a number of ways to harm someone by misusing smart home data. One of the simplest ‘use cases’ for something hideous can be tracking electricity consumption patterns to detect when people are absent. Unless you are planning on becoming a reality TV show star, you probably do not want these sorts of datasets leaking out.

Designing For Data Privacy

Any decent smart home hub comes with access control measures to keep hackers out. But to make sure the entire smart home data cycle is secured, data routed through the hub must be encrypted – even if it does not contain any personal data. Otherwise, it is only a matter of time until a bad actor exploits vulnerabilities, which can exist in any system, and gets their hands on the unencrypted data.

This also points out the issue of your data being stored somewhere on the cloud and out of your control. While cloud data storage can be secure, it’s not perfect and data has been known to leak. Modern computing technology has reached the point where your personal data in the above-mentioned data package could be analyzed locally without using cloud computing. At least from the data privacy design standpoint, this would be a preferred way for handling smart home data.

Another principle is how to handle identity information. When you sign up for smart home services, you obviously commit to an agreement between you and the service provider. This gives the service provider the right to know your identity and other related details to be able to provide the service. However, these personal details don’t need to be included in the smart home data loop, namely the data packages that move between your smart home hub and the cloud-based analytics service. According to accepted good practice, the only connection between your identity and the data transmitted by the smart home service should be a customer-specific identification code sent from the smart home hub. This ID code should be connected to your identity only in a separate secure database that has very limited access rights.

As a result of this technique, the data looping through the smart home services should essentially be a bunch of completely anonymous numbers. If implemented properly, even if there should be a security breach and the service provider’s smart home database leaks to the internet, no one could connect any of these numbers to your identity.

Even if your smart home data is made public, it’s statistically nearly impossible to use this data for identifying your home based just on power consumption patterns. Of course, there may be some unusual cases – say your private residence is the only building in the city having a large swimming pool, your pool heating consumption data can stand out.

However, there are some rather peculiar loopholes in smart home data privacy.

True privacy risks can lie in the appliances and devices connected to your smart home hub. If you purchased any of your devices from a third party, it is possible that the device retailer has both your personal details and the serial number of the device. The very same serial number is usually stored within the smart device and used as an identifier for the device when it connects to the network. Now, if you connect the device to your smart home hub and the service’s query format happens to include the device serial number, it will become a part of the dataset that is moved to the cloud. If your smart home data leaks and there is a dedicated malicious actor who is determined to identify the users behind the data, they could start targeting electronics retailers with the goal of extracting all CMS (contact management system) data listing the devices sold and corresponding customers.

Leave this dark scenario aside, the fact that there are third parties (device sellers, repair shops etc.) that may be able to connect your identity to your devices makes your privacy vulnerable. Any system that handles smart home data should be designed to mitigate these vulnerabilities to the fullest extent possible.

Regulations and Handling Personal Data

Data privacy regulations, such as the European Union’s GDPR, have a direct impact on smart home services.

According to the GDPR, first and foremost, processing personal data requires getting a data subject’s consent. This consent needs to list all the data variables, describe how the data is used, and who can access it. There are no exceptions.

Furthermore, personal data should be pseudonymized when being processed. Pseudonymization is a process where personal data in a data set is replaced with identifiers, making it difficult, but not impossible, to tie the data to an identity. The identifiers for pseudonymized data have to be stored separately and access to identities must be limited.

Any use of personal data – and this usually also applies to pseudonymized data – must be specified, explicit and also be used for an appropriate purpose. This is certainly highly relevant to smart home data. For example, pseudonymized data is still subject to data protection rules, while properly anonymized data is not and can be used without any privacy related restrictions.

As you can see there are a number of key aspects to ensuring that smart home data privacy is correctly done. The system needs to be designed so that smart home data is kept completely separate from any personal data. Each data variable included in the smart home service needs to be analyzed thoroughly to be sure that it cannot be even indirectly connected to the customer. If there is a hesitation that a specific data variable can be connected to an individual under certain circumstances, then it is certainly a good idea to pseudonymize the variable.

When Intertrust worked with the German smart home service provider, we went the extra mile by completing a comprehensive privacy analysis on all the smart home data variables that were supposed to be processed. The investigation disclosed a few surprising findings. So, to be sure that the whole data processing cycle is strictly compliant with privacy regulations, some adjustments were made to the database designs. The experiences gathered were an inspiration for this article. I hope the ideas addressed here are useful for your next encounter with smart home data.

Kalle Kägi is Vice President of European Operations and Corporate Development at Intertrust. He has more than 25 years experience in creating infrastructure for trusted computing and data governance. He is the founder of Planet OS, a data infrastructure company for geospatial IoT,  and served in various founding and executive roles at Flydog Design, Baker Tilly Baltics, and BDO Eesti AS. A graduate of the University of Tartu, Estonia, Kägi is a past co-chair of the Smart Ocean and Smart Industries Working Group of the World Ocean Council.