The Hated One

How To Protect Your Online Privacy With A Threat Model | Complete Tutorial

Added 2022-07-22 12:30:42 +0000 UTC

Privacy tools are inconsistent. The inventory of recommended countermeasures changes all the time. DuckDuckGo was recently exposed for allowing Microsoft trackers in their mobile browser. Their search engine is still used and recommended for privacy, but the browser can’t be trusted.

This happens all the time. An emerging privacy tool is recommended for a long while only to be advised against overnight. Companies drop support, they change ownership, or security research invalidates their merits.

On the long enough timeline, you can’t trust any single product. So what if instead of focusing on the tools, you would learn a consistent method that would help you proactively mitigate privacy threats as they arise?

You would be able to actually evaluate your own exposure and adapt your privacy strategy to your needs, at your own pace, as the threats emerge. You wouldn’t be using privacy tools improperly and you wouldn’t get a false of security, because every decision would have a reason.

This is exactly what this guide aims to give you. It will introduce you to a tested methodology with which you will be able to achieve strong privacy. The secret is to start with a threat model.

Threat modeling

Privacy engineers have been researching the right methodology that would encapsulate all privacy goals in one comprehensive model. They’ve found one by comprising a list of seven threat categories that lead to privacy violation in modern systems. These categories are:

linkability, identifiability, non-repudiation, detectability, disclosure of information, unawareness and non-compliance. This methodology is called LINDDUN privacy threat model. LINDDUN is a mnemonic for the seven threat categories.

For our threat modeling exercise, we will use LINDDUN Go cards to help us elicit and mitigate all the threats we could face in our time. LINDDUN will give us the consistent method of finding the right tools to protect our privacy now and in the future. Let’s begin.

Identifying assets

The first step of our threat modeling journey will be to identify our assets – that is whatwe are trying to protect. This is all the personal information created by us or generated by our usage of apps and services.

Transactional data vs contextual data

Information can be broadly split into two categories – transactional data and contextual data. You want to document collection of both of the data categories. Transactional data refers to the content of the communication while contextual data mostly refers to the metadata.

They are both equally sensitive and can share the same data types. For example, you can privately share your location with a contact in an encrypted message but your location will also be embedded in the message metadata in the form of an IP address. This IP address will be visible as metadata to organizations carrying your traffic, such as your ISP, service provider or the government.

Metadata attributes usually inform about the date and time of creation of the data, date and time of sending and receiving, original location, author, source, size and quality of the data. All of this metadata is unique, open and traceable. It is rarely protected and people get killed based on metadata. When you are building your data inventory, you want to document both of these data types.

Data inventory

But we don’t want to just know what our data assets are, we also want to know where they travel and reside. We want to know if our data is collected, how it is stored and whether it is shared with third parties.

To do this, we start by compiling a list of all apps, services and devices we use. Then, for every service or organization behind those services, we want to create a table of data flow components to estimate where our information assets exist.

The first component will be reserved to our credentials. This is any username-password combinations, biometrics, phone numbers or email addresses used to login to online accounts.

The second row belongs to data collectedby organizations providing us apps and services. This is where we can identify transactional and contextual data assets that leave our devices.

The third row is for data sharedby service providers with third parties. We also want to know whether they share with them the content of our data or just the context.

The fourth row will document every data type retained by organizations and entities for any period of time. This should also include client side data stores such as a password manager database on mobile phone storage.

The fifth row is for data processing. Companies often process your data, such as your transactions, billing information or booking details. If an organization processes your data, making an inventory will help you see if it is proportionate to the required purpose or not.

Identifying adversaries

With the data inventory nicely organized, we are ready to move on to the next threat modeling step – identifying adversaries. That is – who we are protecting our data from.

Because privacy is a different goal than security, our adversaries aren’t just gonna be external attackers. Even fully authorized entities with legitimate access to our data are a privacy threat. In our method, we will consider three main threat sources

External attacker – is an adversary with an unauthorized access to communication or stored data. These are your malicious hackers, intelligence agencies or criminal groups.
Organizational source– is a company or an entity that handles your data in a privacy-violating way. This also includes a malicious employee abusing your data. The list of apps and services in your data inventory will help you identify these entities. This is useful to know when avoiding these organizations for privacy reasons or seeking retribution for violation of your privacy rights.
Receiving party – are third parties that organizations share your data with but they can also be receiving ends of your communications. Tracking down who holds copies of your data will come in handy when we’ll try to eliminate our data footprint during threat mitigation.

Threat elicitation

Now that we know the who part it is time we identify the threats to our privacy – or whatwe want to protect our data from.

We will use the LINDDUN threat list of seven privacy threats and ask corresponding questions about every service in our data inventory. We will know how to answer these questions because our data inventory will tell us what every service does with our data.

For each system from our data inventory, we will create a mapping table. This is a table with the same rows for data flow components as we had for our inventory but we add seven new columns, one for each privacy threat.

When an answer to a question leads to a threat for a given category and a data component in our inventory, we will mark it in our mapping table. If you want to follow this exercise thoroughly, you can download or play LINDDUN Go cards from linddun.org/go.

The first threat on the list is linkability.

Linkability

Linkability is the ability of an attacker to find the link between two items of interest, even without knowing the actual identity of the subject. Linkability can lead to identifiability and inference and is impacted by data minimization and anonymization.

The questions you want to ask are:

Are you re-using the same credentials and is the service using them to track you?
Is your behavior creating a sufficiently unique pattern as to be linkable?
Do you submit personal data that the provider can link together?
Does the service collect linkable metadata?
Does the service share your data with third parties that can be linked?
Is the service minimizing or anonymizing stored data, i.e. by storing aggregate sets instead of individual profiles?

Linkability often leads to inference and profiling that can have discriminatory impacts on data subjects. For example, if a yourneighborhood has a higher percentage of people with a certain disease, your insurance fee might be higher than in the surrounding areas.

Identifiability

Identifiability is the ability of an adversary to identify a subject within a data set. It happens when you can’t hide the link between an item of interest and your identity. Identifiability leads to severe privacy violation. It is impacted by data minimization and linkability and it’s mitigated by anonymization.

Identifiability questions include:

Do the credentials contain identifiable information (i.e. email address with real name, e-ID, biometrics)?
Are your behavioral patterns sufficiently unique as to be identifiable?
Do you submit identifiable personal data to the provider?
Is the provider collecting identifying metadata?
Is your identifiable data shared with third parties?
Is the service storing identified data without minimization or de-identification? Can a retrieving party request identifiers from the stored data?

A modern example of identifiability is pattern recognition AI. Your behavioral patterns are often so unique you can be identified even if you are anonymous. Browsing, listening or viewing habits are common usage patterns that lead to identifiability with a high degree of certainty.

Non-repudiation

Sometimes you will have to make a trade off between mutually exclusive privacy and security goals. That is the case of non-repudiation. Non-repudiation means not being able to deny a claim or an event. In terms of privacy, it’s the opposite of plausible deniability. Non-repudiation is a useful security property for e-commerce application systems, where a receipt is used as evidence a user received an item from the vendor. But for systems such as online voting, off-the-record messaging or whistle-blowing, non-repudiation is a severe privacy threat.

If you need plausible deniability, ask these questions about a system:

Do you have to login to a system with identifiable credentials?
Can the communication be traced back to its origin, i.e. your location?
Can you plausibly deny having received a message?
Can you plausibly deny an encrypted storage?
Can future parties retrieve data that contains undeniable information?

Detectability

Protecting the content of your information is rarely enough. In many cases, mere existence of an item of interest can pose a threat. That’s the case for detectability. Detectability is the ability to distinguish an item of interest exists or not. Due to the open nature of the internet, detectability is a very persistent threat. External adversaries will almost always be able to detect contextual data. Detectability leads to inference and is impacted by data anonymization.

You can find whether you are exposed to this threat if you can answer the following questions:

Can your credentials be discovered by an external threat source? e.g. by getting the “forgot password” prompt.
Is communication between you and the service hidden and anonymous? e.g. by the Tor network?
Can additional information be inferred from communication behavior?
Can storage actions and data retrieval be detected by an external threat actor?

Common examples of detectability are knowing whether someone or no one is at a given location or whether an entry in a database belongs to a real person. For example, if you know a celebrity has a file at a rehab facility, you can deduce they have an addiction problem without having access to the actual content of the record.

Disclosure of information

Our privacy threat model cannot be complete without a complementing security analysis. Your privacy solutions need to have a security model. Without it, you are exposed to the threat of information disclosure. Services from big to small fall victim to data breaches on daily basis. The longer an organization retains your data, the higher the likelihood it will be leaked. If you are storing your data in the cloud, you should always assume a breach and prepare accordingly.

Disclosure of information happens as a breach of confidentiality and can be impacted by authentication and authorization. Common vectors for an attacker to gain access to data is if encryption is improperly implemented or lacking completely, the system is not patched with security updates, when the attacker acquires privileged access to a system, or when they compromise login credentials or device secrets. Other attempts can include spoofing, where a phishing attack tries to impersonate a legitimate entity to trick a target into revealing information.

Disclosure of information is most likely to happen when data flows into and out of a system and during data store. Keep in mind that data isn’t stored just in the cloud but also on your devices.

Unawareness

One of the reasons privacy tools often don’t work to protect people’s data is because of the users being unaware as to how to use them properly. Unawareness is a seriously neglected privacy threat. When you are not aware of the impacts of sharing your data, that’s where you are most exposed. Your privacy routine needs to be supported with at least basic research and understanding. You need to carefully evaluate the following:

Are you sufficiently informed about your data being collected and/or processed?
Does the system provide user friendly privacy controls and are the default settings privacy preserving?
Does the system that stores your personal data provide an easy mechanism for requesting a copy of the data?
Can you request to remove or rectify your personal data from the service?
Does the service require an informed consent before data is processed? Can you easily withdraw your consent?

Unawareness, is in my opinion, the most common problem of modern day privacy issues. Users tend to submit way too much information to service providers without a second thought. Much of privacy violation could be easily avoided by merely giving less data about yourself and tightening your privacy settings.

Non-compliance

Large portion of the blame lies on the backs of service providers. Organizations or malicious employees can often violate privacy rights by breaking regulations and corporate policies. Non-compliance is the last threat in our threat model, but it is probably the most rampant. Non-compliance shouldn’t just be about what’s in the legislation. It should also be about upholding the best data protection principles – such as purpose limitation, proportionality, storage limitation or principle of least privilege. You can benchmark apps and systems with a read access to your inventory by asking these questions about them:

Do they collect more personal data than required for the purpose?
Is the data being processed without your consent?
Is more of your data processed than required for the purpose?
Does the process make decisions without human verification? Can you object the automated decision?
Is the service storing more personal data than required for the purpose?

Once you shuffled between all privacy threats for every service in your inventory, you now have a reliable view of what threats your personal data is exposed to. This will help you tremendously with the next and the final step of our threat modeling exercise. Threat mitigation.

Threat mitigation

Each threat to your assets should be mitigated with a countermeasure. But how do you know which countermeasures to choose and when to replace them with better ones?

Our threat model gives us a consistent methodology to choose the best strategy for our needs and tweak it when the situation requires it.

Our list of mitigation strategies and privacy enhancing techniques will include: Protecting our identity, Protecting data, Guarding exposure and Maximizing accuracy. Each of these strategies is designed to mitigate several privacy threats at once. Let’s dive in.

Protect ID

Identifiability and linkability lead to the most severe privacy violations. So we want to tackle these first.

Linkability of credentials is best mitigated with the use of anonymous one-time credentials wherever possible. Never use the same username and password combination for more than one service.

To manage unlikable identities, use a secure password manager. To mitigate information disclosure threats, create an encrypted offline password database and protect it with a long passphrase. Currently, good password managers for this are KeepassXC and Bitwarden.

Never rely on passwords only, even if they are strong and unique. Always request Two-Factor authentication. Use a non-identifiable method such as an OTP app or FIDO security tokens. The FIDO keys are your best option as they also defend against phishing attacks on your accounts. Avoid using your phone number for SMS codes to mitigate identifiability, non-repudiation and detectability threats.

Protect your anonymity as much as possible. Avoid identifiable login credentials where allowed. Find out if you are legally allowed to give false information to a service. Often times, you can get away with it (this is not a legal advice). Get an anonymous email account and sign up for an email alias service to generate new aliases for every use of a new service.

Services that don’t allow you to dissociate their use from your identity should be treated as violating. If you use these apps on your phone, try to compartmentalize them in a separate user profile so that they can’t track your personal data and link them to your identity. The most secure way of doing this is to create new profiles on GrapheneOS or Android. iPhones do not have this feature built in.

Protect transactional data

Confidentiality of your content is essential. Services that don’t encrypt your messages end-to-end should be treated as untrusted. Do not share sensitive content on these apps if you have to use them. Never use a website or a server that doesn’t encrypt data in transit with HTTPS or TLS.

Find alternatives with stronger encryption protocols. For instant messaging, the best standard are Signal and Briar. For email, use an encrypted provider such as Tutanota or Protonmail.

Keep in mind that your messages are gonna be stored on your devices and this could lead to an information disclosure threat. Always download apps from official repositories such as Google Play Store or App Store. Keep your system and apps updated at all times.

Get the most secure device you can afford. Budget phones from the Pixel lineup are currently the most secure on the market, with iPhones being their only real competitor when it comes to security. Mobile phones will however require you to sign in with your real phone number and will collect your location and usage data. This leads to linkability, identifiability, non-repudiation and detectability threats introduced by your Google account or Apple ID.

This is why Pixel phones are the best option for privacy because they allow you to install GrapheneOS and completely remove all Google services from your phone, thus mitigating all the aforementioned privacy threats.

For device security, don’t trust biometrics for authentication – avoid fingerprint readers to unlock your phone as these can be easily copied from the fingerprints you leave on the phone itself. Set your devices to always request a pass/pin code to unlock your phone or laptop.

Protect contextual data

Context is just as sensitive as content. Contextual metadata makes you easily identifiable and your activities linkable. Metadata often serves as evidence of sensitive information and can be detected at virtually all times. The best mitigation strategies to protect contextual data is to anonymize your traffic and minimize your footprint.

The best anonymization tool available is Tor. On your phone, you can use Orbot either system-wide or for specific sensitive apps. For example, you can use Orbot to anonymize your messaging apps and use Tor Browser to anonymize browsing and search traffic. Your IP address will lead to an approximate location and is unique enough to identify you. If you can’t use Tor, you can obfuscate your IP address with a reputable VPN service.

Some apps that offer end-to-end encryption of content often do nothing to protect your metadata. Services like iMessage from Apple and WhatsApp from Facebook are examples of privacy invasive metadata policies that pose significant linkability, identifiability, non-repudiation and detectability threats. This is where minimization comes to the rescue.

Use a messenger that implements a version of the Off-the-record protocol or sealed sender. These were designed to make communication deniable from an external observer’s perspective. Signal and Briar both implement metadata protection. Briar’s approach is significantly stronger than Signal’s since it requires no identifiable usernames or phone numbers and by default anonymizes metadata via the Tor network.

Platforms that you operate on will collect your app usage data. This means Apple, Google or Microsoft will know what apps you install on your devices, how you use them or how often you open them. This is a severe linkability, non-repudiation and detectability threat. Use F-Droid or Aurora store on a de-Googled GrapheneOS phone if you need to mitigate these threats.

Awareness

For every system in your data inventory, go through the settings and find anything related to privacy and security. Use relevant data removal tools to delete image metadata, clear your browser history and erase data from hard drives. Check for any privacy controls and feedback tools you can take advantage of and be aware of what can and can’t be controlled.

Compliance

Take advantage of the privacy controls presented to you. Don’t just dismiss the pop-up windows with a quick “I agree” toggle. Reject cookies and trackers at every opportunity. Force services to respect your consent by routing your traffic through VPN servers in the European Union where privacy legislation is stricter than elsewhere. Don’t underestimate this step. Hardening your privacy settings will legally bound companies to protect your data. If they violate your consent, you could at least have legal grounds to get them to compensate you for damages. (This is not a legal advice.)

Confidentiality

Whatever data ends up in a data base will be exposed in a breach sooner or later. Either that a lucrative deal will find its way to it with enough money. Encrypt all the data you sent to the cloud with the strongest encryption protocol available and a key only you own. This means avoid iCloud, DropBox or Google Drive and use an end-to-end encrypted storage such as ProtonDrive or use Cryptomator to set up encrypted vaults only you will have access to.

Preferably, avoid cloud backups all together and make an encrypted SD card on which to store all your important files. If you set up a hidden volume, you could also gain plausible deniability if repudiation at data store is your need. I have a dedicated tutorial explaining how to create an encrypted offline backup.

Data store isn’t just what’s in the cloud but also what’s on your devices. Make sure they are all encrypted with a strong passphrase. Pick a device with the strongest Hardware Security Module, such as a Pixel phone with the Titan M chip. These chips were built to make all known attacks on your device secrets unfeasible. Your phone is likely gonna have more secure encryption than your laptop so take advantage of that.

Harden access controls on your devices. This means revoking and limiting permissions that grant invasive access to your sensitive files and data. Sensitive information isn’t just your location, camera or microphone. It’s also your contacts, files and media, calendar, messages and sensors. Go to the privacy manager of your phone, review these permissions and revoke access that is not necessary. GrapheneOS hardens privacy and security settings further than any other phone available.

Minimization

Use apps that don’t need to collect your data to monetize their service. Much of machine learning that improves user experience can be done on device with federated learning. Minimize how much data you share with services and make sure they collect and store as little as possible about you. Disinform where possible.

Sensitive location information can be revealed through more vectors than just location services. Radio triangulation is a cheap practice that correlates cellular connections to your location. This can be mitigated with an airplane mode. Bluetooth and WiFi scanning is often running in the background by default. This needs to disabled manually and only GrapheneOS keeps it off by default. iPhones run low energy Bluetooth scanning permanently.

Maximize accuracy

Request a copy of your data from all of the service providers on regular basis so that you know exactly what info they collect. Review how much data they have on your file and request full deletion. There is usually gonna be a process for this at most major services.

There is a list of people search databases that will harvest your personal details for sale. Contact these services one by one with a request to remove all of your data in their files.

Send these requests routinely every few months or years depending on the amount of data they collect. Take advantage of relevant privacy legislation in your area. Even the biggest surveillance states are gonna have some privacy rules.

End

The list of recommended tools is not exhaustive. But the mitigation strategies are. Use them to continue your research and build your own privacy strategy.

I am not affiliated with any of the recommended services. Doing so would mean I am not giving you an advice but selling you an ad. I don’t have any sponsors for the same reason either.

This is a pure and full guide to privacy. It will actually show you a methodology you can consistently adopt to fit your specific case. Privacy is a long-term process. This is where you need to continue this practice until it becomes your routine.

The only way you can support this channel is at https://patreon.com/thehatedone. I don’t play the algorithm game and I don’t bullshit you with sensational content. If you want to keep seeing more of my work, support the channel.