Privacy Requirements

Discovering Privacy
Requirements

Antti Vähä-Sipilä, F-Secure
antti.vaha-sipila@f-secure.com / avs@iki.fi
Twitter: @anttivs
Available online at https://www.fokkusu.fi/privacy-requirements-slides/ Hey! This is old, pre-GDPR material from 2015!

This presentation has extra notes for non-interactive viewing. You can access the notes by pressing the arrow down, or by clicking on the arrow in the right bottom corner.

You are standing in the basement. There is an arrow pointing up. You hear the muffled noise of the audience upstairs. There is a sign here that reads:

Congratulations! You found the explanations track. Each slide has a 'down' arrow, and pressing it will show some more detail of what I will be talking about. This is intended for those who are reading the slides outside the live presentation. Press the 'up' arrow to get back on the main slides track.

Types of privacy
requirements

Regulatory requirements
Privacy Enhancing Technologies (PETs)
(Security) controls for either

I'm differentiating between legal requirements, which essentially are things that have to be done in order to comply with the regulations, and "privacy enhancing" functionality, which increase privacy without being a legal requirement. Often there is a difference in how these are discovered: When a PET is wanted, it usually is some sort of user promise - perhaps a market differentiator.

Usually (especially) legal requirements start as an policy. Policies are how something should work. However, policies are not program code, and the system that is going to be implemented is only governed by the program code. If the system does not control its privacy aspects, but it only left as a policy (or human behaviour) level, from engineering perspective, there is no "privacy" being built into the system.

The closer you get to the code, then, the more any privacy aspects start to look like security controls. Are you using random identifiers for your customers? This means you have to have a cryptographically strong random number generator and sufficiently large identifiers - pure security engineering. Do you have requirements for data deletion after retaining period? The deletion aspects are purely security engineering.

Privacy Impact
Assessment (PIA)

Assessment happens before implementation

Timing & scheduling depends on your work management system

Do not confuse with a "security assessment" that often happens just before release
"[a] tool that you can use to identify and reduce the privacy risks" (ICO)

A Privacy Impact Assessment (PIA) is an activity that happens at some point before implementation takes place. Typically, lawyers would approach a PIA very early in the process, which has the challenge that not everything that the system does is yet known - especially if the requirements are managed using an agile methodology.

However, a PIA should not be performed after-the-fact. The term "assessment" is somewhat tricky, because a "security assessment" (or even "audit") is usually something that happens late in the lifecycle. It is fairly common to confuse the terms.

A PIA is a risk management exercise, so it is justified to say that PIA is "threat modelling for privacy" or "privacy risk analysis". As we will see in a moment, most PIA activities fit in nicely with security-related risk modelling activities.

PIA building blocks

(Paraphrasing ICO)

Identify business goals
Describe information flows
Identify stakeholders
Identify risks & controls
Implement controls & accept residual risk

As is pretty apparent, on the surface, a PIA does not really differ from a security threat modelling exercise. ICO even mentions information flows, which neatly corresponds to data flow based threat modelling.

The difficult part in this process is actually identifying risks. If you are a privacy (or security) professional, you probably have a pretty good intuitive grasp on what to look for. However, in most organisations, privacy and security experts won't scale. There needs to be some way of finding privacy risks. The most important consideration for identifying risks is that it is systematic. Systematic work provides a scaffolding where people with less experience can arrive at a good enough result. Leaving risk analysis as completely ad hoc or open ended may be risky in itself.

So how can you go on about it?

Side note: PIA in Agile

Run PIA against your backlog items

Iteratively, as they come

If design is still open, conduct "business level" PIA with lawyers

Results in privacy user stories and changed functional features

If you already know the design, conduct "technical" PIA using threat modelling

Results in new features (PETs) or acceptance criteria

Intermission:
Security
threat modelling

Many schools of thought; I do data flow analysis
Use a DFD or MSC diagram, consider all flows and storage
Use a framework such as STRIDE (Microsoft)

There are many ways to do threat modelling. Some people prefer threat trees, some have unstructured discussion. In my line of work, teaching engineers to do it is just as important as the results themselves, so I am using a data flow based threat modelling technique. I have successfully used it in dozens of facilitated sessions for components ranging from embedded device drivers to cloud-deployed web services.

In doing this sort of threat model, we would start with a Data Flow Diagram (DFD) or a Message Sequence Chart (MSC) depending on the complexity of interactions. Each data flow, data store, and processing entity, will be discussed from six aspects that make up the acronym STRIDE (Spoofing, Tampering [Integrity], (non-)Repudiation, Information disclosure [Confidentiality], Denial of Service [Availability], and Elevation of Privilege).

Findings are stored on the product backlog as tasks.

For more information on STRIDE, and its nuances, have a look at Threat Modeling: Designing for Security by Adam Shostack, or my Software Security course at Aalto University in 2015 (see 2014 course at University of Helsinki).

(On how to run this in an agile / Continuous Delivery project, see https://www.fokkusu.fi/security-in-ci-slides/.)

Your reading list

Threat Modeling Chapter 6 (Adam Shostack)

Contextual Integrity Heuristic (Nissenbaum), LINDDUN

Privacy Engineering (Ian Oliver)

Data flow based; classifications and ontologies to understand what happens

The Privacy Engineer's Manifesto Chapter 5

Emphasises 'use cases' ('user stories for privacy')

TRIM, the easy STRIDE add-on (by me)

If you do STRIDE threat modelling, just add four more letters

ENISA's PbD report Section 3

Data flow modelling "acronym extensions"

Extend a data flow diagram analysis (like STRIDE)
LINDDUN: Linkability, Identifiability, Non-Repudiation, Detectability, Disclosure of information, Unawareness of disclosing info, Non-Compliance
TRIM: Transfer, Retention, Informed disclosure, Minimisation

If you have tried STRIDE and find that it works, you can pimp the method by adding more letters (i.e., more points to consider). Sometimes this works, but admittedly this may be seen as somewhat mechanistic.

LINDDUN, as described in Shostack's book and academic papers, discusses a number of aspects that apply either to data (linkability, identifiability, detectability), interaction logic (non-repudiation), user experience (unawareness) and compliance. It is interesting to note that non-repudiation is often a good thing for security and a bad thing for privacy.

TRIM is my own set of considerations, which I tried to make as simple and small as possible, and came up independently before I knew of LINDDUN. For each data flow, you consider whether you are allowed to transfer the data over a regulatory or contractual boundary; you have to discuss how long you retain data and how you will delete it; you have the same user experience (UX) discussion about informed disclosure as in LINDDUN's unawareness, and finally, you determine whether you only transfer the minimum set of data that is technically required.

Problems with "acronym extensions"

You have to know what you're looking for

Example: "Linkability": You have to understand the concepts of anonymity and pseudonymity on information-theoretic level

You may miss the big picture (business level flaws)
If you don't know all your data assets and their metadata, the analysis fails

Don't skip the asset discovery phase!

Contextual Integrity

Specify context of use of personal data, and an "integrity promise"
Context: Actors, type of data, data transmission method, and (societal) norms
Evaluate: If the context or "promise" changes, trigger activities

Helen Nissenbaum's Contextual Integrity is described in her book Privacy in Context. I had a look, and for the purposes of this discussion, Shostack's summary in the Threat Modeling book probably suffices. The method ("Contextual Integrity Heuristic") could be shoehorned into data flow analysis, but I believe it would serve better as a repository of "approved" privacy contexts that business, legal, and security people have agreed on.

You could use the Contextual Integrity Heuristic as a "triage tool" to determine which new functionality would benefit from a more specific PIA. For example, if you do the same old stuff again and again, perhaps you don't need to trigger a PIA. If you break the integrity, then you will have to do a full PIA.

A good thing about this is that the discussion also includes the set of societal norms. This is beyond the bare technical and legal necessities, and takes privacy discussion into the user experience (UX) realm. (LINDDUN and TRIM also have one UX-related consideration, but fall short of considering ethical, political and societal norms.)

Problems with
Contextual Integrity

If used as a "context library", someone needs to maintain it
Evaluation still needs deep privacy area expertise

Potential solution?

Ensure you know your information assets
Convey privacy needs as "privacy user stories", PETs or acceptance criteria

Depending on how well you already know your upcoming features

Use Contextual Integrity Heuristic to triage new feature requests for PIA treatment
Use a data flow based threat modelling technique and standardise on a privacy add-on

TRIM - easy and quick
LINDDUN - more comprehensive
Classifications from Ian's book - when the area is new, and experts are present

Thank you

If you try any of this out, please let me know!

Twitter: @anttivs

Email: avs@iki.fi

Available online at https://www.fokkusu.fi/privacy-requirements-slides/

Discovering PrivacyRequirements

Types of privacyrequirements

Privacy ImpactAssessment (PIA)