Antti Vähä-Sipilä, F-Secure
antti.vaha-sipila@f-secure.com / avs@iki.fi
Twitter: @anttivs
Available online at https://www.fokkusu.fi/privacy-requirements-slides/
Hey! This is old, pre-GDPR material from 2015!
You are standing in the basement. There is an arrow pointing up. You hear the muffled noise of the audience upstairs. There is a sign here that reads:
Congratulations! You found the explanations track. Each slide has a 'down' arrow, and pressing it will show some more detail of what I will be talking about. This is intended for those who are reading the slides outside the live presentation. Press the 'up' arrow to get back on the main slides track.
I'm differentiating between legal requirements, which essentially are things that have to be done in order to comply with the regulations, and "privacy enhancing" functionality, which increase privacy without being a legal requirement. Often there is a difference in how these are discovered: When a PET is wanted, it usually is some sort of user promise - perhaps a market differentiator.
Usually (especially) legal requirements start as an policy. Policies are how something should work. However, policies are not program code, and the system that is going to be implemented is only governed by the program code. If the system does not control its privacy aspects, but it only left as a policy (or human behaviour) level, from engineering perspective, there is no "privacy" being built into the system.
The closer you get to the code, then, the more any privacy aspects start to look like security controls. Are you using random identifiers for your customers? This means you have to have a cryptographically strong random number generator and sufficiently large identifiers - pure security engineering. Do you have requirements for data deletion after retaining period? The deletion aspects are purely security engineering.
A Privacy Impact Assessment (PIA) is an activity that happens at some point before implementation takes place. Typically, lawyers would approach a PIA very early in the process, which has the challenge that not everything that the system does is yet known - especially if the requirements are managed using an agile methodology.
However, a PIA should not be performed after-the-fact. The term "assessment" is somewhat tricky, because a "security assessment" (or even "audit") is usually something that happens late in the lifecycle. It is fairly common to confuse the terms.
A PIA is a risk management exercise, so it is justified to say that PIA is "threat modelling for privacy" or "privacy risk analysis". As we will see in a moment, most PIA activities fit in nicely with security-related risk modelling activities.
(Paraphrasing ICO)
As is pretty apparent, on the surface, a PIA does not really differ from a security threat modelling exercise. ICO even mentions information flows, which neatly corresponds to data flow based threat modelling.
The difficult part in this process is actually identifying risks. If you are a privacy (or security) professional, you probably have a pretty good intuitive grasp on what to look for. However, in most organisations, privacy and security experts won't scale. There needs to be some way of finding privacy risks. The most important consideration for identifying risks is that it is systematic. Systematic work provides a scaffolding where people with less experience can arrive at a good enough result. Leaving risk analysis as completely ad hoc or open ended may be risky in itself.
There are many ways to do threat modelling. Some people prefer threat trees, some have unstructured discussion. In my line of work, teaching engineers to do it is just as important as the results themselves, so I am using a data flow based threat modelling technique. I have successfully used it in dozens of facilitated sessions for components ranging from embedded device drivers to cloud-deployed web services.
In doing this sort of threat model, we would start with a Data Flow Diagram (DFD) or a Message Sequence Chart (MSC) depending on the complexity of interactions. Each data flow, data store, and processing entity, will be discussed from six aspects that make up the acronym STRIDE (Spoofing, Tampering [Integrity], (non-)Repudiation, Information disclosure [Confidentiality], Denial of Service [Availability], and Elevation of Privilege).
Findings are stored on the product backlog as tasks.
For more information on STRIDE, and its nuances, have a look at Threat Modeling: Designing for Security by Adam Shostack, or my Software Security course at Aalto University in 2015 (see 2014 course at University of Helsinki).
(On how to run this in an agile / Continuous Delivery project, see https://www.fokkusu.fi/security-in-ci-slides/.)
If you have tried STRIDE and find that it works, you can pimp the method by adding more letters (i.e., more points to consider). Sometimes this works, but admittedly this may be seen as somewhat mechanistic.
LINDDUN, as described in Shostack's book and academic papers, discusses a number of aspects that apply either to data (linkability, identifiability, detectability), interaction logic (non-repudiation), user experience (unawareness) and compliance. It is interesting to note that non-repudiation is often a good thing for security and a bad thing for privacy.
TRIM is my own set of considerations, which I tried to make as simple and small as possible, and came up independently before I knew of LINDDUN. For each data flow, you consider whether you are allowed to transfer the data over a regulatory or contractual boundary; you have to discuss how long you retain data and how you will delete it; you have the same user experience (UX) discussion about informed disclosure as in LINDDUN's unawareness, and finally, you determine whether you only transfer the minimum set of data that is technically required.
Helen Nissenbaum's Contextual Integrity is described in her book Privacy in Context. I had a look, and for the purposes of this discussion, Shostack's summary in the Threat Modeling book probably suffices. The method ("Contextual Integrity Heuristic") could be shoehorned into data flow analysis, but I believe it would serve better as a repository of "approved" privacy contexts that business, legal, and security people have agreed on.
You could use the Contextual Integrity Heuristic as a "triage tool" to determine which new functionality would benefit from a more specific PIA. For example, if you do the same old stuff again and again, perhaps you don't need to trigger a PIA. If you break the integrity, then you will have to do a full PIA.
A good thing about this is that the discussion also includes the set of societal norms. This is beyond the bare technical and legal necessities, and takes privacy discussion into the user experience (UX) realm. (LINDDUN and TRIM also have one UX-related consideration, but fall short of considering ethical, political and societal norms.)
If you try any of this out, please let me know!
Twitter: @anttivs
Email: avs@iki.fi
Available online at https://www.fokkusu.fi/privacy-requirements-slides/