Information Protection in Windows 10 and OneDrive

Paul Thurrott’s analysis of the soon-to-be-available Windows 10 update – Version 1809 (Redstone 5) – included this snippet that caught my eye:

Storage Sense now integrates with OneDrive and can automatically change any downloaded files to online-only if you haven’t used them in a configurable number of days (in Settings > System > Storage > Storage Sense).

Every vendor struggles with the balance between releasing tools that enable productivity through information availability and protecting information from too much disclosure / availability. What should this person have access to based on their job role and their tasks is a governance question for organisations, that’s enabled by technical capabilities offered by vendors. Data loss prevention stops people from flowing information to other people when it’s sensitive or confidential and the other party doesn’t have access rights. Access control lists on collaborative workspaces, shared folders, and systems of all kinds provide another form of information protection – it lets those who need the content in, and keeps those who don’t have the right to the content out. Role-based access control goes a step further and adds the nuance of who can and cannot take specific actions within a system.

Choosing to sync your OneDrive contents to a local machine is great for productivity – everything is immediately available whether you are connected to the network or not. But the risk is that unauthorised access to your machine – directly by a person or indirectly by a security threat executing and exfiltrating the data on your disk – will enable access to content by people who do not have authorisation. To information that is sensitive, confidential, or in need of special protections. The above forthcoming integration with Storage Sense in Windows 10 will mean that content from OneDrive that is not used often can be removed from local storage, reducing the potential information protection disclosure surface. If it’s not there directly, it can’t be accessed directly … and thus there’s another action required to gain access, which can be evaluated against up-to-the-second security policies.

Information Protection: The What – Office 365 DLP

As mentioned the other day, Microsoft uses two specific products to deal with the what of information protection: Office 365 DLP and Azure Information Protection. There are similarities between the two, but some fundamental differences as well. Let’s focus on Office 365 DLP today.

DLP is all about know and flow. Both are done specifically within the context of the DLP policies you have configured. Know is about the what – the specific sensitive information types or labeled content that exist within an email being written, a document attached to an email being written, or a document being shared from SharePoint or OneDrive.

But the know is only enacted at the point of flow, such as when the user is writing an email that has been addressed to someone not authorised to receive it (e.g., an external recipient), or a document sharing action that would share the document with someone not authorised to view or edit it.

This core idea – know and flow – aligns with the specific protection mandate of Office 365 DLP – to “prevent loss” by stopping an unauthorised someone from gaining access.

Thus DLP policies – as set up in the Security & Compliance Centre – are intended for:
– preventing an internal user from sending content in an email or attached document to a recipient who should not receive it.
– preventing the sharing of a document with someone who should not receive it.
– these actions must be taken within and through Office 365.

DLP will not prevent loss in all situations, unless there are other parts of the Information Protection portfolio in use. For example, if a user downloads a file with sensitive data and then syncs it with Dropbox (or some other cloud sharing service), that content has just disappeared. It has been taken out of the boundary of Office 365, and loss prevention capabilities are blind to what happens. Ditto if it is put onto a USB thumbdrive. There are other solutions in the portfolio – Microsoft Cloud App Security and Windows Information Protection for example – that can address most of these challenges, and Azure Information Protection to a degree as well (in conjunction with those other two). We’ll leave that complexity for another day.

But for now – DLP is all about know and flow.

Information Protection: The What

When thinking about information protection, one of the key questions is what: what specific information should be protected? Some information doesn’t need to be protected at all, such as when it is common knowledge (2+2=4) or easily available (the name of the current leader of a country).

Other information does need to be protected – for a variety of reasons (the why, which we’ll talk about more fully later). Broadly speaking, information that needs to be protected is like that because its inappropriate use or disclosure could cause harm to a person, entity, or organisation. For example, disclosing someone’s credit card number and expiry date to the wrong person could result in financial harm (unauthorised transactions, lost funds, decimated credit rating, etc.) Disclosing someone’s name, address, national ID number and similar data could result in harm through identity theft; an unauthorised actor uses that valid data to masquerade as the other person, receiving benefits that the other party is entitled to or is forced to pay for without receiving the benefit. In an organisational context, disclosing financial planning documents or explanations of the forthcoming business strategy moves to a competitor can result in a weakened market position, reduced market valuation, and in the worst case, outright business failure.

The potential to cause harm is what drives the need to create mitigations through information protection, and in Microsoft’s perspective on information protection, there are two general classes:

  1. General and generic types of information that are sensitive, and that can be computationally discovered. For example, a credit card number is a credit card number is a credit card number, and if you can work out the identifying characteristics of credit card numbers, you can detect the presence of one or more. Likewise for social security numbers (US), tax numbers (pretty much everywhere), health identification numbers (ditto), and more. Information in this class exists generally, and a specific organisation could (or may have to) protect such information if they collect or handle it.
  2. Specific types of information that could cause harm to a specific business (or government agency, organisation, non-profit, etc.) if these were to fall into the wrong hands. For example, strategy documents, financial plans, employee lists, expansion ideas, current M&A targets, and more. Information in this class exists in customized forms for specific entities, and depending on the specific business / organisation / other, will need to be set up. There are of course general classes of these types of information across most entities, but the specific realisation of that is up to the specific entity.

Microsoft deals with the above through two specific products in its information protection solutions portfolio: Office 365 data loss prevention (DLP) and Azure Information Protection (AIP). Both products can work with the generic sensitive information types as well as specific types of information that could cause harm. DLP always works automatically (scanning, analysing, thinking), and AIP can work either by user choice (manual labeling of a document or email) or based on automated content analysis. And if something is found that goes against a policy, an automated action can be triggered – such as a user notification, an alert to an administrator, or a block action that prevents the message or document from being sent / saved / shared.

“Information Protection”

If you thought “collaboration” was a wiggly word with lots of definitions and places it could be used, you should try the phrase “information protection” on for size. Once you start enumerating the types and styles and approaches and consequences and implications and gotchas, you start to build a complex picture of requirements. Which is why Microsoft doesn’t offer an “information protection” product as such, but rather a set of solutions that apply in different situations. I need to get my head around what is actually on offer in Microsoft’s Information Protection Solutions catalog, so let’s have a talk about it. And probably not just today.

The diagram above is a common one used by Microsoft to show the breadth of its solution set. The four blue circles in the middle express the generic commonalities – detect, classify, protect, and monitor. The 11 solutions around the outside are the specific products that are [1] part of the solution set, and [2] in adherence with one or more of the four blue circles.

One immediate conclusion based on the breadth of these capabilities is that information protection is complex. There’s a lot to understand when you are dealing with a product set in Office 365 for productivity and collaboration that is as broad and deep as what Microsoft is attempting. To be the company that helps “everyone to achieve more” – a broad and all-encompassing vision if ever there was one – you have to safeguard and protect the means of achieving as much as providing tools to help with the achieving.

A second observation in looking at the diagram is that it’s important to note that not all of these capabilities are in Office 365. Some are – Office 365 Message Encryption, Office 365 Advanced Security Management (now called Office 365 Cloud App Security), and Office 365 DLP – are three obvious inclusions. And of the capabilities that are in Office 365, not all are in all plans; essentially, if you want all the Office 365 capabilities, you’ll need to purchase the E5 license. Lower licensing levels have a diminishing number of capabilities. The rest of the capabilities come from the Enterprise Mobility + Security plan – this is where you get the full version of Microsoft Cloud App Security, Conditional Access (from Azure Active Directory), Azure Information Protection, and more. One way of thinking about it is that you buy Office 365 E5 for productivity and collaboration and Enterprise Mobility + Security E5 for safeguarding that productivity and collaboration. It’s not a fully correct differentiation, but it’s a broadly accurate distinction. And if you buy the Microsoft 365 plan, you get both the Office 365 capabilities and Enterprise Mobility + Security capabilities, along with Windows 10.

So what do the above capabilities actually do? Let’s talk about that another day.

Data Breaches – It’s Not Just Hackers

In the General Data Protection Regulation and other data protection regulations around the world, data breaches are a topic of concern. In all cases, the regulators do not want data breaches to happen (because it goes against the data protection mandate), and generally speaking, there is a requirement to notify a given authority when a data breach is detected. But despite the general expectation that data breaches are caused by nefarious external agents acting with malicious intent, there are many other types.

Here’s some:

  • An employee who accesses personal data records on customers or patients that are outside his or her task domain, or otherwise beyond what they need to access for their job. The ICO in the UK prosecutes people when this happens, such as a hospital worker, a housing worker, and a council worker, among many others profiled on the ICO blog.
  • An organisation that should know better didn’t scrub the metadata on its published research, legal advice and reports, thereby disclosing details of employee names when its policy is to not disclose employee names.
  • An employee leaves a firm and takes details on customers to a competitor or to their own new firm in the same market space. Again, the ICO prosecutes people for breaches of this nature, such as a recruitment consultant who stole the details of 272 individuals.
  • A county council didn’t put appropriate access security on a database containing personal and sensitive information, which meant that members of the public could access the data with a search engine.

    • Beware your actions.

GDPR: To Whom Does GDPR Apply?

Article 3 of the General Data Protection Regulation (GDPR) states:

Territorial Scope
1. This Regulation applies to the processing of personal data in the context of the activities of an establishment of a controller or a processor in the Union, regardless of whether the processing takes place in the Union or not.

2. This Regulation applies to the processing of personal data of data subjects who are in the Union by a controller or processor not established in the Union, where the processing activities are related to:

(a) the offering of goods or services, irrespective of whether a payment of the data subject is required, to such data subjects in the Union; or

(b) the monitoring of their behaviour as far as their behaviour takes place within the Union.

3. This Regulation applies to the processing of personal data by a controller not established in the Union, but in a place where Member State law applies by virtue of public international law.

The key phrase for applicability is “in the Union,” in the physical sense:
– Any organisation based in the Union must comply with GDPR, for all data processes that make use of personal data (employees, customers, supply chain, etc.).
– Any organisation not based in the Union but which offers goods or services to, or monitors the behaviour of, data subjects in the Union, must also comply with GDPR.

– An organisation outside the Union offering goods or services to, or tracking the behaviour of, an individual who is physically in the Union, must comply.
– An organisation outside the Union offering goods or services to, or tracking the behaviour of, an individual who is not physically in the Union, does not have to comply.
– An organisation outside the Union hiring an individual with EU citizenship for a job role outside of the Union, does not have to comply. GDPR is blind to the idea of “EU citizenship;” it is not on this basis that GDPR applies.
– A visitor to the EU, while in the EU, is afforded the same rights of data protection to anyone who lives in the EU, whether dealing with organisations inside the EU (per Article 3(1)) or those outside offering products and services to individuals in the EU, or tracking the behaviour of individuals physically in the EU (per Article 3(2)).

And So It Begins … By Shutting the Gate

That’s one way to deal with the requirements of GDPR (hat tip, BBC):

A number of high-profile US news websites are temporarily unavailable in Europe after new European Union rules on data protection came into effect.

The Chicago Times and LA Times were among those posting messages saying they were currently unavailable in most European countries.


Lee Enterprises publishes 46 daily newspapers across 21 states.

Its statement read: “We’re sorry. This site is temporarily unavailable. We recognise you are attempting to access this website from a country belonging to the European Economic Area (EEA) including the EU which enforces the General Data Protection Regulation (GDPR) and therefore cannot grant you access at this time.”

Read more: GDPR: US news sites blocked to EU users over data protection rules

And So It Begins

May 25, 2018. It’s a date that’s always been “coming” and recently “coming quickly.” While all future dates are like that, May 25 this year was particularly interesting because that’s when the new European data protection law, the General Data Protection Regulation (GDPR) switches into enforcement mode. Organisations have had just over two years of grace since GDPR was ratified by the 28 members of the EU (by specifically, being published into the official journal, etc.), and now its requirements are supposed to be met by all organisations to whom it applies.

I have just attended a two-day workshop on GDPR in the context of how a tech vendor can help customer organisations become compliant. There was lots of cool technology on show, but answers to the fundamental questions were elusive. For example, to which organisations does it apply (in New Zealand)? For which people does it apply, and when? How do you meet the critical requirement of differentiating personal data according to the legal basis under which it was collected … and how does this flow-through into data subject’s rights to access their data, have it deleted, and more.

Interesting questions. Challenging times. And so the real work now begins … what exactly is expected, what will be taken to court, who will be fined and under what conditions, and more.

Identifying Dark Data in non-OCR’d Images

The new European data protection legislation goes into effect next Friday. It’s going to be incredibly interesting to see how militant its enforcers play the game.

There are many aspects required in complying with GDPR, but one core tenet is knowing where you are storing personal data. A challenging data type in this respect is images that contain personal data but where the image has not been converted to readable text. One law firm in Sweden is doing something about its storage of such content:

Delphi, one of Sweden’s top commercial law firms, has chosen DocsCorp’s contentCrawler as part of its General Data Protection Regulation (GDPR) compliance strategy. The firm selected the contentCrawler OCR module to help address the “dark data” issue that was discovered after an audit of their file systems.

The audit found that 30% of the documents in the firm’s iManage Document Management System (DMS) were non-searchable. Nearly 70% of these were image-based PDF files, undermining the firm’s ability to manage clients’ personal data and to adequately respond to a Data Subject Access Request (DSAR).

For an organization to comply fully with DSARS or data return, erasure or portability requests, it needs to be able to search its DMS for all relevant documents. In the case of Delphi, it scanned driver licences and passports for identification purposes without OCR’ing the resulting image documents. The firm ended up storing large amounts of personal data that was effectively invisible to search technology, putting the firm at risk of non-compliance.

Identification of personal data is important. But so is knowing the legal basis under which it was held in the first place. That has tremendous implications for what organisations must do with discrete elements of personal data. Welcome to the new world, now just 8 days away.

Read more: Law firm chooses contentCrawler for GDPR compliance