Science

Who has your data? Researchers scrutinize apps for undisclosed ties to advertisers, analytics companies

Nearly 60 per cent of apps collected more information than declared in their privacy policies according to a recent study that compared the stated practices of hundreds of apps with how they actually behaved.

A study looked at hundreds of apps' privacy policies — then compared them to the data actually collected

Of the 757 apps analyzed, the researchers found nearly 60 per cent of apps collected more information than stated in their privacy policies. (Mark Makela/Reuters)

If you want to better understand how an app or a service plans to use your personal information, its privacy policy is often a good place to start. But a recent study found there can be a gap between what's described in that privacy policy, and what the app actually collects and shares.

An analysis by University of Toronto researchers found hundreds of Android apps that disclosed the collection of personal information for the app developer's own purposes — but, at the same time, didn't disclose the presence of third-party advertising or analytics services that were collecting the personal information, too.

"This is one of the ways in which you're getting tracked through your use of apps," said Lisa Austin, a law professor and one of the study's co-authors.

To generate revenue, app developers often embed software code, known as ad libraries, allowing them to display ads within their app. Because they want to make the ads relevant to individual users, ad libraries often want specific information about those users.

For those who may be more familiar with the cookies that track your online browsing habits, Austin says that on mobile devices, "you're being tracked through these ad libraries and these analytics libraries in a very similar way."

The researchers have been working on a software project called AppTrans, with the goal of making undisclosed data collection practices more transparent. The software looks for evidence of data collection that isn't spelled out in a privacy policy by comparing the policy's language with an analysis of the app's code.

It does this, in part, using machine learning — artificial intelligence — to automatically scour privacy policies for language that points to the collection of location data, contact information or unique device identifiers. Such data can be useful for targeted advertising or be used to build profiles of user behaviour.

And handing this data to a third party is also a way for developers to monetize free apps.

Of the 757 apps analyzed, the researchers found nearly 60 per cent of apps collected more information than stated in their privacy policies.

Austin called the finding "eye-opening."

"It was so bad," said David Lie, a computer science and engineering professor and another of the study's co-authors.

The team's findings were published in June, and are based partly on work done by one of Lie's graduate students, Peter Yi Ping Sun. The project was funded by the Office of the Privacy Commissioner of Canada.

All or nothing

Under Canada's privacy laws, developers should have to disclose both the information they collect themselves and information collected by third-party services embedded in the app's code.

"You can't have informed consent if you don't know that your information is being collected by these third parties," said Austin.

Part of the problem is that while apps typically have to ask the user for permission before accessing sensitive data, such as location or a person's contact list, granting access is all or nothing.

If you give a weather app access to your location for a more accurate forecast, for example, a third-party advertising service embedded in the app could access it, too. It would be up to the app developer to make that possibility clear.

This is technology that we can use to to make the digital world more transparent," she said, "and that's a real win.- Lisa Austin, study co-author

The U.S. Federal Trade Commission has warned app developers that they must clearly explain to users how they plan to share personal information with ad libraries and seek consent before doing so — or face potential legal repercussions.

Google similarly expects developers to disclose any data shared with third parties in an app's privacy policy, including marketing partners or service providers — as do many third-party libraries themselves.

So why aren't developers doing it in practice? Lie wondered that, too.

The researchers knew it was unlikely that most developers were intentionally lying to their users, he said. Instead, they concluded that app developers are likely just as bad at reading their own privacy policies as their users.

"We can surmise that, similar to how end users often do not read the privacy policies of the applications they use, application developers do not read or properly incorporate the privacy policies of third-party libraries and tools they use to build their applications," they wrote in the study.

Re-imagining the privacy policy

The researchers see automated solutions, such as AppTrans, as one way to help regulators better scrutinize the deluge of apps built and added to mobile-app stores each year.

They also imagine their project could eventually help developers analyze their own apps for non-compliance.

For users, the hope is that software like AppTrans could shed more light on how their personal information is collected and shared.

Lie says the team is already at work on a second, more polished version of AppTrans that would recognize when an app is attempting to access sensitive personal information — a person's contact list, for example — and then display the relevant part of the app's privacy policy that explains the reason why.

He described it as part of a broader effort to reimagine the privacy policy as a more dynamic tool for transparency, rather than a static document that no one reads again after they've installed an app.

For Austin, it's also an example of how artificial intelligence can be used to society's benefit, in spite of very legitimate concerns about algorithmic bias and automated decision-making run amok.

"This is technology that we can use to to make the digital world more transparent," she said. "And that's a real win."