How health websites share your data with Facebook and Google
The FT's European technology correspondent Madhumita Murgia and senior newsroom developer Max Harlow look at how the WebMD symptom checker passes your information to dozens of companies and third parties
Produced by Joe Sinclair; filmed by Joe Sinclair and Petros Gioumpasis
Transcript
You can enable subtitles (captions) in the video player
Every day thousands of people look at health websites to find out information about their health or put in their symptoms to figure out if they can get a diagnosis. So we decided to look behind the scenes and look at really what happens to the data that we put into health websites. So Max here is going to take us through one specific website that we decided to analyse and show us really what's going on. The hidden tracking that happens behind the scenes.
We're going to be going to the WebMD Symptom Checker. What might not be clear when we're opening up a web pages is there is a lot of stuff that's happening behind the scenes. Though normally this is hidden to us, we've got a tool here called HTTP Toolkit that will let us see exactly what's going on.
So here, really, we should be able to see a list of companies or trackers that are looking at what we're doing on this website, right?
Yes. What we're interested in is third parties. Domain names, websites owned by other companies potentially that are being communicated with by this page. This is the requests that our computer made when we went to this URL when we pressed Enter.
And down here we've got this bit of information. It says, set-cookie. And we've got this thing. It says VisitorId= And then we've got a long string of numbers and letters.
Right.
We can look at what other requests this page has made, which has also included this specific ID.
What's a request, Max?
So this is just communicating with another website, normally fetching a bit of information. So fetching a whole web page or fetching an image or fetching a script, a bit of code that's then going to be used on the page.
And the thing that's interesting here is, this is even before we've clicked "I agree" to their privacy policy, right? So at this point, we haven't agreed to being tracked.
We haven't agreed to anything.
So these companies that are getting the ID, what does it mean? What can they do with that ID?
Well, they can associate that ID with this web address. But also, we're about to start entering some personal information. And if there were further requests sending that information to them, they'll have a personal identifier, and then they'll also have some protected category data about us.
MADHUMITA MURGIA: OK, so shall we go in and try and put in some medical symptoms and see who ends up getting a peek at them?
So let's agree first.
Should we be 31?
Are we going to be male or female?
Male.
OK. So back on this list. If I click here, we can see what happens when I press that Male button. So it went to facebook.com.
So as soon as we clicked Male that information went to Facebook and a bunch of other people.
So what's our main symptom?
What do we want to start with?
So let's start with some diarrhoea.
OK, pleasant.
OK, so if I search for diarrhoea, we can see another very similar request going out.
A fever?
Yeah, let's have a fever.
There we are again.
And there we are again. If we look in here we can see other bits of information. So we've clicked a Common Symptom button, one of these buttons down here. And the button text is "fever."
Throat irritation.
Yes.
There it is.
There's our throat irritation.
So they think it's most likely we've got ulcerative colitis or the flu or we've taken a drug overdose.
That's Facebook.
So Facebook has received all of the buttons that we've clicked, including our symptoms, even our diagnosis. And we don't know what they're doing with it. But I think it's really interesting because ultimately, they're an advertising company. So if they're using this to build profiles about us associated with one of the IDs, then it just means that people can target us or, rather, other advertising companies can target us based on what they think we have.
Yes, this is all personal health information, information that's meant to be some of the most protected by law.
And certainly something you don't want Facebook to know about you.
Possibly not. So let's have a look at some of the third parties. Now, I was expecting this to all be the big tech companies. But there's so many that are many other companies which really aren't household names - lijit.com, adnxs.
They're quite a big ad tech company, as well, aren't they?
And you would never realise from going to this quite simple page with two adverts just how many companies would be receiving data about you.
MADHUMITA MURGIA: And what's interesting, right, is that we don't know what happens after this moment. So they could be selling on this information or sharing it with a whole bunch of other companies who then sell it on and on and on. So you know, we have no idea really whose hands it ends up in. And we have no choice but to trust them.
So what I think is interesting is, a lot of that happened before we clicked Accept at all. It wasn't linked to the privacy policy there, though.
Every time you place any of these tracking cookies onto a website you have to ask the user whether they consent to that. So when I went through the policy, it kind of says that they share data with third parties, they list third parties. So what they said is: we share some data with Google and Facebook.
If you want to know what they do with it, look at their privacy policies. And you know, that's very much a grey area because according to the law, consent is supposed to be unambiguous. And I wouldn't say that was unambiguous at all because I'm still left with a million questions about where my data is going and what's happening with it.
Is that protected category data they're meant to require explicit consent. But this didn't seem to be particularly special, did it?
You know, we would never dream that this data was being shared in the way that it was. And the privacy policy really doesn't give us an idea of the extent of this, does it?