CAPTCHA stands for ‘Completely Automated Public Turing test to tell Computers and Humans Apart’ and was one of the first methods to detect bots on the internet. Since then the technology has evolved and is typically coupled with a detection layer that will look for anomalies in the request and invoke the CAPTCHA challenge only as needed. With the advancement in detection accuracy, legitimate users are rarely asked to resolve a CAPTCHA challenge.
The high-level architecture
The private access token architecture consists of four key roles:
- The client, running a web browser and requesting content to an origin
- The origin, hosting content with the ability to request a client for a privacy token before providing access to a resource
- The mediator, a service that authenticates the client before requesting tokens to the issuer
- The issuer, applying the origin policy and issues tokens upon request from the mediator
To put this into context, web browser vendors (Mozilla, Google Chrome, Apple Safari) will play the role of client. As a web security vendor protecting against fraud and abuse, Arkose Labs intends to expand its product offering to become a mediator. CDN vendors (Cloudflare, Fastly) may play the role of issuers. website owners will take the role of origin.
The PAT protocol defines the interaction between the different entities; however, not all types of requests should be handled with PAT. Some workflows or resources of a website are simply too critical to be allowed only based on a blinded token that is not linkable to a specific device. For example, redeeming a privacy token to allow users to access a product page on an e-commerce site, an article on a media site, download an ad, or even search for a flight on an airline’s website makes complete sense to ensure the smoothest user experience and preserve privacy. However, redeeming a token when the user attempts to login, create a new account or change any account settings is riskier. The PAT protocol is flexible and allows the website owner the ability to define when to request a token. This hybrid approach offers the best compromise between privacy and security where the fraud detection and possible challenge will only happen on a very small portion of the site.
Arkose Labs’ customers primarily use our product for critical endpoints, such as login and signup. When introducing the privacy pass, Arkose Labs will continue to help detect fraudulent activity when the user starts their interaction with the site, but additionally interact with token issuers so that the client can redeem them and access content on the origin web server. We foresee privacy tokens primarily being used to handle content scraping and ad fraud use cases.
The workflow with Arkose Labs
The diagram below represents the interaction between the client, Arkose Labs, and the origin web server:
- The client makes a request to the origin for a protected resource
- If the resource distribution must be protected (product page, search query, ad, etc.), the origin will challenge a client for an access token. If the request is for a critical resource (i.e. login, account creation), the origin may run a more in-depth evaluation using Arkose Labs’ products, Arkose Detect and Arkose Protect
- Assuming the origin challenged the user for an access token, if one is available in the local browser cache, the client will send it to the origin to redeem it in exchange for access to the resource. If no token is available, it will contact the mediator, Arkose Labs. The Arkose Detect or Arkose Protect workflow is invoked. During this process, we would take into account many factors beyond simply solving a CAPTCHA to determine if the request should be passed to the issuer
- If no anomalies are detected with the request, the Arkose Labs server will pass the request to the issuer
- The issuer will apply the origin policy and issue a token that will be returned to the mediator
- The mediator will forward the token received from the issuer to the client
- The client will store the token in its local cache and redeem it in exchange for access to the resource. Tokens will be tied to a specific client and may only be valid for the duration of the session
- The origin will verify the token is valid and respond to the client with the requested content. The origin must keep track of the tokens redeemed to avoid replays.
Food for thought
Brand new protocol: The PAT protocol is brand new and various vendors implement specific roles. It’s great to see early adopters but it may take some time for the implementation to mature. Attackers are experts in finding and exploiting flaws in new products, which may increase the number of security incidents until the technology has reached a certain level of maturity.
Website owners will need to maintain state as the origin: Some logic is required at the origin to keep track of the token being redeemed. They have in the past relied on their web security vendors to handle this logic that can be complex to synchronize in case of a distributed infrastructure.
False positives at the mediator level would have a multiplier effect: Let’s face it, despite the industry’s efforts, it’s very hard to achieve 100% accuracy in detection, simply because the threat vectors have a tendency to evolve faster than the defense. If for any session, a client is awarded 10 tokens (therefore reducing the cost of the attack by a factor of 10), the multiplying effect could cause significant damage for the website owner, which is a reason why we recommend not to use PAT for critical endpoints.
The PAT protocol only includes basic safety: The protocol suggests using rate limiting, tying the token to the IP subnet, and limiting the validity of the token to detect abuses. These unfortunately are known security measures that attackers learned to circumvent years ago and may not be enough to prevent abuse with the tokens, which is a reason why advanced web security products need to collect a variety of data to accurately evaluate the traffic.
Arkose Labs is fully committed to supporting internet user privacy and the PAT protocol by the IETF provides a framework for achieving this goal. Our support of such features will be designed to ensure adversaries are not given an uneven cost benefit, and the feature would be opt-in for our customers. Expect more announcements from Arkose Labs coming soon on this topic.