Putting an End to Malicious Automated Web Scraping on Social Media Platforms

By

5 min Read
automated web scraping

Scraping refers to using automated bots used to quickly extract data at scale from websites and applications. Automated scraping allows attackers to make money by selling this stolen data to third parties or exploiting it themselves for criminal activities such as new fake account registration, account takeover, fake listings and reviews, inventory hoarding, and so forth

One of the biggest sources of scraping verified data is social media platforms, as they have evolved into the preferred destination for interaction, job postings, advertising, and influencing user decisions. The global social penetration rate today stands at 49% and there are nearly 3.6 billion active social media users worldwide. This figure is expected to rise to 4.41 billion by 2025.

An average user has accounts on at least three or four social media platforms and voluntarily shares a lot of personal information. This includes email address, phone number, organization details, photos, and so on. Multiple platforms—brimming with billions of active users every day—are, therefore, a magnet for fraudsters. Automated scraping is, possibly, the easiest way to harvest all this data at scale and speed with minimum investments.


Recommended Blog: How to Prevent Scraping Attacks


Businesses lose revenues and customer trust

Attackers exploit this stolen data in every possible manner to make money. They can use the data to create synthetic identities, orchestrate phishing scams, and improvise their own attack tactics. Or, they can sell the stolen databases of a business to its competitors or to third parties. Data is at the core of the commercial viability of any business today. Businesses create products and services based on customer preferences which they learn by analyzing their customers’ data. This in effect means, data helps businesses generate revenue.

When this data is stolen, exploited, and manipulated, businesses stare at fraud losses as well as a deficit in customer trust. Therefore, while scraping may look like an innocuous activity, it is the bedrock of larger downstream fraud and numerous criminal activities that hurt businesses and their customers.

Identifying malicious traffic from authentic users can be challenging

A case in point is one of our customers—a popular social networking giant with more than 600 million global users. The social networking platform was facing hot pursuit from attackers, who were looking to scrape user information so they could abuse it for financial gain.

The scale of operations and popularity of the social networking platform meant that automated scraping would result in large-scale financial losses and downstream fraud, originating from the stolen data, for authentic users. The social networking platform was facing an uphill task trying to filter out malicious traffic from authentic users, as it sought to ensure continued revenue-generation and protection for its genuine users from downstream fraud.

Automated scraping has evolved into a grave problem today because bots use advanced machine vision technology. They are scripted in such a way that they can circumvent traditional fraud solutions—such as legacy CAPTCHAs—and launch a large-scale attack, almost unchallenged. Purely data-driven solutions suffer the drawback of relying on trusted signals from the incoming traffic. With digital identities being mass-manipulated, a lot of signals fall into the gray zone, that confuses the fraud teams of the true intent of the incoming traffic. Behavioral biometrics do allow fraud teams to cross-reference the behavioral patterns with the user data in their possession. However, since the data is corrupt, the accuracy of the analysis is not completely reliable.


Recommended Blog: Introducing the Bankrupting Fraud Virtual Summit 2020


Adopt a future-ready approach to fight automated scraping

For all round protection of their business and customer interests, a fresh approach to tackling online abuse is needed. Businesses need a future-ready approach that not only fights online abuse today but also prepares them to prepare and fight evolving attack tactics in the future while ensuring a frictionless user experience for authentic users.

Our customer—the social networking giant—deployed our solution to detect and filter out risky users with certainty. Our solution goes beyond traditional fraud defense mechanisms, as it uses continuous intelligence and analyzes hundreds of parameters to create telltales of a fraud. Instead of outrightly blocking risky users, it uses targeted friction to pin down malicious users without disrupting the digital journeys of authentic users.

Preserve user experience and revenues

For the social networking platform, the Arkose Labs solution adopted a more nuanced approach to differentiate between automated scraping and authentic users—the future revenue-generating customers. The two components of the Arkose Labs Fraud and Abuse Prevention PlatformArkose Detect and Arkose Enforce—worked in tandem to identify risky users and then presented them with targeted friction in the form of adaptive, step-up enforcement challenges.

In most cases, authentic users did not even see the challenge; and those that did cleared the challenges with no difficulty at all. This meant there was no disruption to the user experience for authentic users. Potentially suspicious users—such as those that viewed multiple user profiles in a session, without logging in as a recognized user—were presented with an enforcement challenge. Automated scripts and bots trying to clear these context-based challenges at scale, failed instantly, as our proprietary challenges are tested and hardened against the most advanced machine vision technology.

Not only this, there was a marked uplift in good user throughput and a remarkable reduction in automated scraping attempts. With its multiple visible advantages, the Arkose Labs solution not only helped the social media platform save millions of dollars but also protected its sanctity and improved the user experience for authentic users.

At Arkose Labs, we guarantee a 100% SLA guarantee to all our customers against automated attacks. To learn how we helped the social networking platform root out automated scraping attempts with certainty, read the case study here.

Meet the Author

Share Now

Share on twitter
Share on facebook
Share on linkedin