Manta Beats Bots with Automated Protection
As a destination site where about 30 million small and midsize companies market themselves to each other and consumers, Manta is attractive to another form of traffic: web scrapers that use automated bots to steal content and siphon valuable IT resources away from business-facing tasks.
The directory site and its technology team have evolved over its 10-year history. Manta fully embraced DevOps in 2013 after undergoing leadership changes and a business reorganization that focused on enhancing consumers' ability to locate local service providers, said Russell Garrison, datacenter coordinator in the DevOps group at Manta, in an interview.
Previously, Manta was primarily a B2B site where SMBs acted as a referral engine, a service that continues. Since half the site's traffic already consisted of consumers, Manta decided to leverage that group and enhance its offerings with both consumers and SMBs in mind, Garrison said. But these changes demanded new IT processes, he said.
"As far as agile, DevOps, and continuously releasing our site, that whole delivery process happened after we made some changes a few years ago to leadership and when we decided to invest strategically in processes and techy, in 2013," said Garrison.
Manta reworked its internal product development processes, eliminating long-term planning and rewriting technology employees' roles. That included creating a four-person DevOps team, including Garrison who previously coordinated Manta's datacenter, he said.
The company stopped running its legacy operations in a datacenter, transitioning to an Amazon cloud facility, said Garrison. But Manta increasingly faced an expensive security problem that jeopardized its business and tied up invaluable IT resources.
Bots, which repeatedly perform automated tasks over the web, represented 56 percent of all web traffic last year, according to Incapsula's 2014 Bot Traffic Report; bad bots accounted for 29 percent of that traffic, the study found. In fact, 22 percent were impersonators; 3.5 percent were hackers; 3 percent were scrapers, and 0.5 percent were spammers, Incapsula found. Small sites are more apt to attract a higher percentage of bots. Smaller websites saw more bots – 80.5 percent per 1,000 visits per day – while about 56.2 percent of 10,000 visits at large sites daily came from bots, the report said. "
We have been conducting this study since 2012, and one constant in our findings is that malicious bots are becoming increasingly sophisticated and harder to distinguish from humans. These bots pose a huge threat to websites and are capable of large-scale hack attacks, DDoS floods, spam schemes and click fraud campaigns," said Marc Gaffan, CEO of Incapsula. "With the vulnerabilities exposed in the past year, notably Shellshock, it is more important than ever that companies operating websites are diligent in securing their sites from malicious traffic."
Despite the prevalence of bots, more than 30 percent of the top 50 media websites could not detect and block the most basic bot and more than one-tenth of the top 500 Internet retailers could not detect the most basic bot, according to the Online Trust Audit and Honor Roll Report. While 44 percent of those companies audited passed, 46 percent failed in one or more categories, said Distil Networks, which partnered with OTAH.
Time to Act
Previously, Manta attempted to address bots via a drop-in application and homegrown system that did not resolve the problem, Garrison said. For one thing, Manta cannot fully exclude bots: It needs to work with legitimate automated bots from legitimate search engines that seek out and share its content.
"We get crawled very heavily by Google, Bing, and Yahoo, so we need to feed legitimate crawlers to maximize return and keep the site up too, so that can happen," said Garrison.
Manta next tried using an appliance to solve its bot problems, but the system did not integrate well with the site's infrastructure, he said. Going back to the drawing board, Manta experimented with other software approaches, each time running into problems, until it heard about Distil Networks, said Garrison.
The investment has paid off multiple times in several ways, he said. "We enjoy much easier and more cost-effective protection of the site and there's just recently been the case of a very, very determined adversary using massively distributed denial of service to attack the site. It presented some challenges, but Distil took almost the majority of the traffic and effectively detected it and blocked it," he said. "Without Distil, even without the microservices, it would have been a nightmare."
In addition to keeping the site operational during attacks, automatically preventing bot attacks saves Manta money, said Garrison.
"We do not have to scale up to handle [bots'] additional traffic," he said. "There are a couple of services that are choke points, especially if someone is doing something to pretend they're a Google bot or Bing bot. It's freeing up development time, lowering infrastructure costs, and giving us peace of mind."
Because Distil includes static content delivery network (CDN) services, Manta no longer pays about $2,000 per month for Amazon CloudFront, added Garrison. "Anybody with a significant amount of volume needs a CDN," he said. "There is a bit of an arms race, and essentially the way I look at is we're doing a risk management/cost-benefit scenario."
Manta monitors site visitors through Distil's web console, which is widely available throughout many of the company's departments including business development, sales, and marketing for insight into areas such as traffic patterns, said Garrison. At the first sign of any inconsistencies or suspicious activity, Garrison's team notifies Distil so its data scientists can further investigate the source, he said.
"It's a very popular destination because of the amount of directory information we have," said Garrison. "At one point we saw about five-times the traffic we normally see because they were getting so aggressive, trying to grab the site. We were able to take immediate action and within the next few hours we pushed them back. We won the war."
automatically preventing bot attacks frees up IT resources to continue focusing on business tasks.