Handling Java Web Application Input
15.09.2007
Inadequate data validation is the most common cause of security exploits suffered by web applications today. A staggering fact is the high number of applications exploited through weak validation. This is due to the simplicity of such an attack. No longer do attackers have to spend vast amounts of time researching ways to circumvent the security infrastructure of an application. An attacker can use freely available tools to scan for vulnerable websites. Using these findings, an attacker can use a web browser to ghost straight through firewall rule sets on port 80, altering the intended behavior of an application. This is true never more so than today. There are a multitude of technologies and frameworks available. Engineers are under increasing pressure to complete work on time, and hence place a heavy reliance on such tools. However, such technology may not adequately deal with user input to meet all cases, and as a result may introduce unintentional security vulnerabilities. Therefore, it is of paramount importance that secure coding practices are in place to close any possible doorway that permits such nefarious attacks to take place.
The purpose of this series of articles is to explain common security vulnerabilities associated with application input. This series emphasizes the importance of handling application input correctly. Although the topics covered are nothing new, they are critical to ensuring the security of an application. This series is aimed at practitioners interested in planning, designing, implementing, and maintaining software systems that are unaware of such issues. In this article, part one in the series, we will look at some validation best practices, along with SQL injection attacks. In later articles, we will look at other common attacks, and in particular, part two will deal with cross-site scripting attacks and error-handling techniques.
The Importance Of Server-Side Validation
The most common web application exploits are the result of not validating input effectively. Web applications provide two means for validating user input: client-side validation and server-side validation. Client-side validation is useful in that it enhances the user experience by improving the responsiveness and usability of an application. It is normally implemented in HTML-based user interfaces through a combination of JavaScript and HTML element attributes. However, client-side validation is easily circumvented, is less sophisticated than server-side validation and, if not used in conjunction with server-side validation, can introduce a number of serious vulnerabilities into an application.
For example, consider an online e-commerce application that allows a customer to purchase a number of books. The customer goes through the standard process of checking out: entering payment and delivery details. At each step through this process, client-side validation is performed, and the state at each step is stored in HTML hidden fields. Finally, the user is presented with a means to confirm the transaction. On confirmation, the order is wrapped in a HTML request and sent across the wire to a unsuspecting server. On receipt, the server performs no validation, but simply accepts the data with no questions asked and goes about its business.
Now, if you haven't already spotted the vulnerability, the threat is quite serious and easy to exploit. Before submitting the order, the attacker can view the HTML source, change the price stored unencrypted in HTML hidden field(s), and remove any JavaScript client-side checks by disabling scripting through the browser or deleting the script and saving the modified version of the checkout page locally. The attacker can then load the newly crafted checkout page in a web browser of their choice and complete the checkout process, by submitting the order to the server for processing.
Although the above example is very simplistic, you are probably thinking that never in a million years would you allow something similar to manifest itself. The point here is that data input into an application crosses a trust boundary; you have no control over what a user inputs, whether well-meaning or otherwise. Therefore, you are making an assumption about the integrity and skill of an end user, which is a sure recipe for disaster. For example, consider a simple message board application. A well-meaning user can alter page formatting and possibly deface a web page by passing in content containing HTML markup.
As a result, never gamble: always identify where data flows from an external source and where it does, analyze it carefully. If your findings reveal input being used to generate content, carefully consider how its usage can damage your application and be as stringent and prudent as possible, by employing server-side validation.
Validation Best Practices
A critical validation practice is to always test for valid data rather than invalid data. The criterion here is that there will most likely be a case you simply cannot perceive. For example, consider a simple file upload application, which for security reasons rejects all files with an .exe extension. If validation is coded to reject all files with this extension, you are opening up the application to new exploits through other types of malicious files. Alternatively, by explicitly checking for files with extensions considered safe and rejecting everything else, you are putting a more secure policy in place. Moreover, by checking for validity, you reduce the potential for exploits through data masquerading. Data masquerading is the process by which insecure data is represented in a way to make it look secure. This can be achieved through using features of the host platform, such as passing a file regarded as insecure, using a different file name representation to a Windows system using the MS-DOS short name, or by an attacker encoding data using a different character set.
Now, with the above in mind, let's look at a common exploit realized through database interaction.
SQL Injection Attacks
A common and dangerous exploit suffered by applications is an SQL injection attack. An SQL injection attack is the process of accepting input and using it to dynamically generate an SQL query that is subsequently executed against a database. As an example, let's consider a very simple and contrived example that has a simplistic database schema, as shown in Figure 1.


Figure 2 shows a simple web page that
accepts a customer ID and displays the
relevant information about a customer. This
simple implementation uses a query built by
pulling the customer ID from the
HttpServletRequest and using it to
generate the SQL.
final String custID = httpRequest.getParameter("custID");
final String sql = "Select * From Customer Where CustomerID
= '" + custID + "'"
The above code snippet illustrates an important point, in that we are trusting the user to input valid data, and are providing them with the capacity to tailor a query to return data not intended. Consider what would happen if the user entered this:
cust1' or 1=1 --
This input would generate the following SQL:
Select * From Customer Where CustomerID
= 'cust1' or 1=1 -- '
which will result in all data held within
the Customer table being
returned, as shown in Figure 3 below. The
-- is a comment operator
understood by SQL Server; other vendors may
provide a different operator, such as the
# provided by MySQL. This
operator will have the effect of causing any
data appearing in the query after the
comment to be ignored.

Figure 3. SQL injection results
A user could also simulate the same
result by using the SQL like
operator.
cust1' or CustomerID like '%
Some database vendors provide support for executing multiple SQL statements, like SQL Server. For example, entering something along the lines of:
Cust1'; select * from creditcard
Cust1'; drop table creditcard
would cause havoc to an application. An
attacker could use this technique to obtain
a privileged login to a system, delete
pertinent data, or even place an order for
goods. Fortunately, this exploit relies on
support from the database vendor and the
data access middleware used. Java uses JDBC
as its data access technology, and by
default it is not possible to introduce this
vulnerability. This is due to the fact that
a java.sql.Statement is only
allowed to return one
java.sql.ResultSet object at a time.
However, it is possible in Java to return
multiple result sets, so Java developers
beware. This can be done through using a
custom Statement implementation
or more programmatically, through using
features of the Statement
interface, as shown below.
final String sql ="Select * From Customer; " +
Select * From CreditCard";
final Statement s = con.createStatement();
final boolean rsReturned = s.execute(sql);
while (true) {
if (rsReturned) {
ResultSet rs = s.getResultSet();
// do something with result set
rs.close();
}
else {
// do something, could be an update etc
}
if (!s.getMoreResults()) {
// no more results so exit loop
break;
}
}
s.close();
con.close();
Another form of SQL injection attack is
facilitated through the use of SQL
union statements.
Cust1' union select Customer_ID, Type, Number,
ExpiryDate From CreditCard --
This is perhaps the most difficult of attacks to achieve--it requires understanding of the underlying database, in that the data types of each column and number of returned columns are required to be equal to the original query, as shown above. However, an attacker usually prevails through trial and error, especially if the application ignorantly reports any database errors to a patient attacker. Figure 4 shows what could happen.

SQL Injection Preventive Measures
The first step to preventing SQL injection attacks is to take advantage of the features provided by your database system. The most important of these is: never establish a database connection using an administrator account. Instead, provide role-based logins with the minimum privileges required to carry out a particular task. This will suppress an opportunity for an attacker to perform additional operations, as doing so will result in error. For example, if a task requires read access, never give write access; if a task requires read access to a particular table, only permit access to that table and no other.
Moreover, always consider where you store configuration information, especially in an external shared hosting environment. For instance, if your application stores SQL statements or database connection settings in a properties or XML file for flexibility, it will be more secure to move them into a Java class, applying techniques to protect against decompilation (which is the process of reconstructing Java source code from a compiled class file using a decompiler such as DJ Decompiler). Alternatively, if possible, use stored procedures and encrypt them (if supported by your database). Using these techniques will help to protect your database configuration data from prying eyes, and the possibility of your application being circumvented through the modification of such data.
The most critical step in preventing
against attack is to employ server-side
validation on any input data used as part of
a database call. Validation code should test
for validity, ensuring that input data is of
the expected type and length, rejecting any
invalid data. To further prevent against
attack, prefer the use of
java.sql.PreparedStatement over
java.sql.Statement and
java.sql.CallableStatement when
calling stored procedures.
final String sql = "Select * from Customer where CustomerID =?";
final PreparedStatement ps = con.prepareStatement(sql);
ps.setString(1,customerID);
Using a java.sql.PreparedStatement
as shown above provides strong type checking
through parameterized input via the
setXXX() methods. Also, the SQL above
now contains a placeholder for such input,
making obsolete the need to handle different
types of input differently during query
construction. This ensures that any data
passed is of the correct type, which limits
what an attacker can do. For instance, an
attacker can no longer pass character data
where a numeric value is expected, as this
will result in error. In the case where
character data is expected, the attacker can
no longer circumvent a query by passing
additional SQL commands. For example,
consider
Figure 3 again, where an attacker passes
cust1' or 1=1 -- as input into
the query below:
final String custID = httpRequest.getParameter("custID");
final String sql = "Select * From Customer Where CustomerID = '" + custID +"'
to produce the result shown below:
final String sql = "Select * From Customer Where CustomerID = 'cust1' or 1=1 --
By using a
java.sql.PreparedStatement, an
attacker can no longer manipulate the query
to achieve the same result as before. This
is due to the input being escaped as a
benign string and not a behavior-altering
unit of instruction. It generates the
following query:
final String sql = "Select * From Customer Where CustomerID = "cust1' or 1=1 --"
The input cust1' or 1=1 --
is now treated as the search criteria used
to match the CustomerID field.
Since no customer has an ID of this form, no
results will be returned. A further
advantage of this approach is that it
eliminates the need for messy validation
code that would be required if a
java.sql.PreparedStatement was not
used. Such validation code would include a
routine to replace any single quote
character sent as part of the input string
with the addition of an extra single quote
or by replacing it with a space character.
Moreover, by enforcing the use of
java.sql.PreparedStatement and
java.sql.CallableStatement as a
coding standard for all database access,
this reduces the burden placed on a
developer who may not be aware of such
validation routines when new functionality
is added, thus introducing unintentional
security holes.
Conclusion
In this article, I talked about some validation best practices when dealing with input from external sources. I also talked about the threat posed through SQL injection attacks and ways to prevent such an attack. In part two of this series, we will look at the very real and dangerous threat of cross-site scripting and some error-handling best practices in J2EE web applications.
Resources
- "Top Ten Security Vulnerabilities" from the Open Web Application Security Project
- Writing Secure Code: A good resource on secure coding techniques
Related information
