Search⌘ K
AI Features

How SQL Injection Works

Explore the mechanics of SQL injection attacks, analyzing how malicious inputs break out of SQL templates to alter queries. Understand the risks of inadequate defenses such as client-side blocking or simple input filtering, and gain insight into why robust server-side protections are essential.

Our code works under ideal inputs, but does it stand up to malicious use? The wildcard parameter to generateWildcardSQLForJournalEntrySearch is controlled by the attacker.

How much influence can the attacker have over the generated SQL by just controlling the wildcard parameter? Just like the knock-knock joke from the beginning of this chapter, this SQL statement was written with a mental model of a template where user input fits into one part and stays in its place to create a full statement. Can the attacker-controlled input break out of that template and alter the structure of the overall statement? What keeps the attacker-controlled wild card in its part of the statement? The answer is the percent signs. What would happen if the attacker-controlled wildcard contained a percent sign?

Calling this:

generateWildcardSQLForJournalEntrySearch(1, "lindy hop%");

will generate this response:

MySQL
SELECT CreatedTimestamp, Body
FROM journal_entries
WHERE PersonId = 1 AND Body LIKE '%lindy hop%%';

This is valid SQL, but it looks kind of odd. That double % at the end looks funny. More importantly for the themes of this course, it shows us how the attacker can start to break out of the template. What if the attacker searched for something weird like this?

can't use a contraction

This would result in the server calling our helper function:

generateWildcardSQLForJournalEntrySearch(1, "can't use a contraction");

This will generate the following SQL:

MySQL
SELECT CreatedTimestamp, Body
FROM journal_entries
WHERE PersonId = 1 AND Body LIKE '%can't use a contraction%';

That’s a different kind of SQL statement than we’ve seen before. The database throws an error when we try to execute this statement.

While the previous statements fit into a pattern that the developer envisioned for user input, this one breaks out of the pattern and the database can’t figure out what to do with it.

Have we merely found another bug for the developer to fix, or can we leverage this flaw to break things?

How about searching for something like this?

lindy hop%' OR 1=1;--

This would result in the server calling our helper function as follows:

Java
class Main {
public static String generateWildcardSQLForJournalEntrySearch(int personId, String wildcard) {
String prefix =
"SELECT CreatedTimestamp, Body from journal_entries WHERE PersonId = ";
String populated =
prefix +
personId +
" AND Body LIKE '%" +
wildcard + "%';";
return populated;
}
public static void main( String args[] ) {
System.out.println( generateWildcardSQLForJournalEntrySearch(1, "lindy hop%' OR 1=1;-- ") );
}
}

Running the generated SQL:

MySQL
SELECT CreatedTimestamp, Body
FROM journal_entries
WHERE PersonId = 1 AND Body LIKE '%lindy hop%' OR 1=1;-- %';

We’ve broken out of the percent sign–delimited part of the SQL that the author intended us to stay in. After that, we can add any SQL we want. In this case, we’ve added an OR clause to the SQL statement. We also added a comment. In SQL -- is the start of a comment that lasts until the end of the line. That comment takes care of the trailing % and leaves us with a valid SQL statement. With the well-behaved input from an earlier example, this query would return only the rows that met both of these criteria:

  1. PersonId matched
  2. Body LIKE '%lindy hop%' (That is, Body contained “lindy hop”)

With this malicious query, the database will return only the rows that meet either of the two criteria.

  1. PersonId matched AND Body contained “lindy hop”
  2. 1=1 This is always true. It doesn’t even depend on the values in the database. The value 1 is always equal to 1.

Since the second criterium is always true no matter what rows are in the table, every row is returned, no matter which person wrote the journal entry in question.

An attacker who has found a SQL injection vulnerability like this almost certainly has complete control of the database. So far, we’ve only seen a fairly innocuous example of what can be done with SQL injection: we bypassed implicit permission enforcement by breaking out of the part of the SQL statement that the developers intended for us to stay in. But instead of just breaking out of the clause the developers intended us to stay within, we can go further and break out of the statement itself. Instead of just adding to the WHERE clause, the attacker could terminate the SELECT statement, append a semicolon, and start a new statement. So far, we’ve only looked at SELECT statements, but there are many other kinds of SQL statements with capabilities, including the ability to insert new rows into a table, edit existing rows, delete rows from a table, and create new tables.

This is a disaster. How do we stop this? Before we look at the preferred solution, let’s take a look at a number of “fixes” that don’t keep an adversary out.

Proposed fixes and why they don’t always work

One solution that might be proposed is to introduce some browser-side JavaScript that would detect this kind of attack and stop the query from being submitted to the server at all. This is not a useful defense. JavaScript can be disabled in a browser. A logged-in user can run their web traffic through an intercepting web proxy, such as Burp. A proxy like Burp lets a user make arbitrary changes to the underlying HTTP requests their browser makes or even construct new HTTP requests altogether.

Additionally, things other than web browsers can make web requests. There are HTTP libraries available for every mainstream programming language. There are command-line tools like curl and HTTPie. These libraries and command-line tools can be used to make arbitrary HTTP requests that would bypass any JavaScript-based defences.

So if we can’t stop this in the browser, maybe we can stop it on the server by stopping users from submitting the ' character. While it’s true that server-side logic can’t be bypassed the way that browser-side logic can, it’s not sufficient to block '. Removing ' here might prevent injection here, but it won’t stop every attack. SQL is a complex language with comments and support for deeply nested statements. SQL and user input can be designed to work together in many ways, so there are many ways malicious input could sneak in. Any attempt to find them all is likely to miss some. Even if you could find them all today, tomorrow’s development efforts may introduce new interactions with new attack surfaces.

Finally, sometimes people legitimately want to use contractions, refer people with apostrophes in their last names or discuss SQL injection attacks. Removing all apostrophes would hinder those conversations.

                                                 Q U I Z  

Technical Quiz
1.

Why doesn’t using JavaScript to detect malicious user input and stop it at the browser mitigate potential SQL injection attacks?

A.

JavaScript can be disabled in the browser.

B.

Proxies like Burp can be used to craft http requests.

C.

Not all traffic comes from web browsers.

D.

All of the above.


1 / 2

In the next lesson, we’ll study some ways to prevent an SQL injection.