Pattern Matching
Explore how to perform pattern matching in SQL using the LIKE operator and wildcards to search and filter data with partial matches. Understand how to combine LIKE with logical operators, use NOT LIKE for exclusions, and apply best practices to handle real-world text data effectively.
Imagine we need to find all products that have a certain word in their name from the online store’s database. For instance, we might want to identify all products whose names contain “book” so that we can run a promotional discount on them. Searching by exact matches might not be enough if we need to handle variations in spelling, partial terms, or different letter cases. This is where pattern matching becomes valuable.
Pattern matching in SQL is a powerful tool that allows us to find specific data within a table that follows a defined pattern. This is particularly useful when searching for records with similar attributes, such as names starting with a certain letter or products containing specific keywords.
Pattern matching allows us to retrieve rows that match partial strings instead of requiring exact matches. This is important whenever we deal with data that might not follow a strict or uniform naming convention.
Let's explore the concept of pattern matching in SQL. We'll go step by step to:
Understand the purpose and importance of pattern matching in SQL.
Learn how to perform pattern matching with the
LIKEoperator.Learn how to use wildcards for pattern matching.
Using the LIKE operator
When working with user-generated content or when product names and descriptions do not follow a strict pattern, we cannot rely solely on equality operators. If we suspect a customer’s search term is contained somewhere in a product name, LIKE helps us find it without knowing the exact string up front.
In SQL, the LIKE operator is used to search for a specified string pattern in a column. The syntax of the LIKE operator is shown below:
In the code above, the Pattern is created using LIKE clause, let's first understand what wildcards are.
Wildcards
LIKE is used in conjunction with wildcards to create search patterns. Common wildcards include:
%(percent): It represents zero, one, or multiple characters. For example, the pattern'Jo%'retrieves all the names that start with "Jo" like "John," "Joanna," and so on._(underscore): It represents a single character. For example, the pattern'J_n'retrieves all the names that are three letters long, start with "J," and end with "n" like "Jon," "Jan," and so on.
Now, let's look at an example that demonstrates the use of LIKE operator while retrieving some data. Remember we have the Address field as part of the customer details in the Customers table:
CustomerID | CustomerName | Phone | Address | |
1 | John Doe | john.doe@smail.com | 413-456-6862 | 123 Elm Street, Springfield, 01103 |
2 | Jane Smith | jane.smith@inlook.com | 708-567-5234 | 456 Maple Avenue, Riverside, 60546 |
3 | Alice Johnson | alice.j@jmail.com | 317-678-5717 | 789 Oak Lane, Greenwood, 46142 |
. . . | . . . | . . . | . . . | . . . |
21 | Charlotte Miller | charlotte.m@hotmail.com | 512-234-1311 | 66 Pine Trail, Sunset Valley, 78745 |
22 | Lucas Anderson | lucas.anderson@jmail.com | 708-345-9809 | 77 Walnut Place, Lakeview, 48850 |
23 | Emma Watson | emma.w@service.org | 864-789-4731 | 303 Birch Road, Greenwood, 29646 |
Let's use LIKE to retrieve all the customers who live in "Greenwood".
The % wildcard has been used in a way to ensure the word "Greenwood" can appear at any position. However, if we are certain about a specific pattern in a column, such as the city name always appearing at the end in the Address column, we can place the wildcard in a more specific position, %Greenwood.
Combining LIKE with logical operators
We can combine LIKE with logical operators, such as AND, OR, and NOT, to create more complex patterns. For example, considering the same Customers table as we used in the previous section, let's retrieve all the customers who live in "Greenwood," and their emails are from the specific domain, "...service.org."
Similar to this, we can use LIKE with other logical operators to achieve more refined results.
Using the NOT LIKE operator
So far, we’ve used the LIKE operator to find patterns in our data. In many real-world scenarios, we may want to do the opposite, that is, exclude certain patterns from our results.
The NOT LIKE operator allows us to filter out rows that match a given pattern. It works as the negation of the LIKE operator and is useful when we want to remove unwanted matches from our dataset.
Syntax
Example 1: Excluding customers based on address
Let’s reuse the Customers table and exclude all customers who live in "Greenwood":
In this query, the % wildcard allows "Greenwood" to appear anywhere in the address.
The NOT LIKE condition excludes all such rows and returns only the remaining customers.
Example 2: Using _ for fixed-length patterns
The _ wildcard represents exactly one character. We can use it when we want to match values with a fixed structure.
Let’s find products whose names have exactly four characters:
Here, each _ represents one character. So '_____' matches only those product names that are exactly five characters long.
Now, to exclude such products, we can use NOT LIKE:
This query returns all product names that are not exactly four characters long.
The same idea applies regardless of the actual data. The
_wildcard always represents a single character at a specific position.
Understanding wildcards with NOT LIKE
%represents zero or more characters_represents exactly one characterNOT LIKEexcludes rows that match the defined pattern
Best practices when using LIKE
Here are some key pointers to follow when using the LIKE operator for efficient and accurate query results:
Avoid overly generic patterns as they can slow down query performance.
Prevent
by sanitizing user inputs used in pattern matching.SQL injection SQL injection is a code injection technique used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution. Be mindful of performance as pattern matching with leading wildcards (e.g.,
%term%) can be slower, especially on large tables.
Common mistakes to avoid when using LIKE
Here are some common pitfalls to watch out for when using the LIKE operator to prevent inefficiencies and errors in your queries:
Understand whether your SQL environment uses case-sensitive pattern matching.
Forgetting to use wildcards and expecting partial matches to work. Without
%or_,LIKEbehaves just like=.Ensure the pattern you define matches the intended data.
Code exercise
We have explored the concept of pattern matching in SQL. Now, let’s apply this knowledge with a practical exercise.
Note: Write your solution code in the code widgets provided at the end of each task. Click "Run" to execute your code and compare the results with the expected outputs to see if your solution is correct.
Task 1
For this task, consider the Customers table, defined in the code widget below. Write a query to find customers with phone numbers starting with "708."
Expected output:
CustomerName | Phone |
Jane Smith | 708-567-5234 |
Emily Davis | 708-345-4980 |
Olivia Thompson | 708-123-1561 |
Lucas Anderson | 708-345-9809 |
Write your solution code in the widget given below.
We recommend that you try solving the exercise on your own first. However, if you still want to see the correct solution, click the "Show Solution" button below.
Task 2
For this task, consider the Products table, defined in the code widget below. Write a query to find all products that contain the substring “smart” anywhere in their name.
Expected output:
ProductID | ProductName | Price | Stock |
1 | Smartphone | 635.79 | 50 |
4 | Smart Outdoor Light | 75.62 | 25 |
11 | Smart Kitchen Scale | 75.65 | 40 |
16 | Smart TV | 2599.99 | 20 |
19 | Smartwatch | 500.00 | 50 |
20 | Smart Fitness Band | 350.99 | 60 |
Write your solution code in the widget given below.
We recommend that you try solving the exercise on your own first. However, if you still want to see the correct solution, click the "Show Solution" button below.
That's about learning how to use pattern matching in SQL to filter data effectively. We thoroughly explored the LIKE operator, and wildcards for flexible and powerful searches. Pattern matching is a vital skill for handling text data and adds a versatile tool to our SQL arsenal.
Keep practicing and experimenting with different patterns in your queries. Each step builds your confidence and strengthens your ability to manage real-world data challenges effectively!