Search⌘ K
AI Features

Pattern Matching

Explore how to perform pattern matching in SQL using the LIKE operator and wildcards to search and filter data with partial matches. Understand how to combine LIKE with logical operators, use NOT LIKE for exclusions, and apply best practices to handle real-world text data effectively.

Imagine we need to find all products that have a certain word in their name from the online store’s database. For instance, we might want to identify all products whose names contain “book” so that we can run a promotional discount on them. Searching by exact matches might not be enough if we need to handle variations in spelling, partial terms, or different letter cases. This is where pattern matching becomes valuable.

Pattern matching in SQL is a powerful tool that allows us to find specific data within a table that follows a defined pattern. This is particularly useful when searching for records with similar attributes, such as names starting with a certain letter or products containing specific keywords.

Pattern matching allows us to retrieve rows that match partial strings instead of requiring exact matches. This is important whenever we deal with data that might not follow a strict or uniform naming convention.

Let's explore the concept of pattern matching in SQL. We'll go step by step to:

  • Understand the purpose and importance of pattern matching in SQL.

  • Learn how to perform pattern matching with the LIKE operator.

  • Learn how to use wildcards for pattern matching.

Using the LIKE operator

When working with user-generated content or when product names and descriptions do not follow a strict pattern, we cannot rely solely on equality operators. If we suspect a customer’s search term is contained somewhere in a product name, LIKE helps us find it without knowing the exact string up front.

In SQL, the LIKE operator is used to search for a specified string pattern in a column. The syntax of the LIKE operator is shown below:

C++
SELECT Column1, Column2, ...
FROM TableName
WHERE ColumnName LIKE Pattern;

In the code above, the Pattern is created using wildcardsA placeholder, usually represented by a character.. Before we explore more about the LIKE clause, let's first understand what wildcards are.

Wildcards

LIKE is used in conjunction with wildcards to create search patterns. Common wildcards include:

  • % (percent): It represents zero, one, or multiple characters. For example, the pattern 'Jo%' retrieves all the names that start with "Jo" like "John," "Joanna," and so on.

  • _ (underscore): It represents a single character. For example, the pattern 'J_n' retrieves all the names that are three letters long, start with "J," and end with "n" like "Jon," "Jan," and so on.

Now, let's look at an example that demonstrates the use of LIKE operator while retrieving some data. Remember we have the Address field as part of the customer details in the Customers table:

CustomerID

CustomerName

Email

Phone

Address

1

John Doe

john.doe@smail.com

413-456-6862

123 Elm Street, Springfield, 01103

2

Jane Smith

jane.smith@inlook.com

708-567-5234

456 Maple Avenue, Riverside, 60546

3

Alice Johnson

alice.j@jmail.com

317-678-5717

789 Oak Lane, Greenwood, 46142

. . .

. . .

. . .

. . .

. . .

21

Charlotte Miller

charlotte.m@hotmail.com

512-234-1311

66 Pine Trail, Sunset Valley, 78745

22

Lucas Anderson

lucas.anderson@jmail.com

708-345-9809

77 Walnut Place, Lakeview, 48850

23

Emma Watson

emma.w@service.org

864-789-4731

303 Birch Road, Greenwood, 29646

Let's use LIKE to retrieve all the customers who live in "Greenwood".

MySQL
-- Definition for the Customers table
-- CREATE TABLE Customers (
-- CustomerID INT PRIMARY KEY AUTO_INCREMENT,
-- CustomerName VARCHAR(50) NOT NULL,
-- Email VARCHAR(50),
-- Phone VARCHAR(15),
-- Address VARCHAR(100)
-- );
-- ------------------------------------------------------------------------------
SELECT CustomerName, Address
FROM Customers
WHERE Address LIKE '%Greenwood%';

The % wildcard has been used in a way to ensure the word "Greenwood" can appear at any position. However, if we are certain about a specific pattern in a column, such as the city name always appearing at the end in the Address column, we can place the wildcard in a more specific position, %Greenwood.

Combining LIKE with logical operators

We can combine LIKE with logical operators, such as AND, OR, and NOT, to create more complex patterns. For example, considering the same Customers table as we used in the previous section, let's retrieve all the customers who live in "Greenwood," and their emails are from the specific domain, "...service.org."

MySQL
-- Definition for the Customers table
-- CREATE TABLE Customers (
-- CustomerID INT PRIMARY KEY AUTO_INCREMENT,
-- CustomerName VARCHAR(50) NOT NULL,
-- Email VARCHAR(50),
-- Phone VARCHAR(15),
-- Address VARCHAR(100)
-- );
-- ------------------------------------------------------------------------------
SELECT CustomerName, Address, Email
FROM Customers
WHERE Address LIKE '%Greenwood%' AND Email LIKE "%service.org%";

Similar to this, we can use LIKE with other logical operators to achieve more refined results.

Using the NOT LIKE operator

So far, we’ve used the LIKE operator to find patterns in our data. In many real-world scenarios, we may want to do the opposite, that is, exclude certain patterns from our results.

The NOT LIKE operator allows us to filter out rows that match a given pattern. It works as the negation of the LIKE operator and is useful when we want to remove unwanted matches from our dataset.

Syntax

MySQL
SELECT Column1, Column2, ...
FROM TableName
WHERE ColumnName NOT LIKE Pattern;

Example 1: Excluding customers based on address

Let’s reuse the Customers table and exclude all customers who live in "Greenwood":

C++
-- Definition for the Customers table
-- CREATE TABLE Customers (
-- CustomerID INT PRIMARY KEY AUTO_INCREMENT,
-- CustomerName VARCHAR(50) NOT NULL,
-- Email VARCHAR(50),
-- Phone VARCHAR(15),
-- Address VARCHAR(100)
-- );
-- ------------------------------------------------------------------------------
SELECT CustomerName, Address
FROM Customers
WHERE Address NOT LIKE '%Greenwood%';

In this query, the % wildcard allows "Greenwood" to appear anywhere in the address.

The NOT LIKE condition excludes all such rows and returns only the remaining customers.

Example 2: Using _ for fixed-length patterns

The _ wildcard represents exactly one character. We can use it when we want to match values with a fixed structure.

Let’s find products whose names have exactly four characters:

C++
-- Definition for the Products table
-- CREATE TABLE Products (
-- ProductID INT PRIMARY KEY AUTO_INCREMENT,
-- ProductName VARCHAR(50) NOT NULL UNIQUE,
-- CategoryID INT,
-- Price DECIMAL(10, 2) NOT NULL,
-- Stock INT NOT NULL,
-- FOREIGN KEY (CategoryID) REFERENCES Categories(CategoryID)
-- );
-- ------------------------------------------------------------------------------
SELECT ProductName
FROM Products
WHERE ProductName LIKE '____';

Here, each _ represents one character. So '_____' matches only those product names that are exactly five characters long.

Now, to exclude such products, we can use NOT LIKE:

C++
-- Definition for the Products table
-- CREATE TABLE Products (
-- ProductID INT PRIMARY KEY AUTO_INCREMENT,
-- ProductName VARCHAR(50) NOT NULL UNIQUE,
-- CategoryID INT,
-- Price DECIMAL(10, 2) NOT NULL,
-- Stock INT NOT NULL,
-- FOREIGN KEY (CategoryID) REFERENCES Categories(CategoryID)
-- );
-- ------------------------------------------------------------------------------
SELECT ProductName
FROM Products
WHERE ProductName NOT LIKE '____';

This query returns all product names that are not exactly four characters long.

The same idea applies regardless of the actual data. The _ wildcard always represents a single character at a specific position.

Understanding wildcards with NOT LIKE

  • % represents zero or more characters

  • _ represents exactly one character

  • NOT LIKE excludes rows that match the defined pattern

Best practices when using LIKE

Here are some key pointers to follow when using the LIKE operator for efficient and accurate query results:

  • Avoid overly generic patterns as they can slow down query performance.

  • Prevent SQL injectionSQL injection is a code injection technique used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution. by sanitizing user inputs used in pattern matching.

  • Be mindful of performance as pattern matching with leading wildcards (e.g., %term%) can be slower, especially on large tables.

Common mistakes to avoid when using LIKE

Here are some common pitfalls to watch out for when using the LIKE operator to prevent inefficiencies and errors in your queries:

  • Understand whether your SQL environment uses case-sensitive pattern matching.

  • Forgetting to use wildcards and expecting partial matches to work. Without % or _, LIKE behaves just like =.

  • Ensure the pattern you define matches the intended data.

Code exercise

We have explored the concept of pattern matching in SQL. Now, let’s apply this knowledge with a practical exercise.

Note: Write your solution code in the code widgets provided at the end of each task. Click "Run" to execute your code and compare the results with the expected outputs to see if your solution is correct.

Task 1

For this task, consider the Customers table, defined in the code widget below. Write a query to find customers with phone numbers starting with "708."

Expected output:

CustomerName

Phone

Jane Smith

708-567-5234

Emily Davis

708-345-4980

Olivia Thompson

708-123-1561

Lucas Anderson

708-345-9809

Write your solution code in the widget given below.

MySQL
-- Definition for the Customers table
-- CREATE TABLE Customers (
-- CustomerID INT PRIMARY KEY AUTO_INCREMENT,
-- CustomerName VARCHAR(50) NOT NULL,
-- Email VARCHAR(50),
-- Phone VARCHAR(15),
-- Address VARCHAR(100)
-- );
-- ------------------------------------------------------------------------------
-- Write your code here

We recommend that you try solving the exercise on your own first. However, if you still want to see the correct solution, click the "Show Solution" button below.

Task 2

For this task, consider the Products table, defined in the code widget below. Write a query to find all products that contain the substring “smart” anywhere in their name.

Expected output:

ProductID

ProductName

Price

Stock

1

Smartphone

635.79

50

4

Smart Outdoor Light

75.62

25

11

Smart Kitchen Scale

75.65

40

16

Smart TV

2599.99

20

19

Smartwatch

500.00

50

20

Smart Fitness Band

350.99

60

Write your solution code in the widget given below.

MySQL
-- Definition for the Products table
-- CREATE TABLE Products (
-- ProductID INT PRIMARY KEY AUTO_INCREMENT,
-- ProductName VARCHAR(50) NOT NULL UNIQUE,
-- CategoryID INT,
-- Price DECIMAL(10, 2) NOT NULL,
-- Stock INT NOT NULL,
-- FOREIGN KEY (CategoryID) REFERENCES Categories(CategoryID)
-- );
-- ------------------------------------------------------------------------------
-- Write your code here

We recommend that you try solving the exercise on your own first. However, if you still want to see the correct solution, click the "Show Solution" button below.

That's about learning how to use pattern matching in SQL to filter data effectively. We thoroughly explored the LIKE operator, and wildcards for flexible and powerful searches. Pattern matching is a vital skill for handling text data and adds a versatile tool to our SQL arsenal.

Keep practicing and experimenting with different patterns in your queries. Each step builds your confidence and strengthens your ability to manage real-world data challenges effectively!