Extracting information from string columns is almost a repetitive necessity in Data Engineers, Data Scientists, and Business Analysts day to day tasks, and this task can be done using a programming language such as Python, or by SQL depends on your application and on the task required. In this tutorial, we will discover together how to extract and to manipulate strings using standard SQL, and we will use MySQL database as our database engine, most of the functions here are common in all database vendors, the difference maybe will be in syntax or keywords but the same concepts will exist in all database vendors.
Following is the content of our article with main topics we will discuss and each topic will be explained with examples
data:image/s3,"s3://crabby-images/8a29c/8a29cd1810448091d2ca7bd55829b3f1357954f5" alt=""
Change String Format
To change string format to Lower, Upper, or first character initial we will use the following functions:
UPPER Case
to change string format to Upper case we can use UPPER() or UCASE() functions as following
data:image/s3,"s3://crabby-images/d6e7e/d6e7e45e250192983fc68071bc8256b153b97c62" alt=""
/* Upper Case */
SELECT EMP_NO, FIRST_NAME, UPPER(FIRST_NAME) 'UPPER_FIRST_NAME',
LAST_NAME, UPPER(LAST_NAME) 'UPPER_LAST_NAME' FROM EMPLOYEES;
SELECT EMP_NO, FIRST_NAME, UCASE(FIRST_NAME) 'UPPER_FIRST_NAME',
LAST_NAME, UCASE(LAST_NAME) 'UPPER_LAST_NAME' FROM EMPLOYEES;
data:image/s3,"s3://crabby-images/5a6c7/5a6c70b00c05871f38687fb91cc1fa592626a9f2" alt=""
lower Case
to change string format to Upper case we can use LOWER() or LCASE() functions as following
data:image/s3,"s3://crabby-images/3e009/3e0090b721523c2174ef80cf5e0bf83e3eee28d7" alt=""
/* Lower Case */
SELECT EMP_NO, FIRST_NAME, LOWER(FIRST_NAME) 'LOWER_FIRST_NAME', LAST_NAME, LOWER(LAST_NAME) 'LOWER_LAST_NAME' FROM EMPLOYEES_ALL_UPPER;
SELECT EMP_NO, FIRST_NAME, LCASE(FIRST_NAME)'LOWER_FIRST_NAME', LAST_NAME, LCASE(LAST_NAME) 'LOWER_LAST_NAME' FROM EMPLOYEES_ALL_UPPER;
data:image/s3,"s3://crabby-images/e68ba/e68ba69d5fa8272abe4762fd309a5c22746e152a" alt=""
Note: We had created EMPLOYEES_ALL_UPPER from the main EMPLOYEES table but with all fields in upper case, to demonstrate the lower functionality
Initial Capital Case
In this case, the first letter of the string will be capital, and to apply this we will need to implement this logic using MySQL functions as there is no direct function to achieve this, in this function we take the first letter of the string then apply Upper method and concatenate the converted first letter with the remaining of the word as follows
/* Initcap */
SELECT EMP_NO, FIRST_NAME,
CONCAT(UCASE(LEFT(first_name, 1)), SUBSTRING(first_name, 2)) 'INIT_FIRST_NAME'
FROM EMPLOYEES;
data:image/s3,"s3://crabby-images/7562c/7562cfd15c7e2bcdf7d9c17f235c115fc2e22aa6" alt=""
We can use Upper, Lower not only in the SELECT statement of a query but we can use also in WHERE conditions or HAVING conditions to match certain values, example as following
SELECT EMP_NO, FIRST_NAME, LAST_NAME, HIRE_DATE
FROM EMPLOYEES
WHERE UPPER(FIRST_NAME)='FLORINA';
data:image/s3,"s3://crabby-images/70ef1/70ef10fb2b1546c926892f137bc83efac4a2cb82" alt=""
Merge or Slice String
One of the needs we can have with strings is that we need to merge two strings together, or we need to cut certain pieces of a string to extract information
Concatenate Strings
To concatenate strings we will use CONCAT or by CONCAT_WS functions, we will know the difference between both in the following examples, let’s start by CONCAT which has the following syntax
CONCAT
data:image/s3,"s3://crabby-images/091fe/091fe706805ba202031d8d295407c22bb0d16a50" alt=""
CONCAT function accepts columns names or strings and converts all to one string together
/* Concatenation */
SELECT EMP_NO, FIRST_NAME, LAST_NAME, CONCAT(FIRST_NAME,' ',LAST_NAME) AS 'Full Name'
FROM EMPLOYEES
data:image/s3,"s3://crabby-images/d19a5/d19a510932feb04e08f54d64a7944b2eece962df" alt=""
Example on Concatenating columns with string will be as following, we will add title initials before the full name
SELECT EMP_NO, FIRST_NAME, LAST_NAME, CONCAT('Mr/Ms. ',FIRST_NAME,' ',LAST_NAME) AS 'Full Name'
FROM EMPLOYEES
data:image/s3,"s3://crabby-images/4273f/4273f1b1f1ceac9db314071772959e6542e550aa" alt=""
Take Care: if any of the concatenated strings are Null, the output string will be Null also.
data:image/s3,"s3://crabby-images/9a356/9a3567979a159d0430bba89c9c224dec9786243d" alt=""
The solution to this problem is to use a CASE expression to check null values or update the table’s data with standard values for null columns
/* Handle Null Values */
SELECT EMP_NO, FIRST_NAME, LAST_NAME,
CASE
WHEN LAST_NAME IS NULL THEN CONCAT(FIRST_NAME,' ')
ELSE CONCAT(FIRST_NAME,' ',LAST_NAME) END AS 'Full Name'
FROM EMPLOYEES;
data:image/s3,"s3://crabby-images/3cd64/3cd643ad82cf7aa962821d31db198f71288d11c4" alt=""
CONCAT_WS
in this function we have the same inputs except that for all string will be merged we will use the same separator to merge all of them as following
data:image/s3,"s3://crabby-images/f68cf/f68cfafbb50e27ae8c5691f3020fe286356b59bd" alt=""
SELECT EMP_NO, FIRST_NAME, LAST_NAME, CONCAT_WS('-',FIRST_NAME,LAST_NAME) AS 'Full Name'
FROM EMPLOYEES
data:image/s3,"s3://crabby-images/23578/2357842afa1eb417584429dea885439ce522dc55" alt=""
Take Care: Only if the first string or column in the CONCAT_WS is null, the output string will be null, otherwise, the output will be a string and intermediate null values will be ignored
Substring Strings
In some we cases we need to extract information from a string, and we don’t need the full string in our task, to do that we will use SUBSTR function, with any of the following forms of the function
data:image/s3,"s3://crabby-images/c519d/c519d967752eec303bbc667de384754ea41c65fe" alt=""
SELECT EMP_NO, SUBSTR(FIRST_NAME,1,1) AS 'Initial', FIRST_NAME FROM EMPLOYEES;
SELECT EMP_NO, SUBSTR(FIRST_NAME FROM 1 FOR 2) AS 'Initial', FIRST_NAME FROM EMPLOYEES;
In This example, we are substring starting from index number 1 for the length of 1 character to get the name initial
data:image/s3,"s3://crabby-images/14c4e/14c4eea2122acdcfd4017c440787c41f2ade651f" alt=""
Indexing starts from left to write with the first character as number 1, and with negative numbers from the last character of the string with -1, the example below for string “DATAVALLEY” if we need to substring the indexes will be as shown below
data:image/s3,"s3://crabby-images/4b799/4b799071e1f3ff84f778a0cab0f831f357529ae4" alt=""
For example to get the last character of a string we will use the negative indexing as following
SELECT EMP_NO, SUBSTR(FIRST_NAME,-1,1) AS 'Last-Character', FIRST_NAME FROM EMPLOYEES;
SELECT EMP_NO, SUBSTR(FIRST_NAME FROM -1 FOR 1) AS 'Last-Character', FIRST_NAME FROM EMPLOYEES;
data:image/s3,"s3://crabby-images/fc032/fc032462caa9fddf8f0c1556129d77259de913af" alt=""
If we removed the substring length, it will be considered to the end of the string
SELECT EMP_NO, SUBSTR(FIRST_NAME,3) AS 'Initial', FIRST_NAME FROM EMPLOYEES;
data:image/s3,"s3://crabby-images/ef122/ef122b06c6ec2189ec26ce50568d3e2e13b2e8da" alt=""
SELECT EMP_NO, SUBSTR(FIRST_NAME FROM -2) AS 'Initial', FIRST_NAME FROM EMPLOYEES;
data:image/s3,"s3://crabby-images/71ac5/71ac56717fbaa5827d8423413d158254d4f6dca5" alt=""
Substring also can be used in WHERE and HAVING statements to filter specific records as following example
SELECT DISTINCT TITLE
FROM TITLES
WHERE SUBSTR(TITLE,1,6)='Senior';
data:image/s3,"s3://crabby-images/6ca8a/6ca8aef5c4c8e298fadadd1e43ebd077c8b353f9" alt=""
Search String
It is very common to search for a value in a text to see if it exists or not, or in another case to get the position of this value in the text, to search for a value in the string we will use INSTR or POSITION functions, both serve the same purpose
data:image/s3,"s3://crabby-images/d3353/d3353914b01f2a5753e118a5fe94f7570dbed348" alt=""
SELECT EMP_NO,INSTR(FIRST_NAME,'A') 'A Position' ,FIRST_NAME
FROM employees;
SELECT EMP_NO, POSITION('A' IN FIRST_NAME) 'A Position', FIRST_NAME
FROM EMPLOYEES;
In this example we search for the position of character ‘A’ in the FIRST_NAME field using both functions for demonstration
data:image/s3,"s3://crabby-images/77bf8/77bf8ad051c39d1583ff6ac9dbf8cd8af91f2bd0" alt=""
Value of Zero means the value we are searching for is not available in the String, we can use this functions also in the WHERE and HAVING statements to filter in the existence of a value in our string or in our column
SELECT DISTINCT TITLE
FROM TITLES
WHERE INSTR(TITLE,'Assistant')>0;
data:image/s3,"s3://crabby-images/76f22/76f22f753057a57f5590dc1f7d0b5e3b7ba85287" alt=""
Clean Spaces from String
To remove spaces from a string value we can use TRIM, LTRIM, RTRIM functions based on our needs, as following
data:image/s3,"s3://crabby-images/7e910/7e9100a1cab7464e9c7ef850e5b62cf2f9b36a28" alt=""
/****** Remove Spaces *************/
SELECT EMP_NO, CONCAT(' ',FIRST_NAME,' ') 'Name with Space',
TRIM(CONCAT(' ',FIRST_NAME,' ')) 'Name without Space'
FROM EMPLOYEES;
data:image/s3,"s3://crabby-images/c679b/c679b7968f38e4bf4469b6a40b2ce93011fdcfa8" alt=""
Replace Value in a String
Replacing a value in a string is a straight forward using REPLACE function as following
data:image/s3,"s3://crabby-images/996cf/996cfb6fb96851d96463069a6c992838f6df261e" alt=""
/****** Replace String ***********/
SELECT EMP_NO, title, REPLACE(TITLE,' ','-') FROM TITLES;
data:image/s3,"s3://crabby-images/d58ea/d58ea65945664738b1cb065ce61c0d4195e8918f" alt=""
Get Length of a String
To get the length of a string column, output string from expression it is very straight forward using LENGTH function
data:image/s3,"s3://crabby-images/815e2/815e2d19c3f7efd6c75bc3a7d733e0de4c7c2851" alt=""
SELECT EMP_NO, FIRST_NAME,LENGTH(FIRST_NAME), LAST_NAME, LENGTH(LAST_NAME) FROM EMPLOYEES;
data:image/s3,"s3://crabby-images/49897/4989754ee32486b9a880a25b92097f3a6eb9fd13" alt=""