One of the things that all websites should pay attention to is their security. One of the methods of hacking websites is SQL Injection and XSS. Websites created with PHP language are not excluded. To prevent these types of attacks, we must check the input data and the output data for the presence of any malicious code. In this article, we will discuss PHP Validation and sanitization.
PHP Validation checks the correctness of the input data, and PHP sanitization removes any malicious code from the input data. However, due to the costly (both financially and time-wise) complete testing of the website, the security is never 100%, and the website may be hacked due to the non-observance of safety precautions in part of the code.
What is an SQL Injection attack?
SQL injection (SQLi) is a web security vulnerability that allows an attacker to interfere with the queries that an application makes to its database. It generally allows an attacker to view data that they are not normally able to retrieve. This might include data belonging to other users, or any other data that the application itself is able to access. In many cases, an attacker can modify or delete this data, causing persistent changes to the application’s content or behavior.
SQL injection usually occurs when you ask a user for input, like their username, and instead of a name/id, the user gives you an SQL statement that you will unknowingly run on your database. In some situations, an attacker can escalate an SQL injection attack to compromise the underlying server or other back-end infrastructure or perform a denial-of-service attack.
What is an XSS attack?
Cross-site scripting (XSS) is a type of security vulnerability that can be found in some web applications. XSS attacks enable attackers to inject client-side scripts into web pages viewed by other users. It allows an attacker to circumvent the same origin policy, which is designed to segregate different websites from each other. Cross-site scripting vulnerabilities normally allow an attacker to masquerade as a victim user, to carry out any actions that the user is able to perform and to access any of the user’s data. If the victim user has privileged access within the application, then the attacker might be able to gain full control over all of the application’s functionality and data.
Cross-site scripting works by manipulating a vulnerable website so that it returns malicious JavaScript to users. When the malicious code executes inside a victim’s browser, the attacker can fully compromise their interaction with the application.
Rules of Sanitization and Validation in PHP
There might be differences between sanitization and validation in PHP and other languages and CMSes, but the following rules are the same generally. For example, if you are interested in WordPress we provide the WordPress data validation and sanitization article for you.
1. Trust Nobody
The idea is that you should not assume that any data entered by the user is safe. Nor should you assume that the data you’ve retrieved from the database is safe.
2. Validate Input, Escape Output
Validate your data as soon as you receive it from the user. Sanitize (or escape) the data when you want to display it to the user.
PHP validation
In this article, we will first deal with PHP validation and check the input data to see if it is entered correctly or not. For example, in the phone number field, the number and + must be entered, not letters such as “a” or “b”. If the entered number contains any letters, the entered data is not correct. In fact, PHP validation has the task of checking these things, which we will check in detail below.
Validation can be done on both the server and client sides, but in this article, we will only examine PHP validation (server side). PHP language has functions that we can use to easily perform validation. Some are as simple as calling a function, and some can be done using Regex.
Check empty string in PHP
An empty string is a string that has no value. To check whether the string is empty or not, we use the empty()
function. Validation of an empty string in PHP is the first task, you don’t want the user insert empty string for the name field.
function hs_is_empty($str){
return empty($str);
}
Sometimes even the presence of space is considered empty. To check this, we use the combination of empty() and trim() functions. Also, you don’t want the user to insert full of spaces for the name.
function hs_is_empty($str){
return empty(trim($str));
}
if(hs_is_empty(' ')){
echo 'String is empty';
}else{
echo 'String is not empty';
}
Output
String is empty
Using the trim() function, we remove the first and last space letters of the variable. Because in this example, all letters are spaces, the variable value will be an empty string, which the empty() function will recognize as empty.
Note: The trim() function cannot remove the ALT + 255 character.
isset() function
Sometimes we need to make sure whether or not a variable has a value, in which case we use the isset()
function. If the variable has a value, the return value is 1, otherwise, it returns nothing.
$var='123';
echo isset($var);
Output
1
function preg_match
The preg_match()
function uses a regular expression (Regex) to check the input value with the given pattern. In the following, we will explain more examples of this function.
is_string function
This function is used to check whether the entered variable is a string or not.
function hs_is_string($str){
return preg_match ("/^[a-zA-z]*$/", $str);
}
This function uses Regex to check whether the variable is a string or not. This function only checks the characters “a” to “z” and “A” to “Z”. If the variable has a space character, the return value will be false. We can use the following function to check the space character too.
function hs_is_string($str){
return preg_match ("/^[a-zA-z ]*$/", $str);
}
These functions are only used in the English language.
Number validation in PHP
PHP Regex can also be used for the validation of numbers.
function hs_is_number($num){
return preg_match("/^[0-9]*$/", $num);
}
By changing the Regex of this function, you can easily check phone numbers or mobile phones. which we will mention in this article in the form validation section.
Validate URL
The URL has specific values that we can use to determine whether the entered data is a URL or not.
The following code validates the URL in PHP.
function hs_is_url_valid($url)
{
if (preg_match("/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i",$url)) {
return true;
}
return false;
}
if(hs_is_url_valid('http://honarsystems.com/php-validation-sanitization')){
echo 'URL is valid';
}else{
echo 'URL is not valid';
}
Validate Email
A valid email must contain @ and . symbols. PHP provides various methods to validation of the email addresses. Here, we will use a PHP filter and regular expressions (Regex) to validate the email address.
PHP has default filters that we can use to validate some data.
function hs_is_email($email){
return filter_var($email, FILTER_VALIDATE_EMAIL);
}
But sometimes we need to have naming policies for our emails. For example, the email cannot contain the underscore character (_). In this case, we use Regex.
function hs_is_email($email){
return preg_match("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$^", $email);
}
Of course, this email function can include an underline, which we apply in Regex to apply restrictions.
Input Length Validation in PHP
Sometimes it is necessary that the length of the variable is a fixed value or not more than a fixed value. For this, we use the strlen()
function to check the length of the string variable. For example, a mobile number cannot be more than 14 characters.
function is_mobile_number($number){
if(strlen($number) > 14){
return false;
}
return preg_match("/^[0-9]*$/", $number);
}
This is a simple example to check mobile numbers that only number. But some numbers start with +, which we will expand this function to include all mobile numbers.
is_int(), is_float(), etc functions
is_int and is_float: Checks if it is an integer or float or not. Usually, it’s sufficient to simply cast the data as numeric with intval()
or floatval()
. There are other functions: is_bool, is_numeric, is_null, is_string, and is_object.
PHP sanitization
The task of PHP sanitization is to clean the input data from any malicious code. Most of the time attacks are from input fields that we have to sanitize the input fields data in PHP. In fact, this does not delete the data but changes its nature in such a way that it does not lead to SQL Injection or XSS attacks. Only in some cases, some data will be deleted so as not to lead to security bugs. For example, in the email field, the user cannot enter the <script> text as username in the input field, but in the post text, he can, because the content of the post may be related to programming like HTML or PHP code. If the content of the same <script> is deleted, the entered content will not be useful for users then we have to sanitize the input field in PHP carefully. In this part, its nature changes, but its appearance is preserved for users.
htmlspecialchars() function to sanitization text in PHP
The htmlspecialchars()
function converts special characters to HTML entities. This means that it will replace HTML characters like <
and >
with <
and >
. This prevents attackers from exploiting the code by injecting HTML or Javascript code (Cross-site Scripting attacks) in forms.
For example, consider the following command.
echo "<script>alert('ok')</script>";
If you execute the command, an alert with an ok message will be executed by JavaScript. Attackers use this method to execute JavaScript commands on the user’s browser. We use the htmlspecialchars()
function to prevent these types of commands from being executed.
echo htmlspecialchars("<script>alert('ok')</script>");
This command prints the text of the code and does not execute it.
Output
<script>alert('ok')</script>
intval(), floatval(), doubleval() and…
Suppose on one of our pages we get the id of the post and extract its information based on that id from the database. Therefore, in this section, we will use id, which is an integer. So id cannot be a decimal number or letters or other characters.
We use the intval()
function to convert the input variable to an integer. If this function cannot convert the entered variable into an integer, it returns 0.
echo intval('59');
echo intval('59.5');
echo intval('a59');
Output
59
59
0
Also, the floatval()
and doubleval()
functions are the same.
Form validation and sanitization in PHP
One of the cases where validation and sanitization are used is forms. Forms have input fields that the user must fill in and send to the server. As we said, hackers can enter their malicious code in this section, which leads to SQL Injection or XSS attacks.
In this section, we want to discuss how to validate and sanitize the registration form in PHP. In this example, we will validate short text such as name, username, password, e-mail, mobile number, website address, long text such as biography, and optional options such as gender.
PHP pages consist of two parts: HTML codes and PHP codes. Some of the inputs can be checked on the client side, but hackers can easily bypass these filters. But they can validate the server side only when there are security bugs in the codes. Consider the example below.
$errors = [];
$name = $username = $password = $email = $mobile = $website = $bio = '';
$gender = 0;
$file_uploaded = false;
function is_name($name)
{
return preg_match('/^[a-zA-z ]*$/', trim($name));
}
function is_gender($gender)
{
return $gender === 'male' || $gender === 'female';
}
function is_username($username)
{
if(strlen($username)>50){
//the username must be less than 50 character
return false;
}
return preg_match('/^[a-zA-z0-9_.]*$/', $username);
}
function is_password($password)
{
//password must contain a-z A-Z 0-9 !@#$%^&*(){}[]
$password = trim($password);
if(strlen($password)<8){
return false;
}
return preg_match("~[\!\@\#\$\%\^\&\*\(\)\{\}\[\]]+~", $password) &&
preg_match('~[a-z]+~', $password) &&
preg_match('~[A-Z]+~', $password) &&
preg_match('~[0-9]+~', $password);
}
function is_email($email)
{
return preg_match(
'^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$^',
$email
);
}
function is_mobile($mobile)
{
//USA mobile format (01, 001, +1, 1, local)
$mobile = str_replace(' ', '', trim($mobile));
$pattern =
'/^01+[0-9]{10}$|^001+[0-9]{10}$|^\+1+[0-9]{10}|^1+[0-9]{10}$|^[0-9]{10}$/i';
return preg_match($pattern, $mobile);
}
function is_website($website)
{
return preg_match(
'/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i',
$website
);
}
if ($_SERVER['REQUEST_METHOD'] == 'POST') {
//validate name
if (isset($_POST['name'])) {
if (is_name($_POST['name'])) {
$name = $_POST['name'];
} else {
$errors[] = 'Name is not correct.';
}
} else {
$errors[] = 'Name is not set.';
}
//validate gender
if (isset($_POST['gender'])) {
if (is_gender($_POST['gender'])) {
if ($_POST['gender'] === 'male') {
$gender = 1;
} else {
$gender = 2;
}
} else {
$errors[] = 'Choose your gender.';
}
} else {
$errors[] = 'Gender is not set.';
}
//validate username
if (isset($_POST['username'])) {
if (is_username($_POST['username'])) {
$username = $_POST['username'];
} else {
$errors[] = 'Username is not correct.';
}
} else {
$errors[] = 'Username is not set.';
}
//validate password
if (isset($_POST['password'])) {
if (is_password($_POST['password'])) {
$password = $_POST['password'];
} else {
$errors[] = 'Password is not correct.';
}
} else {
$errors[] = 'Password is not set.';
}
//validate email
if (isset($_POST['email'])) {
if (is_email($_POST['email'])) {
$email = $_POST['email'];
} else {
$errors[] = 'Email is not correct.';
}
} else {
$errors[] = 'Email is not set.';
}
//validate mobile
if (isset($_POST['mobile'])) {
if (is_mobile($_POST['mobile'])) {
$mobile = $_POST['mobile'];
} else {
$errors[] = 'Mobile is not correct.';
}
} else {
$errors[] = 'Mobile is not set.';
}
//validate file
if (isset($_FILES['image'])) {
if (!empty($_FILES['image']['name'])) {
if ($_FILES['image']['size'] < 512000) {
$file = explode('.', $_FILES['image']['name']);
$extension = end($temp);
if (
$extension === 'jpg' &&
$extension === 'png' &&
$extension === 'jpeg' &&
$extension === 'gif'
) {
$file_uploaded = true;
} else {
$errors[] =
'Uploaded file is not image format (JPG, PNG, JPEG, GIF).';
}
} else {
$errors[] = 'Uploaded file is more than 500KB.';
}
}
}
//validate website
if (isset($_POST['website'])) {
if (!empty($_POST['website'])) {
if (is_website($_POST['website'])) {
$website = htmlspecialchars($_POST['website']);
} else {
$errors[] = 'Website is not correct.';
}
}
}
//validate bio
if (isset($_POST['bio'])) {
if (!empty($_POST['bio'])) {
$bio = htmlspecialchars($_POST['bio']);
}
}
if (count($errors) === 0) {
if (
!empty($name) &&
$gender !== 0 &&
!empty($username) &&
!empty($password) &&
!empty($email) &&
!empty($mobile)
) {
//name, gender, username, password, email and mobile are required
/*
Your code goes here
*/
if ($file_uploaded && !empty($website) && !empty($bio)) {
//profile picture, website and bio are optional
/*
Your code goes here
*/
}
}
}
}
?>
<html>
<body>
<div>
<?php foreach ($errors as $error): ?>
<div><?php echo $error; ?></div>
<?php endforeach; ?>
</div>
<form method="post" action="<?php echo htmlspecialchars(
$_SERVER['PHP_SELF']
); ?>">
<div>
<label>Name</label>
<input type="text" name="name" value="" placeholder="Name" />
</div>
<div>
<label>Gender</label>
<input type="radio" name="gender[]" value="male" placeholder="Male" /><label>Male</label>
<input type="radio" name="gender[]" value="female" placeholder="Female" /><label>Female</label>
</div>
<div>
<label>Username</label>
<input type="text" name="username" value="" placeholder="Username" />
</div>
<div>
<label>Password</label>
<input type="password" name="password" value="" placeholder="Password" />
</div>
<div>
<label>Email</label>
<input type="email" name="email" value="" placeholder="Email" />
</div>
<div>
<label>Mobile</label>
<input type="text" name="mobile" value="" placeholder="Mobile" />
</div>
<div>
<label>Profile Picture</label>
<input type="file" name="image" value="" placeholder="Profile Picture" />
</div>
<div>
<label>Website</label>
<input type="text" name="website" value="" placeholder="Website" />
</div>
<div>
<label>Bio</label>
<textarea name="bio" placeholder="Bio"></textarea>
</div>
<div>
<input type="submit" value="Submit Form" />
</div>
</form>
</body>
</html>
Now we will explain each function and code.
$errors = [];
$name = $username = $password = $email = $mobile = $website = $bio = '';
$gender = 0;
$file_uploaded = false;
First, we define the variables of the input values and set them with initial values.
function is_name()
This function checks whether the input value is a name or not. The name can only contain English lowercase letters “a” to “z” and uppercase letters “A” to “Z”, as well as spaces.
function is_gender()
In this example, we only accept male and female genders. Therefore, we check whether the input value is male or female.
function is_username()
In our example, the username cannot be more than 50 characters, so we first check whether the input value is more than or less than 50 characters. Then we use Regex to check if the username is correct or not. The username can contain lower and upper case letters a to z, numbers, (_), and (.).
function is_password()
The password must contain at least 8 characters, one lowercase letter, one uppercase letter, and one of the characters !@#$%^&*(){}[].
function is_email()
This function is used to validate emails.
function is_mobile()
This function checks whether the entered mobile number is an American mobile number or not. The incoming number can be without prefixes, 1, +1, 01, 001.
Also, this function can be used to check phone numbers as well as numbers from other countries.
function is_website()
With this function, we determine whether the entered value is a website or not.
PHP validation codes
First of all, we need to check whether the called method is POST or not. For this, we use the following command.
if ($_SERVER['REQUEST_METHOD'] == 'POST') {}
Because this page consists of two parts, the form, and the PHP codes, to run the PHP codes, the form values must be filled first, and then validation is done. Because the values have not been set yet in the first run of the page, the validation part should be separated using the above code.
Form data validation
In this part, the user fills the fields of the form and sends it to the server. Some of these fields must have values, such as name and mobile number, and some are optional. For example, consider the name validation.
if (isset($_POST['name'])) {
if (is_name($_POST['name'])) {
$name = $_POST['name'];
} else {
$errors[] = 'Name is not correct.';
}
} else {
$errors[] = 'Name is not set.';
}
First, we check whether the name field is set or not. If it is not set, since the name must be initialized (not empty), we put its error in the error list. Then, using the is_name() function, we check whether the entered value is a name or not. If it is a name, we set it in the $name variable, otherwise, we put an error in the list of errors.
In the gender section, because the input values are sent to the server by the radio button, it can have a value or not (have a null value). We must also check whether it is null.
We also have limitations for uploading files. For the first example, we check whether the file has been uploaded or not (uploading the file is optional). Then we check if it has a name. Then we check that the input file is an image and its size should be less than 500KB.
For values that are optional, such as biography, if they are not set, there is no need to log an error.
if (count($errors) === 0) {
if (
!empty($name) &&
$gender !== 0 &&
!empty($username) &&
!empty($password) &&
!empty($email) &&
!empty($mobile)
) {
//name, gender, username, password, email and mobile are required
/*
Your code goes here
*/
if ($file_uploaded && !empty($website) && !empty($bio)) {
//profile picture, website and bio are optional
/*
Your code goes here
*/
}
}
}
In this part, we check first so that there is no error. Then we check the mandatory items (name, gender, username, password, email, and mobile). If there is no error and the mandatory values are entered correctly, we can perform the registration process in this section. In this section, the entered information is stored in the database, which we have avoided in this tutorial because the purpose of this tutorial is to validate data in PHP and not to connect with the database.
HTML side
The only point that you should consider in the PHP codes section is the action section of the form.
<form method="post" action="<?php echo htmlspecialchars(
$_SERVER['PHP_SELF']
); ?>">
In this section, the input link to the action must be sanitized because even though the link is created with PHP codes, it can be changed by the user.
If there are errors, they will be displayed on the top of the form. In the first run, we won’t have errors because the PHP section won’t run. After submitting the form if there are errors, we will see them in the errors section in the form. This is a list of errors that will be displayed by the PHP loop.
<?php foreach ($errors as $error): ?>
<div><?php echo $error; ?></div>
<?php endforeach; ?>
Conclusion
Validation and sanitization in PHP only prevent SQL Injection and XSS attacks, which you as a programmer should observe. In this article, we tried to teach how to do this as much as possible, but to increase the security of your codes, you must have experience and check other projects in this field.