Security is important in every website on the Internet. Nearly 30% of websites are made by WordPress CMS. WordPress itself is secure enough but when we use plugins, validation and sanitization must be an important issue here.
WordPress has functions to sanitize and validate input and output data. Just remember that never trust user data. We have to validate users’ data everywhere (in storing and displaying data). One bug in one plugin can destroy everything even on the server.
Differences Between Validation and Sanitization
There is a difference between sanitization and validation in WordPress.
Validation checks if input data is in the expected format or not. For example, we expect to get an email address from the user. When the user sends the data, we check if it is an email address or not. The user can insert malicious code in the email input box. As the developer, it is our job to prevent these attacks by validation.
Sanitization or escaping applies some filters to the data to make it safe. Maybe the data contains SQL injection codes. This is a sanitization duty to detect and filter it to make it safe for the database.
We used User as who sends data to our website. Sometimes it is really the user that fills out the form and sometimes it is an API or a web service that sends data to our website. No matter who is the user, validate and sanitize the data.
Even admins are users, and users will enter incorrect data on purpose or on accident. It’s your job to protect them from themselves.
Rules of Sanitizing and Validating in WordPress
There might be differences between sanitization and validation in WordPress and other CMSes, but the following rules are the same generally.
1. Trust Nobody
The idea is that you should not assume that any data entered by the user is safe. Nor should you assume that the data you’ve retrieved from the database is safe.
2. Validate Input, Escape Output
Validate your data as soon as you receive it from the user. Sanitize (or escape) the data when you want to show it.
3. Use WordPress Validator and Sanitizer
WordPress has its own functions to do this job. We can use PHP filters too but WordPress functions have much more filters and made them easy to use for us.
Data Validation in WordPress
First, let’s talk about validation. As you know, validation is about checking data to see if they are in the expected format or not.
1. PHP Built-in Functions
- isset: Determine if a variable is declared and is different than NULL.
if (isset($var)) {
echo "This var is set so I will print.";
}
- empty: Determine whether a variable is empty
if (empty($var)) {
echo '$var is either 0, empty, or not set at all';
}
mb_strlen($string, '8bit');
echo strlen($str);
- preg_match: Perform a regular expression match
preg_match('/(foo)(bar)(baz)/', 'foobarbaz', $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
- strpos: Find the position of the first occurrence of a substring in a string
$mystring = 'abc';
$findme = 'a';
$pos = strpos($mystring, $findme);
- count: Count all elements in an array, or something in an object
- in_array: Checks if a value exists in an array
$os = array("Mac", "NT", "Irix", "Linux");
if (in_array("Irix", $os)) {
echo "Got Irix";
}
- is_int and is_float: Checks it is integer or float or not. Usually, it’s sufficient to simply cast the data as numeric with intval or floatval. There are other functions: is_bool, is_numeric, is_null, is_string, is_object.
2. WordPress Functions
- is_email: Verifies that an email is valid.
if ( is_email( 'email@domain.com' ) ) {
echo 'email address is valid.';
}
- term_exists: Determines whether a taxonomy term exists.
$term = term_exists( 'Uncategorized', 'category' );
if ( $term !== 0 && $term !== null ) {
echo __( "'Uncategorized' category exists!", "textdomain" );
}
- username_exists: Determines whether the given username exists.
$username = sanitize_user( $_POST['username'] );
if ( username_exists( $username ) ) {
echo "Username In Use!";
} else {
echo "Username Not In Use!";
}
- validate_file: Validates a file name and path against an allowed set of rules.
$path = 'uploads/2012/12/my_image.jpg';
return validate_file( $path ); // Returns 0 (valid path)
$path = '../../wp-content/uploads/2012/12/my_image.jpg';
return validate_file( $path ); // Returns 1 (invalid path)
- absint: Convert a value to a non-negative integer.
echo absint(20.33); // 20
echo absint(-20.33); // 20
echo absint(false); // 0
echo absint(true); // 1
echo absint(array(10,20,30)) // 1
echo absint(NULL) // 0
echo absint( 19.99 * 100 ); // Result is 1998, when the expected result is 1999
- zeroise: Add leading zeros when necessary.
echo zeroise(70,4); // Prints 0070
- wp_kses: Allow only some HTML tags in your data that accepts three arguments.
- content: (string) Content to filter through
kses
- allowed_html: An array where each key is an allowed HTML element and the value is an array of allowed attributes for that element
- allowed_protocols: Optional. Allowed protocol in links (for example
http
,mailto
, feed, etc.)
- content: (string) Content to filter through
$content = "<em>Click</em> <a title='click' href='http://honarsystems.com'>here</a> to visit <strong> Honar Systems </strong>";
echo wp_kses( $content, array(
'strong' => array(),
'a' => array('href')
) );
// Prints the HTML "Click <a href='http://honarsystems.com'>here</a> to visit <strong> Honar Systems </strong>":
Click <a href="http://honarsystems.com">here</a> to visit <strong> Honar Systems </strong>
- wp_kses_post: Sanitizes content for allowed HTML tags for post content.
- wp_kses_data: Sanitize content with allowed HTML KSES rules.
$s = '<div id="1st"><strong><i>Foo</i></strong><script>alert("Bar");</script></div>';
$x = wp_kses_data($s);
// Now, $x is <strong><i>Foo</i></strong>alert("Bar");
- wp_kses_allowed_html: Returns an array of allowed HTML tags and attributes for a given context.
// strips all html (empty array)
$allowed_html = wp_kses_allowed_html( 'strip' );
// allows all most inline elements and strips all block level elements except blockquote
$allowed_html = wp_kses_allowed_html( 'data' );
// very permissive: allows pretty much all HTML to pass - same as what's normally applied to the_content by default
$allowed_html = wp_kses_allowed_html( 'post' );
// allows a list of HTML Entities such as
$allowed_html = wp_kses_allowed_html( 'entities' );
- balanceTags: Balances tags if forced to, or if the ‘use_balanceTags’ option is set to true.
<?php
$html = '<ul>
<li>this
<li>is
<li>a
<li>list
</ul>';
echo balanceTags($html, true);
?>
Will output this HTML:
<ul>
<li>this
</li><li>is
</li><li>a
</li><li>list
</li></ul>
- force_balance_tags: Balances tags of string using a modified stack.
<?php $balanced_text = force_balance_tags( $text ); ?>
<div><b>This is an excerpt. <!--more--> and this is more text... </b></div>
not break, when the html
after the more tag is cut off.
<div><b>This is an excerpt.
should be changed to:
<div><b>This is an excerpt. </b></div>
Data Sanitization in WordPress
Sanitization is the process of cleaning or filtering your input data. Whether the data is from a user or an API or web service, you use sanitizing when you don’t know what to expect
- esc_html: Escaping for HTML blocks.
echo esc_html($title);
- esc_attr: Escaping for HTML attributes.
<input type="text" name="myInput" value="<?php echo esc_attr($value);?>"/>
- esc_js: Escape single quotes, htmlspecialchar ” &, and fix line endings.
<a href="#" onclick="<?php echo esc_js( $custom_js ); ?>">Click me</a>
- esc_url: Checks and cleans a URL.
<a href="<?php echo esc_url( home_url( '/' ) ); ?>">Home</a>
- esc_url_raw: Performs esc_url() for database usage.
- urlencode_deep: Navigates through an array, object, or scalar, and encodes the values to be used in a URL.
- esc_textarea: Escaping for
textarea
values.
<textarea><?php echo esc_textarea( $text ); ?></textarea>
- sanitize_email: Strips out all characters that are not allowable in an email.
$sanitized_email = sanitize_email(' admin@example.com! ');
echo $sanitized_email; // will output: 'admin@example.com'
- sanitize_file_name: Sanitizes a filename, replacing whitespace with dashes.
- sanitize_html_class: Sanitizes an HTML
classname
to ensure it only contains valid characters.
// If you want to explicitly style a post, you can use the sanitized version of the post title as a class
$post_class = sanitize_html_class( $post->post_title );
echo '<div class="' . $post_class . '">';
- sanitize_key: Sanitizes a string key.
- sanitize_meta: Sanitizes meta value.
- sanitize_term: Sanitize Term all fields.
- sanitize_term_field: Cleanse the field value in the term based on the context.
- sanitize_mime_type: Sanitize a mime type
sanitize_mime_type( $mime_type );
- sanitize_option: Sanitises various option values based on the nature of the option.
- sanitize_sql_orderby: Ensures a string is a valid SQL ‘order by’ clause.
- sanitize_text_field: Sanitizes a string from user input or from the database.
$str = "<h2>Title</h2>";
sanitize_text_field( $str ); // it will echo "title" without any HTML tags!
- sanitize_textarea_field: Sanitizes a multiline string from user input or from the database.
- sanitize_title: Sanitizes a string into a slug, which can be used in URLs or HTML attributes.
$new_url = sanitize_title('This Long Title is what My Post or Page might be');
echo $new_url;
It should return a formatted value, the output would be this:
this-long-title-is-what-my-post-or-page-might-be
- sanitize_title_for_query: Sanitizes a title with the ‘query’ context.
- sanitize_title_with_dashes: Sanitizes a title, replacing whitespace and a few other characters with dashes.
echo sanitize_title_with_dashes("I'm in LOVE with WordPress!!!1");
// this will print: im-in-love-with-wordpress1
- sanitize_user: Sanitizes a username, stripping out unsafe characters.
$username= sanitize_user(' marmelada ');
echo $username; // will output: 'marmelada'
- sanitize_hex_color: Sanitizes a hex color.
$wp_customize->add_setting( 'accent_color', array(
'default' => '#f72525',
'sanitize_callback' => 'sanitize_hex_color',
) );
- sanitize_hex_color_no_hash: Sanitizes a hex color without a hash. Use sanitize_hex_color() when possible.
- wp_filter_post_kses: Sanitizes content for allowed HTML tags for post content.
- wp_filter_nohtml_kses: Strips all HTML from a text string.
- wp_rel_nofollow: Adds
rel="nofollow" string to all HTML A elements in the
content.
- tag_escape: Escape an HTML tag name.
Escaping with Localization
- _e: Display translated text.
_e( 'Some text to translate and display.', 'textdomain' );
- __: Retrieve the translation of $text.
$translated = __( 'Hello World!', 'mytextdomain' );
// their are same
echo __( 'translate text', 'textdomain' );
_e( 'translate text', 'textdomain' );
- _x: Retrieve translated string with gettext context.
$translated = _x( 'Read', 'past participle: books I have read', 'textdomain' );
- esc_html__: Retrieve the translation of $text and escapes it for safe use in HTML output.
<h1><?php echo esc_html__( 'Title', 'text-domain' )?></h3>
- esc_html_e: Display translated text that has been escaped for safe use in HTML output.
<h1><?php esc_html_e( 'Title', 'text-domain' )?></h1>
- esc_html_x: Translate string with gettext context, and escapes it for safe use in HTML output.
- esc_attr__: Retrieve the translation of $text and escapes it for safe use in an attribute.
- esc_attr_e: Display translated text that has been escaped for safe use in an attribute.
<input title="<?php esc_attr_e( 'Read More', 'your_text_domain' ) ?>" type="submit" value="submit" />
// Returns <input title="Read More" type="submit" value="submit" />
- esc_attr_x: Translate string with
gettext
context, and escapes it for safe use in an attribute.
The WordPress documentation states that you should not use
echo __( 'translate text', 'textdomain' );
Instead, you should use the following function.
_e( 'translate text', 'textdomain' );
- antispambot: Converts email addresses characters to HTML entities to block spam bots.
$email = "johndoe@example.com";
$email = sanitize_email($email);
echo '<a href="mailto:'.antispambot($email,1).'" title="Click to e-mail me" >'.antispambot($email).' </a>';
- add_query_arg: Retrieves a modified URL query string.
URL “http://blog.example.com/client/?s=word”
// This would output '/client/?s=word&foo=bar&baz=tiny'
$arr_params = array( 'foo' => 'bar', 'baz' => 'tiny' );
echo esc_url( add_query_arg( $arr_params ) );
- remove_query_arg: Removes an item or items from a query string.
URL “http://www.example.com/client/?details=value1&type=value2&date=value3″
// This would output '/client/?type=value2&date=value3'
echo esc_url( remove_query_arg( 'details' ) );
Database Escaping
WordPress has some functions to interact database. These functions have their own sanitization filters. But as you know some of them don’t have. Then it is our duty to sanitize the input.
- esc_sql: Escapes data for use in a MySQL query.
$name = esc_sql( $name );
$status = esc_sql( $status );
$wpdb->get_var( "SELECT something FROM table WHERE foo = '$name' and status = '$status'" );
- wpdb::esc_like: First half of escaping for LIKE special characters % and _ before preparing for MySQL.
$wild = '%';
$find = 'only 43% of planets';
$like = $wild . $wpdb->esc_like( $find ) . $wild;
$sql = $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE post_content LIKE %s", $like );
Redirect
- wp_redirect: Redirects to another page.
<?php wp_redirect( home_url() ); exit; ?>
Redirects can also be external, and/or use a “Moved Permanently” code :
<?php wp_redirect( 'http://www.example.com', 301 ); exit; ?>
- wp_safe_redirect: Performs a safe (local) redirect, using wp_redirect().
if ( wp_safe_redirect( $url ) ) {
exit;
}
Final Words About Sanitization and Validation in WordPress
Sanitization and validation in WordPress is a security matter that every developer should be familiar with it. As you know if you want to develop your custom plugin or theme, it is important that your code is safe enough or not.
One bug will destroy everything!