Skip to content
Fix Code Error

How to validate an email address using a regular expression?

March 13, 2021 by Code Error
Posted By: acrosman

Over the years I have slowly developed a regular expression that validates MOST email addresses correctly, assuming they don’t use an IP address as the server part.

I use it in several PHP programs, and it works most of the time. However, from time to time I get contacted by someone that is having trouble with a site that uses it, and I end up having to make some adjustment (most recently I realized that I wasn’t allowing 4-character TLDs).

What is the best regular expression you have or have seen for validating emails?

I’ve seen several solutions that use functions that use several shorter expressions, but I’d rather have one long complex expression in a simple function instead of several short expression in a more complex function.

Solution

The fully RFC 822 compliant regex is inefficient and obscure because of its length. Fortunately, RFC 822 was superseded twice and the current specification for email addresses is RFC 5322. RFC 5322 leads to a regex that can be understood if studied for a few minutes and is efficient enough for actual use.

One RFC 5322 compliant regex can be found at the top of the page at http://emailregex.com/ but uses the IP address pattern that is floating around the internet with a bug that allows 00 for any of the unsigned byte decimal values in a dot-delimited address, which is illegal. The rest of it appears to be consistent with the RFC 5322 grammar and passes several tests using grep -Po, including cases domain names, IP addresses, bad ones, and account names with and without quotes.

Correcting the 00 bug in the IP pattern, we obtain a working and fairly fast regex. (Scrape the rendered version, not the markdown, for actual code.)

(?:[a-z0-9!#$%&’*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&’*+/=?^_`{|}~-]+)*|”(?:[x01-x08x0bx0cx0e-x1fx21x23-x5bx5d-x7f]|\[x01-x09x0bx0cx0e-x7f])*”)@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])).){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[x01-x08x0bx0cx0e-x1fx21-x5ax53-x7f]|\[x01-x09x0bx0cx0e-x7f])+)])

or:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[x01-x08x0bx0cx0e-x1fx21x23-x5bx5d-x7f]|\[x01-x09x0bx0cx0e-x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])).){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[x01-x08x0bx0cx0e-x1fx21-x5ax53-x7f]|\[x01-x09x0bx0cx0e-x7f])+)])

Here is diagram of finite state machine for above regexp which is more clear than regexp itself
enter image description here

The more sophisticated patterns in Perl and PCRE (regex library used e.g. in PHP) can correctly parse RFC 5322 without a hitch. Python and C# can do that too, but they use a different syntax from those first two. However, if you are forced to use one of the many less powerful pattern-matching languages, then it’s best to use a real parser.

It’s also important to understand that validating it per the RFC tells you absolutely nothing about whether that address actually exists at the supplied domain, or whether the person entering the address is its true owner. People sign others up to mailing lists this way all the time. Fixing that requires a fancier kind of validation that involves sending that address a message that includes a confirmation token meant to be entered on the same web page as was the address.

Confirmation tokens are the only way to know you got the address of the person entering it. This is why most mailing lists now use that mechanism to confirm sign-ups. After all, anybody can put down [email protected], and that will even parse as legal, but it isn’t likely to be the person at the other end.

For PHP, you should not use the pattern given in Validate an E-Mail Address with PHP, the Right Way from which I quote:

There is some danger that common usage and widespread sloppy coding will establish a de facto standard for e-mail addresses that is more restrictive than the recorded formal standard.

That is no better than all the other non-RFC patterns. It isn’t even smart enough to handle even RFC 822, let alone RFC 5322. This one, however, is.

If you want to get fancy and pedantic, implement a complete state engine. A regular expression can only act as a rudimentary filter. The problem with regular expressions is that telling someone that their perfectly valid e-mail address is invalid (a false positive) because your regular expression can’t handle it is just rude and impolite from the user’s perspective. A state engine for the purpose can both validate and even correct e-mail addresses that would otherwise be considered invalid as it disassembles the e-mail address according to each RFC. This allows for a potentially more pleasing experience, like

The specified e-mail address ‘[email protected],com’ is invalid. Did you mean ‘[email protected]’?

See also Validating Email Addresses, including the comments. Or Comparing E-mail Address Validating Regular Expressions.

Regular expression visualization

Debuggex Demo

Answered By: bortzmeyer

Related Articles

  • Reference — What does this symbol mean in PHP?
  • How do SO_REUSEADDR and SO_REUSEPORT differ?
  • Reference - What does this regex mean?
  • Iterate and sum values based on a condition in pandas
  • At least one JAR was scanned for TLDs yet contained no TLDs
  • In Ember.js how to notify an ArrayController's…
  • Difference between logical addresses, and physical…
  • What are the undocumented features and limitations…
  • How to get the current location latitude and…
  • Laravel Mail::send() sending to multiple to or bcc addresses
  • Why is my Shopify App built with Next.js (React) so…
  • PHP parse/syntax errors; and how to solve them
  • Node.js/NodeMailer/Express/Outlook smtp host -…
  • Python File Error: unpack requires a buffer of 16 bytes
  • Use of PUT vs PATCH methods in REST API real life scenarios
  • The definitive guide to form-based website authentication
  • How can I loop through a list and for each distinct…
  • Polymer 1.x + Firebase 2.x: How to push or put data…
  • getting error while updating Composer
  • How to replace plain URLs with links?
  • Ukkonen's suffix tree algorithm in plain English
  • How do you parse and process HTML/XML in PHP?
  • How to change the color of vaadin-select-text-field…
  • How does PHP 'foreach' actually work?
  • Execute a PHP script from another PHP script
  • why does $http baddata occur. i need to authenticate user
  • How to get the Android device's primary e-mail address
  • Best practice multi language website
  • How to validate an Email in PHP?
  • Use vee-validate to validate select list with radio…
  • python 3.2 UnicodeEncodeError: 'charmap' codec can't…
  • Send email with PHP from html form on submit with…
  • MySQL database is not receiving any data in PHP
  • Maven does not find JUnit tests to run
  • How do I count unique visitors to my site?
  • Output not incrementing correctly - C++
  • How to insert data into Firebase using Polymerfire…
  • "Notice: Undefined variable", "Notice: Undefined…
  • spring data jpa, native query not setting query parameter
  • Remix error The transaction has been reverted to the…
  • C# Validating input for textbox on winforms
  • Is it possible to apply CSS to half of a character?
  • What is IPV6 for localhost and 0.0.0.0?
  • PHP - Failed to open stream : No such file or directory
  • Failed to authenticate on SMTP server error using gmail
  • How to create websockets server in PHP
  • Validating a non-existent field error in console
  • What is your most productive shortcut with Vim?
  • backbone collection add does not trigger model validate
  • How should I choose an authentication library for…
  • What's the best way of scraping data from a website?
  • What does "dereferencing" a pointer mean?
  • Avoid creating new session on each axios request laravel
  • Ember.js with Rails accepts_nested_attributes_for…
  • Secure hash and salt for PHP passwords
  • Convert a PHP script into a stand-alone windows executable
  • Are PDO prepared statements sufficient to prevent…
  • Formik form only validates after second button…
  • Python timedelta in years
  • UTF-8 all the way through
  • Generating service in Visual Studio
  • Android Studio: Unable to start the daemon process
  • What characters are allowed in an email address?
  • defining relationships in Backbone-Relational -- not…
  • shell script. how to extract string using regular…
  • Key Match using accumulators in XSLT3
  • display js variable to html using vue js
  • how to display validation error message out from the…
  • How to send an email with Python?
  • VBA get a list of files in folder and their tags (Keywords)
  • Select2 acts very different with Uncaught query…
  • Smart way to truncate long strings
  • The mysql extension is deprecated and will be…
  • Java 8: Difference between two LocalDateTime in…
  • TypeError: Object(...) is not a function in Vue
  • ExpressJS How to structure an application?
  • How to use the encrypt password for login php
  • XMLHttpRequest cannot load ✘✘✘ No…
  • Form Validations in EmberJS
  • How to use Regular Expressions (Regex) in Microsoft…
  • How to get UTF-8 working in Java webapps?
  • how to fix Invalid request (Unsupported SSL request)…
  • Sending REST requests to a nested API endpoint URL…
  • How to avoid "module not found" error while calling…
  • What's the best way to get the last element of an…
  • Uncaught Error: [vee-validate] No such validator…
  • Error message "Forbidden You don't have permission…
  • Why does C++ code for testing the Collatz conjecture…
  • how to call an ASP.NET c# method using javascript
  • How do I extract data from JSON with PHP?
  • PHP-FPM and Nginx: 502 Bad Gateway
  • How to use unicode characters in Windows command line?
  • R - Using loops to search one variable with another…
  • Backbone.js newbie - cannot get model error event to fire
  • Validating email addresses using jQuery and regex
  • Backbone: validating attributes one by one
  • Validation not getting triggered when value is changed
  • PHP7 : install ext-dom issue
  • Copy a file in a sane, safe and efficient way
  • How should a model be structured in MVC?

Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.

Post navigation

Previous Post:

How do I replace NA values with zeros in an R dataframe?

Next Post:

vertical-align with Bootstrap 3

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

.net ajax android angular arrays aurelia backbone.js bash c++ css dataframe ember-data ember.js excel git html ios java javascript jquery json laravel linux list mysql next.js node.js pandas php polymer polymer-1.0 python python-3.x r reactjs regex sql sql-server string svelte typescript vue-component vue.js vuejs2 vuetify.js

  • you shouldn’t need to use z-index
  • No column in target database, but getting “The schema update is terminating because data loss might occur”
  • Angular – expected call-signature: ‘changePassword’ to have a typedeftslint(typedef)
  • trying to implement NativeAdFactory imports deprecated method by default in flutter java project
  • What should I use to get an attribute out of my foreign table in Laravel?
© 2022 Fix Code Error