Domain contains control or whitespace что значит

Обновлено: 06.07.2024

Email address validator in JavaScript

This validator will tell with high accuracy whether a given string could possibly be a standard-compliant email address. This is intended to provide validation as strict as possible without failing any valid email address. It was written as a fun challenge.

By default, the validation conforms to RFC 5322. Options can be set to change the validation behavior.

You likely do not need this amount of validation. Input validation will typically be very simple, such as /.+@.+/.test(addr) . If that basic validation passes, then the address will be fully verified by sending a message to it and wait for the user to respond with some action, which is the only way to truly know whether an address is valid.

address The string to check for validity.

options Optional. An object which contains options that affect the validation performed. If an object is supplied and any options are missing, the missing options will be given default values. If nothing is supplied, then all options will be given default values.

Option Default Effect
returnRegex false If false, evaluates the address parameter for validity as an email address and returns true or false. If true, ignores the address parameter and returns a regular expression that can be used to check strings for validity as email addresses. Because the final regular expression is built from multiple small parts each time the function is called, saving the returned regular expression may be more efficient when testing multiple addresses. If true, implies useRegexOnly is also true.
useRegexOnly false If true, don't do any validation that can't be accomplished using only regular expression matching. In particular, nested comments cannot be properly validated with regular expressions.
allowBareEscapes false If and only if true, a backslash character can be used to escape normally illegal characters in an unquoted local address label. Backslash escapes can always be used in comments, quoted strings, and bracketed domain literals, regardless of this option.
allowComments true Allow comments in an address if true, disallow if false.
allowLocalAddresses 0 If 0, every address must have a local part, an "@", and a domain part. If positive, then addresses with only a local part (no "@" and no domain part) are allowed in addition to full addresses. If negative, then only addresses with only a local part are allowed, and full addresses are not allowed. The comparisons are not strict, so anything that compares like 0 or false will be considered 0, and true will be considered positive.
separateLocalLabels true If true, each dot-separated label in the local part of an address is treated as an individual subunit. This allows for each label to be quoted or unquoted, as well as for each label to be preceded or followed by CFWS. If false, the entire local part is treated as a single unit. The entire local part must be either quoted or unquoted, and CFWS cannot be next to a dot in the local part.
separateDomainLabels true If true, each dot-separated label in a hostname domain can be preceded or followed by CFWS. If false, CFWS can only occur before or after the entire domain.
allowObsoleteFoldingWhitespace true If true, Folding Whitespace can contain multiple newlines, each separated by whitespace. If false, Folding Whitespace can contain at most one newline.
allowDomainLiteralEscapes true If true, the text between the brackets in a domain literal can contain any low-ASCII character (including control characters) aside from square brackets and backslash, and it can contain backslash-escaped characters (which provides a way to include square brackets and backslash). If false, the text between the brackets can only be be FWS and printing characters excluding the square brackets and backslash, and it cannot contain escaped characters.
allowQuotedControlCharacters true If and only if true, a quoted string within the local part of an address can contain non-printing, non-whitespace ASCII control characters.
allowControlCharactersInComments true If and only if true, a comment can contain non-printing, non-whitespace ASCII control characters.
allowEscapedControlCharacters true If true, an escaped character (the character following the backslash in a quoted pair) can be ANY low-ASCII character, including control characters, null, bare carriage return, and bare linefeed. If false, the escaped character must be either printable or whitespace.

By default, returns a boolean. If the address parameter is a well-formed email address, returns true. Otherwise, returns false.

If the returnRegex option is true, returns a regular expression that can be used to test any string for validity as an email address.

The rules for a valid email address are surprisingly complex and are scattered through multiple RFCs. I consulted several sources for the rules and their interpretations.

  • An email address (in the language of RFC 2822 and RFC 5322, an addr-spec) contains a local part, followed by "@", followed by a domain part. (This behavior can be modified to either allow or require addresses with only a local part using the allowLocalAddresses option)
  • The local part and the domain part are each composed of one or more sections or labels separated by a period.
  • No label can be entirely empty, which means that two periods cannot normally appear consecutively (but see below for escaped characters and quoted strings). This also means that neither the local part nor the domain part will start or end with an unescaped period.
  • Folding Whitespace: Folding whitespace occurs in specific contexts. It consists of any number of space, tab, and/or newline characters, and it always ends with a space or tab. (not fully implemented - some contexts accept any whitespace when they should take folding whitespace)
  • Backslash Escape: A backslash followed by another character forms a "quoted-pair". This has the effect of escaping the second character in the pair, making it legal where it otherwise would be illegal and removing any special meaning it may have. This is not allowed in all contexts.
  • Comment:
    • According to the validator at isemail.info, multiple comments can appear in succession. (I need to review the RFCs to see whether this is correct, but I've allowed this)
    • A comment is surrounded by parentheses.
    • Valid unescaped characters in a comment are folding whitespace, control characters and any printing character, except for backslash and parentheses (but see the next point). This behavior can be changed with the allowControlCharactersInComments option.
    • A comment can contain nested comments, each surrounded by parentheses. The parentheses must properly nest and match. Note: If the option useRegexOnly is true, this cannot be properly validated. In this case, valid comments will be accepted correctly, but some invalid comments will also be accepted.
    • Backslash escapes are allowed inside a comment, and this can be used to insert a parenthesis or backslash.
    • Comment and/or Folding Whitespace
    • A series of one or more comments and/or Folding Whitespace regions
    • A label in either the local part or the domain part can start and/or end with CFWS.
    • By default, this validator accepts addresses with CFWS. This behavior can be changed with the allowComments option.
    • Since naked backslash and naked double-quote cannot exist within a quoted string, they can be preceded by a backslash to become a quoted-pair. This seems to be the only fully agreed upon non-redundant use of the backslash escape in the non-comment portion of a local part.
    • ANY character can be part of a quoted-pair, regardless of whether it must be quoted. (However, quoting/escaping a character that does not need to be quoted is redundant, unnecessary, needlessly verbose, and wasteful.)
    • Most sources agree that a backslash escape can occur within a quoted string but not in an unquoted label. RFC 3696 disagrees, stating that a quoted-pair (backslash escape) can occur in either type of label. By default, this validator does not allow backslash escapes in unquoted labels. This behavior can be changed using the allowBareEscapes option.
    • The domain part can be either a host name or a domain literal.
    • Domain Literal:
      • A domain literal starts and ends with square brackets.
      • The content between the brackets should be the host's literal location on the network (typically an IP address), but nearly any character is allowed. The only disallowed unescaped characters are square brackets and backslash. Control characters are considered obsolete, and these can be disallowed with the allowDomainLiteralEscapes option.
      • Backslash escapes are allowed and can be used to insert a backslash or square bracket. This behavior can be changed with the allowDomainLiteralEscapes option.
      • A domain literal can be preceded or followed by comments.
      • Each label within a host name can be up to 63 characters long.
      • The entire host name can be up to 255 characters long, but this would not result in a usable email address.
      • Valid characters within a label are alphanumeric characters and the dash (-).
      • A label cannot start or end with a dash.
      • A label CAN contain multiple consecutive dashes, despite what some sources say.
      • Backslash escapes are not allowed (except in comments).
      • Does not handle addresses that contain non-ASCII or high-ASCII characters.
      • Comments are included in the overall length of the address, the length of the local part, and the length of the domain part. They are not included in the length of individual labels in the domain. This is probably not the correct handling.

      RFC 5321 places some restrictions on domain literals. Implement those restrictions.

      RFC 5321 also is more strict on the local part of the address than the rules I followed. Backslash escapes are not allowed in unquoted strings (this now matches the default behavior), a quoted string must be the entire local part (available as an option already), and I haven't fully reviewed to see if there are other restrictions. Review the RFC and implement these (possibly as options).

      An email address cannot be used if it is more than 254 characters. Longer addresses can exist, but the SMTP field in which the address belongs requires a string of 256 characters or less, and that string includes a surrounding pair of angle brackets that takes up two of the 256 characters.

      • (On the other hand, this paragraph in RFC 5321 seems to state that backslash escapes can be used outside of quoted strings):

      Note that the backslash, "", is a quote character, which is used to indicate that the next character is to be used literally (instead of its normal interpretation). For example, "Joe,Smith" indicates a single nine-character user name string with the comma being the fourth character of that string.

      Review RFC 6531 and related documents to handle high-ASCII and non-ASCII characters.

      InvalidParameterValue: Domain contains control or whitespace at Request.extractError in aws lambda

      I am facing a strange problem with ses.SendEmail method. When I use test data everything works as expected and email is send to me, however when adding new entry in dynamoDB that triggers lambda function that will send an email I get an exception :

      In logs I can see that email is in correct format - exactly the same payload for tests that works. Part of NewImage object ( I obfuscated an email (for privacy reason) that did not have any whitespace and contains only alphanumerical values.

      Cant use InternetAddress with umlauts anymore after switching to com.sun.mail:javax.mail

      we recently switched from javax.mail:mail to com.sun.mail:javax.mail . Since then the following code fails:

      Every char >= 177 is treated as control or whitespace - which is wrong, e.g. for umlauts (ö = 246). So the exception message is misleading.

      Did the change of validate() introduce a bug?

      By now, Internet email addresses may contain umlauts encoded in punycode. Thats why i expected to be safe passing a string with umlauts.

      Is InternetAddress intended to be used with an encoded String in this case?

      Thanks in advance

      Update to Bill Shannons answer

      The nicely formatted Groovy script mentioned in my comment:

      Update: test with latest snapshot

      The test runs successfully (not throwing an AddressException) with the latest snapshot 1.6.0-SNAPSHOT that is currently from Tue Feb 21.

      Проблема Javamail с символами в почтовых адресах

      У меня возникают проблемы с методом parse при использовании символа ñ :

      Я передаю false для строгого параметра, но всегда получаю ошибку:

      Если вы все еще сталкиваетесь с этой проблемой, я советую вам добавить двойные кавычки к вашему необычному адресу электронной почты, как я. Это сработало для меня, потому что метод checkAddress для InternetAddress отказывается проверять цитируемый адрес. Это легкое и быстрое решение.

      Пример:

      Совершенным и правильным решением будет Java Team для исправления ошибки InternetAddress class, которая даже при получении "strict = false" в конструкторе проверяет синтаксис электронной почты, когда это не должно быть.

      Пример ошибки:

      Пример обходного решения:

      Подход к созданию нового класса, который расширяет адрес, не будет работать, поскольку Transport.send() по какой-то причине, похоже, снова проверяет адрес и исключение также запускается.

      Пожалуйста, дайте нам знать, если это поможет!

      Убедитесь, что ваш - на самом деле n-тильда. Следующее отлично работает для меня

      Он также работает при замене - с помощью unicode escape (в исходном .java файле)

      Вы можете попробовать заменить его так же, как и тест.

      Edit: BTW Я также использовал синтаксический анализ ( ". " ) с адресом и вторым параметром и передал true для строгих, оба из которых также работали.

      Я пробовал оба, но с тем же результатом. Какую версию javamail вы используете? У меня тоже не работает с использованием Java Mail 1.4.7 @ user871611 Это работает для меня с Java Mail 1.4.4 и все еще работает. Перед тем, как понизить голосование, убедитесь, что у вас действительно U + 00F1 ñ Unicode "LATIN SMALL LETTER N WITH TILDE", а не n U + 006E "LATIN SMALL LETTER N", за которым следует U + 0303 "COMBINING TILDE" [. ] последовательности символов ДОЛЖНЫ быть нормализованы в соответствии с нормализацией Юникода. Но это СЛЕДУЕТ, так что несколько ожидаемые вещи здесь будут работать (и, возможно, сломаться, но на принимающей стороне).

      Анализатор адресов JavaMail говорит вам что-то очень важное здесь: вы можете использовать обычные печатные символы ASCII в левой части значка @ на адрес электронной почты. Ознакомьтесь с RFC 5322 для ABNF для действительных адресов электронной почты:

      Перед тем, как продолжить, убедитесь, что адрес правильный.

      Это больше не так с последней версией RFC 6532, которая допускает символы UTF-8.

      Читайте также: