INTERNET-DRAFT John Klensin, Editor Expires in six months MCI November 22, 1995 Simple Mail Transfer Protocol draft-ietf-drums-smtpupd-01.txt Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months. Internet-Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet-Drafts as reference material or to cite them other than as a "working draft" or "work in progress". To learn the current status of any Internet-Draft, please check the 1id-abstracts.txt listing contained in the Internet-Drafts Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). If consensus is reached on this document, it will be forwarded to the IESG with the recommendation that it be processed as a Proposed Standard for mail transport. [[ Note in Draft: this version of the I-D is very much a work in progress. It should be read by the WG as a proposal about what material should go into the draft and how it should be structured. The editor is painfully aware that there are almost certainly still inconsistencies in section numbering and the like. More important, neither is the balance and allocation of material between the "model", "procedures", and "specifications" sections yet right, nor is the attempt to consolidate the syntax and semantics sections to make it easier to find things fully satisfactory. I have also not yet reworked the examples or state diagrams: For example, I believe that all of the ".ARPA" domains should be pulled out and replaced with contemporary examples, that all "HELO" examples should be replaced by "EHLO" ones, and that all free-text mailed error replies should be replaced by NOTARY-format messages. The rewrite to use 822-ish ABNF is not complete and this draft contains an odd mix as a result -- I'd appreciate help with that from someone who has the time and patience. This effort has, incidentally, further convinced me that we should create a completely separate RFC that defines the ABNF, rather than having everything refer to the not-completely-satifactory definition in 822. WG input on those issues is critical. The WG should also decide how much of the explanatory and justification material (e.g., from 1123) should be included: this draft is very inconsistent along that dimension. And, of course, the WG should discuss what is in here that shouldn't be and what should be in here that the incompetent editor has forgotten. Sections marked with doubled brackets (e.g., "<<") are explicit placeholders or known major loose ends.]] TABLE OF CONTENTS 0. ABSTRACT 1. INTRODUCTION 2. THE SMTP MODEL 2.1 Basic structure 2.2 The extension model 2.3 Other terminology 3. THE SMTP PROCEDURES: AN OVERVIEW 3.1 Session Initiation 3.2 Client initiation 3.3. Mail 3.4. Forwarding for Address Correction or Updating 3.5. Verifying and Expanding 3.6. Sending and Mailing 3.7. Domains 3.8. Relaying 3.9. Changing Roles 3.10. Terminating sessions and connections 4. THE SMTP SPECIFICATIONS 4.1. SMTP Commands 4.1.1. Command Semantics and Syntax 4.1.2. Lower-level Syntax 4.1.3 Order of commands 4.1.4 Private-use commands 4.2. SMTP Replies 4.2.1. Reply Codes by Function Group 4.2.2. Reply Codes in Numeric Order 4.2.3. Reply code 502 4.2.4 Reply codes after DATA and the subsequent CRLF.CRLF. 4.3. Sequencing of Commands and Replies 4.4 Trace information 4.5. State Diagrams 4.6. Details 4.6.1. Minimum Implementation 4.6.2. Transparency 4.6.3. Sizes and Timeouts 4.6.4 Queuing Strategies 5. Problem detection and handling 5.1 Replies by email 5.2 Loop detection 6. Security Considerations 7. References 8. Editor's addresses 9. Acknowledgements APPENDIX A: TCP APPENDIX B: Generating SMTP commands from RFC 822 headers APPENDIX E: Theory of Reply Codes APPENDIX F: Scenarios APPENDIX G: Other gateway issues. APPENDIX H: Glossary APPENDIX X: Change summary and Loose ends (temporary) 0. Abstract This document is a self-contained specification of the basic protocol for the Internet electronic mail transport, consolodating and updating * the original SMTP specification of RFC 821 [RFC-821], * Domain name system requirements and implications for mail transport from RFC 1035 [RFC-DNS] and RFC 974 [RFC974], * the clarifications and applicability statements in RFC 1123 [RFC-1123], and * material drawn from the SMTP Extension mechanisms [SMTPEXT]. It is intended to replace RFC 821, RFC 974, and the mail transport materials of RFC 1123. However, RFC 821 specifies some features that are not in significant use in the Internet of the mid-1990s and, in appendices, some additional transport models. Those sections are omitted in this document in the interest of clarity and brevity; readers needing them should refer to RFC 821. It also includes some additional material from RFC 1123 that appeared to need amplification. These have been identified in multiple ways, mostly by tracking flaming on the header-people list [HEADER-PEOPLE] and problems of unusual readings or interpretations that have turned up as the SMTP extensions have been deployed. It is important to note that everything here is in response to some identified confusion or bad behavior, not just paranoia. Where this specification moves beyond consolodation and actually differs from earlier documents, it supersedes them technically as well as textually. Although SMTP was designed as a mail transport and delivery protocol, this specification also contains information that is important to its use as a "mail posting" protocol, as recommended for POP [RFC-POP2, RFC-POP3] and IMAP [RFC-IMAP4]. Except when the historical terminology is necessary for clarity, this document uses the current "client" and "server" terminology to identify the sending and receiving SMTP processes, respectively. A companion document discusses mail bodies and formats: RFC 822, MIME, and their relationship. 1. INTRODUCTION The objective of the Simple Mail Transfer Protocol (SMTP) is to transfer mail reliably and efficiently. SMTP is independent of the particular transmission subsystem and requires only a reliable ordered data stream channel. While this document specifically discusses transport over TCP, other transports are possible. Appendices to RFC 821 describe some of them. A Glossary provides the definitions of terms as used in this document. An important feature of SMTP is its capability to transport mail across transport service environments, usually referred to as "mail gatewaying". A transport service environment might consist of the mutually-TCP-accessible hosts on the public internet, a firewall-isolated private TCP/IP LAN, or a LAN or WAN environment utilizing an entirely different transport-level protocol. It is important to realize that transport systems are not one-to-one with usual definitions of "networks". A process can communicate directly with another process, and mail communicated, through any mutually known transport layer. Conversely, mail can be relayed (actually gatewayed) between hosts on different transport systems by a host on both transport systems. The Mail eXchanger mechanisms of the domain name system [RFC-DNS, RFC974] usually permit relaying and gatewaying to occur invisibly to the user. 2. THE SMTP MODEL 2.1 Basic structure The SMTP design is based on the following model of communication: as the result of a user mail request (or transfer from a mail user agent (see section 2.3), the SMTP client establishes a two-way transmission channel to an SMTP server. Fully-capable client SMTPs determine the host address supporting the server SMTP function by resolving the domain name in the user request to it into either an intermediate mail exchanger host or a final target host. In other cases, common with clients associated with implementations of the POP [RFC-POP2, RFC-POP3] or IMAP [RFC-IMAP4] protocols, or when the client is inside an isolated transport service enviroment, the SMTP client may send all of its traffic to a single SMTP server which, in turn, relays the mail to final (or other intermediate) destinations and which supports all of the queuing, retrying, and alternate address functions discussed in this specification. The SMTP server may be either the ultimate destination or an intermediate (i.e., may assume the role of an SMTP client after receiving the message). SMTP commands are generated by the SMTP client and sent to the SMTP server. SMTP replies are sent from the SMTP server to the SMTP client in response to the commands. Once the transmission channel is established and initial handshaking completed, the SMTP-sender sends a MAIL command indicating the sender of the mail. If the server SMTP can accept mail it responds with an OK reply. The client SMTP then sends a RCPT command identifying a recipient of the mail. If the server SMTP can accept mail for that recipient (or believes that it can but cannot immediately verify that fact--see below) it responds with an OK reply; if not, it responds with a reply rejecting that recipient (but not the whole mail transaction). The client and server SMTPs may negotiate several recipients. When the recipients have been negotiated the client sends the mail data, terminating with a special sequence. If the server successfully processes the mail data it responds with an OK reply. Either the sender or recipient commands may include server-permitted SMTP service extension requests as discussed in section 2.2. The dialog is purposely lock-step, one-at-a-time although this can be modified by mutually-agreed extension requests. ------------------------------------------------------------- +----------+ +----------+ +------+ | | | | | User |<-->| | SMTP | | +------+ | Sender- |Commands/Replies| Receiver-| +------+ | SMTP |<-------------->| SMTP | +------+ | File |<-->| | and Mail | |<-->| File | |System| | | | | |System| +------+ +----------+ +----------+ +------+ SMTP client SMTP server Model for SMTP Use Figure 1 ------------------------------------------------------------- An SMTP server may "accept mail" for a recipient under one of two circumstances: when it can actually verify the address or when it can make a determination that it is willing to accept responsibility for the mail. Replies to the RCPT command MUST NOT be delayed beyond a reasonable time in order to verify addresses. Hence, a "250 OK" reply to a RCPT command does not necessarily imply that the delivery address(es) are valid. Errors found after message acceptance will be reported by mailing a notification message to an appropriate address. DISCUSSION: The set of conditions under which a RCPT parameter can be validated immediately is an engineering design choice. Reporting destination mailbox errors to the Sender-SMTP before mail is transferred is generally desirable to save time and network bandwidth, but this advantage is lost if RCPT verification is lengthy. For example, most SMTP servers can immediately verify any simple local reference, such as a single locally- registered mailbox. On the other hand, the "reasonable time" limitation generally implies deferring verification of a mailing list until after the message has been transferred and accepted, since verifying a large mailing list can take a very long time. An implementation might or might not choose to defer validation of addresses that are non-local and therefore require a DNS lookup. If a DNS lookup is performed but a soft domain system error (e.g., timeout) occurs, validity must be assumed. An SMTP relay would usually defer verification of addresses when service extensions are specified that require verification with the destination host. The SMTP provides mechanisms for the transmission of mail; directly from the sending user's host to the receiving user's host when the two hosts are connected to the same transport service, via one or more relay SMTP-servers when the source and destination hosts are not connected to the same transport service, or when an intermediate host is selected via a Mail eXchanger mechanism. To be able to provide the relay capability the server SMTP is supplied with the name of the ultimate destination host as well as the destination mailbox name. Usually, intermediate hosts are determined via the DNS MX record, not by explicit "source" routing. The argument to the MAIL command is normally an address in mailbox@domain format, which specifies who the mail is from. The argument to the RCPT command is normally also an address, which specifies who the mail is to. More generally, the MAIL address is a forward-path and the RCPT address a reverse-path. The forward-path is a source route, while the reverse-path is a return route (which may be used to return a message to the sender when an error occurs with a relayed message). When the same message is sent to multiple recipients the SMTP encourages the transmission of only one copy of the data for all the recipients at the same destination host. The mail commands and replies have a rigid syntax. Replies also have a numeric code. In the following, examples appear which use actual commands and replies. The complete lists of commands and replies appears in Section 4 on specifications. Commands and replies are not case sensitive. That is, a command or reply word MAY be upper case, lower case, or any mixture of upper and lower case. Note that this is not true of mailbox user names. For some hosts the user name is case sensitive (this practice impedes interoperability and is discouraged), and SMTP implementations MUST take care to preserve the case of user names as they appear in mailbox arguments. Domain names are not case sensitive. Commands and replies are composed of characters from the ASCII character set [1]. When the transport service provides an 8-bit byte (octet) transmission channel, each 7-bit character is transmitted right justified in an octet with the high order bit cleared to zero. More specifically, the unextended SMTP service provides seven bit transport only. SMTP clients MUST NOT transmit messages with information in the high-order bit of octets. If such messages are transmitted in violation of this rule, receiving SMTP servers MAY clear the high-order bit or reject the message as invalid. Eight -bit transmission MAY be requested of the server by the client using extended SMTP facilities. When specifying the general form of a command or reply, an argument (or special symbol) will be denoted by a meta-linguistic variable (or constant), for example, "" or "". Here the angle brackets indicate these are meta-linguistic variables. However, some arguments use the angle brackets literally. For example, an actual reverse-path is enclosed in angle brackets, i.e., "" is an instance of (the angle brackets are actually transmitted in the command or reply). 2.2 The Extension Model 2.2.1 Background In an effort that started in 199??, approximately a decade after RFC 821 was completed, the protocol was modified with a "service extensions" model that permits the client and server to agree to utilize shared functionality that goes beyond the original basic SMTP requirements. SMTP implementations SHOULD support the basic extension mechanisms (see below for details), i.e., servers should support the EHLO command even if they do not implement any specific extensions and clients SHOULD preferentially utilize EHLO rather than HELO. However, for compatibility with older implementations, SMTP clients and servers MUST support the original HELO mechanisms. Although SMTP is widely and robustly deployed, some parts of the Internet community might wish to extend the SMTP service. The SMTP extension mechanism defines a means whereby an extended SMTP client and server may recognize each other as such and the server can inform the client as to the service extensions that it supports. It must be emphasized that any extension to the SMTP service should not be considered lightly. SMTP's strength comes primarily from its simplicity. Experience with many protocols has shown that: protocols with few options tend towards ubiquity, whilst protocols with many options tend towards obscurity. This means that each and every extension, regardless of its benefits, must be carefully scrutinized with respect to its implementation, deployment, and interoperability costs. In many cases, the cost of extending the SMTP service will likely outweigh the benefit. Given this environment, the extension framework consists of: (1) The SMTP command EHLO, superseding the earlier HELO, (2) a registry of SMTP service extensions, and (3) additional parameters to the SMTP MAIL FROM and RCPT TO commands. 2.2.2 Definition and Registration of Extensions The IANA maintains a registry of SMTP service extensions. Associated with each such extension is a corresponding EHLO keyword value. Each service extension registered with the IANA must be defined in an RFC. Such RFCs must either be on the standards-track or must define an IESG-approved experimental protocol. The definition must include: (1) the textual name of the SMTP service extension; (2) the EHLO keyword value associated with the extension; (3) the syntax and possible values of parameters associated with the EHLO keyword value; (4) any additional SMTP verbs associated with the extension (additional verbs will usually be, but are not required to be, the same as the EHLO keyword value); (5) any new parameters the extension associates with the MAIL FROM or RCPT TO verbs; (6) how support for the extension affects the behavior of a server and client SMTP; and, (7) the increment by which the extension is increasing the maximum length of the commands MAIL FROM, RCPT TO, or both, over that specified in RFC 821. In addition, any EHLO keyword value that starts with an upper or lower case "X" refers to a local SMTP service extension, which is used through bilateral, rather than standardized, agreement. Keywords beginning with "X" may not be used in a registered service extension. Any keyword values presented in the EHLO response that do not begin with "X" must correspond to a standard, standards-track, or IESG-approved experimental SMTP service extension registered with IANA. A conforming server must not offer non "X" prefixed keyword values that are not described in a registered extension. Additional verbs are bound by the same rules as EHLO keywords; specifically, verbs begining with "X" are local extensions that may not be registered or standardized and verbs not beginning with "X" must always be registered. 2.3 Terminology A glossary of terms appears at the end of this document. However, the following terms and concepts are used in special ways here, or represent differences in terminology between RFC 821 and this document and should be understood before reading further. SMTP relays a mail object containing an envelope and a content. (1) The SMTP envelope is straightforward, and is sent as a series of SMTP protocol units (described in section 3): it consists of an originator address (to which error reports should be directed); a delivery mode (e.g., deliver to recipient mailboxes); and, one or more recipient addresses. (2) The SMTP content is sent in the SMTP DATA protocol unit and has two parts: the headers and the body. The headers form a collection of field/value pairs structured according to RFC 822 [RFC822], whilst the body, if structured, is defined according to MIME [3]. The content is textual in nature, expressed using the US ASCII repertoire (ANSI X3.4-1986). Although extensions (such as MIME) may relax this restriction for the content body, the content headers are always encoded using the US ASCII repertoire. The algorithm defined in [4] is used to represent header values outside the US ASCII repertoire, whilst still encoding them using the US ASCII repertoire. <> SMTP-sender, SMTP-receiver -> client and server UA MTA host domain buffer state table 2.4 Syntax Principles The commands consist of a command code followed by an argument field. Command codes are four alphabetic characters. Upper and lower case alphabetic characters are to be treated identically. Thus, any of the following may represent the mail command: MAIL Mail mail MaIl mAIl This also applies to any symbols representing parameter values, such as "TO" or "to" for the forward-path. Command codes and the argument fields are separated by one or more spaces. However, within the reverse-path and forward-path arguments case is important. In particular, in some hosts the user "smith" is different from the user "Smith". The argument field consists of a variable length character string ending with the character sequence . The receiver is to take no action until this sequence is received. The syntax for each command is shown with the discussion of that command, with common elements and parameters shown in section <<>>??.??. Square brackets denote an optional argument field. If the option is not taken, the appropriate default is implied. <<>> Reference 822 ABNF. 3. THE SMTP PROCEDURES: AN OVERVIEW This section presents the procedures used in SMTP in several parts. First comes the basic mail procedure defined as a mail transaction. Following this are descriptions of forwarding mail, verifying mailbox names and expanding mailing lists, sending to terminals instead of or in combination with mailboxes, and the opening and closing exchanges. At the end of this section are comments on relaying, a note on mail domains, and a discussion of changing roles. Throughout this section are examples of partial command and reply sequences, several complete scenarios are presented in Appendix F. 3.1 Session initiation: EHLO An SMTP session is initiated by the client opening a connection to the server and the server responding with an opening message. SMTP server implementations SHOULD include identification of their software and version information in the connection greeting reply after the 220 code. This practice permits much more efficient isolation and repair of any problems. While some systems also identify their contact point for mail problems, this is not a substitute for maintaining the required Postmaster address (see [RFC822]). Implementations MAY make provision for SMTP servers to be configured to disable the software and version announcement where it causes security concerns. 3.2 Client initiation: EHLO The client then sends the EHLO command to the server, indicating its identity. In addition to opening the session, use of EHLO indicates that the client is able to process service extensions and requests that the server provide a list of the extensions it supports. Older SMTP systems, unable to support service extensions, MAY use HELO instead of EHLO but EHLO SHOULD be used by all current clients and accepted by all current systems. In the EHLO, or the older HELO, command the host sending the command identifies itself; the command may be interpreted as saying "Hello, I am " (and, in the case of EHLO, "and I support service extension requests"). ------------------------------------------------------------- Example of Connection Opening R: 220 BBN-UNIX.ARPA Simple Mail Transfer Service Ready S: HELO USC-ISIF.ARPA R: 250 BBN-UNIX.ARPA Example 5 ------------------------------------------------------------- ------------------------------------------------------------- Example of Connection Closing S: QUIT R: 221 BBN-UNIX.ARPA Service closing transmission channel Example 6 ------------------------------------------------------------- 3.3. MAIL There are three steps to SMTP mail transactions. The transaction is started with a MAIL command which gives the sender identification. A series of one or more RCPT commands follows giving the receiver information. Then a DATA command gives the mail data. And finally, the end of mail data indicator confirms the transaction. The first step in the procedure is the MAIL command. The contains the source mailbox. MAIL FROM: [ ] This command tells the SMTP-receiver that a new mail transaction is starting and to reset all its state tables and buffers, including any recipients or mail data. It gives the reverse-path which can be used to report errors (see section 4.2 for a discussion of error reporting). If accepted, the SMTP server returns a 250 OK reply. The can contain more than just a mailbox. The is a reverse source routing list of hosts and source mailbox. The first host in the should be the host sending this command. The optional are associated with negotiated SMTP service extensions (see section 2.2). The second step in the procedure is the RCPT command. RCPT TO: [ ] This command gives a forward-path identifying one recipient. If accepted, the SMTP server returns a 250 OK reply, and stores the forward-path. If the recipient is unknown the SMTP server returns a 550 Failure reply (other circumstances and reply codes are possible). This second step of the procedure can be repeated any number of times. The can contain more than just a mailbox. The may be a source routing list of hosts and the destination mailbox. However, in general, the should contain only a mailbox and domain name, relying on the domain name system to supply routing information if required. Servers MUST be prepared to encounter a list of source routes in the forward path, but MAY ignore the routes or decline to support the relaying they imply. Similarly, servers MAY decline to accept mail that is destined for other hosts or systems. Of course, such a restrictions would make a server useless as a relay for clients that do not support full SMTP functionality, but such clients MUST NOT assume that any SMTP server on the Internet can be used as their mail processing site. Clients SHOULD NOT utilize explicit source routing except under unusual circumstances, such as debugging or potentially relaying around firewalls or mail system configuration errors. If source routes are used, the first host in the should be the host receiving this command. The optional are associated with negotiated SMTP service extensions (see section 2.2). The third step in the procedure is the DATA command. DATA If accepted, the SMTP server returns a 354 Intermediate reply and considers all succeeding lines to be the message text. When the end of text is received and stored the SMTP-receiver sends a 250 OK reply. Since the mail data is sent on the transmission channel the end of the mail data must be indicated so that the command and reply dialog can be resumed. SMTP indicates the end of the mail data by sending a line containing only "." (period or full stop). A transparency procedure is used to prevent this from interfering with the user's text (see Section 4.5.2). The end of mail data indicator also confirms the mail transaction and tells the SMTP server to now process the stored recipients and mail data. If accepted, the SMTP server returns a 250 OK reply. The DATA command should fail only if the mail transaction was incomplete (for example, no recipients), or if resources are not available. However, some servers in practice do not perform recipient verification until after the message text is received. These servers SHOULD treat a failure for one or more recipients as a "subsequent failure" and return a mail message as discussed in section <<>>. Using a "recipient not found" or equivalent reply code after the data are accepted makes it difficult or impossible for the client to determine which recipients failed. The above procedure is an example of a mail transaction. These commands must be used only in the order discussed above. Example 1 (below) illustrates the use of these commands in a mail transaction. ------------------------------------------------------------- Example of the SMTP Procedure This SMTP example shows mail sent by Smith at host Alpha.ARPA, to Jones, Green, and Brown at host Beta.ARPA. Here we assume that host Alpha contacts host Beta directly. S: MAIL FROM: R: 250 OK S: RCPT TO: R: 250 OK S: RCPT TO: R: 550 No such user here S: RCPT TO: R: 250 OK S: DATA R: 354 Start mail input; end with . S: Blah blah blah... S: ...etc. etc. etc. S: . R: 250 OK The mail has now been accepted for Jones and Brown. Green did not have a mailbox at domain Beta.ARPA. Example 1 ------------------------------------------------------------- 3.4. FORWARDING FOR ADDRESS CORRECTION OR UPDATING The "forwarding" mechanisms described in section 3.2 of RFC 821, and especially the 251 reply code from RCPT that indicates a corrected destination, are no longer in active use. Forwarding support is most often required to consolodate and simplify addresses within, or relative to, some enterprise. In most of those cases, information hiding (and sometimes security) considerations argue against exposure of the "final" address through the SMTP protocol as a consequence of the forwarding activity and, in some cases, that final address may not even be reachable by the sender. Silent forwarding of messages (without server notification to the sender) is common in the contemporary Internet. If the forwarding and address correction mechanisms described in RFC 821 are used, the addresses given should be stable enough that it would be reasonable for the client to update local records with them. 3.5. VERIFYING AND EXPANDING 3.5.1 Overview SMTP provides, as additional features, commands to verify a user name or expand a mailing list. This is done with the VRFY and EXPN commands, which have character string arguments. For the VRFY command, the string is a user name (see below) and the response may include the full name of the user and must include the mailbox of the user, e.g., it MUST BE in either User Name or mailbox@domain form. Paths (explicit source routes) MUST NOT be returned by VRFY or EXPN. When a name that is the argument to VRFY could identify more than one mailbox, the server MAY either note the ambiguity or identify the alternatives. In other words, either of the following are legitimate response to VRFY: 553 User ambiguous or 553- Ambiguous; Possibilities are 553-Joe Smith 553-Harry Smith 553 Melvin Smith Under normal circumstances a client receiving a 553 reply would be expected to expose the result to the user. Use of exactly the forms given, and the "user ambiguous" or "ambiguous" keywords, will facilitate automated translation into other languages as needed. For the EXPN command, the string identifies a mailing list, and the multiline response may include the full name of the users and must give the mailboxes on the mailing list. "User name" is a fuzzy term and used purposely. An implementation of the VRFY or EXPN commands MUST include at least recognition of local mailboxes as "user names". If a host chooses to recognize other strings as "user names" that is allowed. In some hosts the distinction between a mailing list and an alias for a single mailbox is a bit fuzzy, since a common data structure may hold both types of entries, and it is possible to have mailing lists of one mailbox. If a request is made to verify a mailing list a positive response can be given if on receipt of a message so addressed it will be delivered to everyone on the list, otherwise an error should be reported (e.g., "550 That is a mailing list, not a user"). If a request is made to expand a user name a positive response can be formed by returning a list containing one name, or an error can be reported (e.g., "550 That is a user name, not a mailing list"). In the case of a multiline reply (normal for EXPN) exactly one mailbox is to be specified on each line of the reply. The case of an ambiguous request is discussed above. The case of verifying a user name is straightforward as shown in example 3. ------------------------------------------------------------- Example of Verifying a User Name Either S: VRFY Smith R: 250 Fred Smith Or S: VRFY Smith R: 251 User not local; will forward to Or S: VRFY Jones R: 550 String does not match anything. Or S: VRFY Jones R: 551 User not local; please try Or S: VRFY Gourzenkyinplatz R: 553 User ambiguous. Example 3 ------------------------------------------------------------- The case of expanding a mailbox list requires a multiline reply as shown in example 4. ------------------------------------------------------------- Example of Expanding a Mailing List Either S: EXPN Example-People R: 250-Jon Postel R: 250-Fred Fonebone R: 250-Sam Q. Smith R: 250-Quincy Smith <@USC-ISIF.ARPA:Q-Smith@ISI-VAXA.ARPA> R: 250- R: 250 Or S: EXPN Executive-Washroom-List R: 550 Access Denied to You. Example 4 ------------------------------------------------------------- The character string arguments of the VRFY and EXPN commands cannot be further restricted due to the variety of implementations of the user name and mailbox list concepts. On some systems it may be appropriate for the argument of the EXPN command to be a file name for a file containing a mailing list, but again there is a variety of file naming conventions in the Internet. 3.5.2 VRFY normal response. When normal (2yz or 551) responses are returned from a VRFY or EXPN request, the reply MUST include the mailbox name, e.g., "" (where "bar" is a fully qualified domain name) must appear in the syntax. EXPN and VRFY MUST return only valid domain addresses that are usable in SMTP RCPT commands. Consequently, if an address implies delivery to a program or other system, the mailbox name used to reach that target should be given. Server implementations MUST support VRFY and SHOULD support EXPN. For security reasons, implementations MAY provide local installations a way to disable either or both of these commands through configuration options or the equivalent. When these commands are supported, they are not required to work across relays when relaying is supported. Since they were both optional in RFC 821, they MUST, if supported, be listed in the response to EHLO if service extensions are supported. 3.5.3 Meaning of VRFY or EXPN success response. A server MUST NOT return a 220 code in response to a VRFY or EXPN command unless it has actually verified the address. In particular, a server MUST NOT return 220 if all it has done is to verify that the syntax given is valid. In that case 502 (Command not implemented) or 500 (Syntax error, command unrecognized) SHOULD be returned (note that implementation of VRFY is required by RFC 1123 and EXPN is strongly recommended; this specification does not change that requirement and, hence, except as provided in section 3.5.5, implementations that return 500 or 502 for VRFY are not in compliance with the specification). Especially when a server is acting as a mail exchanger for another, there may be circumstances where an address appears to be correct but cannot reasonably be verified in real time. In that situation, reply code 252 SHOULD BE returned. These cases parallel the discussion of RCPT verification discussed in section 2.1 although implementations generally SHOULD be more aggressive about address verification in the case of VRFY than in the case of RCPT even if a little more time is required to do so. 3.5.4. Semantics and applications of EXPN. While EXPN is often very useful in debugging and understanding problems with mailing lists and multiple-target-address aliases, some systems have attempted to use source expansion of mailing lists as a means of eliminating duplicates. The propagation of aliasing systems with mail on the Internet--both for hosts (typically with MX and CNAME DNS records) and for mailboxes (various types of local host aliases) has made it nearly impossible for these strategies to work, and mail systems SHOULD NOT attempt them. 3.5.5 VRFY, EXPN, and security. As discussed above, individual sites may want to disable one or both of VRFY or EXPN for security reasons. As a corollary to the above, implementations that permit this MUST NOT appear to have verified addresses that are not, in fact, verified. If a site disables these commands for security reasons, the SMTP server SHOULD return a 252 response, rather than a code that could be confused with successful or unsuccessful verification. Returning a 250 reply code with the address listed in the VRFY command after having checked it for syntax only violates this rule. Of course, an implementation that "supports" VRFY by always returning 550 whether or not the address is valid is equally not in conformance. 3.6. SENDING AND MAILING The main purpose of SMTP is to deliver messages to user's mailboxes. A very similar service provided by some hosts is to deliver messages to user's terminals (provided the user is active on the host). The delivery to the user's mailbox is called "mailing", the delivery to the user's terminal is called "sending". Because in many hosts the implementation of sending is nearly identical to the implementation of mailing these two functions were combined in SMTP as specified in RFC 821. However the sending commands were not included in the required minimum implementation (Section 4.5.1) and, indeed, have not been widely deployed. Implementations of them, if provided, should refer to the details in RFC 821. If one or more of the commands (SEND, SAML, SOML) are implemented, and service extensions are supported, the EHLO command response MUST list their names. 3.7. DOMAINS Domains have become a key concept in the Internet mail system. The use of domains changes the address space from a flat global space of simple character string host names to a hierarchically structured rooted tree of global addresses. The host name is replaced by a domain and host designator which is a sequence of domain element strings separated by periods with the understanding that the domain elements are ordered from the most specific to the most general. For example, "ISIF.ISI.EDU", "Fred.Cambridge.UK", and "PC7.LCS.MIT.EDU" might be domain identifiers. Whenever domain names are used in SMTP, only resolvable, fully-qualified, domain names (FQDNs) are permitted. In other words, names that can be resolved to MX RRs or A RRs (as discussed in section ??.??.??) are permitted, as are CNAME RRs whose targets can be resolved, in turn, to MX or A RRs. Local nicknames or unqualified names MUST NOT be used. [[Note in draft: this represents a liberalization from the provisions of RFC 1123, section 5.2.2 -- WG please discuss.]] There is one exception to this rule: the domain name given in the EHLO (or HELO) command MUST BE either a primary host name (a domain name that resolves to an A RR) or, if the host has no name, a domain literal in dotted-decimal notation. 3.8. RELAYING The forward-path may be a source route of the form "@ONE,@TWO:JOE@THREE", where ONE, TWO, and THREE MUST BE fully-qualified domain names. This form is used to emphasize the distinction between an address and a route. The mailbox is an absolute address, and the route is information about how to get there. The two concepts should not be confused. In general, the availability of Mail eXchanger records in the domain name system [RFC-DNS] makes the use of explicit source routes in the Internet mail system unnecessary. Many historical problems with their interpretation have made their use undesirable. SMTP clients SHOULD NOT generate explicit source routes except under unusual circumstances. SMTP servers MAY decline to act as mail relays or to accept addresses that specify source routes. They are also permitted to ignore the route information and simply send to the final destination as specified in the route and the DNS. However, there has been a practice, albeit invalid, of using names that do not appear in the DNS as destination names, with the senders counting on the intermediate hosts specified in source routing to resolve any problems. If source routes are stripped, this practice will cause failures -- one of several reasons why SMTP clients MUST NOT generate invalid source routes or depend on serial resolution of names. If source routes are not used, the process described in RFC 821 for constructing a reverse-path from the forward-path is not applicable and the reverse-path at the time of delivery will simply be the address that appeared in the MAIL command. If source routes are used, RFC 821 should be consulted for the mechanisms for constructing and updating the forward- and reverse-paths. Using source routing the SMTP server receives mail to be relayed to another SMTP server. The SMTP server may accept or reject the task of relaying the mail in the same way it accepts or rejects mail for a local user. The SMTP server transforms the command arguments by moving its own identifier (its domain name or that of any domain for which it is acting as a mail exchanger), if it appears, from the forward-path to the beginning of the reverse-path. The SMTP server then becomes an SMTP client, establishes a transmission channel to the next SMTP server in the forward-path, and sends it the mail. Notice that the forward-path and reverse-path appear in the SMTP commands and replies, but not necessarily in the message. That is, there is no need for these paths and especially this syntax to appear in the "To:" , "From:", "CC:", etc. fields of the message header. Conversely, SMTP servers MUST NOT derive message delivery information from message header fields. If an SMTP server has accepted the task of relaying the mail and later finds that the forward-path is incorrect or that the mail cannot be delivered for some other reason, then it MUST construct an "undeliverable mail" notification message and send it to the originator of the undeliverable mail (as indicated by the reverse-path). Formats specified for non-delivery reports by other standards SHOULD be used if possible. This notification message must be from the SMTP server at the relay host or the host that first determines that delivery cannot be accomplished. Of course, SMTP servers should not send notification messages about problems with notification messages. One way to prevent loops in error reporting is to specify a null reverse-path in the MAIL command of a notification message. When such a message is transmitted the reverse-path SHOULD BE set to null. A MAIL command with a null reverse-path appears as follows: MAIL FROM:<> An undeliverable mail notification message is shown in example 7. This notification is in response to a message originated by JOE at HOSTW and sent via HOSTX to HOSTY with instructions to relay it on to HOSTZ. What we see in the example is the transaction between HOSTY and HOSTX, which is the first step in the return of the notification message. ------------------------------------------------------------- Example Undeliverable Mail Notification Message S: MAIL FROM:<> R: 250 ok S: RCPT TO:<@HOSTX.ARPA:JOE@HOSTW.ARPA> R: 250 ok S: DATA R: 354 send the mail data, end with . S: Date: 23 Oct 81 11:22:33 S: From: SMTP@HOSTY.ARPA S: To: JOE@HOSTW.ARPA S: Subject: Mail System Problem S: <<>>replace with NOTARY format <<>> S: . R: 250 ok Example 7 ------------------------------------------------------------- 3.9. CHANGING ROLES The TURN command was specified in RFC 821 as a mechanism for reversing the roles of the client and server programs communicating over the transmission channel. It has proven in practice to cause a security problem in environments in which the identity of the client cannot be accurately verified by the server. TURN SHOULD NOT be used in such environments, which are the norm with SMTP. For details of TURN, see RFC 821. Since TURN was optional in the original specification, implementations that support it and also support service extensions MUST identify TURN in the EHLO reply. 3.10. TERMINATING SESSIONS AND CONNECTIONS An SMTP connection is terminated by the client's sending a QUIT command. The server then responds with a positive reply code, after which it closes the connection. An SMTP server MUST NOT intentionally close the connection except: o After receiving a QUIT connand and responding with a 221 reply. o After detecting the need to shutdown the SMTP service and returning a 451 reply to any command. In particular, a server that closes connections in response to commands that are not understood is in violation of this specification. Instead, servers are expected to be tolerant of unknown commands, issuing a 500 reply and awaiting further instructions from the client. An SMTP server which is forcibly shut down via external means SHOULD attempt to send a line containing 451 response code to the SMTP client before exiting. The SMTP client will normally read the 451 response code after sending its next command. [[Note in draft: Keith and Ned suggest that we should invent a new error code to be sent by the server when it shuts down the connection because it has timed out waiting for a client command and that it should be a 5yz code (since nothing temporary is happening). Such a shutdown is, of course, permitted by RFC 1123 and by good sense. I have not done this yet because I (and Mark) fear that introducing a new code could create an excuse for more of the "send code and shutdown" behavior patterns that we have been trying to eliminate. Would a 4yz code be a way out? Comments?]] 4. THE SMTP SPECIFICATIONS 4.1. SMTP COMMANDS 4.1.1. COMMAND SEMANTICS AND SYNTAX The SMTP commands define the mail transfer or the mail system function requested by the user. SMTP commands are character strings terminated by . The command codes themselves are alphabetic characters terminated by if parameters follow and otherwise. The syntax of mailboxes must conform to receiver site conventions. The SMTP commands are discussed below. The SMTP replies are discussed in Section 4.2. A mail transaction involves several data objects which are communicated as arguments to different commands. The reverse-path is the argument of the MAIL command, the forward-path is the argument of the RCPT command, and the mail data is the argument of the DATA command. These arguments or data objects must be transmitted and held pending the confirmation communicated by the end of mail data indication which finalizes the transaction. The model for this is that distinct buffers are provided to hold the types of data objects, that is, there is a reverse-path buffer, a forward-path buffer, and a mail data buffer. Specific commands cause information to be appended to a specific buffer, or cause one or more buffers to be cleared. 4.1.1.1 HELLO (HELO) or Extended HELLO (EHLO) These commands are used to identify the SMTP client to the SMTP server. The argument field contains the host name of the SMTP client. The SMTP server identifies itself to the SMTP client in the connection greeting reply, and in the response to this command. A client SMTP SHOULD start an SMTP session by issuing the EHLO command. If the SMTP server supports the SMTP service extensions it will give a successful response, a failure response, or an error response. If the SMTP server does not support any SMTP service extensions it will generate an error response. Older client SMTP systems MAY, as discussed above, use HELO (as specified in RFC 821) instead of EHLO. These commands and an OK reply to one of them confirm that both the SMTP client and the SMTP server are in the initial state, that is, there is no transaction in progress and all state tables and buffers are cleared. If the server SMTP implements and is able to perform the EHLO command, it will return code 250. This indicates that both the server and client SMTP are in the initial state, that is, there is no transaction in progress and all state tables and buffers are cleared. Normally, this response will be a multiline reply. Each line of the response contains a keyword and, optionally, one or more parameters. The syntax for a positive response, using the ABNF notation of [RFC822], is: ehlo-ok-rsp ::= "250" domain [ SP greeting ] CR LF / ( "250-" domain [ SP greeting ] CR LF *( "250-" ehlo-line CR LF ) "250" SP ehlo-line CR LF ) ; the usual HELO chit-chat greeting ::= 1* ehlo-line ::= ehlo-keyword *( SP ehlo-param ) ehlo-keyword ::= (ALPHA / DIGIT) *(ALPHA / DIGIT / "-") ; syntax and values depend on ehlo-keyword ehlo-param ::= 1* ALPHA ::= DIGIT ::= CR ::= LF ::= SP ::= Although EHLO keywords may be specified in upper, lower, or mixed case, they must always be recognized and processed in a case-insensitive manner. This is simply an extension of practices begun in RFC 821. 4.1.1.2 MAIL (MAIL) This command is used to initiate a mail transaction in which the mail data is delivered to one or more mailboxes. The argument field contains a reverse-path. The reverse-path consists of an optional list of hosts and the sender mailbox. When the list of hosts is present, it is a "reverse" source route and indicates that the mail was relayed through each host on the list (the first host in the list was the most recent relay). This list is used as a source route to return non-delivery notices to the sender. As each relay host adds itself to the beginning of the list, it must use its name as known in the transport environment to which it is relaying the mail rather than that of the transport environment from which the mail came (if they are different). In some types of error reporting messages (for example, undeliverable mail notifications) the reverse-path may be null (see Example 7). This command clears the reverse-path buffer, the forward-path buffer, and the mail data buffer; and inserts the reverse-path information from this command into the reverse-path buffer. If service extensions were negotiated, the MAIL command may also carry parameters associated with a particular service extension. Syntax: MAIL FROM: [ ] or MAIL FROM:<> 4.1.1.3 RECIPIENT (RCPT) This command is used to identify an individual recipient of the mail data; multiple recipients are specified by multiple use of this command. The forward-path consists of an optional list of hosts and a required destination mailbox. When the list of hosts is present, it is a source route and indicates that the mail must be relayed to the next host on the list. If the SMTP server does not implement the relay function it may user the same reply it would for an unknown local user (550). When mail is relayed, the relay host must remove itself from the beginning forward-path and put itself at the beginning of the reverse-path. When mail reaches its ultimate destination (the forward-path contains only a destination mailbox), the SMTP server inserts it into the destination mailbox in accordance with its host mail conventions. For example, mail received at relay host A with arguments FROM: TO:<@HOSTA.ARPA,@HOSTB.ARPA:USERC@HOSTD.ARPA> will be relayed on to host B with arguments FROM:<@HOSTA.ARPA:USERX@HOSTY.ARPA> TO:<@HOSTB.ARPA:USERC@HOSTD.ARPA>. This command causes its forward-path argument to be appended to the forward-path buffer. If service extensions were negotiated, the MAIL command may also carry parameters associated with a particular service extension. Syntax: RCPT TO: [ ] 4.1.1.4 DATA (DATA) The receiver treats the lines (strings ending in CRLF sequences) following the command as mail data from the sender. This command causes the mail data from this command to be appended to the mail data buffer. The mail data may contain any of the 128 ASCII character codes. SMTP is defined in terms of sending messages consisting of lines of text. Lines are strictly defined as ending in ASCII CR LF sequences. Systems that use other line delimiting mechanisms internally MUST convert to CR LF sequences before transmitting mail with unextended SMTP or with any SMTP service extension on the standards track as of the time of this writing. The mail data is terminated by a line containing only a period, that is the character sequence "." (see Section 4.6.2 on Transparency). This is the end of mail data indication. The custom of accepting lines ending only in LF, as a concession to non-conforming behavior on the part of some UNIX systems, has proven to cause more interoperability problems than it solves and SMTP server systems MUST NOT do this, even in the name of improved robustness. In particular, the sequence "LF.LF" (bare line feeds, without carriage returns) MUST NOT be treated as equivalent to CRLF.CRLF as the end of mail data indication. Receipt of the end of mail data indication requires that the server process the stored mail transaction information. This processing consumes the information in the reverse-path buffer, the forward-path buffer, and the mail data buffer, and on the completion of this command these buffers are cleared. If the processing is successful the receiver must send an OK reply. If the processing fails completely the receiver must send a failure reply. When the SMTP server accepts a message either for relaying or for final delivery it inserts a trace record (also referred to interchangabily as a "time stamp line" or "Received" line) at the top of the mail data. This trace record indicates the identity of the host that sent the message, and the identity of the host that received the message (and that is inserting this time stamp), and the date and time the message was received. Relayed messages will have multiple time stamp lines. Details for formation of these lines, including their syntax, is specified in section 4.4. 4.1.1.5 RESET (RSET) This command specifies that the current mail transaction is to be aborted. Any stored sender, recipients, and mail data must be discarded, and all buffers and state tables cleared. The receiver must send an OK reply. A reset command may be issued by the client at any time. It is effectively equivalent to a NOOP if issued immediately after EHLO or HELO, or before either of those commands have been issued. In other situations, it restores the state to that immediately after the most recent EHLO or HELO. An SMTP server MUST NOT close the connection as the result of receiving a RSET; that action is reserved for QUIT (see section 4.1.1.10, below). 4.1.1.6 VERIFY (VRFY) This command asks the receiver to confirm that the argument identifies a user. If it is a user name, the full name of the user (if known) and the fully specified mailbox are returned. This command has no effect on any of the reverse-path buffer, the forward-path buffer, or the mail data buffer. 4.1.1.7 EXPAND (EXPN) This command asks the receiver to confirm that the argument identifies a mailing list, and if so, to return the membership of that list. The full name of the users (if known) and the fully specified mailboxes are returned in a multiline reply. This command has no effect on any of the reverse-path buffer, the forward-path buffer, or the mail data buffer. 4.1.1.8 HELP (HELP) This command causes the receiver to send helpful information to the sender of the HELP command. The command MAY take an argument (e.g., any command name) and return more specific information as a response. This command has no effect on any of the reverse-path buffer, the forward-path buffer, or the mail data buffer. SMTP servers SHOULD support HELP even if the form with an argument is not supported. 4.1.1.9 NOOP (NOOP) This command does not affect any parameters or previously entered commands. It specifies no action other than that the receiver send an OK reply. This command has no effect on any of the reverse-path buffer, the forward-path buffer, or the mail data buffer. 4.1.1.10 QUIT (QUIT) This command specifies that the receiver must send an OK reply, and then close the transmission channel. The receiver MUST NOT intentionally close the transmission channel until it receives and replies to a QUIT command (even if there was an error). The sender MUST NOT intentionally close the transmission channel until it send a QUIT command and receives the reply (even if there was an error response to a previous command). If the connection is closed prematurely due to violations of the above or system or network failure the server MUST act as if a RSET command had been received (cancelling any pending transaction, but not undoing any previously completed transaction) and the client MUST act as if the command or transaction in progress had received a temporary error (4xx). 4.1.1.11 TURN (TURN) This command, described in RFC 821, raises important security issues (described in RFC 1123). Its use is deprecated; SMTP systems SHOULD NOT use it unless the server can authenticate the client. 4.1.2. LOWER-LEVEL SYNTAX The syntax of the argument fields of the above commands (using BNF notation where applicable) is given below. The "..." notation indicates that a field may be repeated one or more times. ::= | "<>" ::= ::= "<" [ ":" ] ">" ::= | "," ::= "@" ::= <<<>> ::= <<<>> domain = sub-domain 1*("." sub-domain) | domain-literal sub-domain = let-dig *(ldh-str) domain-literal = "[" IP-address-literal "]" IP-address-literal = snum 3*("." snum) snum = one, two, or three digits representing a decimal integer value in the range 0 through 255 let-dig = Alpha / Digit ldh-str = *( Alpha / Digit / "-" ) 1*(let-dig) Alpha = ASCII character in the range A-Z or a-z. As specified in the domain name system definition [RFC-DNS], case is not significant in domain strings. Digit = 0 - 9 ::= "@" ::= | While the definition for above is relatively permissive, for maximum interoperability, a host that expects to receive mail SHOULD avoid defining mailboxes where the requires (or uses) the form or where the is case-sensitive. Systems MUST NOT define mailboxes in such a way as to require the use of non-ASCII characters (octets with the high order bit set to one) or ASCII "control characters" (decimal value 0-31 and 127). These characters MUST NOT be used in MAIL FROM or RCPT TO commands or other commands that require mailbox names. <> ::= | <> ::= """ """ <> ::= "\" | "\" | | ::= | "\" ::= | ::= ::= the carriage return character (ASCII code 13) ::= the line feed character (ASCII code 10) ::= the space character (ASCII code 32) ::= one, two, or three digits representing a decimal integer value in the range 0 through 255 ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case ::= any one of the 128 ASCII characters, but not any or ::= any one of the ten digits 0 through 9 ::= any one of the 128 ASCII characters except , , quote ("), or backslash (\) ::= any one of the 128 ASCII characters (no exceptions) ::= "<" | ">" | "(" | ")" | "[" | "]" | "\" | "." | "," | ";" | ":" | "@" """ | the control characters (ASCII codes 0 through 31 inclusive and 127) Note that the backslash, "\", is a quote character, which is used to indicate that the next character is to be used literally (instead of its normal interpretation). For example, "Joe\,Smith" could be used to indicate a single nine character user field with comma being the fourth character of the field. Hosts are generally known by names which are translated to addresses in each host. Note that the name elements of domains must be resolvable in the Internet domain system. Local aliases or nicknames MUST NOT be used. Characters outside the set of specials, alphas, digits, and hyphen are prohibited by the domain name system definition and MUST NOT appear in domain names. In particular, the underscore character is not permitted. Sometimes a host is not known to the translation function and communication is blocked. To bypass this barrier a numeric form is also allowed for host "names". This form uses four or more small decimal integers separated by dots and enclosed by brackets, e.g., "[123.255.37.2]", which indicates an Internet Address in sequence-of-octets form. The earlier escape form that uses a decimal integer prefixed by a pound sign, "#", indicating the number is the address of the host, is deprecated and MUST NOT be used. The time stamp line and the return path line are formally defined as follows: ::= "Return-Path:" ::= "Received:" ::= ";" ::= "FROM" ::= "BY" ::= [] [] [] [] ::= "VIA" ::= "WITH" ::= "ID" ::= "FOR" <<>>FOR and need to be nailed down. ::= The standard names for links are registered with the Internet Assigned Numbers Authority (IANA). ::= The standard names for protocols are registered with the Internet Assigned Numbers Authority (IANA). ::=