Diligent proof-reading was performed by Bruce Lilly. SMTP D. Crocker Internet-Draft Brandenburg InternetWorking Intended status: Standards Track August 7, 2008 Expires: February 8, 2009 Internet Mail Architecture draft-crocker-email-arch-11-14dc Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on February 8, 2009. Copyright Notice Copyright (C) The IETF Trust (2008). Abstract Over its thirty-five year history Internet Mail has undergone significant changes in scale and complexity, as it has become a global infrastructure service. The first standardized architecture for networked email specified little more than a simple split between the user world and the transfer world. Core aspects of the service, such as the styles of mailbox address and basic message format, have remained remarkably constant. However today's Internet Mail is distinguished by many independent operators, many different Crocker Expires February 8, 2009 [Page 1] Internet-Draft EMail Architecture August 2008 components for providing service to users and many others for performing message transfer. Public discussion of the service often lacks common terminology and a common frame of reference for these components and their activities. Having a common reference model and terminology facilitates discussion about problems with the service, changes in policy, or enhancement to the service's functionality. This document offers an enhanced Internet Mail architecture that targets description of the existing service, in order to facilitate clearer and more efficient technical, operations and policy discussions about email. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . ancho 1.1. Framework . . . . . . . . . . . . . . . . . . . . . . . ancho 1.2. Service Overview . . . . . . . . . . . . . . . . . . . Servi 1.3. Document Conventions . . . . . . . . . . . . . . . . . ancho 1.4. Changes for This Version . . . . . . . . . . . . . . . chang 2. Responsible Actor Roles . . . . . . . . . . . . . . . . . . Actor 2.1. User Actors . . . . . . . . . . . . . . . . . . . . . . Users 2.2. Mail Handling Service (MHS) Actors . . . . . . . . . . MHS 2.3. Administrative Actors . . . . . . . . . . . . . . . . . Admin 3. Identities . . . . . . . . . . . . . . . . . . . . . . . . ancho 3.1. Mailbox . . . . . . . . . . . . . . . . . . . . . . . . ancho 3.2. Domain Names . . . . . . . . . . . . . . . . . . . . . DNS 3.3. Message Identifier . . . . . . . . . . . . . . . . . . ancho 4. Services and Standards . . . . . . . . . . . . . . . . . . ancho 4.1. Message Data . . . . . . . . . . . . . . . . . . . . . Data 4.2. User-Level Services . . . . . . . . . . . . . . . . . . ancho 4.3. MHS-Level Services . . . . . . . . . . . . . . . . . . ancho 4.4. Transition Modes . . . . . . . . . . . . . . . . . . . ancho 4.5. Implementation and Operation . . . . . . . . . . . . . ancho 5. Mediators . . . . . . . . . . . . . . . . . . . . . . . . . Media 5.1. Aliasing . . . . . . . . . . . . . . . . . . . . . . . ancho 5.2. ReSending . . . . . . . . . . . . . . . . . . . . . . . ancho 5.3. Mailing Lists . . . . . . . . . . . . . . . . . . . . . ancho 5.4. Gateways . . . . . . . . . . . . . . . . . . . . . . . ancho 5.5. Boundary Filter . . . . . . . . . . . . . . . . . . . . ancho 6. References . . . . . . . . . . . . . . . . . . . . . . . . ancho 6.1. Normative . . . . . . . . . . . . . . . . . . . . . . . ancho 6.2. Informative . . . . . . . . . . . . . . . . . . . . . . ancho Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . ancho Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 0 Intellectual Property and Copyright Statements . . . . . . . . Crocker Expires February 8, 2009 [Page 2] Internet-Draft EMail Architecture August 2008 1. Introduction Over its thirty-five year history Internet Mail has undergone significant changes in scale and complexity, as it has become a global infrastructure service. The changes have been evolutionary, rather than revolutionary, reflecting a strong desire to preserve its installed base of users and utility. Today, Internet Mail is distinguished by many independent operators, many different components for providing service to users and many other components for performing message transfer. Public collaboration on email technical, operations and policy activities, including those responding to the challenges of email abuse, has brought in a much wider range of participants than email's technical community originally had. In order to do work on a large, complex system, they need to share the same view of how it is put together, as well as what terms to use to refer to the pieces and their activities. Otherwise, it is difficult to know exactly what another participant means. It is these differences in each person's perspective that motivates this document, to describe the realities of the current system. Internet mail is the subject of ongoing technical, operations and policy work, and the discussions often are hindered by different models of email service design and different meanings for the same terms. This architecture document seeks to facilitate clearer and more efficient technical, operations and policy exchanges about email. This document offers an enhanced Internet Mail architecture to reflect the current service. In particular it: * Documents refinements to the email model * Clarifies functional roles for the architectural components * Clarifies identity-related issues, across the email service * Defines terminology for architectural components and their interactions 1.1. Framework The first standardized architecture for networked email specified a simple split between the user world, in the form of Mail User Agents (MUA), and the transfer world, in the form of the Mail Handling Service (MHS) composed of Mail Transfer Agents (MTA). The MHS is responsible for accepting a message from one User and delivering it Crocker Expires February 8, 2009 [Page 3] Internet-Draft EMail Architecture August 2008 to one or more others, creating a virtual MUA-to-MUA exchange environment. As shown in Figure 1 this defines two logical "layers" of interoperability. One is directly between Users. The other is between the neighboring components, along the transfer path. In addition, there is interoperability between the layers, first when a message is posted from the User to the MHS and later when it is delivered from the MHS to the User. The operational service has evolved sub-divisions for each of these layers into more specialized modules. Core aspects of the service, such as mailbox addressing and message format style, have remained remarkably constant. So the original distinction between user-level concerns and transfer-level concerns is retained, but with an elaboration to each level of the architecture. The term "Internet Mail" is used to refer to the entire collection of user and transfer components and services. For Internet Mail the term "end-to-end" usually refers to a single posting and the set of deliveries directly resulting from its single transiting of the MHS. A common exception is with group dialogue that is mediated via a mailing list, so that two postings occur before intended recipients receive an Author's message, as discussed in Section 2.1.4. In fact some uses of email consider the entire email service -- including Author and Recipient -- as a subordinate component. For these services "end-to-end" refers to points outside of the email service. Examples are voicemail over email [RFC3801], EDI over email [RFC1767] and facsimile over email [RFC4142]. Crocker Expires February 8, 2009 [Page 4] Internet-Draft EMail Architecture August 2008 +--------+ ++================>| User | || +--------+ || ^ +--------+ || +--------+ . | User +==++=========>| User | . +---+----+ || +--------+ . . || ^ . . || +--------+ . . . ++==>| User | . . . +--------+ . . . ^ . . . . . . V . . . +---+-----------------+------+------+---+ | . . . . | | .................>. . . | | . . . | | ........................>. . | | . . | | ...............................>. | | | | Mail Handling Service (MHS) | +---------------------------------------+ Figure 1: Basic Internet Mail Service Model 1.2. Service Overview End-to-end Internet Mail exchange is accomplished by using a standardized infrastructure comprising: * An email object * Global addressing * An asynchronous sequence of point-to-point transfer mechanisms * No prior arrangement between MTAs or between an Author and their Recipients. * No prior arrangement between point-to-point transfer services, over the open Internet * No requirement for Author, Originator or Recipients to be online at the same time. Crocker Expires February 8, 2009 [Page 5] Internet-Draft EMail Architecture August 2008 The end-to-end portion of the service is the email object, called a message. Broadly the message, itself, distinguishes between control information for handling, versus the author's message content. A precept to the design of mail over the open Internet is permitting user-to-user and MTA-to-MTA interoperability to take place with no prior, direct arrangement between the independent administrative authorities responsible for handling a message. That is, all participants rely on the core services being universally supported and accessible, either directly or through gateways that translate between Internet Mail and email environments that conform to other standards. Given the importance of spontaneity and serendipity in the world of human communications, this lack of prearrangement between participants is a core benefit of Internet Mail and remains a core requirement for it. Within localized networks at the edge of the public Internet, prior administrative arrangement often is required and can include access control, routing constraints and information query service configuration. Although recipient authentication has usually been required for message access, since the beginning of Internet mail, in recent years it also has been required for message submission. In these cases a server performs explicit validation of the client's identity, whether by explicit, security protocols or by implicit infrastructure query to identify "local" participants. 1.3. Document Conventions References to structured fields of a message use a two-part dotted notation. The first part cites the document that contains the specification for the field and the second is the name of the field. Hence is the From: header field in an email content header and is the address in the SMTP "Mail From" command. When occurring without the RFC2822 qualifier, header field names are shown with a colon suffix. For example, From:. References to labels for actors, functions or components have the first letter capitalized. Also, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Crocker Expires February 8, 2009 [Page 6] Internet-Draft EMail Architecture August 2008 Discussion venue: Please direct discussion about this document to the IETF-SMTP mailing list . 1.4. Changes for This Version INSTRUCTIONS TO THE RFC EDITOR: Remove this sub-section prior to publication. Many small editing changes, for wordsmithing improvements and to make details more consistent. This section documents changes with significant impact. 1.4.1. email-arch-11-12 Added pictorial artwork for PDF version. 1.4.2. email-arch-11-11 Security Considerations: Extentive additions. Border MTA: Added section on this Internationalization: Added tiny section on this. MUA: Deferred interesting issues over to RFC 2822, rather than trying to repeat them. Mailing List: Added discussion of Reply-To challenges Services diagram: Collapsed rMS into single box, deferring distributed agent implementation issues. Transition modes: Added section to discuss push/pull. Implementation and Operation: Added section to hold discussion of architecture vs. implementation, and various access "modes", as examples. Indexing: Started added entries for index. Delivery: Added clarifications of construct. Crocker Expires February 8, 2009 [Page 7] Internet-Draft EMail Architecture August 2008 1.4.3. email-arch-11-04dc Decoupled receiving MS: The linkage between a local MS and a remote one has been removed, to avoid dealing with the messy, and apparently very rare, interaction issues when they are linked. Reference to common folders: Tuned the language to emphasize that the text offers a common exemplar, not a requirement or restriction and not tied to imap. Alias vs. Mailing List: Attempted to clarify details about Alias and Mailing List mediators, to make their differences objectively clear. However it's not clear that the distinction can be held that cleanly. Diagrams: Added emphasis to beginning and end nodes, and distinguished primary paths from secondary (return) paths. Before publication, non-ascii art versions will be added for the xml document source. Authentication for posting: Clarified recent, increased used of submission-time authentication Submission and Delivery transition: Clarified figure and text to explain transfer of responsibility into and out of MHS, with (S) and (D) transitions in Figure 5 Assigning message-id: Slight tweak of language about assigning and consuming message-id. BCC Removed paragraph that elaborated on BCC usage and styles, instead just referring to RFC2822. Dest ADMD pre-delivery msg mod: Added note that MTA in destination ADMD that changes the message is a Gateway, not an MTA... Multiple Msg-ids: Removed reference to messages sometimes having more than one message-id. Mailing list setting Reply-To: Added note in Mailing List discussion about controversy of setting Reply-To. Gateway examples: Added citations to fax, vpim and mms gateway specs. Did not include X.400, since it is essentially obsolete Crocker Expires February 8, 2009 [Page 8] Internet-Draft EMail Architecture August 2008 Auto-Responder ** pending RFC3834 *** Citations for normative assertions: *** *** Origin vs. Submission: Clarified Actor diagram and text to map Originator actor to MSA functionality. Reserved mailbox names: Added mention of RFC2142 MTA sub-roles: border/inbound/outbound/final mta references. Mediator common info: Mediator section initial text had standalone table of information. It has been changed to refer only to information that is common for all Mediators. Within specific Mediator discussions, any listing of this common information is was redundant and has been removed. Security Considerations: Meager revisions to Security Considerations, stating that the underlying service does not attend to them, really, and that particular specifications cover their particular Considerations. This is intended to acknowledge the reality of Security Considerations but defer meaningful handling of them to concrete specifications. This architecture document defers to specifications where possible, in order to avoid divergent discussions. Internationalization: The document contains no discussion of this important topic. Frankly, I have not thought of what the document should usefully say about the topic and am particularly worried that the problematic and fluid nature of the topic would cause the architecture document to conflict with the reality of work that is underway. Offerings of candidate text for the document, to deal with I8N are encouraged. 2. Responsible Actor Roles Internet Mail is a highly distributed service, with a variety of actors serving different roles. These divide into 3 basic types: * User * Mail Handling Service (MHS) Crocker Expires February 8, 2009 [Page 9] Internet-Draft EMail Architecture August 2008 * ADministrative Management Domain (ADMD) Although related to a technical architecture, the focus on Actors concerns participant responsibilities, rather than on functionality of modules. Hence the labels used are different than for classic email architecture diagrams. 2.1. User Actors Users are the sources and sinks of messages. They can be humans, organizations or processes. They can have an exchange that iterates and they can expand or contract the set of users participating in a set of exchanges. In Internet Mail there are three types of user- level Actors: * Authors * Recipients * Mediators Crocker Expires February 8, 2009 [Page 10] Internet-Draft EMail Architecture August 2008 Figure Figure 2 depicts the flow of messages among Actors: ++==========++ || Author ||<..................................<.. ++=++=++=++=++ . || || || ++===========++ . || || ++====>|| Recipient || . || || ++=====+=====++ . || || . . || || ..........................>.+ || || . || || ................... . || || . . . || || V . . || || +-----------+ ++=====+=====++ . || ++========>| Mediator +===>|| Recipient || . || +-----+-----+ ++=====+=====++ . || . . . || ..................+.......>.+ || . || ..............+.................. . || . . . . \/ V V ' . +-----------+ +-----------+ ++=====+=====++ . | Mediator +===>| Mediator +===>|| Recipient || . +-----+-----+ +-----+-----+ ++=====+=====++ . . . . . .................+.................+.......>.. Figure 2: Relationships Among User Actors From the User-level perspective all mail transfer activities are performed by a monolithic Mail Handling Service (MHS), even though the actual service can be provided by many independent organizations. Users are customers of this unified service. 2.1.1. Author This is the user-level participant responsible for creating the message, its contents and its list of recipient addresses. The MHS operates to transfer and deliver mail from an Author to Recipients. As described below, the MHS has an Originator role that correlates with the separate, user-level Author role and a Destination role that correlates with the separate, user-level Recipient role. 2.1.2. Recipient The Recipient is a consumer of delivered message content. As described below, the MHS has a "Dest[ination]" role that correlates Crocker Expires February 8, 2009 [Page 11] Internet-Draft EMail Architecture August 2008 with the user-level Recipient role. A Recipient can close the user-level communication loop by creating and submitting a new message that replies to an Author. An example of an automated form of reply is the Message Disposition Notification (MDN), which informs the Author about the Recipient's handling of the message. (See Section 4.1.) 2.1.3. Return Handler The Return Handler -- also called "Bounce Handler" -- receives and services notifications generated by the MHS, as a result of efforts to transfer or deliver the message. Notices can be about failures or completions and are sent to an address that is specified by the Originator. This Return handling address (also known as a Return address) might have no visible characteristics in common with the address of the Author or Originator. 2.1.4. Mediator A Mediator receives, aggregates, reformulates and redistributes messages as part of a potentially-protracted, higher-level exchange among Users. It is easy to confuse this user-level activity with the underlying MHS transfer exchanges. However they serve very different purposes and operate in very different ways. Mediators are considered extensively in Section 5. When mail is delivered to a receiving mediator specified in the RFC2821.RcptTo command, the MHS handles it the same way as for any other Recipient. That is, the MHS only sees posting and delivery sources and sinks and does not see (later) re-posting as a continuation of a process. Hence when submitting messages, the Mediator is an Author. The distinctive aspects of a Mediator are, therefore, above the MHS. A Mediator preserves the Author information of the message it reformulates, but is permitted to make meaningful changes to the message's content or envelope. Hence the MHS sees a new message, but Users receive a message that is interpreted as primarily being from -- or, at least, initiated by -- the author of the original message. The role of a Mediator permits distinct, active creativity, rather than being limited to the more constrained job of merely connecting together other participants. Hence it is really the Mediator that is responsible for the new message. A Mediator's task can be complex and contingent, such as modifying and adding content or regulating which users are allowed to participate and when. The popular example of this role is a group Crocker Expires February 8, 2009 [Page 12] Internet-Draft EMail Architecture August 2008 mailing list. A sequence of Mediators might even perform a series of formal steps, such as reviewing, modifying and approving a purchase request. Because a Mediator originates messages, it can also receive replies. So a Mediator really is a full-fledged User. Gateway: A Gateway is a particularly interesting form of Mediator. It is a hybrid of User and Relay that interconnects heterogeneous mail services. Its goal is to emulate a Relay, and a detailed discussion is in Section 2.2.3. 2.2. Mail Handling Service (MHS) Actors The Mail Handling Service (MHS) has the task of performing a single, end-to-end transfer on behalf of the Author and reaching the Recipient address(es) specified in the original RFC2821.RcptTo commands. Mediated or protracted, iterative exchanges, such as those used for collaboration over time, are part of the User-level service, and are not part of this transfer-level Handling Service. Crocker Expires February 8, 2009 [Page 13] Internet-Draft EMail Architecture August 2008 Figure Figure 3 figure depicts the relationships among transfer participants in Internet Mail. It shows the Origin[ator] as distinct from the Author, and Dest[ination] as distinct from Recipient, although it is common for each pair to be the same actor. Transfers typically entail one or more Relays. However direct delivery from the Originator to Destination is possible. For intra-organization mail services, it is common to have only one Relay. ++==========++ ++===========++ || Author || || Recipient || ++====++====++ +--------+ ++===========++ || | Return | /\ || +-+------+ || \/ . ^ || +---------+ . . +---++---+ | | . . | | //==+=========+============================+========+===\\ || | | . . MHS | | || || | Origin +<...... .................+ Dest | || | | ^ | | +---++----+ . +--------+ || . /\ || ..............+.................. || \/ . . . || +-------+-+ +--+------+ +-+--++---+ | Relay +=======>| Relay +=======>| Relay | +---------+ +----++---+ +---------+ || || \/ +---------+ | Gateway +-->... +---------+ Figure 3: Relationships Among MHS Actors 2.2.1. Originator The Originator role ensures that a message is valid for posting and then submits it to a Relay. In effect, this actor is responsible for the Mail Submission Agent's functions. Message validity includes conformance with Internet Mail standards, as well as with local operational policies. The Originator can simply review the message for conformance and reject it if there are errors, or it can create some or all of the necessary information. The Originator operates with dual "allegiance". It serves the Author and can be the same entity. However its role in assuring validity means that it MUST also represent the local operator of the MHS, that Crocker Expires February 8, 2009 [Page 14] Internet-Draft EMail Architecture August 2008 is, the local ADministrative Management Domain (ADMD). The Originator also performs any post-submission, Author-related administrative tasks associated with message transfer and delivery. Notably this pertains to error and delivery notices, as well as enforcement of local policies or otherwise dealing with messages from the Author that prove to be problematic for the Internet. Hence Originator is best held "accountable" for the message content, even when they are not "responsible" for it. That is, the author creates the content, but the Originator is the administrative point of contact for handling issues with the message. 2.2.2. Relay A mail Relay performs email transfer-service routing and store-and- forward by (re-)transmitting the message on towards its Recipient(s). A Relay adds trace information.[RFC2505] However it does not modify existing envelope information or the message content semantics. It can modify message content representation, such as a change between binary and text transfer-encoding form, only as required to meet the capabilities of the next hop in the MHS. A set of Relays composes a Mail Handling Service (MHS) network. This is above any underlying packet-switching network that they might be using and below any gateways or other user-level Mediators. In other words, interesting email scenarios can involve three distinct architectural layers of store-and-forward service: * User Mediators * MHS Relays * Packet Switches with the bottom-most usually being the Internet's IP service. The most basic email scenarios involve Relays and Switches. Aborting a message transfer results in having the Relay become an Author and sending an error message to the Return address. The potential for looping is avoided by having this message, itself, contain no Return address. Crocker Expires February 8, 2009 [Page 15] Internet-Draft EMail Architecture August 2008 2.2.3. Gateway A Gateway is a hybrid form of User and Relay that interconnects heterogeneous mail services. Its purpose is simply to emulate a Relay and the closer it comes to this, the better. However it operates at the User level, because it needs the ability to modify message content. Differences between mail services can be as small as minor syntax variations, but usually encompass significant, semantic distinctions. One difference could have the concept of an email address being a hierarchical, machine-specific address, versus having it be a flat, global name space. Another difference could be between text-only content, versus multi-media. Hence the Relay function in a Gateway offers significant design challenges, if the resulting performance is to be close to seamless. The challenge is to ensure that user-to- user functionality is assured between the services, in spite of differences in their syntax and semantics. The basic test of a Gateway's adequacy is whether an Author on one side of a Gateway can send a useful message to a Recipient on the other side, without requiring changes to any of the components in the Author's or Recipient's mail services, other than adding the Gateway. To each of these otherwise independent services, the Gateway will appear to be a "native" participant. However the ultimate test of a Gateway's adequacy is whether the Author and Recipient can sustain a dialogue. In particular can a Recipient's MUA automatically formulate a valid Reply that will reach the initial Author? 2.3. Administrative Actors Actors can be associated with different organizations, each with its own administrative authority. This operational independence, coupled with the need for interaction between groups, provides the motivation for distinguishing among ADministrative Management Domains (ADMD). Each ADMD can have vastly different operating policies and trust- based decision-making. An obvious example is the distinction between mail that is exchanged within a single organization, versus mail that is exchanged between independent organizations. The rules for handling these two types of traffic tend to be quite different. That difference requires defining the boundaries of each, and this requires the ADMD construct. Operation of Internet Mail services is apportioned to different providers (or operators). Each can be an independent ADMD. This independence of administrative decision-making defines boundaries that distinguish different portions of the Internet Mail service. A department operating a local Relay, an IT department operating an Crocker Expires February 8, 2009 [Page 16] Internet-Draft EMail Architecture August 2008 enterprise Relay and an ISP operating a public shared email service can be configured into many combinations of administrative and operational relationships. Each is a distinct ADMD, potentially having a complex arrangement of functional components. Figure 4 depicts relationships among ADMDs. The benefit of having the ADMD construct is to facilitate discussion about designs and operations that need to distinguish between "internal" issues and "external" ones. The architectural impact of needing to have boundaries between ADMD's is discussed in [Tussle]. Most significant is that the entities communicating across ADMD boundaries will typically have an added burden to enforce organizational policies concerning "external" communications. At a more mundane level, routing mail between ADMDs can be an issue, such as needing to route mail for partners over specially-trusted paths. Basic types of ADMDs include -- Edge: Independent transfer services, in networks at the edge of the open Internet Mail service. User: End-user services. This might be subsumed under the Edge service, such as is common for web-based email access. Transit: These are Mail Service Providers (MSP) offering value- added capabilities for Edge ADMDs, such as aggregation and filtering. The mail-level transit service is different from packet-level switching. End-to-end packet transfers usually go through intermediate routers, while email exchange across the open Internet can be directly between the Boundary MTAs of Edge ADMDs. This further highlights the differences discussed in Section 2.2.2 Crocker Expires February 8, 2009 [Page 17] Internet-Draft EMail Architecture August 2008 +--------+ +---------+ +-------+ +-----------+ | ADMD1 |<===>| ADMD2 |<===>| ADMD3 |<===>| ADMD4 | | ----- | | ----- | | ----- | | ----- | | | | | | | | | | Author | | | | | | | | . | | | | | | | | V | | | | | | | | Edge..+....>|.Transit.+....>|-Edge..+....>|.Recipient | | | | | | | | | +--------+ +---------+ +-------+ +-----------+ Figure 4: Administrative Domain (ADMD) Example Edge networks can use proprietary email standards internally. However the distinction between Transit network and Edge network transfer services is primarily significant because it highlights the need for concern over interaction and protection between independent administrations. In particular this distinction calls for additional care in assessing transitions of responsibility, as well as the accountability and authorization relationships among participants in email transfer. ADMD component interactions are subject to the policies of that domain, covering such things as: * Reliability * Access control * Accountability * Content evaluation and modification They can be implemented in different functional components, according to the needs of the ADMD. For example see [RFC5068]. User, Edge and Transit services can be offered by providers that operate component services or sets of services. Further it is possible for one ADMD to host services for other ADMDs. Common ADMD examples are -- Enterprise Service Providers: Crocker Expires February 8, 2009 [Page 18] Internet-Draft EMail Architecture August 2008 Operating an organization's internal data and/or mail services. Internet Service Providers: Operating underlying data communication services that, in turn, are used by one or more Relays and Users. It is not necessarily their job to perform email functions, but they can, instead, provide an environment in which those functions can be performed. Mail Service Providers: Operating email services, such as for end-users, or mailing lists. Operational pragmatics dictate that providers be involved in administration and enforcement issues. This can include operators of lower-level packet services. 3. Identities Internet Mail uses three forms of identity: mailbox, domain name and message-id. Each is required to be globally unique. 3.1. Mailbox "A mailbox sends and receives mail. It is a conceptual entity which does not necessarily pertain to file storage." [RFC2822] A mailbox is specified as an Internet Mail address . It has two distinct parts, divided by an at-sign ("@"). The right-hand side is a globally interpreted domain name that is associated with an ADMD. Domain Names are discussed in Section 3.2. Formal Internet Mail addressing syntax can support source routes, to indicate the path through which a message ought to be sent. The use of source routes is not common and has been deprecated in [RFC2821]. The portion to the left of the at-sign contains a string that is globally opaque and is called the . It is to be interpreted only by the entity specified by the address's right-hand side domain name. All other entities MUST treat the local-part as a uninterpreted literal string and MUST preserve all of its original details. As such its public distribution is equivalent to sending a Crocker Expires February 8, 2009 [Page 19] Internet-Draft EMail Architecture August 2008 Web browser "cookie" that is only interpreted upon being returned to its Author. 3.1.1. Global Standards for Local-Part Some local-part values have been standardized, for contacting personnel at an organization. These names cover common operations and business functions.[RFC2142] It is common for sites to have local structuring conventions for the left-hand side of an . This permits sub- addressing, such as for distinguishing different discussion groups used by the same participant. However it is worth stressing that these conventions are strictly private to the user's organization and SHOULD NOT be interpreted by any domain except the one listed in the right-hand side of the addr-spec. The exceptions are those specialized services conforming to standardized conventions, as noted below. There are a few types of addresses that have an elaboration on basic email addressing, with a standardized, global schema for the local- part. These are conventions between authoring systems and Recipient Gateways, and they are invisible to the public email transfer infrastructure. When an Author is explicitly sending via a Gateway out of the Internet, there are coding conventions for the local-part, so that the Author can formulate instructions for the Gateway. Standardized examples of this are the telephone numbering formats for VPIM [RFC3801], such as "+16137637582@vpim.example.com", and iFax [RFC3192], such as "FAX=+12027653000/T33S=1387@ifax.example.com". 3.1.2. Scope of Email Address Use Email addresses are being used far beyond their original email transfer and delivery role. In practical terms, an email address string has become the common identifier for representing online identity. What is essential, then, is to be clear about the nature and role of an identity string in a particular context and to be clear about the entity responsible for setting that string. For example, see: Section 4.1.4, Section 4.3.3, Section 5. 3.2. Domain Names A domain name is a global reference to an Internet resource, such as a host, a service or a network. A domain name usually maps to one or more IP Addresses. Conceptually the name might encompass an entire organization, a collection of machines integrated into a homogeneous service, or only a single machine. A domain name can be administered to refer to individual users, but this is not common practice. The Crocker Expires February 8, 2009 [Page 20] Internet-Draft EMail Architecture August 2008 name is structured as a hierarchical sequence of sub-names, separated by dots ("."), with the top of the hierarchy being on the right-end of the sequence. Domain names are defined and operated through the Domain Name System (DNS) [RFC1034], [RFC1035], [RFC2181]. When not part of a mailbox address, a domain name is used in Internet Mail to refer to the ADMD or the host that took action upon the message, such as providing the administrative scope for a message identifier, or performing transfer processing. 3.3. Message Identifier There are two standardized tags for identifying messages: Message-ID: and ENVID. Essentially, a Message-ID pertains to content, while an ENVID pertains to transfer. 3.3.1. Message-ID Internet Mail standards provide for, at most, a single Message-ID:. The Message-ID:, which is a user-level tag, having a variety of uses, including threading, aiding identification of duplicates, and DSN tracking. [RFC2822]. The Originator assigns the Message-ID:. The recipient's ADMD is the intended consumer of the Message-ID:, although any actor along the transfer path can use it. Message-ID: is required to be globally unique. It has a format that is similar to that of a mailbox, with two distinct parts, divided by an at-sign ("@"). Typically the right-hand side specifies the ADMD or host assigning the identifier, and the left-hand side contains a string that is globally opaque and serves to uniquely identify the message within the domain referenced on the right-hand side. The duration of uniqueness for the message identifier is undefined. When a message is revised in any way, the question of whether to assign a new Message-ID: requires a subjective assessment, deciding whether the editorial content has been changed enough to constitute a new message. [RFC2822] says "a message identifier pertains to exactly one instantiation of a particular message; subsequent revisions to the message each receive new message identifiers." However real-world experience dictates some flexibility. An impossible test is whether the recipient will consider the new message to be equivalent to the old. For most components of Internet Mail, there is no way to predict a specific recipient's preferences on this matter. Both creating and failing to create a new Message-ID: have their downsides. Here, are some guidelines and examples: Crocker Expires February 8, 2009 [Page 21] Internet-Draft EMail Architecture August 2008 * If a message is changed only in terms of form, such as character-encoding, it clearly is still the same message. * If a message has minor additions to the content, such as a mailing list tag at the beginning of the RFC2822.Subject header field, or some mailing list administrative information added to the end of the primary body-part's text, then it probably is still the same message. * If a message has viruses deleted from it, it probably is still the same message. * If a message has offensive words deleted from it, then some recipients will consider it the same message, but some will not. * If a message is translated into a different language, then some recipients will consider it the same message, but some will not. * If a message is included in a digest of messages, the digest constitutes a new message. * If a message is forwarded by a recipient, what is forwarded is considered to be a new message. * If a message is "redirected", such as using RFC2822 "Resent-*" header fields, some recipients will consider it the same message, but some will not. The absence of objective, precise criteria for Message-ID: re- generation, along with the absence of strong protection associated with the string, means that the presence of an ID can permit an assessment that is marginally better than a heuristic, but the ID certainly has no value on its own for strict formal reference or comparison. Hence Message-ID: SHOULD NOT be used for any function that has security implications. 3.3.2. ENVID The ENVID can be used for message tracking purposes [RFC3885] concerning a single posting/delivery transfer. The ENVID (envelope identifier) labels a single transit of the MHS by a specific message. So, the ENVID is used from one message posting, until the directly- resulting message deliveries. A re-posting of the message, such as by a Mediator, does not re-use that ENVID, but can use a new one, Crocker Expires February 8, 2009 [Page 22] Internet-Draft EMail Architecture August 2008 even though the message might legitimately retain its original Message-ID:. The format of an ENVID is free-form. Although its creator might choose to impose structure on the string, none is imposed by Internet standards. By implication, the scope of the string is defined by the domain name of the Return Address. 4. Services and Standards Internet Mail's architecture distinguishes among six basic types of functionality, arranged to support a store-and-forward service architecture. As shown in Figure 5 these types can have multiple instances, some of which represent specialized sub-roles. This section considers the activities and relationships among these components, and the Internet Mail standards that apply to them. 1. Message 2. Mail User Agent (MUA) Author MUA (aMUA) Recipient MUA (rMUA) 3. Message Submission Agent (MSA) Author-focused MSA functions (aMSA) MHS-focused MSA functions (hMSA) 4. Message Transfer Agent (MTA) 5. Message Delivery Agent (MDA) Recipient-focused MDA functions (rMDA) MHS-focused MDA functions (hMDA) 6. Message Store (MS) 1. Author MS (aMS) 2. Recipient MS (rMS) Crocker Expires February 8, 2009 [Page 23] Internet-Draft EMail Architecture August 2008 This section describes each functional component for Internet Mail, and the standards-based protocols associated with their operation. This figure shows function modules and the standardized protocols used between them. Crocker Expires February 8, 2009 [Page 24] Internet-Draft EMail Architecture August 2008 ++========++ || || +-------+ ...........++ aMUA ||<............................+ Disp | . || || +-------+ . ++=+==+===++ ^ . local,imap}| |{smtp,submission . . +-----+ | | +--------+ . . | aMS |<---+ | ........................>| Return | . . +-----+ | . +--------+ . . | . ***************** ^ . . +-----V-.----*------------+ * . . . MSA | +-------+ * +------+ | * . . . | | aMSA +-(S)->| hMSA | | * . . . | +-------+ * +--+---+ | * . . V +------------*------+-----+ * . . //==========\\ * V {smtp * . . || MESSAGE || * +------+ * //===+===\\ . ||----------|| MHS * | MTA | * || dsn || . || Envelope || * +--+---+ * \\=======// . || SMTP || * V {smtp * ^ ^ . || Content || * +------+ * . . //==+==\\ || RFC2822 || * | MTA +....*...... . || mdn || || MIME || * +--+---+ * . \\=====// \\==========// * smtp}| {local * . ^ . MDA * | {lmtp * . . . +----------------+------V-----+ * . . . | +----------+ * +------+ | * . . . | | | * | | +..*.......... . . | | rMDA |<-(D)--+ hMDA | | * . . | | | * | | |<.*........ . . | +-+------+-+ * +------+ | * . . . +------+---------*------------+ * . . . | ***************** . . . V{smtp,imap,pop,local . . . +-----+ //===+===\\ . . | rMS | || sieve || . . +--+--+ \\=======// . . |{imap,pop,local ^ . . V . . . ++==========++ . . . || || . . .......>|| rMUA ++........................... . || ++................................... ++==========++ Figure 5: Protocols and Services Crocker Expires February 8, 2009 [Page 25] Internet-Draft EMail Architecture August 2008 4.1. Message Data The purpose of the Mail Handling Service (MHS) is to exchange a message object among participants [RFC2822], [RFC0822]. Hence all of its underlying mechanisms are merely in the service of getting that message from its Author to its Recipients. A message can be explicitly labeled as to its nature [RFC3458]. A message comprises a transit handling envelope and the message content. The envelope contains information used by the MHS. The content is divided into a structured header and the body. The header comprises transit trace information and end-user structured fields. The body can be unstructured simple lines of text, or it can be a MIME tree of multi-media subordinate objects, called body-parts, or attachments [RFC2045], [RFC2046], [RFC2047], [RFC4288], [RFC4289], [RFC2049]. In addition, Internet Mail has a few conventions for special control data -- Delivery Status Notification (DSN): A Delivery Status Notification (DSN) is a message that can be generated by the MHS (MSA, MTA or MDA) and sent to the RFC2821.MailFrom address. An MDA and MTA are shown as sources of DSNs in Figure 5 and the destination is shown as Returns. DSNs provide information about message transit, such as transfer errors or successful delivery [RFC3461]. Message Disposition Notification (MDN): A Message Disposition Notification (MDN) is a message that provides information about user-level, Recipient-side message processing, such as indicating that the message has been displayed [RFC3798] or the form of content that can be supported [RFC3297]. It can be generated by an rMUA and is sent to the Disposition-Notification-To address(es). The mailbox for this is shown as Disp in Figure 5. Message Filtering (SIEVE): Crocker Expires February 8, 2009 [Page 26] Internet-Draft EMail Architecture August 2008 SIEVE is a scripting language that permits specifying conditions for differential handling of mail, typically at the time of delivery [RFC5228]. It can be conveyed in a variety of ways, as a MIME part. Figure 5 shows a Sieve specification going from the rMUA to the MDA. However filtering can be done at many different points along the transit path and any one or more of them might be subject to Sieve directives, especially within a single ADMD. Hence the Figure shows only one relationship, for (relative) simplicity. 4.1.1. Envelope Internet Mail has a fragmented framework for transit-related "handling" information. Information that is directly used by the MHS is called the "envelope". It directs handling activities by the transfer service and is carried in transfer service commands. That is, the envelope exists in the transfer protocol SMTP [RFC2821]. Trace information, such as > RFC2822.Received, is recorded in the message header and is not subsequently altered. [RFC2822] 4.1.2. Header Fields Header fields are attribute name/value pairs covering an extensible range of email service, user content and user transaction meta- information. The core set of header fields is defined in [RFC2822], [RFC0822]. It is common to extend this set, for different applications. Procedures for registering header fields are defined in [RFC3864]. An extensive set of existing header field registrations is provided in [RFC4021]. One danger with placing additional information in header fields is that Gateways often alter or delete them. 4.1.3. Body The body of a message might simply be lines of ASCII text or it might be hierarchically structured into a composition of multi-media body- part attachments, using MIME [RFC2045], [RFC2046], [RFC2047], [RFC4288], [RFC2049]. 4.1.4. Identity References in a Message These are the core identifiers present in a message during transit: Crocker Expires February 8, 2009 [Page 27] Internet-Draft EMail Architecture August 2008 +-----------------------+----------------+-------------------------+ | Layer | Field | Set By | +-----------------------+----------------+-------------------------+ | Message Body | MIME Header | Author | | Message header fields | From: | Author | | | Sender: | Originator | | | Reply-To: | Author | | | To:, CC:, BCC: | Author | | | Message-ID: | Originator | | | Received: | Originator, Relay, Dest | | | Return-Path: | MDA, from MailFrom | | | Resent-*: | Mediator | | | List-Id: | Mediator Author | | | List-*: | Mediator Author | | SMTP | HELO/EHLO | Latest Relay Client | | | ENVID | Originator | | | MailFrom | Originator | | | RcptTo | Author | | | ORCPT | Author | | IP | Source Address | Latest Relay Client | +-----------------------+----------------+-------------------------+ Layered Identities The most common address-related fields are: RFC2822.From: Set by - Author Names and addresses for author(s) of the message content are listed in the From: field. RFC2822.Reply-To: Set by - Author If a message Recipient sends a reply message that would otherwise use the RFC2822.From field address(es) contained in the original message, then they are instead to use the address(es) in the RFC2822.Reply-To field. In other words this field is a direct override of the From: field, for responses from Recipients. RFC2822.Sender: Set by - Originator This specifies the address responsible for submitting the message into the transfer service. For efficiency this field can be omitted if it contains the same address as RFC2822.From. However this does not mean there is no Sender specified. Rather it means that that header field is virtual and that the address in the From: field MUST be used. Crocker Expires February 8, 2009 [Page 28] Internet-Draft EMail Architecture August 2008 Specification of the notifications Return addresses -- contained in RFC2821.MailFrom -- is made by the RFC2822.Sender. Typically the Return address is the same as the Sender address. However some usage scenarios require it to be different. RFC2822.To/.CC: Set by - Author These specify MUA Recipient addresses. However some or all of the addresses in these fields might not be present in the RFC2821.RcptTo commands. The distinction between To and CC is subjective. Generally a To addressee is considered primary and is expected to take action on the message. A CC addressee typically receives a copy only for their information. RFC2822.BCC: Set by - Author A message might be copied to an addressee whose participation is not to be disclosed to the RFC2822.To or RFC2822.CC Recipients and, usually, not to the other BCC Recipients. The BCC: header field indicates a message copy to such a Recipient. Use of this field is discussed in [RFC2822]. RFC2821.HELO/.EHLO: Set by - Originator, MSA, MTA any SMTP client -- including Originator, MSA or MTA -- can specify its hosting domain identity for the SMTP HELO or EHLO command operation. RFC3461.ENVID: Set by - Originator The MSA can specify an opaque string, to be included in a DSN, as a means of assisting the Return address recipient in identifying the message that produced a DSN, or message tracking. RFC2821.MailFrom: Set by - Originator This is an end-to-end string that specifies an email address for receiving return control information, such as returned messages. The name of this field is misleading, because it is not required to specify either the author or the Actor responsible for submitting the message. Rather, the Actor responsible for submission specifies the RFC2821.MailFrom address. Ultimately the simple basis for deciding what address needs to be in the RFC2821.MailFrom is to determine what address needs to be informed about transfer-level problems (and, possibly, successes.) Crocker Expires February 8, 2009 [Page 29] Internet-Draft EMail Architecture August 2008 RFC2821.RcptTo: Set by - Author, Final MTA, MDA. This specifies the MUA mailbox address of a recipient. The string might not be visible in the message content header. For example, the message destination address header fields, such as RFC2822.To, might specify a mailing list mailbox, while the RFC2821.RcptTo address specifies a member of that list. RFC2821.ORCPT Set by - Author. An optional parameter to the RCPT command, indicating the original address to which the current RCPT TO address corresponds, after a mapping was performed during transit. An ORCPT is the only reliable way to correlate a DSN from a multi-recipient message transfer with the > intended recipient. RFC2821.Received: Set by - Originator, Relay, Mediator, Dest This indicates trace information, including originating host, relays, Mediators, and MSA host domain names and/or IP Addresses. RFC2821.Return-Path: Set by - Originator The MDA records the RFC2821.MailFrom address into the RFC2822.Return-Path field. RFC2919.List-Id: Set by - Mediator Author This provides a globally unique mailing list naming framework that is independent of particular hosts. [RFC2919] The identifier is in the form of a domain name; however the string usually is constructed by combining the two parts of an email address and the result rarely is a true domain name, listed in the domain name service -- although it can be. RFC2369.List-*: Set by - Mediator Author [RFC2369] defines a collection of message header fields for use by mailing lists. In effect they supply list-specific parameters for common mailing list user operations. The identifiers for these operations are for the list, itself, and the user-as-subscriber [RFC2369]. RFC0791.SourceAddr: Set by - The Client SMTP sending host immediately preceding the current receiving SMTP server. Crocker Expires February 8, 2009 [Page 30] Internet-Draft EMail Architecture August 2008 [RFC0791] defines the basic unit of data transfer for the Internet, the IP Datagram. It contains a "Source Address" field that specifies the IP Address for the host (interface) from which the datagram was sent. This information is set and provided by the IP layer, and is therefore independent of mail-level mechanisms. As such, it is often taken to be authoritative, although it is possible to provide false addresses. 4.2. User-Level Services Interactions at the user level entail protocol exchanges, distinct from those that occur at lower layers of the Internet Mail architecture, which is above the Internet Transport layer. Because the motivation for email, and much of its use, is for interaction among humans, the nature and details of these protocol exchanges often are determined by the needs of human and group communication. In terms of efforts to specify behaviors, one effect of this is to require subjective guidelines, rather than strict rules, for some aspects of system behavior. Mailing Lists provide particularly salient examples of this. 4.2.1. Mail User Agent (MUA) A Mail User Agent (MUA) works on behalf of end-users and end-user applications. It is their "representative" within the email service. The Author MUA (aMUA) creates a message and performs initial "submission" into the transfer infrastructure, via a Mail Submission Agent (MSA). It can also perform any creation- and posting-time archival in its Message Store (aMS). An MUA's aMS can organize messages in many different ways. A common model is to have aggregations, called "folders". It is common to have folders for messages under development (Drafts), one for messages waiting to be sent (Queued or Unsent) and one for messages that have been successfully posted for transfer (Sent). However these are not required. For example, IMAP allows drafts to appear in any folder so no drafts folder is present. The Recipient-side MUA (rMUA) works on behalf of the end-user Recipient to process received mail. This includes generating user- level return control messages, displaying and disposing of the received message, and closing or expanding the user communication loop, by initiating replies and forwarding new messages. Crocker Expires February 8, 2009 [Page 31] Internet-Draft EMail Architecture August 2008 NOTE: Although not shown in Figure 5, an MUA can, itself, have a distributed implementation, such as a "thin" user interface module on a limited end-user device, with the bulk of the MUA functionality operated remotely on a more capable server. An example of such an architecture might use IMAP [RFC3501] for most of the interactions between an MUA client and an MUA server. A standardized approach for such scenarios is defined by [RFC4550]. A Mediator is special class of MUA. It performs message re-posting, as discussed in Section 2.1. An MUA can be automated, on behalf of an end-user who is not present at the time the MUA is active. One example can be some bulk sending services which have a timed-initiation feature. These are not to be confused with a mailing list Mediator, in that there is no incoming message that triggers the activity of the automated service. A popular and problematic MUA is an automatic responder, such as one that sends vacation notices. This, too, might be confused with a Mediator actor, but in fact is generating an entirely new message. Automatic responders have a tendency to annoy users of mailing lists unless they follow [RFC3834]. ****** The recommendations in RFC 3834 are an important consequence of the addressing architecture of Internet mail so they do help illustrate the architecture. ***** Identity fields relevant to a typical end-user MUA include: RFC2822.From RFC2822.Reply-To RFC2822.Sender RFC2822.To, RFC2822.CC RFC2822.BCC 4.2.2. Message Store (MS) An MUA can employ a long-term Message Store (MS). Figure 5 depicts an Origination-side MS (aMS) and a Recipient-side MS (rMS). It can be located on a remote server or on the same machine as the MUA. Crocker Expires February 8, 2009 [Page 32] Internet-Draft EMail Architecture August 2008 An MS acquires messages from an MDA either by a local mechanism or by using POP or IMAP. The MUA access the MS either by a local mechanism or by using POP or IMAP. Using POP for message access, rather than bulk transfer, is rare, awkward, and largely non-standard. 4.3. MHS-Level Services 4.3.1. Mail Submission Agent (MSA) A Mail Submission Agent (MSA) accepts the message submission from the aMUA and enforces the policies of the hosting ADMD and the requirements of Internet standards. An MSA represents an unusual functional dichotomy. A portion of its task is to represent the interests of the Author (aMUA) during message posting, to facilitate posting success, and another portion is to represent the interests of the MHS. In the architecture, this is modeled, as shown in Figure 5, by dividing the MSA into two sub-components, aMSA and hMSA, respectively. Transfer of responsibility, for a single message, from an Author's environment to the MHS, is called "posting". In Figure 5 it is marked as the "(S)" transition, within the MSA. The hMSA's function is to take transit responsibility for a message that conforms to the relevant Internet standards and to local site policies. It rejects messages that are not in conformance. The oMSA's role is to perform final message preparation for submission and to effect the transfer of responsibility to the MHS, via the hMSA. The amount of preparation will depend upon the local implementations. Examples of oMSA tasks could be to add header fields, such as Date: and Message-ID:, to modify portions of the message from local notations to Internet standards, such as expanding an address to its formal RFC2822 representation. Historically, standards-based MUA/MSA interactions have used SMTP [RFC2821]. The current standards preference is SUBMISSION [RFC4409]. Although SUBMISSION derives from SMTP, it uses a separate TCP port and imposes distinct requirements, such as access authorization. Identities relevant to the MSA include: RFC2821.HELO/.EHLO RFC3461.ENVID Crocker Expires February 8, 2009 [Page 33] Internet-Draft EMail Architecture August 2008 RFC2821.MailFrom RFC2821.RcptTo RFC2821.Received RFC0791.SourceAddr 4.3.2. Mail Transfer Agent (MTA) A Mail Transfer Agent (MTA) relays mail for one application-level "hop". It is like a packet-switch or IP router in that its job is to make routing assessments and to move the message closer to the Recipient(s). Relaying is performed by a sequence of MTAs, until the message reaches a destination MDA. Hence an MTA implements both client and server MTA functionality. It does not make changes to addresses in the envelope or reformulate the editorial content. Hence a change in data form, such as to the MIME Content-Transfer- Encoding, is within the purview of an MTA, whereas removal or replacement of body content is not. Also it adds trace information.[RFC2505] Of course email objects are typically much larger than the payload of a packet or datagram, and the end-to-end latencies are typically much higher. NOTE: Within a destination ADMD, email relaying modules can make a variety of changes to the message, prior to delivery. In such cases, these modules are architecturally acting as Gateways, rather than MTAs. Internet Mail primarily uses SMTP [RFC2821], [RFC0821] to effect point-to-point transfers between peer MTAs. The basic set of protocol reply codes for SMTP have been enhanced with an extensible registry of values. [RFC5248] Other transfer mechanisms include Batch SMTP [RFC2442] and ODMR [RFC2645]. As with most network layer mechanisms, Internet Mail's SMTP supports a basic level of reliability, by virtue of providing for retransmission after a temporary transfer failure. Contrary to typical packet switches (and Instant Messaging services) Internet Mail MTAs are expected to store messages in a manner that allows recovery across service interruptions, such as host system shutdown. However the degree of such robustness and persistence by an MTA can be variable. The primary "routing" mechanism for Internet Mail is the DNS MX record [RFC1035], which specifies an MTA through which the queried domain can be reached. This presumes a public -- or at least a common -- backbone that permits any attached MTA to connect to any other. Crocker Expires February 8, 2009 [Page 34] Internet-Draft EMail Architecture August 2008 MTAs can perform according to well-established sub-roles. Specifically: Boundary MTA: An MTA which is part of an ADMD and interacts with MTAs in other ADMDs. This is also called a Border MTA. Its role sub- divides according to the direction of mail-flow: Outbound MTA: An MTA which relays messages to other ADMDs. Inbound MTA: An MTA which receives inbound SMTP messages from other MTA relays (typically in other ADMDs), for example an MTA running on the host listed as the target of an MX record. Final MTA: The MTA that transfers a message to the MDA Identities relevant to the MTA include: RFC2821.HELO/.EHLO RFC3461.ENVID RFC2821.MailFrom RFC2821.RcptTo RFC2822.Received Set by - Relay Server RFC0791.SourceAddr 4.3.3. Mail Delivery Agent (MDA) A transfer of responsibility from the MHS to a Recipient's environment (mailbox) is called "delivery". In the architecture, as depicted in Figure 5, this takes place within a Mail Delivery Agent (MDA) and is shown as the "(D)" transition from the MHS-oriented MDA component (hMDA) to the Recipient-oriented MDA component (rMDA). Crocker Expires February 8, 2009 [Page 35] Internet-Draft EMail Architecture August 2008 NOTE: In common practice, the term "delivery" can mean the formal, MHS function, specified here, or to the first time a message is displayed to an end-user. A simple, practical test for whether the MHS-based definition applies is whether a DSN can be generated. An MDA can provide distinctive, address-based functionality, made possible by its detailed knowledge of the properties of the destination address. This knowledge might also be present elsewhere in the Recipient's ADMD, such as at an organizational border (Boundary) Relay. However it is required for the MDA, if only because the MDA is required to know where to deliver the message. As with an MSA, an MDA serves two roles, as depicted in Figure 5. Formal transfer of responsibility, called "delivery", is effected between the two components that embody these roles as shows as "(D)" in Figure 5. The MHS portion (hMDA) primarily functions as a server SMTP engine. A common additional role is to re-direct the message to an alternative address, as specified by the recipient addressee's preferences. The job of the recipient portion of the MDA (rMDA) is to perform any delivery-actions that are desired by the recipient. Transfer into the MDA is accomplished by a normal MTA transfer mechanism. Transfer from an MDA to an MS uses an access protocol, such as POP or IMAP. Identities relevant to the MDA include: RFC2821.Return-Path: Set by - Author Originator or Mediator Originator The MDA records the RFC2821.MailFrom address into the RFC2822.Return-Path field. RFC2822.Received: Set by - MDA server An MDA can record a Received: header field to indicate trace information, including source host and receiving host domain names and/or IP Addresses. 4.4. Transition Modes From the origination site to the point of delivery, Internet mail usually follows a "push" model. That is, the actor holding the message actively initiates transfer to the next venue, typically with SMTP [RFC2821] or LMTP [RFC2033]. With a "pull" model, the actor Crocker Expires February 8, 2009 [Page 36] Internet-Draft EMail Architecture August 2008 holding the message is passive and waits for the actor in the next venue to initiate a request for transfer. Standardized mechanisms for pull-based MHS transfer are ETRN [RFC1985] and ODMR [RFC2645]. After delivery, the recipient's MUA (or MS) can gain access by having the message pushed to it, or by having the receiver of access "pull" the message, such as by using POP [RFC1939] and IMAP [RFC3501]. 4.5. Implementation and Operation A discussion about any interesting system architecture is often complicated by confusion between architecture versus implementation or operations configuration. An architecture defines the conceptual functions of a service, divided into discrete conceptual modules. An implementation of that architecture can combine or separate architectural components, as needed for a particular operational environment. For example, a software system that primarily performs message relaying -- and therefore is an MTA -- might also include MDA functionality. That same MTA system might be able to interface with non-Internet email services and therefore qualify as a Gateway. It is important not to confuse the engineering decisions made to implement a product, with the architectural abstractions used to define conceptual functions. Similarly, implemented modules might be configured to form elaborations of the architecture. An interesting example is of a distributed MS. One portion might be a remote server while another is local to the MUA. As discussed in [RFC1733] the operational relationship among such MSs can be -- Online: Only a remote MS is used, with messages being accessible only when the MUA is attached to the MS, and the MUA repeatedly fetches all or part of a message, from one session to the next. Offline: The MS is local to the user, and messages are completely moved from any remote store, rather than (also) being retained there. Disconnected: An rMS and a uMS are kept synchronized, for all or part of their contents, while there is a connection between them. While they are disconnected, mail can continue to arrive at the rMS and the user can continue to make changes to the uMS. Upon reconnection, the two stores are re-synchronized. Crocker Expires February 8, 2009 [Page 37] Internet-Draft EMail Architecture August 2008 5. Mediators Basic email transfer from an Author to the specified Recipients is accomplished by using an asynchronous, store-and-forward communication infrastructure, in a sequence of independent transmissions through some number of MTAs. A very different task is a User-level sequence of postings and deliveries, through Mediators. A Mediator forwards a message, through a re-posting process. The Mediator does share some functionality with basic MTA relaying, but it enjoys a degree of freedom with both addressing and content that is not available to MTAs. The core set of message information that is commonly set by all types of Mediators is: RFC2821.HELO/.EHLO: Set by - Mediator Originator RFC3461.ENVID: Set by - Mediator Originator RFC2821.RcptTo: Set by - Mediator Author RFC2821.Received: Set by - Mediator Dest The Actor can record Received information, to indicate the delivery to the original address and submission to the alias address. The trace of Received: header fields can therefore include everything from original posting through final delivery to a final delivery. The salient aspect of a Mediator, that distinguishes it from any other MUA creating an entirely new message, is that a Mediator preserves the integrity and tone of the original message, including the essential aspects of its origination information. The Mediator might also add commentary. Examples of MUA message creation NOT performed by Mediators include -- New message that forwards an existing message: This action rather curiously provides a basic template for a class of Mediators. However for its typical occurrence it is Crocker Expires February 8, 2009 [Page 38] Internet-Draft EMail Architecture August 2008 not itself an example of a Mediator. The new message is viewed as being from the Actor doing the forwarding, rather than being from the original Author. A new message encapsulates the original message and is seen as strictly "from" the Mediator. The Mediator might add commentary and certainly has the opportunity to modify the original message content. The forwarded message is therefore independent of the original message exchange and creates a new message dialogue. However the final Recipient sees the contained message as from the original Author. Reply: When a Recipient formulates a response back to the original message's author, the new message is not typically viewed as being a "forwarding" of the original. Its focus is the new content, although it might contain all or part of the material in the original message. Therefore the earlier material is merely contextual and secondary. This includes automated replies, such as for vacation notices, as discussed in Section 4.2.1. Annotation: The integrity of the original message is usually preserved, but one or more comments about the message are added in a manner that distinguishes commentary from original text. The primary purpose of the new message is to provide commentary from a new Author, similar to a Reply. The remainder of this section describes common examples of Mediators. 5.1. Aliasing One function of an MDA is to determine the "internal" location of a mailbox, in order to perform delivery. Aliasing is a simple re- addressing facility that provides one or more new Internet Mail addresses, rather than a single, internal one. Instead the message continues through the transfer service, for delivery to one or more alternate addresses. Although typically implemented as part of an MDA, this facility is strictly a Recipient user function. It resubmits the message, although all handling information other than the envelope recipient (rfc2821.RcptTo) address is retained. In Crocker Expires February 8, 2009 [Page 39] Internet-Draft EMail Architecture August 2008 particular, the Return address (rfc2821.MailFrom) is unchanged. What is most distinctive about this forwarding mechanism is how closely it compares to normal MTA store-and-forward Relaying. Its only interesting difference is that it changes the RFC2821.RcptTo value. Having the change be this small makes it easy to view aliasing as a part of the lower-level mail relaying activity. However the small change has a large semantic impact: The designated recipient has chosen a new recipient. NOTE: When the replacement list of addresses has more than one address, the alias is increasingly likely to have delivery problems. However problem reports will go to the original author, rather than the administrator of the alias entry. An MDA that is re-posting a message to an alias typically changes only envelope information: RFC2822.To/.CC/.BCC: Set by - Author These retain their original addresses. RFC2821.MailFrom: Set by - Author The benefit of retaining the original MailFrom value is to ensure that an origination-side Actor knows that there has been a delivery problem. On the other hand, the responsibility for handling problems, when transiting from the original recipient mailbox to the alias mailbox usually lies with that original Recipient, since the Alias mechanism is strictly under the Recipient's control. 5.2. ReSending Also called Re-Directing, ReSending differs from Forwarding by virtue of having the Mediator "splice" a message's addressing information, to connect the Author of the original message and the Recipient of the new message. This permits them to have direct exchange, using their normal MUA Reply functions, while also recording full reference information about the recipient who served as a Mediator. Hence the new Recipient sees the message as being From: the original Author, even if the Mediator adds commentary. Identities specified in a resent message include: Crocker Expires February 8, 2009 [Page 40] Internet-Draft EMail Architecture August 2008 RFC2822.From: Set by - original Author Names and email addresses for the original author(s) of the message content are retained. The free-form (display-name) portion of the address might be modified to provide informal reference to the Actor responsible for the redirection. RFC2822.Reply-To: Set by - original Author If this field is present in the original message, it is retained in the Resent message. RFC2822.Sender: Set by - Author Originator or Mediator Originator. RFC2822.To/.CC/.BCC: Set by - original Author These specify the original message Recipients. RFC2822.Resent-From: Set by - Mediator Author The address of the original Recipient who is redirecting the message. Otherwise the same rules apply for the Resent-From: field as for an original RFC2822.From field. RFC2822.Resent-Sender: Set by - Mediator Originator The address of the Actor responsible for re-submitting the message. As with RFC2822.Sender, this field is can be omitted when it would merely contain the same address as RFC2822.Resent-From. RFC2822.Resent-To/-CC/-BCC: Set by: Mediator Author The addresses of the new Recipients who will now be able to reply to the original author. RFC2821.MailFrom: Set by - Mediator Originator The Actor responsible for re-submission (RFC2822.Resent-Sender) is also responsible for specifying the new MailFrom address. Crocker Expires February 8, 2009 [Page 41] Internet-Draft EMail Architecture August 2008 5.3. Mailing Lists Mailing lists have explicit email addresses and they re-post messages to a list of subscribed members. The Mailing List Actor performs a task that can be viewed as an elaboration of the Re-Director role. In addition to sending the new message to a potentially large number of new Recipients, the Mediator can modify content, such as deleting attachments, converting the format, and adding list-specific comments. In addition, archiving list messages is common. Still the message retains characteristics of being "from" the original Author. Identities relevant to a mailing list processor, when submitting a message, include: RFC2919.List-Id: Set by - Mediator Author RFC2369.List-*: Set by - Mediator Author RFC2822.From: Set by - original Author Names and email addresses for the original author(s) of the message content are specified -- or, rather, retained. RFC2822.Reply-To: Set by - original Author or Mediator Author Although problematic, it is common for a mailing list to assign its own addresses to the Reply-To: header field of messages it (re-)posts. This is intended to ensure that recipient replies go to all list members, rather than only to the original Author. As a User actor, a mailing list is, effectively, the author of the new message and can legitimately set the Reply-To: value as it sees fit. As a Mediator attempting to represent the message on behalf of its original Author, creating or modifying a Reply-To: field can be viewed as violating that author's intent. Modifying the field to contain the list address can lead to replies that are meant only for the original Author instead going to the full list. When the list Actor does not set the field, a reply meant for the entire list can instead go only to the original Author. At best, either choice is a matter of group communication "culture". RFC2822.Sender: Set by - Author Originator or Mediator Originator Crocker Expires February 8, 2009 [Page 42] Internet-Draft EMail Architecture August 2008 This will usually specify the address of the Actor responsible for mailing list operations. However some mailing lists operate in a manner very similar to a simple MTA Relay, so that they preserve as much of the original handling information as possible, including the original RFC2822.Sender field. (Note that this makes the mailing list be essentially the same as an Alias, with the possible difference in number of new addressees. RFC2822.To/.CC: Set by - original Author These usually contain the original list of Recipient addresses. RFC2821.MailFrom: Set by - Mediator Originator Because a Mailing List has complete freedom with the content of a message, it is responsible for that content; that is, it is a true Author. As such, the Return Address is specified by the Mailing List. While is plausible for the Mailing List to decide to re-use the Return Address employed by the original Originator, notifications sent to that address after a message has been processed by a Mailing List Mediator could be problematic. 5.4. Gateways A Gateway performs the basic routing and transfer work of message relaying, but it also is permitted to make any content, structure, address, or attribute modifications needed to send the message into a messaging environment that operates according to different standards or potentially incompatible policies. When a Gateway connects two differing messaging services, its role is easy to identify and understand. When it connects environments that have technical similarity, but can have significant administrative differences, it is easy to think that a Gateway is merely an MTA. The critical distinction between an MTA and a Gateway is that the latter can make substantive changes to a message, in order to map between the standards of two, different messaging services. In virtually all cases, this mapping process results in some degree of semantic loss. The challenge of Gateway design is to minimize this loss. Standardized gateways to Internet Mail are: Facsimile [RFC4143], Voicemail[RFC3801] and MMS [RFC4356] A Gateway can set any identity field available to a regular MUA. Identities typically relevant to Gateways include: Crocker Expires February 8, 2009 [Page 43] Internet-Draft EMail Architecture August 2008 RFC2822.From: Set by - original Author Names and email addresses for the original author(s) of the message content are retained. As for all original addressing information in the message, the Gateway can translate addresses in whatever way will allow them continue to be useful in the target environment. RFC2822.Reply-To: Set by - original Author The Gateway SHOULD retain this information, if it is originally present. The ability to perform a successful reply by a Gatewayed Recipient is a typical test of Gateway functionality. RFC2822.Sender: Set by - Author Originator or Mediator Originator This can retain the original value or can be set to a new address. RFC2822.To/.CC/.BCC: Set by - original Recipient These usually retain their original addresses. RFC2821.MailFrom: Set by - Author Originator or Mediator Originator The Actor responsible for gatewaying the message can choose to specify a new address to receive handling notices. 5.5. Boundary Filter Organizations can enforce security boundaries by subjecting messages to analysis, for conformance with the organization's safety policies. An example is detection of content classed as spam or a virus. A Filter might alter the content, to render it safe, such as by removing content deemed unacceptable. Typically these actions will result in the addition of content that records the actions. 6. References 6.1. Normative [RFC0791] Postel, J., "Internet Protocol", RFC 791, 1981 September. Crocker Expires February 8, 2009 [Page 44] Internet-Draft EMail Architecture August 2008 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987. [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. [RFC1939] Myers, J. and M. Rose, "Post Office Protocol - Version 3", STD 53, RFC 1939, May 1996. [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text", RFC 2047, November 1996. [RFC2049] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Five: Conformance Criteria and Examples", RFC 2049, November 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS Specification", RFC 2181, July 1997. [RFC2369] Neufeld, G. and J. Baer, "The Use of URLs as Meta-Syntax for Core Mail List Commands and their Transport through Message Header Fields", RFC 2369, July 1998. [RFC2645] "On-Demand Mail Relay (ODMR) SMTP with Dynamic IP Addresses", RFC 2645, August 1999. [RFC2821] Klensin, J., "Simple Mail Transfer Protocol", RFC 2821, April 2001. [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, April 2001. [RFC2919] Chandhok, R. and G. Wenger, "List-Id: A Structured Field and Namespace for the Identification of Mailing Lists", RFC 2919, March 2001. Crocker Expires February 8, 2009 [Page 45] Internet-Draft EMail Architecture August 2008 [RFC3192] Allocchio, C., "Minimal FAX address format in Internet Mail", RFC 2304, October 2001. [RFC3297] Klyne, G., Iwazaki, R., and D. Crocker, "Content Negotiation for Messaging Services based on Email", RFC 3297, July 2002. [RFC3458] Burger, E., Candell, E., Eliot, C., and G. Klyne, "Message Context for Internet Mail", RFC 3458, January 2003. [RFC3461] Moore, K., "Simple Mail Transfer Protocol (SMTP) Service Extension for Delivery Status Notifications (DSNs)", RFC 3461, January 2003. [RFC3501] Crispin, M., "Internet Message Access Protocol - Version 4rev1", RFC 3501, March 2003. [RFC3798] Hansen, T. and G. Vaudreuil, "Message Disposition Notification", RFC 3798, May 2004. [RFC3834] Moore, K., "Recommendations for Automatic Responses to Electronic Mail", RFC 3834, August 2004. [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration Procedures for Message Header Fields", RFC 3864, September 2004. [RFC4021] Klyne, G. and J. Palme, "Registration of Mail and MIME Header Fields", RFC 4021, March 2005. [RFC4288] Freed, N., Klensin, J., and J. Postel, "Media Type Specifications and Registration Procedures", BCP 13, RFC 4288, December 2005. [RFC4289] Freed, N., Klensin, J., and J. Postel, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", BCP 13, RFC 4289, December 2005. [RFC4409] Gellens, R. and J. Klensin, "Message Submission for Mail", RFC 4409, April 2006. [RFC4550] Maes, S., , S., and Isode Ltd., "Internet Email to Support Diverse Service Environments (Lemonade) Profile", June 2006. [RFC5228] Showalter, T., "Sieve: A Mail Filtering Language", RFC 5228. Crocker Expires February 8, 2009 [Page 46] Internet-Draft EMail Architecture August 2008 [RFC5248] Hansen, T. and J. Klensin, "A Registry for SMTP Enhanced Mail System Status Codes", RFC 5248, June 2008. 6.2. Informative [MAIL-I18N] Internet Mail Consortium, "Using International Characters in Internet Mail", IMC IMCR-010, August 1998. [RFC0821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, August 1982. [RFC0822] Crocker, D., "Standard for the format of ARPA Internet text messages", STD 11, RFC 822, August 1982. [RFC1733] Crispin, M., "Distributed Electronic Models in IMAP4", December 1994. [RFC1767] Crocker, D., "MIME Encapsulation of EDI Objects", RFC 1767, March 1995. [RFC1985] De Winter, J., "SMTP Service Extension for Remote Message Queue Starting", August 1996. [RFC2033] Myers, J., "Local Mail Transfer Protocol", RFC 2033, October 1996. [RFC2142] Crocker, D., "Mailbox Names for Common services, Roles and Functions", RFC 2142, May 1997. [RFC2442] "The Batch SMTP Media Type", RFC 2442, November 1998. [RFC2480] Freed, N., "Gateways and MIME Security Multiparts", RFC 2480, January 1999. [RFC2505] Lindberg, G., "nti-Spam Recommendations for SMTP MTAs", RFC 2505, February 1999. [RFC2554] Myers, J., "SMTP Service Extension for Authentication", RFC 2554, March 1999. [RFC3207] Hoffman, P., "SMTP Service Extension for Secure SMTP over Transport Layer Security", RFC 3207, February 2002. [RFC3685] Daboo, C., "SIEVE Email Filtering: Spamtest and VirusTest Extensions", RFC 3685, February 2004. [RFC3801] Vaudreuil, G. and G. Parsons, "Voice Profile for Internet Crocker Expires February 8, 2009 [Page 47] Internet-Draft EMail Architecture August 2008 Mail - version 2 (VPIMv2)", RFC 3801, June 2004. [RFC3851] Ramsdell, B., Ed., "Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.1 Message Specification", RFC 3851, July 2004. [RFC3885] Allman, E. and T. Hansen, "SMTP Service Extension for Message Tracking", RFC 3885, September 2004. [RFC4142] Crocker, D. and G. Klyne, "Full-mode Fax Profile for Internet Mail: FFPIM", December 2005. [RFC4143] Toyoda, K. and D. Crocker, "Facsimile Using Internet Mail (IFAX) Service of ENUM", RFC 4143, November 2005. [RFC4356] Gellens, R., "Mapping Between the Multimedia Messaging Service (MMS) and Internet Mail", RFC 4356, January 2006. [RFC4880] Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. Thayer, "OpenPGP Message Format", RFC 4880, November 2007. [RFC5068] Hutzler, C., Crocker, D., Resnick, P., Sanderson, R., and E. Allman, "Email Submission Operations: Access and Accountability Requirements", RFC 5068, BCP 134, Nov 2007. [Tussle] Clark, D., Wroclawski, J., Sollins, K., and R. Braden, "Tussle in Cyberspace: Defining Tomorrow's Internet", ACM SIGCOMM, 2002. Appendix A. Acknowledgements This work derives from a section in an early version of [RFC5068]. Discussion of the Originator actor role was greatly clarified during discussions in the IETF's Marid working group. Graham Klyne, Pete Resnick and Steve Atkins provided thoughtful insight on the framework and details of the original drafts, as did Chris Newman for the final versions, while also serving as cognizant Area Director for the document. Tony Hansen served as document shepherd, through the IETF process. Later reviews and suggestions were provided by Eric Allman, Nathaniel Borenstein, Ed Bradford, Cyrus Daboo, Frank Ellermann, Tony Finch, Ned Freed, Eric Hall, Willemien Hoogendoorn, Brad Knowles, John Leslie, Bruce Valdis Kletnieks, Mark E. Mallett, David MacQuigg, Alexey Melnikov, der Mouse, S. Moonesamy, Daryl Odnert, Rahmat M. Samik-Ibrahim, Marshall Rose, Hector Santos, Jochen Topf, Greg Crocker Expires February 8, 2009 [Page 48] Internet-Draft EMail Architecture August 2008 Vaudreuil.