W3C NOTE 13-Dec-96
This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by the NOTE.
This document has originally been published as an Internet Draft, and is available as online as draft-lassila-http-edit-dist-00.txt.
This document contains examples of distributed editing conducted through HTTP. These scenarios have been developed by the Distributed Authoring and Versioning Group in the course of specifying requirements for distributed editing, and aim to demonstrate the concepts of distributed editing. The document presents a logical hierarchy of scenarios, separating actual editing actions from document management.
The purpose of this document is to catalog scenarios of distributed editing and authoring as well as versioning, as related to the work of the Distributed Authoring and Versioning group, and particularly to the Interned Draft document "Requirements on HTTP for Distributed Content Editing" [1]. These scenarios can serve as examples of distributed authoring and versioning HTTP extensions' usage, and can be used as basis for discussion of various requirements and protocol features.
The scenarios in this document have been divided into sections addressing different aspects of the distributed authoring area: the first section focuses on the manipulation of the contents of resources (documents), the second section focuses on the management of the documents themselves and their relationships to other documents and the URL space. A third section has also been included for scenarios not clearly belonging to either of the first two sections.
The individuals "Jane" and "Joe", used in the scenarios, can in most scenarios (if not all of them) be understood as either real people or as some types of software agents.
This section contains scenarios where contents of resources are changed through the use of HTTP (as opposed to through local file system operations).
Scenarios in this section illustrate opening and closing of documents and retrieving their contents for editing. They are related to (and partially overlap) scenarios in section 2.2.
Scenario A: Jane requests that document D be opened. D is available in English, Finnish and Swedish, each using different word processors and each with different revision histories.
Scenario B: Jane requests that document D be opened. D contains links to headers, footers and graphical material. Variation 1: Jane's client side environment is a browser. Variation 2: Jane's client side environment is an editor.
Notes:
Scenario A: Jane "checks out" a document D with an "intent to edit" (i.e., a high probability that Jane will delete, add-to or change document content or document meta-data). Variation 1: Jane wants exclusive editing rights ("write lock") to D. Jane has no objection to letting others view as it is edited. Variation 2: Jane wants exclusive editing rights to D and objects to others viewing D as it is edited ("read/write lock"). Variation 3: Jane wants to edit a local copy of D with the intent to merge and resolve conflicts later at "check in". Note that this case accommodates editing "off-line" (disconnected mode). Variation 4: Jane is willing to engage in "free-for-all" editing, but wishes to make it known to other potential editors that she is entering/leaving the melee.
Scenarios in this section are related to retrieving the contents of a document in various formats for editing. The are related to (and partially overlap) scenarios in the previous section.
Scenario A: Jane, the maintainer of a web page, needs to update its HTML source. There are no other variants to this page, such as translations into other languages. She is working with a distributed authoring tool, DistEdit. She loads the HTML source into DistEdit via HTTP. She then performs some edits to the HTML source. The HTML source is then written back to its original URL using HTTP. The distributed editing session is ended.
Relevant requirements (see [1]) and/or protocol features: Source Retrieval, HTTP PUT, Partial Write.
Scenario A: Jane, who is fluent in Finnish, needs to update the HTML source of the Finnish language variant of a web page which has English, Finnish, and Swedish language variants. She is working with a distributed authoring tool, DistEdit. She loads the Finnish language HTML source into DistEdit using HTTP, and makes some corrections and modifications. She then writes the HTML source back to the original URL using HTTP. The distributed editing session is ended.
Relevant requirements and/or protocol features: Source Retrieval, HTTP PUT, Partial Write.
Scenario A: Jane needs to update the HTML source of a web page. The HTML source includes a server side include (SSI) directive which instructs the HTTP server to insert the current date into the document, and is written in English. There are no other variants to this page, such as translations into other languages. Jane is working with a distributed authoring tool, DistEdit. She loads the HTML source (including the source of the server side include directive) into DistEdit via HTTP. She then performs some edits to the HTML source. The HTML source is then written back to its original URL using HTTP. The distributed editing session is ended.
Relevant requirements and/or protocol features: Source Retrieval, HTTP PUT, Partial Write.
Scenario A: Jane needs to update the source of a web page, stored in the native format of the HTTP-aware word processor DistProc. The HTTP server containing this resource has extensions provided by the vendor of DistProc which automatically convert the DistProc native files into HTML which is served whenever the web page is accessed from its URL, U. The web page does not include any graphic content, and is written in English. She loads the web page source into DistProc from URL U using HTTP, and begins to edit this DistProc native format source file. After making some modifications, she saves the source file back to the original URL, U, using HTTP. She then checks the HTML source by retrieving URL U using their favorite web browser. Since it looks fine, she ends the distributed editing session.
Relevant requirements and/or protocol features: Source Retrieval, HTTP PUT.
Scenario A: A certain Web site is maintained by two people, both of whom make changes on an ad hoc basis. As is frequently the case, there are a few documents that are hot points of congestion, even between these two people. Both people (we'll call them "Jane" and "Joe") have a fancy, version-aware Web authoring tool that interacts with their Web server.
Joe downloads a document from the Web site, and decides that it needs work. He clicks on the "edit" button from his browser/authoring tool, and the tool reports two things: first, that the Web server has acknowledged his edit operation (giving him assurance that a subsequent PUT will not be a complete surprise to the server); second, that the document he will edit is identical to that which he viewed. This may not always be the case: sometimes the document viewed by users is not the true, editable source of the document. But in this case it is. Joe proceeds to revamp the document.
Jane meanwhile is viewing the same document and realizes that in the document the word "fuchsia" has a typo. Jane also clicks the "edit" button, but the authoring tool has a lengthier report for her: in addition to what Joe was told, Jane is told that Joe is also working on the same document. Jane calls Joe and they reach an agreement: Jane will make her fix now (because the error is embarrassing) and Joe will make sure this alteration makes it into his revision.
Jane makes her changes and clicks the "save" button. Her authoring tool prompts her for a brief description of her changes, and then the server informs Jane that her change has resulted in a new, named revision of the document, and that name is displayed.
Joe forgets what he was doing, and weeks later (while working on something else) clicks the "what am I working on" button. In the long list of documents that Joe has started to change is the document we've been discussing, and Joe decides it is time to finish it off. He makes his final edits, and clicks the "save" button. Joe, however, gets a message indicating that what he edited is no longer the latest version of the document, and Joe clicks the "merge" button. The authoring tool has the latest and greatest merge mechanisms, and in the process of resolving Jane's work with his he realizes that Jane did more than just fix the misspelling she said she would. That doesn't matter, because the merge mechanism uses actual differences, not verbally stated intentions.
Joe again clicks the "save" button, and this time he is prompted for a description and his new version of the document is saved.
[Continued in 3.6.1.]
Scenarios in this section describe remote management of the properties of resources, remote management of URL hierarchies (these could be called "directories"), as well as visualization of the relationships among graphs.
Scenarios in this section are illustrate management of document containers (e.g., "folders").
Scenario A: Jane requests that repository R (a web, a DMS store, etc.) be opened. The server response to Jane establishes context for R; for example, a list of R attributes and corresponding attribute values, followed by a list cataloging the objects immediately subordinate to R (folders, files, pages, whatever). Variation 1: Jane requests by location (URL). Variation 2: Jane requests by identity (URI).
Scenario B: Jane requests that file F (containing document D) be opened. The server response to Jane establishes context for F; for example a "reference handle" for accessing the attributes and content of F.
Scenario C: Jane opens folder S. In response, Jane receives context for S. Just after Jane's OPEN request Joe initiates a MOVE of S.
Notes:
Scenario A: Jane, having examined the list of container attributes for folder S, uses an editing tool in order to change the value V1 of the attribute A to V2. The new attribute value has local instantiation at the remote host(s) which are providing an environment for Jane's editing tool. The (server side) object S itself does not yet reflect V2 at A. Through some action, either explicitly (such as requesting a "close" transaction with S) or implicitly (such as ending the edit session) Jane asks for closure with S. Variation 1: A second party Joe has meanwhile requested to PUT a value to A. Variation 2: At the time of Jane's CLOSE request, Jane is disconnected from the network.
Scenario B: Jane has opened folder S and requested that all objects in S not accessed within the last six months be deleted. Before the deletion is complete, Jane requests closure with the repository R containing S.
Notes:
Scenarios in this section illustrate various ways of creating new documents.
Scenario A: Jane is working with distributed authoring tool DistEdit on a new HTML page which does not contain any embedded graphical content. She has finished her edits, and saves the HTML resource to a web server using the HTTP protocol. She is prompted for a URL for the new document; the page is then written to this URL using the HTTP "PUT" method.
Relevant requirements and/or protocol features: HTTP PUT.
Scenario A: Jane is working with distributed authoring tool DistEdit on a new HTML page which does not contain any embedded graphical content. She has finished her edits, and wishes to save the HTML resource to a web server using the HTTP protocol, but does not know the exact name of the level of the URL hierarchy where she wants the document to be stored. She invokes the "Save As..." feature of DistEdit, which includes a hierarchy level viewer, a list of all the entities and their MIME types at a specific level of the hierarchy, along with the ability to go up or down a level of the hierarchy by clicking on either ".." to go up, or the name of a hierarchy level to go down. She moves up and down within the URL hierarchy using the facilities of the hierarchy level viewer, finally finding a good hierarchy level for the resource. She then enters a name for the HTML resource, and hits the "Save" button. The DistEdit tool now writes the HTML page to the URL created by combining the hierarchy level selected using the hierarchy level viewer, and the name just entered by her. The web page is written to the URL using the HTTP "PUT" method.
Notes:
Relevant requirements and/or protocol features: List URL Hierarchy Level, HTTP PUT.
Scenario A: Jane is working with distributed authoring tool DistEdit on a new HTML page which contains some associated embedded graphical content. She finishes her edits, and wishes to save the HTML resource to a web server using the HTTP protocol, as well as save the graphical images (collectively we will call this publishing). She invokes the publishing feature of DistEdit, which includes the hierarchy level viewer (as described in the previous scenario). She finds a level of the hierarchy using the hierarchy viewer, but since this is a new web, she decides to create a new level of the hierarchy just to contain this web. Pressing the "Create New Hierarchy" button causes the author to be queried for the name of the new hierarchy level. Once entered, DistEdit informs the HTTP server that a new hierarchy level should be added below the level currently displayed in the hierarchy level viewer. If the author has the correct access permissions to create a new hierarchy, the new hierarchy level is created. The web author then presses the "Publish" button, and his web of HTML and graphic entities are written to the HTTP server.
Notes:
Relevant requirements and/or protocol features: List URL Hierarchy Level, Make URL Hierarchy Level, HTTP PUT.
Scenario A: In order to understand the link structure and resource inclusion relationships at a hierarchy level, a web maintainer chooses the "Graph View" option of their distributed editing tool DistEdit. DistEdit queries the web maintainer for which level of the hierarchy to display using a graph visualization, and then uses the HTTP protocol to read information about that level of the hierarchy. DistEdit uses this information to display a graphical visualization of the hierarchy level, including an icon for each resource, solid lines between the icons representing links, and dashed lines representing inclusion (for example, images loaded using the IMG tag). Entities the web maintainer has read and write access to are displayed in green, those which they have read access to are in white, and those which they have no access to are in red. To create the graph visualization, DistEdit must, using HTTP, get a listing of all the entities at a level of the hierarchy, and their access control permissions.
Notes:
Relevant requirements and/or protocol features: List URL Hierarchy Level.
Scenarios in this section illustrate management of document attribute data, and the administration of access rights.
Scenario A: Jane submits a list of document URIs with the request that the "subject/summary" attribute value be returned for each document.
Scenario A: Jane requests that the value of the document attribute "subject/summary" for document D be modified to correct an error.
Scenario B: Jane is creating a new document on the Web. She sends it to the server, but also wants to set a bunch of attributes that can be used later in searches (author, title, type of document, subject, organization, etc.). Sometimes she may also want to create catalog entries for documents that are not available in electronic form. There will be no content for these documents, just attributes.
Relevant requirements and/or protocol features: Attributes.
Scenario A (Realistic): A sales manager at a company which contains an organization-wide intranet is working with an intranet-enabled spreadsheet program, DistCalc. After entering the sales figures for the previous month (which are below projections), a graph of the sales figures is generated as a JPEG image, and then saved to the departmental HTTP server using the HTTP "PUT" method. Realizing that it might be best to limit access to this information, in their web browser they bring up the graph image. After selecting the menu option, "Modify Access Permissions," the browser displays the access control page for the graph image resource. The sales manager uses the (server-specific) facilities on this page to modify the sales chart's access control rights so it is password protected.
Scenario B (Ideal): In the ideal case, the DistCalc program would display a dialog box asking the user for what access rights the graph resource should have before the graph is saved to the departmental HTTP server.
Notes:
No matching requirements or protocol features.
Scenarios in this section illustrate typical "housekeeping" involved with managing documents, such as renaming, moving them around, and deleting them.
Scenario A: Jane is looking at the list of monthly reports available on the server. She selects one from the list that she wants to use as the basis for a new monthly report. She asks for a copy of this monthly report to be made in the same directory but with a different name. Since she is not intending to work on it now, there is no reason to pull the content to the client.
Relevant requirements and/or protocol features: 14 (Copy).
Scenario A: Jane directs that folder S1 be copied to folder S2. While the copy is in progress, Joe directs that S1 be moved to folder S3.
Scenario B: Jane directs that a container F be copied to a location outside the repository R. Although F contains only a simple text document, the structure of F both as a logical and a physical entity is highly idiosyncratic, being intimately bound to R. Consequently, F cannot be expressed in the external domain. Variation 1: The DMS has export capability (to the external file system) with a granularity that can resolve F. Variation 2: The document contained by F has a native format corresponding to the tool used to generate the document (HTML, WPD, etc.). In this case one could interpret the copy as a transform from F in the DMS domain to the native document format D in the external domain. Variation 3: The container to be copied is folder-like, i.e. a proper container, and the container hierarchy in R is compatible with the external domain container hierarchy. In this case, some kind of copy/transform could be implemented, with the understanding that container attributes might be largely distorted or lost.
Notes:
Scenario A: Jane directs that page P and all subordinate objects be deleted from web W. Pn is subordinate to Pk is subordinate to P, and both Pk and Pn are in scope (i.e., in W). It so happens that Pn forward links to Pk. The delete process DEL recursively chains down from P, eventually encountering Pk, and asserts a "read lock" on Pk preparatory to deleting Pk. Since Pk has subordinate links, DEL continues down the chain until it encounters Pn, where it asserts a "read lock" and recursively chains forward to Pk. DEL requests a "read lock" on Pk.
Notes:
Scenario A: Jane opens folder S and examines its content. Jane decides to delete all non-folder objects in S, but is unsure if existing folders subordinate to S have valuable content. Jane directs that all non-folder objects in S be deleted.
Scenario B: Jane directs that folder S and all subordinate folders and content be deleted. A document subordinate to S is currently open to Joe.
Scenario C: Jane directs that file F containing document D be deleted. A copy process of D to some other repository is in progress.
Scenario D: Jane directs that all containers and related content subordinate to folder S with content that has not been modified since a given date (supplied by Jane or otherwise provided) be deleted. One or more active documents in the repository reference a common header that is in a file F subordinate to S. F meets the delete criterion.
Scenario E (Undeleting): Container S1 (subordinate to S) and all subordinate containers have been deleted. Jane requests that container S3 and all subordinate objects be undeleted, where S3 was subordinate to S2 which in turn was subordinate to S1. Variation 1: The container structure is "plex", so that S3 was also subordinate to Sk which was not subordinate to S1.
Notes:
Scenario A: Jane notices that after recently editing the content of a document its assigned name no longer makes logical sense. She decides to rename the document. She selects the document from a list of existing documents and is prompted for a new name. Since she does not intend to work on it now, there is no reason to pull the content to the client.
Scenario A: Jane directs that folder S1 be moved to folder S2. In the container hierarchy, S2 is subordinate to S1.
Scenario B: Jane directs that container S1 be moved to container S2. There exists in web W a page P that is external to S which makes (forward) reference (via URI) to one or more objects in S.
Notes:
Scenarios in this section illustrate situations where manipulation of a (logical) collection of related document is necessary.
Scenario A (Browsing): A net surfer browsing the web loads the introductory page for a book which has been written in HTML and subdivided so that there is a separate resource for each chapter, and many side links to clarifying text and standalone figures. Since the book is of interest, the net surfer would like to print the entire document. Clicking on the "Print" button of their web browser brings up the Print dialog box, which contains an option, "Print multi-resource document," which they select, before pressing the "Start Printing" button.
The browser now begins, in the background, to load all of the chapters of the book along with their explanatory sidebars, sending them one by one, in order, to the printer. When complete, the browser pops-up a dialog box stating that the document has been completely printed.
Scenario B (Distributed Authoring): This scenario applies equally well to a distributed authoring situation. If the author of a multi-resource document is using a distributed authoring tool to write the document, it is desirable for them to be able to print the document as a whole, rather than by loading and printing each resource in turn.
Notes:
Scenario A: A professor is working on a new textbook using their favorite intranet-enabled word processor, DistProc. Once the initial draft of this book is complete, they use the "Publish" feature of DistProc to save their book as multiple resources, one per chapter, on a web server. Since the author intends for their students to read the text using web browsers employing a DistProc reader plug-in, the professor has the book on the HTTP server in DistProc native format, preserving layout information.
In order to provide additional browsing structure to the students, the professor uses the feature of DistProc to automatically create links to the table of contents, index, and glossary for the book. To make generating feedback easier, all book chapters automatically have a link to a corrections and feedback page. As the students are reading the text, these automatic links are displayed as special toolbar icons in their browser.
Notes:
These scenarios are hopefully illustrative of the need of a versioning scheme for distributed editing.
[Continued from 2.2.5.]
Scenario A: Jane and Joe's version-aware web server is fairly simple: normally, it serves up the latest revision of each document, but if instructed it will instead serve up the revisions of documents as listed in a named configuration. In this way, they can make their trivial changes and have them show up immediately, but if they plan to make a heavy-duty overhaul they can save the current set as a working configuration and tell the server to use those until the work is complete (this can all be carried out without the explicit knowledge of Jane and Joe's authoring tool, because the Web server makes itself configurable via Web pages with forms on them).
Joe is about to make a set of minor changes, and to be on the safe side tells the server to save the current configuration as "stable", a name he uses for these occasions. He goes through the various documents, clicking "edit" on any that he thinks are in need of updating.
Once again Joe forgets what he is doing, but a few days later the "what am I working on" button again comes in handy. He realizes that his work is about complete, and makes his final edits.
Joe's changes really are a coherent set that should appear simultaneously, and he doesn't want to find out halfway through saving that Jane has made changes that need merging, so he clicks the "Save All" button. Fortunately, Jane has been busy viewing other parts of the web and hasn't made any changes to their local Web pages, and so Joe is prompted for a description of the changes he has made. Since Joe is saving all the documents at once, a single description applies to all the changes. One by one the new documents are saved, and in the end Joe gets confirmation that all documents are in place. Joe browses the result and is satisfied that their customers are seeing what he has just finished.
Joe goes on vacation.
[Continued in 3.6.2.]
[Continued from 3.6.1.]
Scenario A: Jane gets back to real work and realizes that every document that Joe edited has the same old spelling problem. In a panic she calls Joe but realizes that he is on vacation. Knowing that the errors would harm their image, she decides to undo what Joe has done until he returns and can correct his mistakes.
Jane begins by browsing the revision history of each document, and notes that all the erroneous documents came about at the same time when Joe saved his changes just before vacation.
Jane browses the configuration lists in the version-aware web server and sees that Joe had made a "stable" configuration before his latest work. Jane instructs the server to serve up only documents from the "stable" configuration. As this doesn't involve changing any of Joe's work, it is a quick fix to the pages on their public web server. Jane now browses the documents on their server and is satisfied that they are the precursors to Joe's latest change.
When Joe returns, he fixes his spelling mistakes and then tells the server to resume using the latest documents.
Versioning (the ability to retain revision histories for documents) is discussed in several scenarios in this document. Section 3.6 (Publishing and Reverting) presents scenarios where versioning is necessary, as well as section 2.1.2 (Checking Documents In and Out) variation 3 and section 2.2.5 (Multiple Simultaneous Editors).
Scenario A: Jane's department keeps its documents organized in hierarchical collections. There is a collection called "Monthly Reports" with subcollections for each month. There is also a collection called "Monthly Business Letters" with subcollections for each month. The monthly reports are used to derive the monthly business letters, so the monthly reports appear in the appropriate "Monthly Business Letters" subcollections as well. When Jane writes her monthly report, she puts it into Monthly Reports/199608 and into Monthly Business Letters/199608. Only one copy of the report should exist on the server, but it appears in both places when users browse or search the collections.
Scenario B: The first time Jane's monthly report gets printed, it gets converted to PostScript, which she wants to store on the server. Now there will be two renditions of the same (versions of the same) document from which she can choose when she retrieves the document in the future. She also saves the printing instructions (duplex, landscape, stapled, etc.) for the document, which she may want to retrieve with the PostScript later.
[1] Jim Whitehead, 1996. "Requirements on HTTP for Distributed Content Editing", Internet Draft (available on-line from the Internet Draft collections as draft-whitehead-http-distreq-00.txt)
[2] Tim Berners-Lee, 1994. "Universal Resource Identifiers in WWW - A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web", Request for Comments #1630
[3] --, "Uniform Resource Identifiers", available on-line as http://www.acl.lanl.gov/URI/uri.html
The following people have contributed to this document by submitting sample scenarios and/or by commenting: