Understanding HTTP PUT
I was in the process of writing an article to explain on how to make possible to edit your Web pages with HTTP PUT and Amaya (W3C technologies testbed authoring tool). The information is scarce on how to configure your server. This is my attempt at explaining HTTP PUT. Your comments are welcome. I'm pretty sure the lazyweb will fix any of my own misunderstandings.
Not many servers handle HTTP PUT in a simple way. There is a good reason behind that. The HTTP specification defines an abstract model for modifying and managing an information space. The "physical action" are unrelated to the use of HTTP verbs. HTTP means create or update a resource. As Roy Fielding put it in 2006:
FWIW, PUT does not mean store. I must have repeated that a million times in webdav and related lists. HTTP defines the intended semantics of the communication – the expectations of each party.
The protocol does not define how either side fulfills those expectations, and it makes damn sure it doesn't prevent a server from having absolute authority over its own resources.
What does it mean for an application which is really Web aware? Let's imagine that you have an information space (Web site), called MyIsland, available at http://MyIsland.example.org/
. You are working with a photo manipulation software program. You have set the levels, contrasts, saturation of an image. You resized the photo, which is an object.
You are ready to create a resource to the information space. Let's say that your resource will be identified in the information space by http://MyIsland.example.org/somewhere/coconut
. You can now send to the server a request to create the resource in the information space and the image (entity enclosed with the request).
What the server, which is a piece of software, does with the object is not driven by HTTP at all. The server manages the information space (a list of URIs) depending on the requests from clients. The application could store the object on the filesystem, in a database, decompose the image in a series of mathematical equations, to send an email to someone that a new resource has been created, to print it in a real printer, to do a zillion of things additional to the fact that the resource has been created.
It means that later on, if someone does an HTTP GET on this information space at this URI, he/she will receive back the object coming from somewhere. Similarly, the HTTP GET is not a way to say "read this file on the filesystem", but just give me the information designated by this URI. What the server does in the backend is entirely up to the software program. It could be reading a file in the system, it could generate on the fly back the image because it had stored all the equations, a zillion of things.
In the following weeks, I hope I'll get time to give you a bit of code to handle HTTP PUT on your Web server and use Amaya with it. If you have links to online documentation on how to do it practically please leave a comment.
PS: So far I have these:
When I think about using the HTTP methods like this, I always come back to the HTML form element and the fact it only accepts/implements GET and POST.
Any advice on using more methods from a browser (as opposed to implementing a client that sits on a web server - I know there are libraries supporting this). Sounds like Amaya is a bit more advanced in this regard... how does it make a PUT request?
Hey Ben,
some people have tried with javascript See this experiment of HTTP PUT and DELETE with javascript in the context of AtomPub protocol.
One thing I've never quite understood with PUT is whether servers have to put the entire contents of the PUT request at the specified URI (eg with the image example above), or whether they can wrap the PUT body within an HTML page - ie can you make a PUT request with just the body and title of a blog post, and then have it returned surrounded by all the usual stuff (header, nav, footer, etc).
Yes, PUT is not about storing files, for example in PhotoRDF, PUT can be used to modify the metadata chunk of a picture, so the PUT affects not only one resource, but potentially several others.
Another example of a serverhandling is own resource is when you do a PUT to a server that has a versionning system in the back-end. It is perfectly possible for a server, once the PUT is done, to do some processing and potentially change the resource (like cvs keyword substitution). What matters is the resource (and its identification), but not the server machinery or any assumption about the way the server works.
My University lecturer told me not to bother with PUT or any of the other methods that are available and to just stick with POST (when sending data) and GET (when getting data) and to be honest I cant see a reason why I would use PUT so can someone tell of an instance when both POST and GET would not be sufficient and PUT would be needed?
Rashed, It is often used in webdav clients and server script implementations. Perhaps encoding is also better when transfering via a HTTP PUT, but I believe it mainly fulfils the task of larger user sent information, to arrive reliably at the server side.
@Rashed, well, POST can be used to do lots of things, but it would be bad to do POST /foo to delete /foo, the DELETE method is for that. The POST method identifies the resource that will handle the data you send along with the request, and the scope of this 'handling' is wide and undefined, while PUT will affect only the targeted resource.
Another property is that POST is not idempotent while PUT is, meaning that PUT can be repeated if things go wrong (like sudden cut of the connection).
Excellent! This is little understood.
One nitpick: in "You are working with a photo manipulation software..." and "What the server, which is a software,..."
No, those are pieces of software, software packages, or software programs. There is no such thing as "a software" or "a hardware" or "a clothing" -- those are collective nouns. Grammar is important, please.
Thanks, William, for highlighting grammar mistakes. I've fixed them.