Difference between revisions of "Webservices API"

From GeeklogWiki
Jump to: navigation, search
(Added a section for Implementation Details that needed to be documented)
(Added a note about WSSE authentication)
Line 235: Line 235:
 
== Authentication ==
 
== Authentication ==
  
The webservice uses the Basic authentication scheme. If the client receives a '401 Unauthorized' HTTP response, then it MAY send another request with suitable credentials. The authentication string passed by the client is expected to be -
+
The webservice uses the Basic authentication scheme(1). If the client receives a '401 Unauthorized' HTTP response, then it MAY send another request with suitable credentials. The authentication string passed by the client is expected to be -
  
 
<pre>base64 ( <username>:<password> )</pre>
 
<pre>base64 ( <username>:<password> )</pre>
Line 246: Line 246:
 
RewriteRule .* - [E=REMOTE_USER:%{HTTP:Authorization},L]
 
RewriteRule .* - [E=REMOTE_USER:%{HTTP:Authorization},L]
 
RewriteCond %{HTTP:Authorization} username=\"([^\"]+)\"</pre>
 
RewriteCond %{HTTP:Authorization} username=\"([^\"]+)\"</pre>
 +
 +
(1) A functional implementation of WSSE authentication is included in <tt>system/lib-webservices.php</tt> but can not be used since Geeklog does not have access to the user's unencrypted password and therefore can't perform the authentication ...
  
  

Revision as of 08:21, 8 August 2008

Webservices Documentation for Geeklog 1.5

The purpose of the Geeklog webservices module is to provide an application layer interface for Geeklog that can be used by standardized clients such as aggregators, desktop publishing software etc. to interact with the web-server, and create and update content programmatically.

The Atom Publishing Protocol is the protocol that has been implemented for this purpose. The protocol is now an official internet standard (RFC 5023).

For an end-user documentation, please see Using the Webservices.


Relation of the Webservices component with the rest of Geeklog

In order to incorporate the webservice, the Geeklog code has been re-organized. The purely functional code has been shifted into 'service' methods, separating it from the code that manages the display and rendering of the plugin. For instance, there are now functions called 'service_post_story' and 'service_delete_story' that take an array of parameters as argument and perform the 'post' and 'delete' functions respectively. These methods can be called by the webservice and the HTML scripts alike. When called from any external functions, these functions must be called by using the PLG_invokeService method, like this:

PLG_invokeService($plugin, $verb, $input, $output, $svc_msg);
$plugin The plugin whose function needs to be invoked
$verb The action to be performed ('post,' 'delete,' etc.)
$input An array of input parameters

Some of these input parameters may be optional, depending on the plugin. The plugin MUST ignore unknown parameters.

$output An array of output parameters passed by reference. The calling function may choose to display these parameters as it chooses. In the story and staticpages implementation, this variable is a string in a few cases, but it is STRONGLY RECOMMENDED that this variable be treated as a simple array of parameter-value pairs. On successful operation, the webservice script MAY display the content of this variable to the client in the form of XML.
$svc_msg This array, short for "service messages" is used exclusively by the webservice to get certain types of control information from the plugin.

The above invocation calls the function (provided it exists)

service_<verb>_<plugin>(<input>, <output>, <svc_msg>)

where <output> and <svc_msg> are passed by reference to the plugin.

The <output> and <svc_msg> arrays MUST be filled in by the plugin function, so that it can be acted upon by the calling function.

Implications for writing a plugin

If a plugin is developed according to certain rules, it can automatically provide an Atom-enabled interface for the client. For this, the following verbs must be implemented:

submit This verb handles any kind of data posted to the server. In the context of the Atom protocol, this verb is used to create new items or update existing items on the server. (See: GL Directives) Successful completion results in the return of either a 'HTTP/1.1 201 Created' response or a 'HTTP/1.1 200 Ok' response to the Atom client.
delete This verb deletes a specified resource on the server.
get This verb handles the retrieval of information existing on the server. When accessed via the webservice, each item is serialized into XML, in the Atom Syndication format (See: RFC 4287). The plugin may return a single entry or multiple entries. The plugin MUST return a single item if the $id variable is set. (See: Service Messages)

The Atom specifications have been extended to allow verbs other than the ones above. (See: Atom Extensions)

Enabling webservices for a plugin

In order to enable webservices for a plugin, it must implement the plugin_wsEnabled_<plugin> function AND this function must return true. For example:

function plugin_wsEnabled_staticpages()
{
    return true;
}

For a plugin that supports webservices, webservices can be disabled by returning false in the function above.

The following function can be used in the rest of the GL code to check if a specific plugin supports webservices:

PLG_wsEnabled($type);


GL Directives

The $input array should contain all the information required for the successful processing of the requested action. Some keys in this array are, however, reserved for providing useful processing information to the plugin. These array keys MUST NOT be used to store user-provided information.

'gl_svc' If true, this indicates that the function has been invoked by the webservices component. Ideally, this should not matter, but for existing plugins, it eases the transition to an Atom-enabled server by allowing the plugin to differentiate between a webservice call and an invocation by the HTML component.
'gl_edit' If true, this indicates that the 'submit' verb has been invoked in 'Edit' mode, which means an existing item is to be modified. On successful completion, the Atom client will receive a 'HTTP/1.1 200 Ok' response, rather than a 'HTTP/1.1 201 Created' response that is normally transmitted for new items.
'gl_etag' If set, this variable contains the If-Match HTTP header (with the double-quotes stripped) sent by the client along with a updation request. Unless it is empty, this variable MUST be compared to the 'updated' property of the existing item before the item is modified. This ensures that the item has not been modified in the interval between its retrieval by the client and subsequent updation.


Standard Input Keys

Apart from the GL directives, there are some more array keys for the $input variable that have standard meanings. These include -

'id' The ID of the item that the client wants to refer to.
'title' The title of the item under consideration.
'author_name' The name of the author, as provided by the client.
'category' An array of all the categories for the item, supplied by the user. (See: Categories)
'updated' The date and time, as accurate as possible when the item was last updated. Since the 'updated' value is used to determine if the item has been modified, it is STRONGLY RECOMMENDED that the value be updated on each modification of the item. This value is in the RFC 2822 format. The following keys are also updated with the local time, based on the value of $input['updated'] -
       'publish_month'
       'publish_year'
       'publish_day'
       'publish_hour'
       'publish_minute'
       'publish_second'
'summary' A summary of the content of the item.
'content' The main content of the item.


Output Array

The $output variable contains all the output generated by the plugin function. In error conditions, the $output variable MAY be a string rather than an array, since the webservice does not handle the $output variable under error conditions. However, this is NOT RECOMMENDED.

The items listed in Section Standard Input Keys MUST be filled in appropriately by the plugin function before returning.


Service Messages

The $svc_msg array is used to return specific messages to the webservice component. The following array keys are understood -

'id' The ID of the item under consideration. For POST requests, this ID forms a part of the URI returned in the Location header, as specified by the Atom protocol. For GET requests, this ID forms part of the URI that is inserted into each entry.
'error_desc' When the plugin function returns an error code, the webservice looks at this value and returns it to the user if it is non-empty. This is particularly useful for making the 400 Bad Request errors more descriptive and plugin-specific.
'gl_feed' The plugin should set this variable to true if the plugin is returning multiple items, rather than a single item. This means that $output is expected to be an array of arrays.
'offset' This variable indicates the number of items (from the start) that the server would have to skip in order to present the next partial list of items of the collection. This value forms a part of the URI inserted into the Atom feed document.
'output_fields' This array provides the list of keys of the $output variable that should be converted into XML and displayed to the user. This is primarily used because the plugin function may want to hide some of the output values in case of the webservice. This list MUST NOT include any of the standard Atom elements (See: Standard Input Keys). Those elements will be displayed to the user, even without being listed here.


HTTP Responses and Return Codes

The service_<verb>_<plugin> functions MUST return one of the following codes

PLG_RET_OK Everything is okay
PLG_RET_AUTH_FAILED Credentials were supplied by the client, but authentication failed
PLG_RET_PERMISSION_DENIED The client does not access to the specified resource
PLG_RET_PRECONDITION_FAILED The If-Match HTTP header condition provided by the client failed
PLG_RET_ERROR An error apart from the ones above was encountered


The Atom server returns one of the following responses on successful (code>PLG_RET_OK</code>) or unsuccessful (all other return codes) completion of an operation:

PLG_RET_OK 200 Ok This is the usual response.
  201 Created This is the response returned when the HTTP method used is POST
PLG_RET_AUTH_FAILED 401 Unauthorized Authentication failed
PLG_RET_PERMISSION_DENIED 403 Forbidden The supplied credentials are insufficient
PLG_RET_PRECONDITION_FAILED 412 Precondition Failed A necessary condition failed


Atom Client Requirements

A standard Atom client can be used to post, edit, delete and get items to the server using any plugin. A Geeklog specific client would provide fine-grained control over the input data. (See: Atom Extensions)


XML Namespaces

The namespaces that are expected by the webservice are


Atom Extensions

The webservice ignores all XML elements that do not belong to one of the above namespaces. Some elements belonging to the http://www.w3.org/2005/Atom are interpreted as explained in Section: Standard Input Keys. All other elements belonging to either the http://www.w3.org/2005/Atom or http://www.geeklog.net/xmlns/app/gl namespaces are transformed thus:

$input[<name>] = <value>

where

  • <name> is the local name of the node
  • <value> is the value of the node's content (if the content is text-only) OR the text-values contained in all the child nodes, stored as an array

For example

<somename>John</somename>                   becomes $input['somename'] = 'John';
<somename><param>abcd</param></somename>    becomes $input['somename'] = array ( 'abcd' );

To invoke a verb other than 'submit,' 'delete' or 'get,' the client should insert the following XML snippet as a child of the atom:entry node

<action xmlns="http://www.geeklog.net/xmlns/app/gl">$verb</action>

where $verb is the requested verb. The content should be submitted as a POST request. Successful operation returns a '201 Created' HTTP response.


Categories

Atom categories correspond to Topics in Geeklog. The Atom server has support for multiple categories. atom:category elements of the form -

<category xmlns="http://www.geeklog.net/xmlns/app/gl" term="sometopic"/>
<category xmlns="http://www.geeklog.net/xmlns/app/gl" term="someothertopic"/>

are converted into -

$input['category'] = array ( 'sometopic', 'someothertopic' );

If the plugin can support only one topic, then it MAY reject all or all except one category provided by the user.

The server provides the client a list of possible categories using the 'getTopicList' verb.


URI Details

The webservice follows the standard Atom discovery mechanism to let clients know the URIs of the available services. A webservice URI is of the form -

http://<domain>/webservices/atom/?plugin=<plugin_name>
http://<domain>/webservices/atom/?plugin=<plugin_name>&id=<object_id>
http://<domain>/webservices/atom/?plugin=<plugin_name>&offset=<offset_value>

In the absence of the <object_id> parameter value, the URI is assumed to point to the entire collection. In this case, the first <offset_value> items MAY be skipped on a GET request. If the plugin provides support for skipping elements, then the $svc_msg['offset'] value, on return, MUST contain the offset value for obtaining the next set of items in the collection.

If the <object_id> value is invalid but not empty, the plugin function must return an error response.


Authentication

The webservice uses the Basic authentication scheme(1). If the client receives a '401 Unauthorized' HTTP response, then it MAY send another request with suitable credentials. The authentication string passed by the client is expected to be -

base64 ( <username>:<password> )

Basic authentication is handled implicitly by the webserver in most cases.

If PHP is installed as a CGI binary on your server, then authentication might fail because Apache may not pass on the authorization headers to PHP. In that case, update your .htaccess file to include the following lines:

RewriteEngine on
RewriteRule .* - [E=REMOTE_USER:%{HTTP:Authorization},L]
RewriteCond %{HTTP:Authorization} username=\"([^\"]+)\"

(1) A functional implementation of WSSE authentication is included in system/lib-webservices.php but can not be used since Geeklog does not have access to the user's unencrypted password and therefore can't perform the authentication ...


Implementation Details

Length of Entry IDs

Atompub clients will usually include an ID when creating a new entry. To ensure that this ID is unique and can be used, Geeklog will need to know the max. length of an ID as used by a plugin. For stories and static pages, that max. length is 40 characters and has now been hard-coded as constants.

Earlier versions of Geeklog did not use a hard-coded length, so it was easy to use longer IDs (e.g. for SEO purposes) by simply changing the database and the input fields in the story or static pages editor. If you did that, you will have to adjust the constants accordingly or you will not be able to modify your entries through the webservices API.

Slug

Some Atompub clients will send a Slug: header with the POST request when creating a new entry. This header contains a text string that the client suggests to be used in the ID for the new entry.

Geeklog will try to make use of the Slug: header, if it decides that it needs to create a new ID for the entry. However, the content will be ignored if it contains %-encoded characters since those are usually non-printable or Unicode characters that can not be used in an entry ID in Geeklog.

The content of the Slug: header is also available as a 'slug' entry in the input array. Plugins can either use it directly or pass it to the WS_makeId function to create a new ID.


Security Implications

Plugin developers should be aware that writing a function of the form -

service_<x>_<y> (...)

makes the function open to the public, in the sense that it can be called using the webservice, with appropriate parameters. Functions should not be named in this way, unless they are intended to be called independently.

An important corollary is that function calls of the type -

if (<security_check>) {
    PLG_invokeService(...);
}

are BAD, because the same function can be called from the webservice WITHOUT the <security_check>.


Compliance

To the best of our knowledge, the webservices / Atompub implementation in Geeklog complies with RFC 5023 (and RFC 4287, where applicable).

Atom Protocol Exerciser

The Atom Protocol Exerciser (aka The APE) by Tim Bray performs several operations against an Atompub service (such as the one implemented in Geeklog) and evaluates the responses, i.e. it is looking for expected results according to the RFCs.

At the time of this writing (January 2008) Geeklog's Atompub implementation for the Static Pages plugin passes all of the APE's tests.

Non-conformance of Stories Implementation

The Atompub implementation for stories fails one test - but that was a deliberate decision. In this test, the APE creates 3 stories, modifies the second one, and then expects the stories to show up in the order 2-1-3. However, since Geeklog does not have the concept of a "last-modified" date for stories, we would have to modify the creation date of the story to pass this test. Which would mean that any change to a story through an Atompub client would cause the story to show up as "new" on the site, even if someone only fixed a typo in an old story.

Other than that detail, the Atompub implementation for stories is also fully compliant.