MMMeeja

blog

A Beginners Guide To REST Web Services

Posted on 19 Nov 2008 by Andy - Filed under   

This article series will take you through RESTful web services, introducing the technologies behind them, what they are, how to use them and how to implement them.

The series is presented in five parts, each of which will be posted over the next few days so sign up to the MMMeeja RSS feed to make sure you don't miss a single post!

REST (or Representational State Transfer) is fast becoming the de facto standard for exposing APIs on the web, beating more complex SOAP/RPC services. It's easy to understand why - it's a logical extension to how we all use the web already - as you'll see as the series continues.

I start off with a couple of introductory articles to the underlying technologies to get you up to speed:

1. What Is A URL?
2. An Introduction To HTTP

Then, let's get to the meat with:

3. Examples Of REST Interfaces
4. Using RESTful Web Services

And for the more advanced:

5. Designing And Implementing A RESTful Interface

So, the series will be a trip from n00b to l33t h4xx0r in five installments. Don't forget to subscribe.


Creative Commons licensed photo by Bobby ~ Lawcrow911.

0 comments, add yours.

Examples Of REST Interfaces

Posted on 19 Nov 2008 by Andy - Filed under   

Continuing our look at Representational State Transfer interfaces, part three lets us get our hands dirty with some real-life RESTful interfaces from BrightKite and del.icio.us.

But first...

What Is A RESTful Interface?

Three articles into the series and I finally get to explain this, thanks for sticking around!

Knitted body to laptop interface

REST stands for Representational State Transfer, a term first coined in Roy Fielding’s PHd Thesis. It involves using HTTP to manipulate resources identified by URLs via a completely stateless protocol.

The URLs used are generally quite pretty. For example, an e-commerce store might sell a product identified by this URL:

http://mystore.com/products/1234

The store might have an API that supplied users with a list of products by performing a HTTP GET request on this URL:

http://mystore.com/products

The response might be supplied in an XML document, like this:

 <?xml version="1.0"?>
<products>
    <product id="1">
         <url>http://mystore.com/products/1<url>
         <name>Red widget<name>
         <description>A large, left-handed widget<description>
         <stock-level>406<stock-level>
    <product>
    <product id="2">
         <url>http://mystore.com/products/2<url>
         <name>Blue widget<name>
         <description>A small, left-handed widget<description>
         <stock-level>12<stock-level>
    <product>
<products>

Stateless Protocol

I mentioned that REST is a stateless protocol, so how can that work in an e-commerce context? People need to browse around the site and fill up their shopping basket!

Simple, a shopping basket is a resource too. Create a basket with a POST request to this URL:

http://mystore.com/baskets

And add items with POSTs like this:

http://mystore.com/baskets/1/products/2

You can use HTTP authentication to ensure that only the basket owner can view or manipulate the contents of the basket.

Enough of fabricated example shops, let’s look at some RESTful APIs that are being used on the internet today...

Delicious

One of the first mainstream web applications to offer a RESTful interface was del.cio.us with its version 2 API. We’ll be using the del.icio.us API for this example, so if you’re following along be sure to get a del.icio.us account and download CURL, the tool that we’ll be using.

Delicious Logo

We’ll begin by using the API to get a list of tags that we’ve applied to our bookmarks. Open a command line terminal and type the following (changing username & password to your values):

curl --user username:password https://api.del.icio.us/v1/tags/get

This executes an authenticated GET request for your set of tags and the output will look something like this:


<?xml version="1.0" encoding="UTF-8"?>
<tags>
  <tag count="1" tag="Action"/>
  <tag count="1" tag="Bookmarklet"/>
  <tag count="1" tag="Day"/>
  <tag count="8" tag="EC2"/>
  <tag count="1" tag="GTD"/>
  <tag count="14" tag="IE"/>
  <tag count="14" tag="Maps"/>
  <tag count="1" tag="On"/>
  <tag count="3" tag="yahoo"/>
  <tag count="1" tag="youtube"/>
<tags>

Pretty cool, no?

Now, let’s examine the REST API to add a new bookmark:

curl --user username:password -d "?url=http%3A%2F%2Fwww.mmmeeja.com%2Fblog%2F&tags=webdev%20web%20design" https://api.del.icio.us/v1/tags/add

The command above will perform a HTTP POST to add this blog :) to your del.icio.us account. Note that the url parameter is URL encoded and the tag parameter has spaces replaced by %20.

The observant reader will note that there’s something fishy about the two del.icio.us URLs we just used. The first ended in /get and the second in /add - this goes against the whole idea of using HTTP methods for REST! Yes, delicious have compromised by allowing people to use HTTP GET to create resources and the only way to do that is to break the HTTP model.

This is very common and I understand why they did it, but it’s a shame, especially when I’m using their API to teach about RESTful interfaces. Roy Fielding successfully argues that a resource should only ever have one URL no matter whether the API changes.

So, how about another API that’s a bit more compliant?

BrightKite

I had to search for a long time to find a well-known service that implemented to truly RESTful API, so hats off to BrightKite.

BrightKite Logo

If you’ve not got a BrightKite account already, you might be out of luck because its invite only at the moment, but there are plenty of invites around if you know where to look.

The BrightKite API documentation explains that people, places, notes, photos etc are all resources available via their API. Placenames are rarely unique and unambiguous so places are assigned UUIDs, so the URL for City Square, in Leeds is:

http://brightkite.com/places/b1adc0a0b65e11dd8a90003048c0801e

That URL takes you to the web page, but you can get the information in XML format by performing a HTTP GET of the URL with “.xml” on the end:

curl http://brightkite.com/places/b1adc0a0b65e11dd8a90003048c0801e.xml

You should be able to access that part of the BrightKite API without an account, so try it and check out the results. You can also get the results in JSON format by substituting “.json” for “.xml”.

Posting Notes

Places can have notes associated with them, get a list of notes attached to City Square like this:

curl http://brightkite.com/places/b1adc0a0b65e11dd8a90003048c0801e/notes.xml

In correct RESTful fashion, we can create a note with an HTTP POST. We must be an authenticated user to be allowed to do this, so replace the username and password fields in the command below:

curl -u username:password -X POST http://brightkite.com/places/b1adc0a0b65e11dd8a90003048c0801e/notes -dnote[body]=My%20lovely%20note

Finally, We Get Some REST!

That was part three of the series, only two more to go. I hope that it’s been worth getting up to speed on URLs and HTTP before we got into the REST examples.

In the next installment, I’ll cover using RESTful interfaces from software. Both client-side javascript and server-side scripts will be featured, so don’t forget to subscribe.


Creative Commons licensed photo by Bekathwia.

9 comments, add yours.

An Introduction To HTTP

Posted on 16 Nov 2008 by Andy - Filed under   

HTTP (HyperText Transfer Protocol) defines how resources get transfered across the web. In the previous article on URLs, I explained that there are other protocols but HTTP is the most important when it comes to RESTful interfaces.

HTTP Methods

When a client (like your web browser) asks for a resource from a web server, it uses one of:

  • GET
  • PUT
  • POST
  • DELETE
  • HEAD

HTTP GET

This method means “I’d like the resource, please” The client or the server can override the request and grab a copy from a cache.

I’m going to say that again the client can override the request and grab a copy from its cache. That means no network activity, the server doesn’t even know that the resource was requested.

HTTP PUT & POST

Both these can mean “here is a resource for you” They could be an update to an existing resource or create a new one, the HTTP standard does not specify which. We will see later on that in RESTful interfaces a PUT usually means update, whilst POST means create.

POST is commonly used in web forms, so when you buy an item from Amazon or bid on an Ebay auction, you are sending a POST request. They should never be cached.

HTTP DELETE

I bet you can guess what this does!

HTTP HEAD

This is a more interesting method, it means “tell me about the resource, but don’t bother sending the resource” It can be used to check whether a resource exists (by checking the response code), the type of the resource or how old it is (by check the headers).

HTTP Response Codes

The server will answer a HTTP request with a code that indicates whether the request could be processed. I’m sure you know some of these code already, like the dreaded 404!

There’s quite a bunch of them, so here are a select few that are applicable to RESTful APIs:

  • 200 OK
  • 201 Created
  • 202 Accepted
  • 304 Not Modified
  • 400 Bad Request
  • 401 Unauthorized
  • 403 Forbidden
  • 404 Not Found
  • 405 Method Not Allowed
  • 406 Not Acceptable
  • 409 Conflict
  • 410 Gone
  • 411 Length Required
  • 412 Precondition Failed
  • 413 Request Entity Too Large
  • 414 Request-URI Too Large
  • 415 Unsupported Media Type
  • 500 Internal Server Error
  • 501 Not Implemented
  • 503 Service Unavailable

I’ll explain how these might apply to a web service in a later article, when we cover REST in more detail.

HTTP Headers

Both requests and responses can have extra headers attached. If you’ve done any web development you’re bound to know a few common headers (like UserAgent and Referer) already but there are many more that are useful when we’re dealing with APIs.

Request Headers

A classic use of a request header comes about when a resource can be represented in many forms, such as a product description available as a HTML page, XML document, JSON or a photo. In this case, a client might send its GET request with an Accept header indicating that it wants JSON, like this:

Accept: text/json

Another useful request header can be used to get resources that have been updated recently:

If-Modified-Since: Sat, 25 Oct 2008 19:43:31 GMT

Response Headers

Response headers contain meta-data about the requested resource. They can explain what format the data is (Content-Type), how long the data is valid for (the Expires header) or when it was last written (Last-Updated).

Don’t forget that you can use the HEADERS method to get just the response code and headers back from the server.

End Of Part Two

Hopefully you now have a decent grasp on the magic protocol behind the web. I find it really helps if you can think in terms of objects or resources instead of pages, then you’ll get a better feel for exactly what your browser is doing when you ask for a URL.

As always, feel free to ask questions via the comments, I’ll do my best to answer.

This post is part of a series on REST so if you’ve found it useful, subscribe to catch the other articles in the series.

0 comments, add yours.

Anatomy Of A URL

Posted on 12 Nov 2008 by Andy - Filed under   

I am writing a series of articles on RESTful web services but some research showed that a surprising number of people don’t understand the basics of what a URL is. So, in the interest of learning to walk before we can run, I’ll explain what a URL is and how the various components are used in modern web design.

URL stands for Uniform Resource Locator, as defined by this W3C standard and a URL identifies a unique item on the web. The item might be a web page, an image, a database item (like an ebay auction), an MP3 file or anything that can be represented on the web.

An item may have several different URLs, but each URL only points to one thing - although that thing may change over time.

A typical URL might look like this:

http://www.example.com/directory/page.html?param1=value1&param2=another%20param

That’s a pretty complex example, so let’s break it into its constituent components...

Protocol

The bit before the colon is the URL protocol, http in our case.

This tells the web browser how to talk to the server - how to ask for a resource, what will happen if the resource does not exist and so on.

There are several URL protocols available, but four are most common on the world wide web today:

  • http- HyperText Transfer Protocol is the protocol used by web servers. The page you are reading now was delivered via HTTP.
  • https - Secure HTTP is just the same as normal HTTP except that the transmissions are encrypted. If you enter passwords or your credit card details on a web site, you want to ensure that this protocol is used by checking for a padlock icon in your web browser.
  • mailto - This protocol allows for clickable email addresses.
  • ftp - File Transfer Protocol is used to manipulate files over the internet.

Double Slash

Our example has a double slash (//) after the protocol. This indicates that the URL is an absolute URL - that it does not need any context to resolve to a unique resource.

The opposite of an absolute URL is a relative URL. It makes no sense to type a relative URL into your browser’s address bar, but embedding one in a web page would indicate that the resource can be found relative to the URL of the page. If you want to know more about relative URLs, this article on UNIX relative paths should help, as relative URLs follow the same standards.

Domain Name

The next bit of an absolute URL is the domain name, www.example.com in the example. This identifies a computer (or cluster of computers) on the internet that stores the resource you want. Domain names are not case sensitive so www.example.com and WwW.eXAMplE.COM are equivalent.

Path

The URL path tells the server how to find the resource that you require. The path in our example URL is directory/page.html. Unlike domain names, paths are case sensitive.

You might think that the .html indicates that the resource is a HTML file but that is not necessarily true! Your web browser will check the contents of the file to determine what kind of resource it is and how it should handle it and if you are writing any code that downloads from the internet, you should do the same.

CGI Parameters

The end of our URL has two parameters, param1 and param2.

Parameters are optional for all URLs and their presence is indicated by the question mark (?). An ampersand (&) is used to separate multiple parameters.

Parameters can also be assigned a value using an equals sign (=), as is the case in our example.

param1’s value is “value1” but param2’s value is a bit more complex - it contains a space!

URL components can only contain certain characters: A-Z 0-9 underscores and dashes. All other characters must be “URL encoded” that is translated into a percent sign (%) followed by their ASCII hex value.

ASCII translation can also be used in other parts of the URL, but it’s not recommended - can you imagine reading out the URL over the phone and saying “H-T-T-P-colon-slash-slash-A-B-percent-twenty-C” Ridiculous!

Other URL Components

There are many other URL components that you might encounter that weren’t present in our example.

Port Numbers

Some URLs add a port number to the domain name, like this:

http://www.example.com:8080/page.html

Most web servers operate on default ports, but sometimes another port might be specified. The default ports are:

  • Port 80 - HTTP
  • Port 443 - HTTPS
  • Port 21 - FTP

Specifying a different port does not mean you can supply an incorrect protocol in the URL - trying to talk HTTP with an FTP server will fail.

Usernames & Passwords In URLs

It is possible, although rare, to specify a username and password inside a URL. In this case, the username and/or password are supplied before the domain name, like this:

ftp://username:password@hostname/

I hope I don’t have to tell you just how insecure it would be to embed a URL like this in a web page.

You Should Now Know All About URLs

That pretty much covers the basics of URLs. Feel free to experiment with your own web spaces - there’s nothing that can go wrong with asking a webserver for a resource via a URL. Ask questions in the comments too, I’ll do my best to answer.

This post is part of a series on REST so if you’ve found it useful, subscribe to catch the other articles in the series.


Creative Commons licensed photos by Laughing Squid and dailyinvention.

0 comments, add yours.

Programming Collective Intelligence

Posted on 28 Oct 2008 by Andy - Filed under  

Since the web first took off, I have found myself buying fewer and fewer computing textbooks as reference documentation moved online and blogs provided a wealth of how-to articles. I still sometimes scan the computing shelves of my local bookstores in idle moments and that is how I chanced across Programming Collective Intelligence by Toby Segaran.

Programming Collective Intelligence cover

The book is subtitled “Building Smart Web 2.0 Applications” and that is very appropriate. It is aimed squarely at web developers who, like me, are fascinated by the interactive nature of modern web applications and the use of machine-learning algorithms that make use of all the juicy data collected by the likes of eBay, Amazon and del.icio.us.

This is not a book that will teach you how to program or how to design a website - it is aimed squarely at competent, experienced back-end web developers who want to see the algorithms behind some of the world’s most successful websites.

Wonderful Examples

The examples contained within the book are its greatest strength.

Toby Segaran chose to use Python throughout the book, a wise choice. Despite that I have little Python experience, it is a very readable language and Toby deliberately avoids language-specific tricks and obscure libraries.

Each chapter introduces algorithms that solve a specific problem, including recommendation engines, categorisation, search engines, optimisation and more. Open APIs (such as the del.icio.us API) are used where possible and the example code is structured in a very modular, pluggable manner. Readers are encouraged to experiment via the Python shell.

Practical Introductions

The book introduces each algorithm with an overview that does not resort to intense mathematics, which was great for me since I promptly forgot most of my maths after graduating. Compare this from the book with the Wikipedia article on the same subject.

The author correctly surmises that most readers will not need to implement common algorithms from scratch but will use well-constructed third-party libraries, and so do not need to know a great deal of academic detail about each technique.

The book really lends itself to being used as a reference when searching for an appropriate algorithm in respositories like CPAN: chapter 12 provides a summary of the rest of the book with each algorithm’s strengths and weaknesses clearly presented.

Conclusion

A book for the hardcore geek? Yes, as my girlfriend pointed out when I showed her my purchase. But also a book for programmers with a healthy curiosity, as she later said “That sounds really interesting!”


Creative Commons licensed photo by vj_pdx.

2 comments, add yours.

Microsoft Embrace And Extend Microformats With hSlice

Posted on 28 Oct 2008 by Andy - Filed under  

Internet Explorer 8 beta has an interesting new feature: hSlice support.

An hSlice is a small chunk of a web page that you can subscribe to with your browser, like RSS but for a section of a page instead of the whole thing. The most obvious application of this technology is for e-commerce sites so you can wait for prices to drop or stock levels to replenish before placing an order but I have no doubt that once the technology gains wider adoption, a wide range of interesting and unpredicted applications will emerge outside of the e-commerce arena.

Will hSlices Be Adopted?

Yes. The idea is a good one and it should appeal to people who already use RSS. The are already a number of Firefox extensions that are bringing hSlice support to the geeks’ favourite browser.

Hopefully, RSS aggregators like Google Reader and NewsGator will add hSlice support soon and then widget systems like iGoogle will quickly follow. The other major browsers should be able to add hSlice support without too much development effort since they all have RSS support and Safari already does something quite similar with web snippets.

A hugely important difference between RSS/Atom feeds and hSlices is that the former are (usually) used to indicate new pages being added to web sites, whilst hSlices show changes to an existing page. Stop and think about that for a moment, that’s a massively important change and it will have an impact on every aspect of the web.

But It’s A Microsoft Standard And They Are Evil!

Calm down dear, your tinfoil hat is slipping.

Seriously, I think there is a need for this technology and, as I said, Apple are doing something similar with Safari’s web snippets. The microformats.org mail discussion list has given a tentative welcome to the new arrival, but pointed out that it would have been nicer to be involved in the naming/design.

It’s good to see Microsoft actually innovating on the web, instead of playing catch-up with the open sourcerers and it’s particularly gratifying to see that innovation taking the form of an open standard.

hSlice Markup

Adding hSlices to your HTML is pretty straight forward, though you’ll need to update the server code to serve updates. Just as with serving hAtom (and most other microformats), you add classes to standard HTML tags, like this:


<div class="hslice" id="1234">
<p class="entry-title">Buy 1 doz eggs</p>
<p class="entry-content"><img src="eggs.jpg" alt="Eggs"/> £2.68 per dozen</p>
<a rel="feedurl" href="http://www.mmmeeja.com/slice-1234.xml">Subscribe to Feed</a>
</div> 

There can be more than one slice per page but each must have a unique ID.

The “feedurl” anchor is optional and, if it is not present, clients are expected to download the entire page and extract the hSlice using its ID.

So, my feeling on hSlice should be pretty clear by now. I don’t care that it’s from Microsoft, it’s going to make for some exciting new web tools and technologies.

DO WANT!


Creative Commons licensed photo by petoo.

0 comments, add yours.

Twitter Serendipity

Posted on 09 Oct 2008 by Andy - Filed under  

Just a short, throw-away post today as I have a good amount of work on at the moment, but this was too good not to share!

Marshall Kirkpatrick, RSS guru and blogger for ReadWriteWeb was pimping his latest post on Twitter which was immediately followed by a tweet from Barry Carlyon from Leeds Student Radio.

The results could not have been scripted...

Marshall: How much do top tier bloggers and social media consultants get paid? Barry: Cookies!

Made me chuckle anyway.

1 comments, add yours.

Social Media Best Practices

Posted on 25 Sep 2008 by Andy - Filed under  

I’m honoured to have been chosen to take part in a round of blog tag discussing best practices for online social media. Thanks to Kim Woodbridge for the tag and to Mitch Joel for kicking off the project.

The posts so far have been focussed on remaining calm and not starting a troll-fest, with Kim recommending that we reflect, Ari Herzog saying pause before hitting the submit button and David Bradley saying we should be nice.

This is all good advice but as this is a technical blog, I will give you a technical response...

Automate!

I certainly don’t mean that you should use scripts to spam your blog into every social media service, but we responsible netizens can extend our presence by employing some clever automation.

There are, literally, thousands of on-line communities, blogs, digg-clones and forums, so we cannot keep an eye on them all without spreading our time too thinly. So I want a notification when I (or my brand) is mentioned, so I can follow up quickly.

You see, I found out that Kim had tagged me for this blog meme when an automated Google search for my name twittered that I’d been mentioned on a web page. I have similar alerts set up for a variety of keywords and can respond quickly thanks to the automated searches.

There are a surprising number of ways that automation can help you manage the firehose of information and social interaction that being an active participant in social media brings. Have you tried any of these?

There are many, many, many more great tweaks and tips to succeeeding with social media, but mine is to automate.

Next...

I pass the baton on to the most beautiful people:

Oh, one more - Robert Scoble, c’mon baby tell us your best pratices for social media!

2 comments, add yours.

Automatically Twitter Your Sphinns

Posted on 23 Sep 2008 by Andy - Filed under  

I last posted about twiggit, an automated system to inform the world of your diggs via twitter. In the comments, ecreeds asked whether a similar service for Sphinn existed. I don’t think one does, but here are instructions on using TwitterFeed to do something similar.

It’s Easy

  1. Sign up for twitterfeed, if you don’t already have an account. If you need an OpenID, you can set one up on any Yahoo account - everybody has a Yahoo account don’t they?
  2. Head over to your Sphinn profile page and choose whether to tweet just your Sphinn submissions, or every vote. Choose either the submits or the sphinns tab.
  3. Notice the RSS icon next to the tab bar? That gives you a useful feed of each of your submissions or votes. Right-click and copy the link location.
  4. Head over to your twitterfeed accout and add a new feed like this: TwitterFeed - Add a new feed You’ll need to change the twitter username and paste in the URL of your Sphinn feed.

That’s all there is to it. When your feed is processed, you should see a tweet from your account like this:

Automated Sphinn Tweet

Most social networking sites provide feeds of your interactions so you could use TwitterFeed to do something similar for del.icio.us, stumbleupon, youtube, flickr and more but beware that if you automate too much, people will unsubscribe from you because your signal to noise ratio is too low. FriendFeed is much better place to stream all your web 2.0 feeds.

Hopefully that answers ecreed’s question, if you have any similar queries or want help with anything to do with the web or social media, leave a comment.

0 comments, add yours.

A Review Of Twiggit - Automatically Tweet Your Diggs

Posted on 08 Sep 2008 by Andy - Filed under  

Jason from Twiggit dropped me an email a few days ago, asking if I’d like to review his site. He was polite and not spammy so, of course, I said yes.

The Twiggit.org Logo

What Is Twiggit?

Twiggit is a service that will automatically add your Digg submissions and votes to your Twitter stream. It is highly configurable, allowing you to choose to tweet just items you submit, or positive votes too. There are other options to change how often it checks Digg, pause operation or delete your account.

Like many Twitter applications, it requires your Twitter password to work - a limitation of the Twitter API. A lot of the Twitter community really want to see the long-promised adoption of OAuth to replace this, but there’s no sign just yet.

It doesn’t need your Digg password, thankfully, since the Digg data can be read publicly.

In Action

Twiggit produces nicely formatted tweets that use TinyURL to perform URL shortening. You can see an example of one of my twiggit tweets displayed in TwitterFox, below:

A tweet by Twiggit

When I first signed up for the service, Twiggit picked up one of my Digg votes from the evening before, which was a bit alarming although there have been no other glitches. It seems to a be well-rounded and professional piece of software.

Who Will Benefit?

I think that Digg power-users that organise their networks via Twitter will be best served by this software. As we have seen from the latest MrBabyMan controversy, top users submit and vote on hundreds of stories every day and anything that can lighten their load will pay dividends in time savings.

For the record, I don’t think that MrBabyMan should be banned, and I don’t know if he organises his network via Twitter - or even if he has a network. I do think that Digg’s algorithm is too heavily biased in favour of existing power users but I really do not care about the rest of this spat.

You Could Make This With Yahoo Pipes And TwitterFeed

Yes you could, but why bother? Twiggit works very well and you can get it up and running in seconds. I have made good use of both Yahoo Pipes and Twitterfeed before, but Twiggit just works.

Will you be using Twiggit? Want one for Reddit? Leave a comment.

4 comments, add yours.

 

Sitemap

Copyright © 2006-2008 MMMeeja Ltd. All rights reserved.