blog :: semantic

The Real Reasons Behind Google Buying MetaWeb

Posted on 29 Jul 2010 by Andy

So Google bought MetaWeb a couple of weeks ago. That’s old news in the fast-moving world of tech company acquisitions, but very few commentators have really understood what that means for the future of the web so I wanted to write this post to further the discussion.

What Is FreeBase?

Freebase is MetaWeb’s flagship product and is the central reason for Google’s purchase.

It is often touted as a database of things AKA an entity database and grew out of a project to add semantic data to Wikipedia articles. The result is a beautifully curated database of companies, people and events.

Freebase does provide web pages for its topics, but the real strength of the database is that it provides an RDF representation for each of its topics. This is hugely important for people building linked-data where subjects and objects are links to RDF documents.

FreeBase Has Authority

There’s much more to Freebase than just things - a big part of its database is concepts. Basic concepts like North, Aluminium, House, Kitten etc are also present. These RDF documents are the very foundations of the semantic web - an enormous number of third parties use them to describe their own entities.

So if I want to create some linked data stating that my shoes are white, I would link to Freebase’s representation of white, rather than creating my own.

Similarly, if I wanted to find a set of people who have white shoes, I would start at Freebase’s white node and traverse the link graph searching for white shoes and their owners.

All this means that FreeBase is the Wikipedia of the semantic web:

  • It has lots of inbound links
  • It does not link out
  • It has age and human curated data
  • It has authority!

What Does Freebase Mean For Google?

Google just bought a big chunk of the semantic web (relatively cheaply) with only one real competitor - DBpedia. OWL’s sameas method of mapping entity equality pretty much takes care of any competition from DBpedia (from an indexing and linking point-of-view rather than a commercial perspective).

Freebase gives Google an instant foothold into the web of linked data and you better believe Google knows a lot about links!

As more and more web documents get enhanced with semantic markup, Google will be indexing and ranking that data. It is a good bet that the search engine is going to enhance its results using that data. I would put money on a new onebox appearing for select queries and displaying factual data much like Wolfram|Alpha.

I’m also hoping that Google will provide some new APIs that provide very fast graph traversal for all this data.

What Does It Mean For Web Publishers?

The semantic web is heating up and with all this investment from some big players I think we’ll see consumer applications emerge soon. When that happens, the linked data graph will become another SEO battleground.

Web site owners should prepare for that future by publishing linked data about their company, products and services right now (I’ve been advocating semantic search optimisation for a while now).

Build authority for your data by:

  • Capitalising on your current domain authority
  • Publishing accurate and timely data
  • Build links to yourv entities

Some SEOs complain about Google’s love for Wikipedia but unless they start paying attention to the linked data web it will happen again.

Shout out to Chris Lewis - at least one SEO gets it.


More From Us On The Semantic Web


Creative Commons licensed photo by Eric M Martin.

2 comments, add yours.

Twitter Annotations Are A Big Deal

Posted on 22 Apr 2010 by Andy

Contents

  1. Annotations
  2. Namespaces
  3. Semantic Web
  4. Size matters
  5. Consuming
  6. Displaying
  7. Creating
  8. Further Reading
  9. Conclusion

Twitter’s Chirp developer conference had two big announcements:

  1. They are buying the Tweetie iPhone application
  2. The addition of meta data payloads to tweets - called annotations

The first announcement was greated with dismay by many twitter application developers - Oh noes! They is eatin our lunches! - but the more far-sighted commentators focussed on the possibilities that annotations will bring.

Robert Scoble has a great post covering the basics of annotations and twitter themselves are gradually releasing more information as the details get hammered out.

What Is An Annotation?

Simply, an annotation is some extra data that can be associated with each tweet - data that followers might never see.

Each annotation has three fields: a namespace, key and value - and each tweet can have several annotations.

The namespace explains what the annotation describes. It could be a book, a meal, a place or pretty much anything.

The key and value provide data within the context of the namespace - the author of the book, price of the meal, etc.

It will be up to twitter clients to create and display annotations as they see fit. So all those Chirp attendees that moaned about Tweetie becoming the official iPhone app for Twitter should stop worrying about dealing with just 140 characters because Twitter just gave them a huge new sandbox to play in.

Namespaces Are The Key

Namespaces are a means of describing the context of an annotation.

Early indications are that Twitter will allow any text as the namespace value making some people call for a centralised authority for namespace registration. This would be counter-productive for developers and would sacrifice flexibility for consensus as to the meaning of the namespace.

Far better to take a leaf out of the semantic web’s playbook and have the namespace describe itself. Make your namespace a URL that points to an XML document describing the data (keys and values) that can exist within the namespace.

A good example is the FOAF RDF schema at http://xmlns.com/foaf/spec/index.rdf.

A big advantage of this approach is that schemas can be extended and combined with ease - and without having to ask permission from a central authority.

Semantic Web Annotations

Much of the semantic web is presented as RDF triples, which can be combined to describe almost anything.

A triple has a subject, predicate and an object. For example:

andymurd checked in at Rundle Mall, Adelaide

gives us:

subjectandymurd
predicatechecked in at
objectRundle Mall, Adelaide

Each of the subject, predicate and object can be represented as a URI - in the above example, the subject might be http://foursquare.com/user/andymurd.

But twitter will only provide key-value pairs, not triples so we must fit our RDF ontologies into this model:

namespacehttp://www.w3.org/1999/02/22-rdf-syntax-ns
keysubject
valuehttp://foursquare.com/user/andymurd
keypredicate
valuechecked_in_at
keyobject
valuehttp://rdf.freebase.com/rdf/en.rundle_mall_adelaide

Many linked data tweeps are justifiably excited about the potential of embedding an RDF payload in tweets, and I think they are right!

Keep It Short

Twitter will be limiting the size of annotations (intially just 512 bytes) so we need to keep our meta-data succinct.

A lot of URLs for RDF ontologies are quite long, as they include versioning information, so I expect that many developers will make use of URL shorteners for annotations too.

It is also likely that standard will emerge to abstract meta-data into an external document in order to overcome the size limitations. Some kind of "See Also..." for annotations. This would also allow editing of annotations (something which Twitter doesn’t plan to provide) but will also introduce security implications for application developers.

Consuming Annotations

Semantic data is produced for machines - typically search engine indexers or graph query tools, and now we can add twitter bots and clients to that list.

Open, discoverable standards are important for communication between these consumers and RDF has a broad base of support. Google already does a good job of indexing RDF and microformats and using the data to enhance its ten blue links with relevant information about product reviews, document authors and more. I really want to see that integrated with their realtime search results.

Yahoo technologies like YQL and BOSS can facilitate search mashups that make use of RDF too. Hopefully we’ll see some twitter SearchMonkey plugins shortly after annotations are released.

Twitter adds more data into the mix - tweets have authors, timestamps, replies, locations - as this excellent tweet infographic shows. One issue for data consumers to tackle is to decide whether these are relevant to the annotation.

Displaying Annotations

We’re about to enter an era of much richer twitter clients. They will be capable of displaying video, photos, maps, playing mp3s and much more.

Developers will need to consider which annotation namespaces are deserving of being displayed to their users. Certainly some equivalent of the media RSS standard would be a prime candidate.

Other namespaces will gain authority as de facto standards with developer support and we should be looking to existing web meta-data formats to predict which will be implemented in twitter clients first. Microformats like hCard and hReview are an obvious first choice but new ontologies will be created to exploit the real-time nature of twitter.

How about a standard for location based services (Gowalla, FourSquare, BrightKite, et al)?

Pluggable twitter clients (like Seesmic) will become more common and a supplemental developer eco-system will emerge for third party plug-ins that manipulate annotations. Maybe we will eventually see a standard for twitter client plug-ins.

I would like to see web-based twitter clients (maybe even twitter.com) publishing RDFa (HTML & RDF mixed together) where the annotations are appropriate. It would also be great to see semantic data mixed into Google’s realtime search results.

Of course, spammers will try to exploit any security loopholes in a twitter client’s annotation handling, so annotations published on the web will need to be sanitised like any other user generated content.

Creating Annotations

A very big job for twitter application developers will be building user interfaces to create annotation data. The semantic web is lacking a simple UI that makes it easy for everyone to create linkable data.

I don’t believe it is practical to automatically derive accurate semantic data from just 140 characters of free text that makes a typical (manually created) tweet. However, many websites integrate with twitter already (@andymurd favorited a video on YouTube etc.) and they will be well-placed to automatically add annotations to tweets. Also RDF/SPARQL equivalents of twitterfeed and tweetmeme will emerge.

My hope is that application developers will rise to the challenge of providing simple user interfaces that allow everyone to easily create linked data and share it through twitter. All the semantic web authoring tools I’ve tried have been complex, unwieldy things that need in-depth technical knowledge to use effectively.

I think that the twitter developer community can change all that by focussing on the user experience.

Further Reading

Lots of people have been discussing the potential of annotations:

There’s a Google Group with some good ideas.

There are also quite a few blog posts:

Several initiatives have attempted to utilise twitter messages for transmitting semantic data. Every twitter user is aware of hashtags but interested readers should also check out what RoboCrunch, SemanticTwitter and TwitterFormats have been up to.

Conclusion

These are exciting times for twitter developers and semantic web proponents but there will be some big challenges ahead:

  • How do we promote open, extensible namespaces?
  • How are spammers likely to exploit annotations?
  • How can we get users to love, create and use annotations?
  • We’ll need an icon to indicate that a tweet has annotations!

All these challenges must be solved whilst remembering that twitter is a tool for humans. We must add value through annotations, value that makes people want to use the new breed of rich twitter clients that leverage this technolgy.

These problems are not beyond us and I believe that twitter could provide the impetus to make the semantic web a part of our daily lives.


Creative Commons licensed photos by Shira Golding and MiriamBJDolls.

4 comments, add yours.

Ten Useful Wolfram|Alpha Searches

Posted on 18 May 2009 by Andy

So the Wolfram|Alpha engine launched last weekend to a great deal of fanfare, but the reactions from blogs and the twitterverse show that a lot of people just don’t get it.

This is not surprising, we’ve been conditioned by Google et al to type in a phrase and expect ten blue links, so when you first accessed the Wolfram|Alpha input box, what did you type? Your name? A natural language question?

I bet the answer to your first query was the disappointing:

Wolfram|Alpha isn't sure what to do with your input.

That’s because Wolfram Alpha is not a search engine - it’s a knowledge inference engine and so many people struggle to get the best out of it. It deals with facts, maths and statistics and it deals with them very well.

In this post, I’ll show you ten useful queries that should build the right kind of mindset and encourage you to experiment with the tool a bit further.

1. Stock Comparisons

How about a nice chart of Microsoft versus Google?

Microsoft Google stock chart

2. Complex Mathematics

Stephen Wolfram wrote the excellent Mathematica program so you can bet that Wolfram|Alpha will kick ass at maths.

It can solve quadratic equations and differentiate them, and even plot fractals.

Julia set plot

3. Date Manipulation

What is the date of the first Tuesday in May, next year?

Most of us would flick through a calendar to answer a question like this, but Wolfram|Alpha can save us time:

It’s the 4th!

4. Analyse Sports Statistics

Despite knowing nothing about baseball, I can check the histories of the Boston Red Sox and the New York Yankees. That’s probably useful to someone less geeky than me.

5. Find Out Flight Times

Enter two city names to see the distance between them and the average flight time. Here is London to New York.

6. Show Movie Casts

Not as detailed as IMDB but pretty handy all the same.

7. View Currency Fluctuations

Here is a query of definite use to me right now - the dollar vs the UK pound.

5 year data for the Australian dollar vs UK pound

8. Show Website Traffic Estimates

Using Alexa traffic data, Wolfram estimates that apple.com receives 11,111 visits per minute.

9. Compare Chemical Compounds

Very useful for chemistry homework, here’s Methanol compared with Ethanol.

Thermodynamic comparison of Methanol and Ethanol

10. Calculate Your Mortgage Payments In Plain English

Who hasn’t wanted to cut through the financial gibberish and get to the bottom line quickly and easily.

The Wolfram|Alpha interface

Whilst researching this post, I hit a few limitations of the Wolfram|Alpha engine (and user interface) - in particular a bias towards placenames, not great when there are places like Dollar in Scotland and Pound in Wisconsin - but a bit of experimentation usually brings you to the correct syntax.

In all, I think Wolfram Alpha has been a successful launch and I look forward to seeing just what future improvements are in store. In the medium term, I hope that the API is extended to allow third party developers to use the inference engine to process their own data - that would be very cool.

Have you found any interesting queries that showcase Wolfram|Alpha’s engine? Leave a comment.

6 comments, add yours.

Evolving User Interfaces For Semantic Search Engines

Posted on 23 Feb 2009 by Andy

Semantic search engines are starting to appear on the fringes of mainstream web, and thanks to Yahoo’s BOSS/SearchMonkey integrations they are likely to get a lot more prevalent. However, a vital component needs to be overhauled before my mum is going to use them - the user interface.

Let’s start by having a look at current search engine UIs in common use today.

A Single Text Box

At present, search engines just employ a single text box for users to enter a summary of their goal. This works well for most text-based searches, not least because users have learned to modify their behaviour to get the most out of the search technology.

Most users type one to three noun phrases, examine the results and then either drill into the ten blue links or refine their query, often returning to previous queries before they reach their goal.

One of the biggest challenges for semantic search developers will be to modify the users’ learned behaviour. Longer queries give more accurate results, both for text search and semantic search.

Advanced Search

Many search engines offer an “advanced search” option, which takes the user to a lengthy web form comprising of optional search fields. Such complexity makes for a horrendous user experience, as Google discovered when they found large numbers of users viewing the form but leaving before they entered any data.

Faceted Search

“Faceted Search” is a technical term for the filters that you often see on e-commerce sites. For example, a user might search for “adidas shoes” and then supply extra criteria by clicking filters for “Men’s shoes”, “Under £100”, “White” etc.

This is a great improvement in usability where the number of facets is low. Multiple page reloads can be problematic for people with slow connections or using small devices like mobile phones.

Repeatability can be an issue for facted search too. Remembering the search phrase then a sequence of four clicks that got you to your favourite shoes is onerous for the user but the “you recently viewed” feature of sites like Amazon really help.

It’s not just e-commerce sites that have faceted search: Google’s Image, Blog, Finance etc are all search facets too.

Natural Language Search

From Ask.com to Powerset, there’s always been rather more hype than substance surrounding natural language search. The technology performs well for simple queries like “How old is Barack Obama?”

Ask.com search:

Ask.com search results for the query “how old is barack obama?” state Barack Obama is 47 years old

Powerset search:

Powerset search results for the query “how old is barack obama?” gives Barack Obama’s date of birth

...producing better results than the equivalent Google search.

Ask a question that has two or more facets, and these engines fall back to text searches. For example, “Which English philosophers where also classical liberals?” would require an intelligent engine to find the list of English philosophers then find which are mentioned in the page on classic liberals, intersect the two sets and provide the results.

SPARQL & MQL

Neither SPARQL nor FreeBase’s MQL are user friendly, but they’re not designed to be. They are designed to answer complex queries like “Which English philosophers where also classical liberals?”

If you’re interested, here is the SPARQL to perform that query against DBPedia:


PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?name WHERE {
     ?person skos:subject <http://dbpedia.org/resource/Category:English_philosophers> .
     ?person skos:subject <http://dbpedia.org/resource/Category:Classical_liberals> .
     ?person foaf:name ?name .
     ?person rdfs:comment ?description .
     FILTER (LANG(?description) = 'en') .
}
ORDER BY ?name

And here are the results:

{ "head": { "link": [], "vars": ["name"] },
  "results": { "distinct": false, "ordered": true, "bindings": [
    { "name": { "type":"literal", "value": "Herbert Spencer" }},
    { "name": { "type":"literal", "value": "Jeremy Bentham" }},
    { "name": { "type":"literal", "value": "John Locke" }},
    { "name": { "type":"literal", "value": "John Stuart Mill" }} ] } }

Horrible, but perfectly accurate and a real answer, not ten blue links.

The query syntax comes about from having a large number of possible facets to cover - far more than could reliably follow the e-commerce model of applying filters.

The Coming Challenge

User interface designers are going to have to work hard to get from users that are used to a single text box to the complex query syntax of SPARQL and MQL.

I think the first thing we all need to recognise is that that damn text box is not going away! Users know that’s how to interact with a search engine and you’re not going to change their minds anytime soon.

The ten blue links, however, can be scrapped if we can determine that the user is searching for a definitive answer (not just looking for “funny jokes” or “cat pictures”). Instead, a summary of the data available should be presented together with a number of appropriate facets to drill down into the results.

The interface components representing facets should be appropriate to the query - use date pickers for dates, drop-downs for lists of countries etc. Lead the user forward using visual cues that they are already comfortable with, like the timeline in this experimental Google search:

Here’s a very usable faceted search of Nobel Prize winners, although it could benefit from allowing use of the browser back button. You might also like to play around with MedStory, a clinical information search engine that has eye-catching filters, but is let down by the confusing pop-up interface.

Perhaps we can learn from off-line data analysis tools, I’m thinking that techniques like OLAP cubes or Excel’s pivot tables might be made more web-friendly. Maybe the Hollywood cliché 3D interface can help, but I doubt it.

There’s been some good progress made in semantic search technologies but now we need to start thinking about users. How will they interact with the tools? How will semantic search change the way we work online? What limitations will people hate?


Related documents:


Creative Commons licensed photo by dullhunk.

2 comments, add yours.

My Prediction for 2009

Posted on 29 Dec 2008 by Andy

Every good blogger must stick their neck out and guess at what next year will hold. Never one to buck a trend or miss a bandwagon, this is my prediction...

Are you ready? Here it is:

We’ll get closer to the Star Trek ideal... semantic search

You know what I mean: that perfect example of human-computer interaction: “Computer, where is Commander Riker?”, “Commander Riker is on the holodeck on level four, sobbing uncontrollably”.

Ask a (simple) question, get an answer. We’re close now but we’ll be a lot closer by the end of 2009 because next year will be the year that semantic web applications grow beyond acaademic exercises and are finally usable by real human beings. The least disruptive application, and therefore the first that will be in common use, is search.

Semantic Search

Search is not disruptive because the switch from text/graph analysis algorithms (that we use now) to semantic systems are invisible - users ask the same questions but they’ll start getting better answers.

They’ll continue to ask short one to five word questions but get used to shorter answers. Instead of ten Google results per page, we’ll see 140 character twitter style answers to queries - and trust them!

Who Are The Players?

Twitter have been building a huge database of short but content-rich tweets. Twine is building its user base quite successfully and Facebook is hoarding data like Scrooge McDuck hoards gold. All this data is worthless until it is analysed and made useful to us all.

And it’s not just startups looking towards this ideal - Google & Yahoo have been busy investigating LSI, HMM and watching how people use public APIs. Yahoo pledged to index the semantic web and Google are looking towards speech recognition. Things are getting exciting, web 3.0 is so close I can almost taste it!

It’s not all established firms though - the collapse of western finance markets continue as I write - small, agile startups with a solution will continue to secure investment and operating capital. During hard times shoe-string startups can find it easier to beat the mega-corps but they have to have the best solution!

So, what are the next steps?

We’re going to see a convergence between personalised search, the document classification systems of semantichacker and technologies like Yahoo’s term extracter and there are big bucks to be made by getting this right.

The winner will be able to tell what you’re searching for from the three or four words you type into the search box and your web history and then integrate heterogeneous sources of data to get the results. Contrast the searches of “how much do elephants weigh?” and “how much could I earn in Santa Monica?”. The former is factual and will not change greatly over time, whereas the second requires up-to-date information and knowledge of the searcher. Neither can be answered accurately using traditional search-engine algorithms that look for common textual patterns.

As always with disruptive technologies like semantic search, change and user acceptance will be gradual. To get a user to trust a 140 character answer will take time and peer reviews but I think that we’ll be starting down that road towards the end of 2009.

Am I right?

I’m putting my head above the parapet by making a bold prediction, and one that can be measured but I think the pace of change on the internet is accelerating again and a load of clever people are looking at the possibilities of semantic search.

A load of greedy people (especially advertisers) are looking at these technologies, so one thing is for certain: interesting times ahead.

Am I too optimistic? Way off the mark? Leave a comment.

More on the semantic web.

0 comments, add yours.

Microsoft Embrace And Extend Microformats With hSlice

Posted on 28 Oct 2008 by Andy

Internet Explorer 8 beta has an interesting new feature: hSlice support.

An hSlice is a small chunk of a web page that you can subscribe to with your browser, like RSS but for a section of a page instead of the whole thing. The most obvious application of this technology is for e-commerce sites so you can wait for prices to drop or stock levels to replenish before placing an order but I have no doubt that once the technology gains wider adoption, a wide range of interesting and unpredicted applications will emerge outside of the e-commerce arena.

Will hSlices Be Adopted?

Yes. The idea is a good one and it should appeal to people who already use RSS. The are already a number of Firefox extensions that are bringing hSlice support to the geeks’ favourite browser.

Hopefully, RSS aggregators like Google Reader and NewsGator will add hSlice support soon and then widget systems like iGoogle will quickly follow. The other major browsers should be able to add hSlice support without too much development effort since they all have RSS support and Safari already does something quite similar with web snippets.

A hugely important difference between RSS/Atom feeds and hSlices is that the former are (usually) used to indicate new pages being added to web sites, whilst hSlices show changes to an existing page. Stop and think about that for a moment, that’s a massively important change and it will have an impact on every aspect of the web.

But It’s A Microsoft Standard And They Are Evil!

Calm down dear, your tinfoil hat is slipping.

Seriously, I think there is a need for this technology and, as I said, Apple are doing something similar with Safari’s web snippets. The microformats.org mail discussion list has given a tentative welcome to the new arrival, but pointed out that it would have been nicer to be involved in the naming/design.

It’s good to see Microsoft actually innovating on the web, instead of playing catch-up with the open sourcerers and it’s particularly gratifying to see that innovation taking the form of an open standard.

hSlice Markup

Adding hSlices to your HTML is pretty straight forward, though you’ll need to update the server code to serve updates. Just as with serving hAtom (and most other microformats), you add classes to standard HTML tags, like this:


<div class="hslice" id="1234">
<p class="entry-title">Buy 1 doz eggs</p>
<p class="entry-content"><img src="eggs.jpg" alt="Eggs"/> £2.68 per dozen</p>
<a rel="feedurl" href="http://www.mmmeeja.com/slice-1234.xml">Subscribe to Feed</a>
</div> 

There can be more than one slice per page but each must have a unique ID.

The “feedurl” anchor is optional and, if it is not present, clients are expected to download the entire page and extract the hSlice using its ID.

So, my feeling on hSlice should be pretty clear by now. I don’t care that it’s from Microsoft, it’s going to make for some exciting new web tools and technologies.

DO WANT!


Creative Commons licensed photo by petoo.

1 comments, add yours.

7 Tools To Make The Most Of The Semantic Web

Posted on 17 Jul 2008 by Andy

The semantic web starting to make progress with new, useful applications appearing everyday.Add to that Microsoft’s purchase of Powerset hitting the headlines and you can bet that you will need to know about semantic technologies before long.

We’re here to help get you started with this run-down of really useful tools.

Books about books

The Operator Firefox Extension

Operator is a Firefox extension that adds a toolbar allowing you to explore the microfomats and semantic data embedded in any web page as you surf.

Surfing with Operator is really eye opening as it brings attention to just how much semantic data is already out there on the web, Technorati tags, blog author names, friend lists, feed subscriptions are common and often marked up with the right tags.

Freebase Semantic Database

Freebase provides a database of over 4 million topics, all semantically labelled and accessible through a comprehesive API. The API is REST-based and uses MQL to structure query requests.

One of the greatest features of Freebase is that users can add their own ontologies (or topics), so if you have a load of data that you want to share and make discoverable, you can do so via a bulk upload. Then combining your data with that already present in Freebase is straight-forward.

Semantic Hacker

Semantic Hacker caused controversy when they announced a competition to find the best use of their API with a prize of a measly million dollars!

Leaving the competition aside, their API will take unstructured text (or web pages) and attempt to classify them by placing them into categories that roughly match up to the DMoz directory categories. Their Bayesian algorithm is pretty well-trained and gives good results even when fed web pages that contain irrelevant information like adverts, menus and copyright notices.

Semantic Search Engines

There are a number of search engines providing search facilities for semantic data. SWSE searches RDF data on the web and has a SPARQL search API.

Yahoo also have plans to provide search for the semantic web and have a research project already available online. When this matures and gets customised using their new BOSS technology, it should be a rich platform for developers.

Intellidimension also offer a semantic web search engine with a SPARQL interface.

In fact, there are lots of companies jumping into this area. There is no clear winner yet but SPARQL seems to be the technology of choice. Everybody is eying Google warily, if they make an announcement about their interest in this field, all these companies will see some very tough competition.

Open Calais

This absolute jewell of a service can create semantically structured documents from unstructured text. Owned by Reuters, Open Calais is heavily slanted towards business and news gathering but the results are still very impressive.

When you feed in a chunk of text, it will identify places, dates, people, companies and so forth and even relations or events involving those entities.

It also offers a tagging plug-in for Wordpress blogs. If you have any experience with Tagaroo, I’d love to hear about it so please leave a comment.

Twine

Twine is a CMS with a semantic focus and could well become the Wordpress of the next generation web. Currently still in beta so I haven’t been able to try it out.

Mash It All Up With Pipes

DERI offer an RDF equivalent to Yahoo Pipes, called simply Pipes. Their offering is not as full-featured as Yahoo’s but you can get the source and extend it, and even host it on your own server.

Given the wealth of amazing data provided by all the other services, no doubt you have some great ideas for mash-ups and new services, so Pipes can be a quick way of prototyping.

The future of the web is not so far away and we all need to learn about these technologies so we can be part of it.


Creative Common licensed photo by jm3.

1 comments, add yours.

 

Sitemap

Copyright © 2006-2009 MMMeeja Pty. Ltd. All rights reserved.