Checking out my RSS feeds over the Christmas break, a post from ProgrammableWeb caught my eye - a sentiment analysis API for tweets. Digging a little deeper, a blog post from the authors showed that they have applied libSVM to make a great tool.
I had to try this out
The API is incredibly easy to use - especially if you have used the Twitter API before. TweetSentiments.com essentially augment the JSON returned from a subset of the public Twitter API with sentiment data. So a search for tweets returns 20 tweets, each of which has a
sentiment grade and then the whole block has an aggregated count and overall score.
I often test out SEO techniques using my girlfriend’s Wine Education site so I thought I’d see what twittersentiment.com thought of the most common wine varietals.
I whipped up a short perl script to call the API for the following phrases:
- Sauvignon blanc
- Pinot grigio
- Pinot blanc
- Syrah OR Shiraz
- Cabernet sauvignon
- Pinot noir
The results got dumped into a CSV file, which I then imported into a Google Spreadsheet for analysis (spreadsheet here).
The results could hardly be called scientifically rigorous, but that’s not the primary purpose of the exercise, I just wanted to play around with the twittersentiment API.
The API found twenty tweets for each of my search terms so the small sample size has skewed the results towards a very narrow window. This graph shows that the wine with the highest
sentiment score was Merlot, whilst the lowest was Pinot Noir.
Interesting choices, in a period traditionally associated with eating roast turkey when I’d have plumped for a fruity white, such as a Chardonnay.
The next graph might add a little towards an explanation - opinion of the Merlot is not as divided as for the other wines:
So, what this tells us is that Americans really like Merlot (see the full results for
Merlot at tweetsentiments.com).
Other wines, like Cabernet Sauvignon, got a larger number of tweets that were classified as neutral - sometimes incorrectly, sometimes because they used more technical terms to describe the wine. Here are the results for cab sav from tweetsentiments.com.
That is to be expected - they do measure different things after all.
The tweetsentiments.com API is nicely constructed and easy to use but the results aren’t perfect (especially for this test).
It would certainly be interesting to track sentiment changes over time and a sharp swing towards negative sentiment would make a useful early warning indicator for brands in trouble.
One of the biggest take-aways from this exercise is that it is hard to extract meaningful data from the twitter stream. I have no doubt that there are people doing it, but they won’t be giving their data away via a free API.