Chris Risner . Com

Some Thoughts on Siri and Google

Posted on: 11/5/2011 3:58:00 AM by

SiriUnless you’ve had your head buried in the sand, you’ve heard of Siri, Apple’s iOS Personal Assistant program.  Siri was originally an application available in the AppStore and made by a company named, naturally, Siri.  In April of 2010, Apple purchased Siri.  At this time they killed off Siri’s plans to port the application to other devices (Siri had officially been working on Blackberry and Android ports).  Interestingly, Siri worked on any iOS device prior to it becoming a feature in iOS5 and on the iPhone 4S.  Once the 4s was announced, the Siri application pretty much stopped working which means that (as of right now) only people with a 4S can use Siri.  Even the hacked versions people have made to work on non 4S devices won’t work.  The reason for this get’s into how Siri works.

Everyone using Siri is using it on their iPhone 4S (one more time, currently).  However, the bulk of the Siri work is actually done on Apple servers located far away from the 4S held in your hand.  If you think about it, this makes sense.  When you say something, a program needs to translate that speech to text, then process that text and understand it, then find the answer and deliver it.  Finding the answer could involve searching Yelp, Google, Wolfram Alpha, OpenTable, making a calendar appointment on the device, or even more.  For a device to do all of this in a timely manner is unrealistic (for now).  Furthermore, since this processing is done on the server side, it’s easy for Apple to make improvements without having to push new versions of iOS out. 

So now to the “Thoughts” on Siri.  First, I think it’s awesome.  It certainly isn’t something most people could use all of the time (i.e. in a meeting) but with Siri, you could get something done faster via speech than with typing.  I think this is a very natural progression for Apple.  To some degree, Apple has looked at using natural user interfaces pretty heavily.  Think of when you swipe up and down pinch to zoom.  These are the sort of things you would think you could do to interact with something.  Opening a keyboard or holding down several keys to make something happen isn’t as intuitive.  Speaking to people and things is something you’ve been doing since not long after you were born so it would be an incredibly natural way to interact with something.  Furthermore, since I’ve been interested in Natural Language Processing for a while now, this sort of software is right up my alley.  I wouldn’t be surprised for voice control to come to more things (so the rumors that Siri is coming to Lion aren’t surprising either).  Another thing I really like about Siri is that it has a personality.  It’s not always just returning cut and dry answers.  It uses humor.  It tells a story.

One thing that I’ve found very interesting about Siri is that Apple released it in Beta.  To the public.  If you look at the recent past, you don’t see a lot of beta’s coming out from Apple.  A lot of people have been complaining about poor quality and misunderstandings when speaking with Siri and it’s been huge news.  I think Apple customers (I think it’s safe to say the majority of people that bought 4Ss were already Apple fans and customers) aren’t use to having something released that isn’t amazing to start with.  Google is a company that has, in the past, regularly released programs under the Beta tag.  This enables them to get their new applications out to people for use without actually saying “we think this is a finished product.”  Google obviously continues to improve these beta programs and I’m certain Apple intends to do the same thing.  I do wonder if we’ll start seeing more Betas out of Apple.

GoogleNow I’d like to talk a little bit about some of the press Siri has gotten.  To start, Andy Rubin, Google’s Senior VP of Mobile was quoted at the Asia D conference saying “’I don’t believe that your phone should be an assistant,’ he said.  ‘Your phone is a tool for communicating.  You shouldn’t be communicating with the phone; you should be communicating with somebody on the other side of the phone.’”  I can only hope that this is just a front and that Google is actually working on something similar.  They already have Voice Actions on Android which you can use to search, text, make calls, play music, pull up maps, etc.  Is it that far off from something that could communicate back to you with some sense of personality? 

gary the foolIf Andy Rubin’s comments were discouraging, Gary Morgenthaler’s comments were just foolish.  Gary Morgenthaler, a one time director of Siri (when it was it’s own company), says that Siri gives Apple a 2 year advantage over Google.  Comparing individual features such as Siri can make a phone call and Voice Actions can play a song wouldn’t make much sense.  Any one of these individual features wouldn’t be difficult for the other to duplicate.  What Siri does that Voice Actions does not, is translate human speech to intent.  When you say “Remind me to pick up my dress when I leave work,” Siri translates that into “Create a reminder to Pick up my dress when the phones location leaves Work”.  When you say “I’m in the mood for italian food in North Beach",” it translates that to “look up italian restaurants in North Beach on yelp and return the results.”  While these are impressive feats (it’s definitely not an easy thing to go from normal human speech to specific instructions), it’s not something that Google couldn’t reproduce and top in well under 2 years.  At the end of the day, Siri is still just taking more or less simple statements and translating them to simple computer instructions.  It’s true that Siri has nowhere to go but up (hopefully) but I wouldn’t discount Google and I certainly wouldn’t try estimating how long it would take Google to catch up.

Bookmark and Share
First Article

Installing extjwnl to a MySQL Database on OS X or Linux

Posted on: 10/20/2011 1:50:00 AM by

Recently I’ve been doing a lot of experimenting with Natural Language Processing (NLP).  One of the goals, or maybe necessities of the program I’m working on is having the ability to look up the definition of words it doesn’t understand.  While most of the time, if I’m not sure how to spell a word or am looking for the definition, I just rely on Google.  However, for a running computer program that will have an undetermined (and probably excessive) amount of input, it wouldn’t be very efficient to go to Google multiple times a second (plus google would frown on that).  Enter WordNet.  WordNet is a lexical database of words.  Nouns, verbs, adjectives..OH MY!  (Yes, WordNet is much more than JUST words and definitions, but for now we’ll stick with the words) On it’s own, WordNet is just a group of files that has all of this information in it, however, there are many different implementations of WordNet available

As I’ve been experimenting with GATE in Java, I wanted a Java extension of WordNet.  The first one I looked at was JWNL (or Java WordNet Library).  Unfortunately, I ran into a lot of problems getting the JWNL library working.  After getting frustrated enough to look for something else, I found extJWNL, which, according to the WordNet site is an updated version of JWNL.  As I mentioned before, Wordnet, comes as just flat files with the data in them.  extJWNL comes with the code to accessing these files as well as for accessing the data if it’s in a database.  In addition (and perhaps critical given the code for accessing via the database) are scripts and code to insert all of the WordNet data into a database.  Unfortunately, extJWNL seems to be scripted ONLY for windows.  Up until the recent past, this wouldn’t phase me.  However, at work I’m primarily using an OS X computer which means I needed to find a way to make this work.  If you look at the dict2db.bat file, you can find what it’s really doing:

%JAVACMD% %JAVA_OPTS% %EXTRA_JVM_ARGUMENTS% -classpath %CLASSPATH_PREFIX%;%CLASSPATH% -Dapp.name="dict2db" -Dapp.repo="%REPO%" -Dbasedir="%BASEDIR%" net.sf.extjwnl.utilities.DictionaryToDatabase %CMD_LINE_ARGS%

If you care to look and figure out what get’s sent into the script file you can then figure out what’s being put in all the parameters.  Or you can look at this which I used to actually run the code:

java -classpath lib/extjwnl-1.6.2.jar:lib/commons-logging-1.1.1.jar:lib/mysql-connector-java-5.1.17.jar:lib/extjwnl-utilities-1.6.2.jar -Dapp.name="dict2db" -Dapp.repo="lib" -Dbasedir="/" net.sf.extjwnl.utilities.DictionaryToDatabase src/extjwnl/main/resources/net/sf/extjwnl/file_properties.xml src/utilities/main/sql/create.sql com.mysql.jdbc.Driver jdbc:mysql://localhost/newwordnet

The important things here are:

1.  The list of jar files (colon separated)

2.  The repo and base directory location (since this line was run from the root extjwnl directory, we used “lib” and “/” respectively")

3.  The path the to file properties file.  This file is part of the extjwnl zip and you MUST alter it to point to where ever you have installed WordNet before running this.

4.  the location of the create.sql file.  Again this file is part of the extjwnl zip.

5.  The mysql jdbc driver name and database path.  In this example, I created a database named newwordnet and gave the anonymous user complete access to the database for the purposes of running this.

 

Once you’ve done that, sit back and watch as it enlightens you with multiple table creations and data insertions.  A few minutes later, you’ll be database driven and ready to go.

Categories: Java, NLP, SQL
Bookmark and Share
First Article