Sunday, December 19, 2010

Uploading a Drupal module using TortoiseCVS

I just added the yolink enhanced search module to drupal.org.

The last hurdle in getting this module out in the world was adding it to CVS, which I prefer to do using the Tortoise CVS GUI. Following these directions, everything was going great. And it would have gone great right through to the end, but in the Create New Module step, I missed the part where I changed the path to the module. My module wasn't showing up in the contributions directory, and I kept getting errors that adding a branch wasn't allowed in /module_name.

Lots of Googling and some fail-dread-based procrastination later, I found this video, paused after every sentence, and managed not to skim over the critical module path setting. CVS let me add a branch with no errors in sight (cue angels singing the name of kyl191).

If you're looking to add a module to Drupal.org with TortoiseCVS, both of the linked sets of directions should work -- if you follow them!

Saturday, December 11, 2010

Uploading to Freebase, part II: authenticating with OAuth

I'd hoped to have written the bulk of human knowledge to Freebase by now, but I came to a screeching halt when I found that I'd need cookies and sessions and such.

That is, you have to authenticate to write data in bulk to Freebase. Here's one way to do so using OAuth and PHP.

1. Sign in to Freebase and register an app. Take note of your Consumer Key and Consumer Secret.

2. Get oauth-php and add it to a directory where your code can see it.

3. On the page from which you'd like users to authenticate, include the following code (adapted pretty directly from the Twitter example):

require "oauth-php/library/OAuthStore.php";
require "oauth-php/library/OAuthRequester.php";

/**
* oauth-php: Example OAuth client
*
* Performs simple 2-legged authentication
*
* The MIT License
*
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/

// register at http://www.freebase.com/apps/create and fill these two
define("FREEBASE_CONSUMER_KEY", "FILL IN");
define("FREEBASE_CONSUMER_SECRET", "FILL IN");

define("FREEBASE_OAUTH_HOST","https://api.freebase.com");
define("FREEBASE_REQUEST_TOKEN_URL", FREEBASE_OAUTH_HOST . "/api/oauth/request_token");
define("FREEBASE_AUTHORIZE_URL", "https://www.freebase.com/signin/authorize_token");
define("FREEBASE_ACCESS_TOKEN_URL", FREEBASE_OAUTH_HOST . "/api/oauth/access_token");

define('OAUTH_TMP_DIR', function_exists('sys_get_temp_dir') ? sys_get_temp_dir() : realpath($_ENV["TMP"]));

// test
$options = array('consumer_key' => FREEBASE_CONSUMER_KEY, 'consumer_secret' => FREEBASE_CONSUMER_SECRET);
OAuthStore::instance("2Leg", $options);

try
{
// Obtain a request object for the request we want to make
$request = new OAuthRequester(FREEBASE_REQUEST_TOKEN_URL, "GET");
$result = $request->doRequest(0);
parse_str($result['body'], $params);

echo $result['body'];

}
catch(OAuthException2 $e)
{
echo "Exception" . $e->getMessage();
}

?>


When you load that code, you should see a token, good for at least one POST to Freebase. (I hope -- I'm writing this up as I go.)

Please stay tuned for the next exciting installment of Uploading to Freebase!

Tuesday, December 7, 2010

How to upload to Freebase part I: domains and types and properties, oh my!

I've been admiring the Wikipedia of databases, Freebase, from afar for a long time. I've kept my distance because if you think learning wiki syntax is a challenge, try figuring out where to even put factoids about your favorite books or musicians on Freebase. If you want your data to be useful to more people than just yourself, though, it is critical that you get the organization right. In honor of Open Data Day last weekend, I've decided to finally figure out how to upload data to Freebase (instead of adding three things by hand and giving up when the knowledge that I could add 1000 with a little PHP gets too unbearable).

The first step when you'd like to add data to Freebase is pretty easy: determine if your data would be useful to other people, and if you have a right to upload it. If the answer to both is yes, proceed!

But then we're on to step 2 -- where does your data belong? You need to determine your data's structure. My first through tenth passes at adding to Freebase probably went through the Basic Concepts wiki, which falls prey to the downfall of many a wiki page -- really bad organization. (It looks like someone played a hand of Yahtzee with all the abstract concepts you need to understand to put your data in the right place on Freebase, then wrote it up in wiki form. I'd re-organize, but every attempt to cut down redundant information and leave things in reasonable order on Wikipedia has left me reversed, scolded, and frustrated.)

So let me try to clarify here (feel free, Freebase wiki editors, to grab any content that's useful, but please don't just sprinkle it hither and thither within the page).

My first hope that I might make sense of Freebase yet came from the Freebase Schema Explorer app.

Just the name gave me hope. The schema is the structure of the data, so a schema viewer is just what I was in the market for.

Right away, I saw that Freebase data is organized in domains, like Books (accessed at http://www.freebase.com/view/book).



Domains (like Books) have types, like Poem, Short Story, and Book (accessed at http://www.freebase.com/view/book/book). This could be confusing, because other users, probably baffled by the wiki like me, have added things like ISBN and Book Character as types of books. I know we're in data hippie land and everyone is a special flower, but that's frankly wrong. "Book Character" is not a type of book.



Types (like Poem, Short Story, or Book) can have both instances and properties, accessed at http://www.freebase.com/view/book/book. For the domain Books of type Book, an instance would be something like The Catcher in the Rye. (It seems that Freebase calls instances "topics", but the Schema Viewer nails it better with "instance," I think.) Examples of book properties include Characters and Genre.



For the data I'll probably add to Freebase first -- podcasts -- the domain is Broadcast (/broadcast), with the type Podcast Feed (http://www.freebase.com/view/broadcast/podcast_feed).

As of this writing, there are 2,584 Podcast Feed instances/topics. Their properties include Name (example: Wired's Alt Text), Image, and Average Media Length. Already I see a flaw, which I'll need to correct if I'm going to use this data for my Podcast Finder -- there's no podcast creator (or Podcaster) property listed. And the Freebase wiki noted that one can't edit a schema created by another Freebase user -- I'd have to duplicate all 2,584 instances/topics and set up my own schema.

Will I figure out how to add the Podcaster property? Will I duplicate the Podcast Feed type? Will I lose my Internet connection because I stayed up too late blogging, slept through a WebEx, and lost my job?

Find out in the next installment of this exciting series, wherein I shall explore yet another query language, MQL, and hopefully start adding all information ever to an easily-queried free online database.

(Or I'll take another two years off blogging and come back in 2012 blogging about how excited I am about our new lady President. Or arsenic-based space aliens. Stay tuned!)