RDF: Show Me the Money
Metadata July 6th. 2005, 9:03pmBruce D’Arcus and I have been sending some emails back and forth about my last post on metadata interoperability. In a comment on the original post, he suggests RDF instead of XSLT crosswalks. In my reply, I referenced Bill Moen’s tongue-in-cheek comment about RDF (for the record, I don’t really believe it is too complicated for Moen to figure out but, perhaps, I didn’t have my tongue planted firmly enough in my cheek while reciting what Moen said — I think what Moen was getting at (and this could just be my interpretation) is that RDF’s real world usefulness, when compared to its complexity, remains questionable).
Over the course of these emails, Bruce was kind enough to refer me to an entry on Leigh Dodds’ weblog discussing RDF and the library world (and, in part, my original post). Dodds, Bruce tells me, has been developing an RDF solution for Ingenta (I’m not clear if it is in production yet or not). I’d like to respond on his weblog, but it seems comments are disabled so I’ll just point to the article (linked above). [Update: looks like comments have been turned on there now, but since I’ve already written here I’ll just go ahead and post here (I wrote more than I was planning to anyway)]
To start, I should go ahead and confess my biases. First, I don’t think RDF will succeed (other than in providing a way for people to put XML into a relational database with greater ease — which, for some people, might be sufficient (keep in mind, though, that there is a lot of baggage involved with that “ease”)). Second, I don’t think relational databases (while the best thing in the world for business data) are the best choice for library bibliographic and authority data. Yes, I know many library databases are built on top of a RDBMS — I have extracted data from many-a-MARC record stored in a series of VARCHARs (in essence having to parse segmented data from a BLOB).
It is not that I dislike RDF just because it wasn’t invented in the library community (Moen warned of the NIH (not invented here) syndrome; that warning has been resonating in my head). As an example of a non-library standard that is worth librarians investigating, I’d point to Topic Maps. If I had to I’d choose to use them instead of RDF (yes, I know they can also work together). Anyway, enough background…
The first comment that Dodds picks up on in his weblog entry is my: “I don’t think [RDF] can/will really accomplish anything that agreement on any XML schema couldn’t/wouldn’t.”
He responds: “This surprised me greatly. One thing that RDF doesn’t mandate is a single all-embracing format, it positively embraces plurality of schemas, and independent adoption and repurposing of schemas.”
Perhaps I should go into a little more detail. By this I meant that RDF says, “We can provide a way to reuse and make your data more interoperable if you just accept our structure.” While allowing arbitrary connections to be made is different than having a schema (or XSLT) that defines how those connections should be made, it is also the same in that it is accomplishing the same goal by limiting what one does with the data (for the record, I don’t really think we will have agreement on one schema either). Dorothea Salo touches (with a sledgehammer) on this ‘plays-well-with-others’ aspect in her post Look, We Get It Already.
On the other hand, Dodds says RDF is not complex because “… [he’s] had no trouble introducing engineers to RDF….” That gave me a bit of a chuckle. I assume by engineers he means software engineers (since he is a [software] engineer at Ingenta, judging from his home page). Are they the target audience?
In this respect, RDF does seem a bit like XSD to me. Being able to model objects and subclass things in a programming-like way provides a very rich way of defining things. But because people other than software engineers will need to use metadata and its related schemas, I tend to favor RELAX NG over XSD as a schema language (RELAX NG has a sound foundation, but doesn’t push it onto its users).
It is not that RDF is so complex that noone in the library community can figure it out (there are people here experimenting with it). It is rather that it’s complexity (or should I say it’s “required knowledge from a particular domain”) makes it prohibitively complex for those not in that domain. Really, though, who wants to edit RDF? Do I really have to think like a relational database? I don’t want to. Perhaps I’m missing the user-friendly RDF editor that already exists out there, but I think we don’t see this because of the “designed in” flexibility of RDF. We end up editing the structure embedded as content and editing the real content too… that is too much.
Dodds says, “RDF has a definite image problem.” This is something I think we all can agree on. I’ve heard the “XML people” say it and I’ve heard the “RDF people” say it (and I’ve heard Salo take it apart as a rhetorical argument (see above)). It is actually not unlike XOBIS’ image problem (if it has one). When you want to replace something that has grown organically over the years with something that has been constructed (and which requires a radical rethinking of the way things are done) there is a lot of drag. How do we know THAT system is better if we can’t see it in the real world.
For reasons why we need RDF, Dodds suggests, “its much harder to map any given model to XML, because XML is limited in how well can express relationships.” I think XOBIS and Topic Maps do a fine job of representing relationships. Relationships are the “reason for being” for both. There are things though we expect a XOBIS aware system to do (what Dodds calls “implicit semantics”). There are a lot of promises in the RDF world too, though, about the types of things machines will be able to do with these explicit linkages (what I refer to as “taking a leap of faith”).
At the end of his post (yes, I’m really trying to wrap this rant up), Dodds says, “There’s some tasty morsels at the bottom of that semantic web layer cake. The only way to demonstrate that is to come up with more convincing demonstrations, e.g. a recast of MODS as RDF, backed by some useful code.” I’d agree with the last part (though I think it will take a lot of useful code (more than just “some”… librarians are a cautious bunch (rightfully so)) and there are also still social and institutional issues that will have to be addressed too). So, RDF (or XOBIS for that matter)… SHOW ME THE MONEY!
So, I’ve been wrong before and I’ll be wrong again (I’m pretty sure of this). If someone demonstrates the usefulness of RDF and creates the application that revolutionizes the Web, that’s great… I’ll start using RDF. I’m not begrudging anyone their RDF experimentations anymore than I’d want someone to take away my chance to experiment with XOBIS (yeah, let them try!) Until that time, though, that time when we start using machine readable data instead of human readable data, I’ll choose to spend my time on areas where I feel the best use of my time will be spent (or at least where I’m most interested).
Just a chuckle before I stop… I just “heard” in the code4lib IRC channel: “in [the] Semantic Web World my agent will coordinate rearranging your schedule with your agent….”
Hope this wasn’t too much of a rant
I’m going to be hungry today b/c I’ve now missed my lunch!