Client: Marc Andreessen

Topic: Resource Description Framework (RDF)

Content Types: Ghostwriting, Interview

CLIENT NEEDS: Marc Andreessen, Internet browser pioneer and partner of the VC firm Andreessen Horowitz, needed a ghostwriter to produce articles and interviews with other pioneering technologists.

DELIVERED: I interviewed a number of technology pioneers, conducted general research, and produced a series of articles and interviews to be published under Andreessen’s byline. A sample interview, on RDF, is below.

Ramanathan V. Guha and RDF (condensed)

By Marc Andreessen

With his pioneering work on Resource Description Framework (RDF), Ramanathan V. Guha, known as Guha, just may be the Aldus of the web. Guha’s work on RDF really began in 1995, when he wrote the Meta Content Framework (MCF) while at Apple. MCF is like a table of contents for a web site. Also at Apple, Guha wrote Project X , a browser plug-in that Apple turned into a web navigation system called HotSauce. Project X used MCF to let web users fly through 3D maps of web sites.

Guha came to Netscape in 1997, and after meeting consultant Tim Bray, who was working on XML, he decided to turn MCF into an XML application. The result, the Resource Description Framework (RDF), can be used for sitemaps, content ratings systems, search engine data collection systems, digital library collections, and distributed authoring systems. RDF changes the way people interact with the web.

Guha’s a programming maniac and is often found in a hacking frenzy, trying to produce a prototype of his most recent idea. He spends a lot of time spreading his ideas around and working with various industry groups. Underneath his obvious intelligence and the energy field that surrounds him, Guha is a great guy who’s fun to know.

He received a bachelor of science degree in mechanical engineering from the Indian Institute of Technology in Madras, India, in 1986; a master of science degree in mechanical engineering from the University of California at Berkeley in 1987; and a Ph.D. in computer science from Stanford in 1991. With Douglas B. Lenant, he coauthored a book, Building Large Knowledge-Based Systems, which was published by Addison-Wesley in 1989.

I recently met with Guha to talk about RDF.

Marc Andreessen: You began your career by doing research in AI. Tell me how you got started with that.

Ramanathan V. Guha: While I was at UC Berkeley working on my master’s degree in mechanical engineering, I took one course in AI. I liked it and didn’t have a summer job, so I sent a résumé to a research consortium in Austin, Texas, called Microelectronics and Computer Technology Corporation (MCC). They were working on the Cyc project, which was an attempt to build a commonsense knowledge base for AI. They called me to say no, they didn’t need me, but they were so sweet. They said, “Well, we really don’t have room. We already have all of our students.” I said, “OK, no problem. I’ll work at night.” They said, “But we don’t have an office for you,” and I said, “That’s okay, I’ll work wherever.” So they hired me. After three weeks, they said, “You get your own office, and we want you to stay.” I ended up staying there for seven-and-a-half years.

AI is one of the grand challenges today, and I was in my twenties then and easily influenced, so I wholeheartedly flung myself into that effort. Cyc, a ten year project at MCC, was fascinating. Here’s the basic idea: There’s a lot of stuff that a five-year-old knows that a computer doesn’t know. How would we go about teaching a computer that corpus of knowledge? The body of knowledge is pretty much the same across all people. If you ask someone “If I drop something, what will happen to it? Will it fall?” they will say, “Yes.” Or if you ask them “What color is the sky on a clear day?” they’ll say “Blue.” There is a core consensus. It doesn’t matter if the person is a computer scientist or a doctor or a farmer or whatever, there’s a substrate of consensus knowledge or common sense that we all share. Interestingly, that’s exactly the kind of stuff that computers don’t know. So if you want computers to go from being the stupid kinds of things they are today to being natively interesting, it’s not specialized expertise, such as drawing vector graphics that they need. Rather, they need to know the zillions of little things that enable us to function as intelligent beings.

The Cyc project was about hunkering down and building a machine-understandable corpus of all this knowledge. Building Cyc turned out to be much more difficult than ever imagined. It turns out that a lot of fundamental research needs to be done before we can go about actually building something like Cyc. My guess would be that in 10 or 15 years the time will be right to try again. Such a beast will be needed if computers are ever going to be able to understand human languages, such as English, and do 1700 of the other things that we absolutely take for granted that people can do.

I think that the core issue of whether a computer is intelligent is not going to be an interesting question. The question of what does it mean to be alive was a primary question in biology for many hundreds of years. We still don’t know what it means to be alive. Is a virus alive? No…well, maybe. It sure can cause havoc, but it can’t reproduce by itself. That question turned out to be largely irrelevant, because there was no real answer. It was the wrong question; nobody asks that question anymore. I think the same thing is going to happen to the concept of intelligence in computers.

MA: But you went back to school after you started working on the Cyc project?

RVG: I went back and finished my master’s thesis, then decided if I was going to stay in that field, I might as well get a Ph.D. So I went to Stanford and got my Ph.D. in computer science. For a while, I was living this crazy life: teaching a class about the Cyc project at Stanford and the University of Texas at Austin, managing 25 people in Austin and Palo Alto, and writing my thesis – all at the same time.

In 1995, after my work on the Cyc project, I tried to do a startup. I had an idea for a heterogeneous database integration engine that I called Babelfish. The idea was to find a way to describe the semantics of the schemas of relational databases so that a program could transparently query a large heterogeneous set of databases. I wrote the program, but could not sell it. That kind of product is an extremely high-end-enterprise kind of thing – difficult for a lone kid in Texas to sell. I also realized that I was more interested in building the product than in building a business. After I built the product prototype, I decided to return to research.

MA: When did your work on RDF begin to take shape?

RVG: Right around the time I was finishing up with Babelfish, Alan Kay, who was then an Apple fellow, convinced me to go to Apple, which was where I developed the Meta Content Framework (MCF). MCF was a way to represent metadata structures – information about information – to bridge the gaps in information flow created by various heterogeneously structured software products. The goal of MCF was not unlike the goal of Babelfish. But research at Apple imploded when Jobs came back to Apple in 1997; within a month or so, Apple just got out of all research. There were two places that were clearly the places to go if I wanted to be able to reach millions of people. One was Microsoft, the other was Netscape. They were the two big platforms: the operating system and the browser. I chose Netscape.

So in February 1997 I came to Netscape, where I met Tim Bray, coauthor of the W3C’s XML 1.0 specification and, at the time, a consultant for Netscape. Tim and I decided to adapt MCF using XML. This ultimately resulted in the W3C spec for the Resource Description Framework (RDF), which is based significantly on MCF.

One of the interesting things about the Internet is that the stakes are so high. Whenever a market is growing exponentially like this, the stakes are so high that people try crazy things and things get accelerated. I realized that there was more innovation going on in companies like Netscape that were trying to get products out and change people’s lives. They didn’t think of it as research, by any stretch of the imagination. It just had to be done. As a research guy I would have said, “Yeah, that’s a very interesting problem. Let’s work on it for the next six months and write a couple of papers about it.” At Netscape we say, “Oh, this seems like a hard problem. Let’s see if we can solve it by dinner and ship it next week.” It’s a different attitude. More often than not, when you have to solve it by dinner, you don’t solve it by dinner, but you do have an 80 percent solution in three days. It’s more exciting, and you have more impact on people’s lives. You actually make more progress this way.