Interview: Roger Cutler on W3C and Chevron use of Semantic Web Technology
On the eve of his retirement, I spoke with Roger Cutler, longtime W3C participant from Chevron.
IJ: Roger, since you have been participating in W3C for some time, can you describe how Chevron's interests have changed?
RC: In 2000 our focus was on XML and Web Services. As those areas matured, we turned our interests to Semantic Web technology. As end users of Web technology we are often insulated from standards through our dependency on vendors. For example, although we think HTML5 will be important to us, nonetheless we don't have a particular axe to grind since vendors will come between us and HTML5 much of the time. But in the case of the Semantic Web we are trying to access the technology more directly and much of the expertise in the field can be found within the W3C.
IJ: How is Chevron using Semantic Web technology?
RC: In one project we sought to exploit the technical strengths of Semantic Web technology such as the expressiveness and reasoning achievable with OWL. While our efforts in that project have been a success as far as the technology goes, we have not yet seen a significant business benefit.
A second effort focused on challenging integration problems that involve information about equipment in major capital projects such as an oil rig or platform. These capital projects involve tens of thousands of objects: flanges, pumps, blowout preventers, sub-assemblies, and so on. All the pieces of equipment come with documents (for safety and regulatory reasons, engineering drawings, etc.) and manufacturer’s specifications (e.g., temperatures at which the components function). The equipment data associated with these projects are both valuable and complex. We tried to exploit the expressiveness of OWL to create an ontology that puts this together and deals with the complexity. This has been technically successful but again we haven’t yet deployed the solution so we have not yet derived any business benefit from it. Obviously we’re at the stage of learning and experimenting with the technology.
IJ: Can you describe where the data comes from and how it is managed?
RC: All this information about equipment lives in different forms in a number of different systems and is handled separately by different organizations with different data models. For example, the people who build facilities and determine what equipment they will use have data about the equipment. The people doing the maintenance of the equipment have much the same data, only structured differently, and with additional information specific to their needs. Then there are the production people running the equipment, turning on valves and seeking to maximize production on a daily basis. They have their own systems and equipment. Still others are modeling the characteristics of reservoirs. The list goes on. Each party optimizes information for their own needs, and all of the systems have evolved independently. And yet, much of the time they are dealing with the same information.
As an example of what we need, suppose that someone replaces a pump or reroutes some pipes. We need to propagate information about these changes into all these different systems, which is a time-consuming, manually intensive, and fragile process. Some sort of communication among systems is necessary on an ongoing basis. It is very important that it be done right. We would like to use the Semantic Web to help us do it right.
The integration can also help us learn more from the data we have. For instance, we might want to combine information about production scheduling and maintenance scheduling to optimize them simultaneously. If you want to do these things, the systems need to talk to each other, but, as I said, it is difficult.
IJ: What approaches can you take to address this?
RC: People use point-to-point solutions or big data warehouses, but neither approach scales gracefully. Point-to-point solutions become very complex and hard to maintain. Data warehouses create replication issues and tend to be fragile. So, the possibility of a smarter, more agile, more cost-effective way of dealing with integration would have a great deal of value to us. The Semantic Web is not guaranteed to be the solution, but it looks plausible and we’d like to see if it lives up to its promise in practice. .
IJ: Earlier you said that you were successful with the technology but not sure you would deploy it. Why not?
RC: Quite simply, it's hard. We are slowly learning how to apply it to our world. The big target --- the thing that would make this investment in technology worthwhile --- is integration. But to integrate things you need more than one thing to integrate! So if we start by building an ontology for equipment that attempts to exploit the expressivity and flexibility of OWL, then later we may be able to build another for maintenance and link them. It may be, in fact, that this stepwise approach has caused us to try to be more aggressive in using the advanced features of OWL than might be optimal for integration purposes – but I guess we’ll find out whether that’s the case as we proceed.
IJ: Has reuse of existing vocabularies proved valuable?
RC: Not in the project I've just described. We do think that using an "upper ontology" -- one that defines very general concepts like units of measure or geographical concepts -- to structure class relationships is probably a good way to go. But reuse has not been our primary motivation. Rather, it has been to integrate the information in our internal systems.
IJ: How does this work?
RC: I'm fond of telling people within Chevron who ask about Semantic Web technology anything you can do with the Semantic Web you can do with relational databases – if you’re willing to write enough code, which can lead to higher cost and complexity. In fact, we have demonstrated a case in which similar objectives were obtained in the context of an ontology with about fifteen lines of readily comprehensible rules and in a relational database context with over 1000 lines of pretty complex code. So in the equipment catalog project, there is a solution in a relational database, but it involves a bunch of obscure pointers in the tables and associated code. In that system we try to maintain some relationships of interest to us, but the system doesn't handle all of them. It is not only incomplete, it would be complicated to make it complete. So we wanted to get the data out and do better. We wrote programs to generate OWL from schemas in our databases. The declarative techniques serve as a framework that lets us express the complex relationships in a way that is more maintainable and scalable. I think we've demonstrated that. The result is that we have reproduced our internal system in OWL, and the OWL version should be more maintainable and scalable, as well as more complete.
IJ: What have you learned from this project?
RC: One thing that intimidates us is OWL reasoning. It is very daunting to figure out how to gain the organizational capability to support a technology that is so difficult to understand and use effectively.
IJ: What makes it so challenging?
RC: For one, how reasoning works in the open world model. An innocent looking statement can cause unexpected results, and it can be challenging to understand why. We are also making extensive use of OWL restriction classes, which can be tricky.
IJ: Does participation in the Working Groups give you the opportunity to address some of these issues?
RC: To some extent it is relevant to us to have the opportunity to influence what gets standardized. Sharing use cases with working groups can also be valuable. I have contributed use cases and sample data directly to Working Groups in which I was participating. Doing so makes it more likely they will create something that works for us. Being aware of that activity may give Chevron a leg up, or may benefit our entire industry. For example, I contributed use cases and data to the Efficient XML (EXI) work.
However, the strongest motivator by far for W3C Membership is the experience, knowledge and flow of information through personal contacts. Participation leads to relationships with world-class experts in a wide variety of fields. The conversations at membership meetings and online discussions in which we freely express our opinions generates trust. I come back from these meetings with knowledge about industry direction and technology development. There are a lot of people in the W3C community who can help us learn about topics of interest not only related to the Semantic Web but also many other technologies. And this has enabled me to do a much better job of advising Chevron on these technologies -- where we should play a leadership role and why, and possible solutions to specific problems. We also learn from observing the W3C process. This is an extensive consensus-based process that has both formal aspects and informal traditions. Chevron learns from observing how this consensus-driven organization does its business.
IJ: Are there other areas of W3C work you are watching?
RC: There's a good deal of interest in Chevron in HTML5. Other hot tickets for us include mobile, social networking, cloud computing, and big data. We're glad to see the standards work in these areas but don't have a particular outcome in mind. For the cloud, security and policy are important. We want to avoid vendor lock-in of services. We want the protocols of cloud solutions to be standardized so that we can change suppliers if necessary.
IJ: What would you like to see W3C do differently?
RC: It is my perception that historically the W3C is much more concerned and knowledgeable about the public Web than in how Web technologies are used in corporate intranets. The enterprise environment is different in fundamental ways from the public Web. And the issues and concerns are not the same. For example, there is no anonymity and access control needs are different. In the Oil and Gas industry we also get into federation issues because there are a lot of joint ventures between highly controlled environments.. Certainly many of the W3C Members that market to companies like ours have a good understanding of those issues, but I think the W3C leadership should do more to understand that world.
W3C also needs to increase investment in authoring tools. It is a big issue for us if authoring tools create output that doesn’t conform to specifications, is not accessible, inefficient or hard to maintain. I would like to see more attention paid to authoring tools and testing to ensure they conform.
Lastly, most W3C Members are technology vendors, universities, and government agencies. Chevron, on the other hand, is an end user and I think we would all benefit if there were more Members like that in W3C. Just as we gain insight and information and benefit from participation, W3C can benefit from the insights and views of end user companies. I have been involved in the creation of W3C's first Business Group, and I see those as another mechanism to help bring more end user companies into the W3C community.
IJ: Though you are retiring will you continue to participate?
RC: Perhaps! From a personal perspective, I have really valued the friendships and working relationships I have formed with people in W3C. It's been a wonderful thing.
IJ: And we have loved having you. Thank you, Roger, and best wishes in your retirement!
Very nice-- would like to see more of this kind of interaction.
Kind regards,
MM-Kyield