Careers: Interviews
A Discussion with Mike Daconta, Chief Scientist, APG, McDonald
Bradley, Inc.
This week,
Stephen Ibaraki, ISP, has an exclusive interview with the world
renowned Michael C. Daconta, Chief Scientist, APG, McDonald Bradley,
Inc. (www.mcbrad.com).
Amongst Michael’s array of talents, he is a developer, writer,
design and architecture guru specializing in such diverse areas as
the Semantic Web, XML, eXtensible User interface Language (XUL,
Java, C++, dynamic memory management, and JavaScript. His work can
be followed at
www.daconta.net.
Discussion:
Q: With your busy schedule, it’s a real pleasure to have you do this
interview and share your insights with the audience. Thank you for
agreeing to this interview.
A: You are welcome. This certainly has been a busy year, but I enjoy
discussing information technology and where it is going. I'd also
like to thank you for taking time to do the interview.
Q: Michael, can you describe your latest project work and real-world
tips you can pass on?
A: I am currently working two related projects for the United States
Department of Defense: the Virtual Knowledge Base and Net-centric
Enterprise Services architecture. The DIA's Virtual Knowledge Base
is an interoperability framework to integrate heterogeneous data
stores (databases, HTML, message traffic, etc.) into a single
virtual repository. The Network Centric Enterprise Services (NCES)
is a wide-ranging program to transform the United Stated Department
of Defense by improving the horizontal fusion of information.
There have been many lessons learned over the last few years. Here
are some tips from those real-world experiences:
• Document versus Remote Procedure Call (RPC)-based web services.
This is a critical issue for interoperability. Everyone who creates
a web-service must specify in the WSDL SOAP binding whether the
style is document or RPC. In other words, whether the web service
transaction involves XML documents or parameters and a return
argument for the method calls. The RPC method is clearly an XML form
of traditional RPC; while the document binding makes web services
more message-oriented. In terms of interoperability, the additional
design required to transact XML documents vice RPC parameters
provides better context, validation and application-independent
abstraction to any number of clients. This is a critical component
of “net-centricity” which attempts to eliminate our reliance on
“point-to-point” interfaces. Examples 3 and Examples 4 of the WSDL
1.1 specification (available at http://www.w3.org/TR/wsdl)
demonstrate the difference between these two styles. I plan on
writing an article on this more fully demonstrating the differences
between the two approaches. In short, use document-based web
services to improve interoperability and addressability of your
information systems. RPC-based web services do not exploit the full
benefits of XML and offer little or no improvements over CORBA.
• Weak XML Design. A while ago I wrote an article for XML Journal
entitled “Are elements and attributes interchangeable?” The article
focused on the design issues and tradeoffs in this decision (by the
way, the answer is “no”). One point the article tried to make was
that many people treat such a distinction in overly-simplistic ways.
Because XML-based markup languages are easy to create, sometimes too
little thought is put into their design. For example, let’s say I am
creating a recursive structure to display a business organization. I
could do something like this:
<Employee type=”President” name=”Joe”>
<Employee type=”Vice President” name=”Sam”>
<Employee type=”Director” name=”Bill”>
…
</Employee>
</Employee>
<Employee type=”Vice President”
name=”Harry”>
…
</Employee>
</Employee>
There is a significant deficiency in over-reliance on the type
attribute. First, this document cannot be validated completely by
standard validation methods because nesting rules would depend on
the value of the Employee type which cannot be expressed in a DTD or
XML Schema (though this may be possible in other schema languages).
Thus it would be better to model this like so:
<President name=”Joe”>
<VicePresident name=”Sam”>
<Director name=”Bill”>
…
</Director>
</VicePresident>
…
</President>
• Maturity and performance of RDF stores is improving. When we
initially started VKB this was a stumbling block to adoption. Now
there are commercial implementations in addition to increased
maturity of the open source offerings. This year will be the year
they are ready for primetime.
• People too often confuse taxonomies with ontologies. I have to
explain the difference between taxonomies (and topic maps) and
ontologies too often. The confusion lies in the fact that a taxonomy
may be an ontology if the classes defined follow a formal subclass
relation; however, if they do not follow a subclass relation than a
taxonomy is not an ontology. Thus, the key question being whether a
defined taxonomy or classification scheme is suitable for inference.
• Web service interfaces and polymorphism. The technique of defining
a standard web-service interface which can be implemented by any
number of service providers is a powerful use of web services. This
implements the object oriented principle of polymorphism in the
web-services environment. One example of this technique is the Web
Services for Remote Portals (WSRP) Specification from OASIS at
http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsrp.
• Metadata registries are still in flux. Many people parrot the
marketing hype of the web-services “triumvirate” as SOAP, WSDL and
UDDI. While SOAP and WSDL are de-facto standards, the same cannot be
said for UDDI. The UDDI classification capability is weak. Secondly,
it has no current or planned support for ontologies. Finally, its
information model is business-centric and often uses overly-abstract
names like tModel, publisherAssertion, and instanceParms. The bottom
line is that there is much competition in this space (11179, LDAP,
ebXML, RDF, UDDI) and the jury is still out.
Q: What do you see on the horizon that businesses and IT
professionals “must” be aware of to be competitive?
A: Here are the top five things IT professionals and executives
should be examining now:
• Portals and standard, reusable portlets (JSR 168) – Portals are
web aggregation points for specific communities. They are also great
vehicles for organizations to implement business process
reorganization (BPR), Enterprise Application Integration (EAI) and
Enterprise Information Integration (EII) all in a single project.
• Ontologies and axioms (specifically OWL and the UML profile for
OWL). As is made clear, in our just released books, ontologies have
really come of age and our ready for primetime use. Both the revised
RDF specifications and OWL will become W3C recommendations within
the next few months.
• Inference and rule engines. While ontologies provide a formal fact
base, rules can be used to infer new information and perform actions
on the knowledge base. Although there are no current standards in
this area, there are some promising efforts like RuleML and some
standards efforts on the drawing board.
• Web service orchestration (OWL-S) and Web Services for Remote
Portals (WSRP). Web services will proliferate both on intranets and
the internet. If done correctly, (see comment above on RPC versus
Document-based web services) they will be a catalyst force to the
next phase of data evolution which is the semantic web (more on this
later). Once these services are widespread, they will need to work
in concert both in terms of workflow and in terms of semantic
interoperability. This requires a much higher degree of data
modeling for discovery and orchestration. The classic orchestration
example is the travel service which unites flight, auto and hotel
booking. In addition to orchestration, using web services as
portlets is important to allow reuse among various portals and this
is the goal of the WSRP specification underway in OASIS (www.oasis-open.org).
• Metadata registries and the incorporation of ontologies into them.
In February I attended a metadata registries conference in New
Mexico. It was a very good conference that really highlighted both
the fragmentation and demand in this emerging space. The next step
in these registries is to integrate the emerging semantic web
concepts into the registry (specifically ontologies). So far, having
looked at all the available technologies, I am most impressed with
the ebXML information model. Additionally, I was told by the
technical lead of that effort that ontology support would be added
to the next version of the registry.
Q: Can you give specific predictions for businesses about where
technology is going in two years, and in five years in the following
areas: telephony, security, pervasive computing, networking, the
desktop, the web, data storage, Voice over IP, IPv6, and other areas
you feel require consideration?
A: Though I do not consider myself an expert in all these areas, I
will do my best to give an assessment of where they are going.
• Telephony – the most interesting aspect of this area is the
integration of cell phones with the PDA, network connectivity and
digital cameras. I wrote a J2ME pitfall in the More Java Pitfalls
book! It was great to do some code in this exciting area. If you
have not done any J2ME development, I highly recommend it. You don’t
need to have a device to code in this area, as the emulators are
nicely done. Other exciting developments in the telephone space are
Push-to-Talk (PTT) features, bandwidth expansion, GPS integration,
downloadable code, and mobile code. This space is well-suited to
semantic web technologies and web-services; for example, the CC/PP
RDF vocabulary.
• Security. Obviously web services security is very hot right now.
Kevin Smith discusses SAML, XACML and other XML security
technologies in our new Semantic Web book. In fact, Kevin is doing a
talk at JavaOne this year on web services security.
• Pervasive computing – this is defined differently by different
people. In general, the goal in this area is for computing to become
transparent. In everything, networked and taken for granted.
• Networking – For those who don’t have broadband at home – I highly
recommend it and it is worth the cost. The best feature of
high-speed internet is not actually the speed of the line but
instead is the “always-on” aspect of broadband. It makes the
internet much more useful to not have a separate “connect” phase. Of
course, the speed will continue to improve and that will open us up
to better multimedia, cheap video conferencing, and better
collaboration software.
• Desktop computing – with the idea of pervasive computing the
desktop computing arena is waning. It is more important to
seamlessly connect our plethora of devices than to concentrate on a
single device, thus Apple has it right with the digital hub concept.
• The World Wide Web – Tim Berners-Lee’s vision will come true. We
will have a Collaborative web and a Semantic web. More on this
later.
• Data storage – we are pushing towards Terabytes on devices the
size of a postage stamp! This is one of the reasons the semantic web
is a must! Our storage capacity has significantly outstripped our
ability to manage the information we can store. We have spent the
last several decades ramping up and refining our ability to create
information (in a plethora of mostly proprietary formats) without
considering the discovery problem. Vannevar Bush must be rolling
over in his grave. This is our most pressing imperative and we
ignore it to the detriment of productivity and progress.
• Voice over IP – I am not very knowledgeable in this area. In
general, this technology does not seem to be a big issue as long
distance rates keeps dropping. Competition in the long distance
market has been great and consistently reduced the price for
consumers. If we can only get that same competition in the local and
broadband space. FCC are you listening?
• Ipv6 – I expect a graceful transition to this based on a dual
capability path in equipment and equipment upgrades. We have to do
this and IT managers should plan this in to their next equipment
upgrades for routers and other network equipment.
• 64-bit computing – I am surprised that 64-bit computing does not
get more press. It is an absolutely necessary transition in the next
year or so for power users. The transition in the mainstream will be
within 2 years (but I hope sooner). The bottom line is that we are
bumping up against the 4GB limit of 32 bit addresses for memory
intensive application like high-fidelity games and multimedia. It is
disappointing that the major vendors and Microsoft have not made
this switch sooner. Let’s not wait until there are massive
complaints to get 64-bit computers and applications out there. We
have known about this for a long time so let’s make the switch and
get it over with.
• XML/RDF data stores – I predict we will see more of the big
players get into this space. Especially all the database vendors. If
they do not, the open source variants will eat their lunch as these
data stores gain prominence. This is especially important in the
registry space.
• Ontologies – as we discuss in detail in our book, these are truly
the next big thing in data modeling, model-driven architecture,
expert systems and the web. With OWL, they will cross the chasm from
early adopters to mainstream adoption.
Q: As an international expert in Java, can you share insights and
tips from your latest book? What prompted you to write this book?
A: In March of this year, the book More Java Pitfalls was released
by John Wiley and Sons, Inc. I am proud of this book and the team
that worked on it because we worked hard at perfecting how to write
about pitfalls. Some people seem to think that writing about
pitfalls is a slam on Java – that is not true. Every complex system
has pitfalls to inexperienced and average users (in our case
programmers). Sometimes it is the designer of the systems fault and
sometimes it is the users fault. In the introduction to the book, I
display a taxonomy of pitfalls as shown here in the figure below.
See
Figure 1 to
see diagram.
Looking at the pitfall taxonomy, the initial split is between the
programmer’s fault and the platform designer’s fault. So, pitfalls
are not a slam on the Java platform, instead they are the sharing of
our hard-won experience with other programmers. That brings me to
the next key point about the book, its use in peer mentoring
programs. Peer mentoring is a concept I have been talking about at
my company for awhile. In fact, there is an article about peer
mentoring on the McDonald Bradley website.
One of the best uses of the pitfalls book is in a formal mentoring
program. For example, a pitfall is a great discussion item for a
brown-bag lunch training session that gets programmers together in
an informal session.
As for tips, the whole book is a set of 50 pitfalls and the tips
that resolve or workaround the pitfall. There are several sample
pitfalls posted on the companion website at:
http://www.wiley.com/legacy/compbooks/daconta
Q: There’s considerable momentum around the “Semantic Web.” Can you
describe your book on this topic and why it’s important for IT
professionals and businesses to be aware of this concept and related
technologies?
A: About two weeks ago, John Wiley & Sons, Inc released my new book
titled The Semantic Web: A guide to the future of XML, Web Services
and Knowledge Management. I feel our book is truly unique on this
subject in the following ways:
1. It is the First and Only Semantic Web Book for Technical Managers
and Development leads. The authors are senior technologists that
offer a critical analysis of the technologies and strategic guidance
to senior Information Technology Professionals. The authors are not
cheerleaders for the technology in order to give honest, accurate
assessments. All previous books on the subject have been
graduate-level or implementation-level books for developers.
2. It explains all the pieces of the Semantic Web and how current
technologies fit into the puzzle. Specifically, discusses XML and
its family of specifications, RDF, RDFS, Web Services, Taxonomies
and Ontologies. No major web specification is left unanalyzed.
3. It reveals the Return-on-Investment (ROI) for Semantic Web
Initiatives and other aspects of the business case for semantic web
technologies.
4. It gives your Organization a clear, step-by-step Roadmap to
preparing for and implementing Semantic Web technologies today.
5. It introduces, explains and explores the implications of new
transformational concepts like the “Smart Data Continuum”, “Semantic
Levels”, “Non-contextual Modeling”, and “Combinatorial
Experimentation”. These new concepts are inventions of the authors,
derived from their real-world experience, and their applicability
demonstrated to today’s business environment.
I’d like to take a moment to highlight the concept of the smart data
continuum as it is particularly relevant to understanding both what
the semantic web is and how it is just another step along a larger
path of the evolution of data fidelity. Below is the diagram of the
smart data continuum.
See
Figure 2 to
see diagram of smart data continuum.
While the smart data continuum is covered in more detail in my book,
I will summarize the key points here. The smart data continuum
reveals the path of data evolution along a continuum of increasing
intelligence. So, in a nutshell, the semantic web is a shift in
moving the intelligence (or smarts) from applications to the data.
The first major development along this path was in the 1970’s when
the term GIGO, or garbage-in-garbage-out, characterized the
dependence of programs on correct data. This elevated the importance
of data in programs and led to object-oriented programming
languages. In the 1990’s, the advent of HTML and the success of the
World Wide Web, caused another shift in the data continuum from
proprietary schemas to open schemas. Today, with web services, we
are again undergoing a shift in the data evolution from
interoperable syntax to interoperable semantics. So, the figure
above highlights the progression of increasing intelligence from
proprietary schemas (office documents, databases), to open schemas
(XML), to multi-vocabulary schemas (taxonomies and namespaces) and
finally to inference and automated reasoning (Ontologies).
6. Lastly, the book is taken from the author’s real-world experience
transforming the Department Of Defense (DOD) and Intelligence
Community into a Net-centric environment. The principles and
practices espoused in the book are guiding the next generation
military towards net-centricity.
Q: Please share more of your ideas around the Semantic Web?
A: Here are a few other ideas I am either actively experimenting
with or exploring:
• Semantic chains – one problem with metadata is that it is
potentially infinite. Unfortunately, you cannot burden every
potential application with metadata layers that it has no interest
in. Thus, the rule with metadata must be to provide “just enough”.
Thus an XML or XML/RDF document must provide a minimal set of
metadata to enable the most common processing of that data; however,
the document must include a reference or link back to a larger pool
of metadata which provides context for the data or clarification of
the data. In turn, that pool may link back to another, possibly yet
larger pool of metadata to provide yet more context and thus we have
a semantic chain of metadata pools stretching back as far as it
needs to go.
• Multitagging – yet another problem with metadata, especially in
application areas that require cross-domain processing, is the fact
that content can only be marked up (or governed) by a single DTD or
Schema. In essence, each markup language can be viewed as a
perspective on a piece of content. If there are multiple
perspectives on that content, the only current way to achieve
capture of those perspectives is to either transform the document
into multiple separate markup languages (where the content itself is
repeated) or attempt to nest different tags within a single
document. Unfortunately, nesting breaks down due to the fact that it
is illegal for tags to be interlaced (and rightly so). Therefore, I
have created a simple XML format that allows markup language tags to
be treated as separate layers on the content. The analogy would be
to think of tags as an overlay (like acetate) on top of the content
and thus multitagging will allow us to have multiple overlays
overlaid on top of a single body of content. This is especially
relevant for applications involving automated tagging, re-tagging
and temporary tagging. After a successful prototype, I plan on
releasing the specification to the W3C as a Note.
• Asymmetric search – in order to combat asymmetric threats (threats
that do not follow symmetrical thinking), we need to significantly
improve our ability to formally relate conceptual and physical
entities in our information spaces. In my opinion, this is both our
current biggest problem and biggest opportunity. I am not talking
simple link analysis or simple labeled links. This is a difficult
problem that is against the normal tide of traditional thought, and
traditional information processing. Look at the way most programming
languages have links (references or pointers) as second-class
citizens that get buried inside other structures. More and more we
are seeing that the links between the dots is where we are sorely
lacking. I encourage every person with an entrepreneurial bent to
study this problem and think of innovative solutions. We desperately
need better ways to capture, manage and exploit formal relations
(with their own set of metadata) in our information systems. The
next killer application is “Relation authoring, sharing and
discovery.” It is important to understand that this is not just a
technical problem; it is also a functional knowledge engineering
problem.
• I am working on several other ideas but will save discussion of
them for a future interview.
Q: Describe future book titles and articles can we expect from you?
A: In the near term, I plan on writing some articles on some of the
ideas knocking around in my head. Especially the ones related to the
semantic web and practical examples of applications in that area.
As for books, there are many possibilities. I will most likely stay
on the dual track of exploring pitfalls in other complex systems
(XML, web services, .Net) and exploring the intricacies and
applications of semantic web technologies.
Q: With your deep knowledge of the entire IT industry, what other
pointers would you like to give the readers?
A: Even though the economy is in a funk, that is (and should be)
tangential to progress in the IT industry. Hopefully it will dampen
some of the hype surrounding IT and let us get back to solving
problems, increasing productivity and improving our effectiveness in
using and sharing knowledge. The real important message for IT is
that real progress is being made and continues to be made in all
aspects of software development. The industry is maturing and both
reliable and user-friendly IT systems are possible.
Lastly, here are some final tips:
- Forget operating systems – we are now on the layer “above” the
operating system. If you are tied to a particular operating system
or protocol, you need a better IT Architect.
- Knowledge management activities must pervade every aspect of the
organization. Especially capture. 90% of useful information is lost
in informal or non-existing capture methods.
- Beware of expensive, proprietary solutions: open standards and
open source are the way to go for most projects. Especially open
standards.
- Never forget the human aspects of computing. Have your developers
interact with users regularly in a social setting. Also, start a
mentoring program in your organization.
- The next leap in information technology is all about information
fidelity. We have spent the last 30 years mostly on graphics
fidelity (3D, user interfaces, etc.). That era is mostly done. The
next wave is information fidelity: this means fine-grained metadata,
relations between data elements, and links between discovery and
production of information.
Q: Thank you for taking the time to share with our readers and we
look forward to reading your books, and articles.
A: Thanks for allowing me to discuss these important issues with
your readers. I am always interested in the thoughts of other IT
professionals so they can feel free to email me at
mike@daconta.net.
|