Explaining Complex Systems Deborah L. McGuinness Acting Director Knowledge Systems, AI Lab, Stanford University Tetherless World Chair, Rensselaer Polytechnic Institute (RPI)
Increasing Explanation Motivations
Systems are getting more complex Multiple heterogeneous distributed information sources Highly variable reliability of information sources Interest in reuse of information systems (many times for purposes other than those originally planned for) Hybrid and distributed processing Multiple types of components, including multiple learners (e.g., calo, gila), multiple text components (e.g., uima, kani, …) Less transparency of system computation and reasoning Systems are taking more autonomous control Guide/assist user actions Perform autonomous actions on behalf of user “reason, learn from experience, be told what to do, explain what they are doing, reflect on their experience, and respond robustly to surprise” *
* DARPA PAL program: Deborah L. McGuinness
http://www.darpa.mil/ipto/programs/pal/ Semantic e-Science Vancouver July 23, 2007
2
Motivation Support explanations of provenance, information manipulation trace, and trust using an interoperable, transparent, and user-friendly knowledge provenance infrastructure.
Explanation
Provenance – if users (humans and agents) are to use and integrate data from unknown, unreliable, or evolving sources, they need provenance metadata for evaluation Information manipulation trace – if information has been manipulated (i.e., by sound deduction or by heuristic processes), information manipulation trace information should be available Trust – if some sources are more trustworthy than others, representations should be available to encode, propagate, combine, and (appropriately) display trust values
Interoperability – as systems use varied sources and multiple information manipulation engines, they benefit more from encodings that are shareable & interoperable Transparency –explanations can be used to provide transparency and accountability for systems by allowing (authorized) users to see what the system has done.. Usability – varied users need rich representation options and a broad range of tool support to provide context- and user-appropriate presentations. Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
3
Inference Web Explanation Infrastructure WWW
SDS
OWL-S/BPEL
Trace of web service discovery
CWM
N3
Proof Markup Language (PML)
KIF
Trust
Toolkit IWTrust IW Explainer/ Abstractor
Trace of rule application
JTP Trace of theorem prover
SPARK
SPARK-L
Trace of task execution
UIMA
Justification Provenance
Text Analytics
Trust computation End-user friendly visualization
IWBrowser
Expert friendly Visualization
IWSearch
search engine based publishing
IWBase
provenance registration
Trace of information extraction
Semantic Web based infrastructure PML is an explanation interlingua
Represent knowledge provenance (who, where, when…) Represent justifications and workflow traces across system boundaries
Inference Web provides a toolkit for data management and visualization Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
4
Explaining Information Manipulation in PML (behind the scenes) Question - foo:question1 “when and where does Ramazi have an office” Query - foo:query1 “(Holds (|hasOffice| |Ramazi| ?where) ?when) ” isQueryFor
IWRegistry (Provenance Metadata)
hasAnswer hasLanguage NodeSet - foo:ns1 {answer to query} “(Holds (|hasOffice| |Ramazi| |SelectGourmetFoods|) April_01_2003)”
Language - reg:KIF
isConsequentOf fromQuery
hasInferencEngine
InferenceStep hasAntecendent
InferenceEngine - reg:JTP
hasInferenceRule
hasVariableMapping Mapping From: “?f”
InferenceRule - reg:GMP
To: “(|hasOffice| |Ramazi| ?where)”
More mappings …
Source – reg:TypicalityOnto
More NodeSets…
NodeSet - foo:ns2 {direct assertion} “(<= (or (Ab ?f ?t) (Holds ?f ?t)) (Holds* ?f ?t))” fromAnswer InferenceStep Deborah L. McGuinness
isConsequentOf hasSourceUsage
SourceUsage
hasSource Semantic e-Science Vancouver July 23, 2007
5
Filtered View Views of Explanation filtered
focused
Explanation (in PML)
trust
abstraction discourse
provenance
Show Highlights
Deborah L. McGuinness
global
Query Answer Supporting assertions Sources
Semantic e-Science Vancouver July 23, 2007
6
Focused View Views of Explanation filtered
focused
Explanation (in PML)
trust
discourse provenance
Original query Conclusion Direct antecedents Inference rule
Contextually appropriate follow-up questions
Deborah L. McGuinness
abstraction
One step of justification
global
Sources Ground Assertions Assumptions Full trace Question answerers used …
Semantic e-Science Vancouver July 23, 2007
7
Global View and More Views of Explanation filtered
focused
Explanation (in PML)
provenance
Explanation as a graph Customizable browser options
Proof style Sentence format Lens magnitude Lens width
More information
Deborah L. McGuinness
abstraction discourse
trust
global
Provenance metadata Source PML Proof statistics Variable bindings Link to tabulator …
Semantic e-Science Vancouver July 23, 2007
8
Abstraction View
Rewrite rules used to hide part(s) of the sub-graph
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
9
Discourse View
(Limited) natural language interface Mixed initiative dialogue Exemplified in CALO domain Explains task execution component powered by learned and human generated procedures
Deborah L. McGuinness
Views of Explanation filtered
focused
Explanation (in PML)
trust
global abstraction discourse
provenance
Semantic e-Science Vancouver July 23, 2007
10
Provenance View
Source metadata: name, description, … Source-Usage metadata: which fragment of a source has been used when
Views of Explanation filtered
focused
Explanation (in PML)
trust
Deborah L. McGuinness
global abstraction discourse
provenance
Semantic e-Science Vancouver July 23, 2007
11
Trust View Views of Explanation filtered Detailed trust explanation
Trust Tab
Explanation (in PML)
trust
Fragment colored by trust value Deborah L. McGuinness
focused
global abstraction discourse
provenance
(preliminary) simple trust representation Provides colored (mouseable) view based on trust values Enables sharing and collaborative computation and propagation of trust values
Semantic e-Science Vancouver July 23, 2007
12
Inference Web Data Access Interface
Browse
Organized by class-hierarchy Customized entry Summary and audit views (in report listing)
Search (with filters)
NodeSet
Deborah L. McGuinness
root NodeSet
Query Conclusion …
Semantic e-Science Vancouver July 23, 2007
13
The Use-Ask-Understand-Update Cycle
Use
Update
Deborah L. McGuinness
Ask
Understand / Accept
Semantic e-Science Vancouver July 23, 2007
14
Selected IW and PML Applications
Portable proofs across reasoners: JTP (with temporal and context reasoners (Stanford); CWM (W3C), SNARK(SRI), … Explaining web service composition and discovery (SNRC) Explaining information extraction (more emphasis on provenance – KANI, UIMA) Explaining intelligence analysts’ tools (NIMD/KANI) Explaining tasks processing (SPARK / CALO) Explaining learned procedures (TAILOR, LAPDOG, / CALO) Explaining privacy policy law validation (TAMI) Explaining decision making and machine learning (GILA) Explaining trust in social collaborative networks (TrustTab) Registered knowledge provenance: IW Registrar (Explainable Knowledge Aggregation) Deborah L. McGuinness Semantic e-Science Vancouver July 23, 2007
15
Trend: Semantically Enabling Applications leveraging the Semantic Web focus
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
16
Semantic Web Methodology and Technology Development Process
Open World: Evolve, Iterate, Redesign, Redeploy
Establish and improve a well-defined methodology vision for Semantic Technology-based application development
Rapid Prototype
Leverage Technology Infrastructure
Adopt Technology Approach
Expert Review & Iteration
Use Tools Analysis
Use Case
Small Team, mixed skills Deborah L. McGuinness
Develop model/ ontology Joint with P. Fox
Semantic e-Science Vancouver July 23, 2007
17
Scientific Environment Goal Scientists should be able to access a global, distributed knowledge base of scientific data that: • appears to be integrated • appears to be locally available But… data is obtained by multiple instruments, using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed.
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
18
Virtual Observatory
Workshop: A Virtual Observatory (VO) is a suite of software applications on a set of computers that allows users to uniformly find, access, and use resources (data, software, document, and image products and services using these) from a collection of distributed product repositories and service providers. A VO is a service that unites services and/or multiple repositories. lwsde.gsfc.nasa.gov/VO_Framework_7_Jan_05.doc
VxOs - x is one discipline Trend: VxyO – multi-discipline virtual observatories Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
19
Selected VxyO Motivation: Mt. Spurr, AK. 8/18/1992 eruption, USGS
http://www.avo.alaska.edu/image.php?id=319 Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
20
Eruption cloud movement from Mt.Spurr, AK,1992
Deborah L. McGuinness
Semantic e-Science VancouverUSGS July 23, 2007
21
Tropopause
http://aerosols.larc.nasa.gov/volcano2.swf 22
Atmosphere Use Case
Determine the statistical signatures of both volcanic and solar forcings on the height of the tropopause From paleoclimate researcher – Caspar Ammann – Climate and Global Dynamics Division of NCAR - CGD/NCAR Layperson perspective: - look for indicators of acid rain in the part of the atmosphere we experience… (look at measurements of sulfur dioxide in relation to sulfuric acid after volcanic eruptions at the boundary of the troposphere and the stratosphere) Nasa funded effort with Fox - NCAR, Sinha - Va. Tech, Raskin - JPL
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
23
Use Case detail: A volcano erupts Preferentially it’s a tropical mountain (+/- 30 degrees of the equator) with ‘acidic’ magma; more SiO2, and it erupts with great intensity so that material and large amounts of gas are injected into the stratosphere. The SO2 gas converts to H2SO4 (Sulfuric Acid) + H2O (75% H2SO4 + 25% H2O). The half life of SO2 is about 30 - 40 days. The sulfuric acid condensates to little super-cooled liquid droplets. These are the volcanic aerosol that will linger around for a year or two. Brewer Dobson Circulation of the stratosphere will transport aerosol to higher latitudes. The particles generate great sunsets, most commonly first seen in fall of the respective hemisphere. The sunlight gets partially reflected, some part gets scattered in the forward direction. Result is that the direct solar beam is reduced, yet diffuse skylight increases. The scattering is responsible for the colorful sunsets as more and more of the blue wavelength are scattered away.in mid-latitudes the volcanic aerosol starts to settle, but most efficient removal from the stratosphere is through tropopause folds in the vicinity of the storm tracks. If particles get over the pole, which happens in spring of the respective hemisphere, then they will settle down and fall onto polar ice caps. Its from these ice caps that we recover annual records of sulfate flux or deposit. We get ice cores that show continuous deposition information. Nowadays we measure sulfate or SO4(2-). Earlier measurements were indirect, putting an electric current through the ice and measuring the delay. With acids present, the electric flow would be faster. What we are looking for are pulse like events with a build up over a few months (mostly in summer, when the vortex is gone), and then a decay of the peak of about 1/e in 12 months. The distribution of these pulses was found to follow an extreme value distribution (Frechet) with a heavy tail.
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
24
Use Case detail: … climate
So reflection reduces the total amount of energy, forward scattering just changes the beam, path length, but that's it. The dry fogs in the sky (even after thunderstorm) still up there, thus stratosphere not troposphere. The tropical reservoir will keep delivering aerosol for about two years after the eruption. The particles are excellent scatterers in short wavelength. They do absorb in NIR and in IR. Because of absorption, there is a local temperature change in the lower stratosphere. This temperature change will cause some convective motion to further spread the aerosol, and second: Its good factual stuff. Once it warms up, it will generate a temperature gradient. Horizontal temperature gradients increase the baroclinicity and thus storms, and they speedup the local zonal winds. This change in zonal wind in high latitudes is particularly large in winter. This increased zonal wind (Westerly) will remove all cold air that tries to buildup over winter in high arctic. Therefore, the temperature anomaly in winter time is actually quite okay. Impact of volcanoes is to cool the surface through scattering of radiation. In winter time over the continents there might be some warming. In the stratosphere, the aerosol warm. The amount of GHG emitted is comparably small to the reservoir in the air. The hydrologic cycle responds to a volcanic eruption. Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
25
Atmosphere (portions from SWEET)
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
26
Atmosphere II
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
27
Interdisciplinary VOs, Semantics, and Explanation
Background ontologies can be used to help access and integrate distributed data sources Semantically-enabled VOs are starting to go into service (e.g., VSTO – talk later today on services and in IAAI on tues – deployed application track) Provenance issues become more critical in such systems – where did the data come from? How was it collected? Who collected it? What are their credentials? etc. Annotations and explanations may be the key to increasing trust in answers Annotations may simultaneously be a key to increasing contributions as users become confident that they will get appropriate credit An explanation interlingua (such as PML) may be a critical component to semantic integration, sharing, and acceptance An explanation infrastructure (such as Inference Web) may provide a foundation on which to build such applications
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
28
References
IW and PML
Trust
Deborah L. McGuinness and Paulo Pinheiro da Silva. Explaining Answers from the Semantic Web: The Inference Web Approach. Journal of Web Semantics. Vol.1 No.4, 2004 Deborah L. McGuinness. Knowledge Representation for Question Answering. In Proceedings of the American Association for Artificial Intelligence Spring Symposium Workshop on New Directions for Question Answering. Stanford University, Stanford, CA. pages 75-77, AAAI Press, March 2003. Paulo Pinheiro da Silva, Deborah L. McGuinness and Richard Fikes. A Proof Markup Language for Semantic Web Services. Information Systems. Volume 31, Issues 4-5, 2006 (New Version – PML2 in Explanation Aware Computing Workshop at AAAI 2007. Deborah L. McGuinness, Honglei Zeng, Paulo Pinheiro da Silva, Li Ding, Dhyanesh Narayanan, and Mayukh Bhaowal. Investigations into Trust for Collaborative Information Repositories: A Wikipedia Case Study. WWW2006 Workshop on the Models of Trust for the Web (MTW'06) Ilya Zaihrayeu, Paulo Pinheiro da Silva and Deborah L. McGuinness. IWTrust: Improving User Trust in Answers from the Web. Proceedings of 3rd International Conference on Trust Management (iTrust2005) H. Zeng, M. Alhossaini, L. Ding, R. Fikes, and D. McGuinness. Computing Trust from Revision History. The 2006 International Conference on Privacy, Security and Trust (PST 2006) Honglei Zeng, Maher Alhossaini, Richard Fikes, and Deborah L. McGuinness. Mining Revision History to Assess Trustworthiness of Article Fragments. The 2nd International Conference on Collaborative Computing: Networking, Applications and Worksharing
Some particular aspects of explanation:
Text Analytics: J. William Murdock, Deborah L. McGuinness, Paulo Pinheiro da Silva, Christopher Welty and David Ferrucci. Explaining Conclusions from Diverse Knowledge Sources. The 5th International Semantic Web Conference(ISWC2006) Learning Task Procedures: Deborah L. McGuinness, Alyssa Glass, Michael Wolverton and Paulo Pinheiro da Silva. Explaining Task Processing in Cognitive Assistants That Learn. FLAIRS 2007. Explaining Data Usage: Daniel J. Weitzner, Hal Abelson, Tim Berners-Lee, Chris P. Hanson, Jim Hendler, Lalana Kagal, Deborah L. McGuinness, Gerald J. Sussman, K. Krasnow Waterman. Transparent Accountable Inferencing for Privacy Risk Management. Proceedings of AAAI Spring Symposium on The Semantic Web meets eGovernment. AAAI Press, Stanford University, USA 2006. Also available as Stanford KSL Technical Report KSL06-03 and MIT CSAIL Technical Report-2006-007. User needs: Andrew. J. Cowell, Deborah L. McGuinness, Carrie F. Varley, and David A. Thurman. KnowledgeWorker Requirements for Next Generation Query Answering and Explanation Systems. In the Proceedings of the Workshop on Intelligent User Interfaces for Intelligence Analysis, International Conference on Intelligent User 29 Deborah McGuinness Semantic e-Science Vancouver July 23, 2007 Interfaces (IUIL.2006), Sydney, Australia.
Extra
Deborah L. McGuinness
Semantic e-Science Vancouver July 23, 2007
30