(SBIR) Navy - Concept Maps from RDF (Resource Description Framework)

Concept Maps from RDF (Resource Description Framework)
Navy SBIR 2013.2 - Topic N132-128
ONR - Ms. Lore Anne Ponirakis - [email protected]
Opens: May 24, 2013 - Closes: June 26, 2013

N132-128 TITLE: Concept Maps from RDF (Resource Description Framework)

TECHNOLOGY AREAS: Information Systems, Human Systems

ACQUISITION PROGRAM: PMW-120, PMMI (DCGS-N, DCGS-MC); FNT 14-03 Exchange of Actionable Intelligence

RESTRICTION ON PERFORMANCE BY FOREIGN CITIZENS (i.e., those holding non-U.S. Passports): This topic is "ITAR Restricted". The information and materials provided pursuant to or resulting from this topic are restricted under the International Traffic in Arms Regulations (ITAR), 22 CFR Parts 120 - 130, which control the export of defense-related material and services, including the export of sensitive technical data. Foreign Citizens may perform work under an award resulting from this topic only if they hold the "Permanent Resident Card", or are designated as "Protected Individuals" as defined by 8 U.S.C. 1324b(a)(3). If a proposal for this topic contains participation by a foreign citizen who is not in one of the above two categories, the proposal will be rejected.

OBJECTIVE: The objective of this topic is to develop a capability to propose concept maps from very large RDF data stores. To meet this objective, the need exists to construct visual graphs, reorganize nodes/edges to increase readability, remove irrelevant data and prioritize content with respect to user needs.

DESCRIPTION: The military requires affordable means to convert text and image based data to knowledge. Commercial tools exist to semantically tag entities and relationships. The focus of this topic is to take the next step and automate building of concept maps from large RDF data stores to clearly show meaningful relationships such as human networks and behaviors/activities. The goal of this topic is to develop machine based processes to assist human operators in making sense of large graphs derived from the content of documents and video. A concept map is a graphic tool for exploring knowledge and also gathering and sharing information [1]. They can include concepts, shown as component entities enclosed in circles, and relationships between concepts indicated by a connecting line. Products can take the form of a graph, graph with hyperlinks or website pages [2]. Concept map structures are dependent on a user supplied context frame or focus question. Of particular interest for this topic is assisting military operators with handling large quantities of data through automated visual representations, with reduced clutter and prioritize content to meet the needs of specific users.

Thought has to be given to the knowledge desired to meet the needs of user based on mission and tasking. Intelligence knowledge desired can take the form of know-what, know-how, know-who, and know-why questions [3]. Structured Models, Approaches and Techniques (SMATs) can be used by intelligence analysts to identify elite leaders, locate high value individuals, map organizational structures, filter raw data for semantic content and read messages to track incidents. Automated construction of concept maps would provide a valuable tool to assist intelligence analysts in answering these types of questions. The topic�s research objective is to automate construction of concept maps to show the significance of entities and relationships extracted from series of structured and unstructured text reports and video. Entity and association extraction has evolved to the point that the large data problem has become a large graph challenge. Each document and video of an already large corpus, once structured, is represented by a graph containing hundreds or thousands of nodes through entity and association extraction. A capability to move from large graphs to meaningful concept maps is critically needed. Technical challenges include the development of graph processing (RDF) techniques that consider context (time, place and the nature of an association) and the meaning of a filtered graph relative to a concept. The maturation of multi-dimensional clustering and word frame technologies may be relevant. The technical risk involves development of an appropriate data store/ taxonomy, graph simplification through frame clustering, inferring concept maps through artificial intelligence automatically. Natural language and video processing tools are able to structure the content of documents and video through the recognition and extraction of proper nouns (e.g. people, places) and selected other parts of speech or context. Structure and grammar frames have been used to classify the meaning of a sentence and/or image. Combining the two techniques to enable a large data scalable machine understanding capability, that clearly shows meaningful relationships, can now be worked. A successful prototype would automatically translate large RDF graphs into the stories they tell.

PHASE I: Develop processes and techniques to create an automated concept map; document the heuristic, machine learning and/or other methods used and show basis in scientific literature. A phase I effort should identify key technical risks associated with the development of a prototype and track risk reduction progress through the measurement of key technical parameters. A Phase I effort should end with a proof of concept demonstration that bounds the size of a graph considered and the types of concept maps generated. The results should be put in a report and if time allows a conference or journal publication. The final Phase I brief should show plans for Phase I Option and Phase II if selected.

PHASE II: Prototype a system that can take a question, input RDF from documents and video and output a concept map. The prototype system will be able to automatically process, display a graph and provide links to sources (pedigree). The system should work with little burden on operators but provide means to refine process decisions. The performer should profile a prototype system that is effective against a bounded set of information questions and data sources (graphs of at least 10 million nodes). The selection of questions and data should be consistent with those of interest to the target transition program. It is possible that operational RDF of interest to the transition program will be classified secret.

PHASE III: Produce a system capable of deployment and operational evaluation that is relevant to multiple user domains and can operate against RDF graphs of at least 100 million nodes. The system should address topics or themes that are specific to use cases favored by the transition program and commercial application. Machine based processing steps, metadata tags and heuristics should be accessible by operator in human understandable form. Data input/outputs and software environment should be modified to operate in accordance with guidelines provided by the transition sponsor.

PRIVATE SECTOR COMMERCIAL POTENTIAL/DUAL-USE APPLICATIONS: The private-sector internet market is always interested in new ways to make sense of tagged data. Currently search engines are available that allow for the discovery of information based on user generated tags. This topic would expand future search capabilities to discover based on machine generated tags and machine generated concept maps. Tagged data has caused the large data problem to become a large graph problem. This topic will support research to translate big graphs to relevant stories.

REFERENCES:
1. Joseph D. Novak and Alberto J. Cañas, "The Theory Underlying Concept Maps and How To Construct and Use Them", Institute for Human and Machine Cognition, 2006.

2. Plotnick, Eric, "Concept Mapping: A Graphical System for Understanding the Relationship between Concepts", ERIC Digest, 1997. http://www.ericdigests.org/1998-1/concept.htm

3. Victor H. Ruiz, "A Knowledge Taxonomy for Army Intelligence Training: An Assessment of the Military Intelligence Basic Officer Leaders Course Using Lundvall�s Knowledge Taxonomy", 2010 https://digital.library.txstate.edu/bitstream/handle/10877/3440/fulltext.pdf?sequence=1

4. John Jones, "When Robots Write", digital Media and Learning (DML) Central, April 14, 2011 http://dmlcentral.net/blog/john-jones/when-robots-write

KEYWORDS: Concept Maps, RDF Stores, Knowledge Bases, Cognitive Science, Machine Understanding, Tagged Data

** TOPIC AUTHOR (TPOC) **
DoD Notice: Between April 24 through May 24, 2013, you may talk directly with the Topic Authors (TPOC) to ask technical questions about the topics. Their contact information is listed above. For reasons of competitive fairness, direct communication between proposers and topic authors is not allowed starting May 24, 2013, when DoD begins accepting proposals for this solicitation.

However, proposers may still submit written questions about solicitation topics through the DoD's SBIR/STTR Interactive Topic Information System (SITIS), in which the questioner and respondent remain anonymous and all questions and answers are posted electronically for general viewing until the solicitation closes. All proposers are advised to monitor SITIS (13.2 Q&A) during the solicitation period for questions and answers, and other significant information, relevant to the SBIR 13.1 topic under which they are proposing.
If you have general questions about DoD SBIR program, please contact the DoD SBIR Help Desk at (866) 724-7457 or email weblink.

Return