Improved Information Retrieval

Daniel Dunlavy, John Neumann Fellow, 2006-2007

Project Goals

  • Development of novel Information Retrieval (IR) system that provide access to vast amounts of reference material effectively presenting only relevant information

Importance to ASCR, DOE-SC, SNL

  • Foundational tool for Cyber Security R&D in open science
  • Platform for advanced informatics research via efficient, modular systems integration

Technical approach

  • Retrieve relevant documents
    • Latent Semantic Indexing (LSI)
  • Clusters retrieved documents by topic
    • Adaptive, robust  k-means clustering
  • Multi-document summarization of topics
    • Hidden Markov Models (HMM)
    • Non-redundant sentences via pivoted QR

Highlights

  • Dunlavy, et al., QCS: A System for Querying, Clustering and Summarizing Documents, Information Processing & Management, 43(6), p. 1588-1605, 2007.
  • Completion of online, interactive system
    • Data available: newswire, medical abstracts


(Return to Applied Math program list)