05.12.2012 Views

NASA Scientific and Technical Aerospace Reports

NASA Scientific and Technical Aerospace Reports

NASA Scientific and Technical Aerospace Reports

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

We explore the use of Optimal Mixture Models to represent topics. We analyze two broad classes of mixture models:<br />

set-based <strong>and</strong> weighted. We provide an original proof that estimation of set-based models is NP-hard, <strong>and</strong> therefore not<br />

feasible. We argue that weighted models are superior to set-based models, <strong>and</strong> the solution can be estimated by a simple<br />

gradient descent technique. We demonstrate that Optimal Mixture Models can be successfully applied to the task of document<br />

retrieval. Our experiments show that weighted mixtures outperform a simple language modeling baseline. We also observe that<br />

weighted mixtures are more robust than other approaches of estimating topical models.<br />

DTIC<br />

Information Retrieval; Mathematical Models; Optimization<br />

20060001853 Massachusetts Univ., Amherst, MA USA<br />

A Conditional R<strong>and</strong>om Field for Discriminatively-Trained Finite-State String Edit Distance<br />

McCallum, Andrew; Bellare, Kedar; Pereira, Fern<strong>and</strong>o; Jan. 1, 2005; 9 pp.; In English<br />

Report No.(s): AD-A440386; No Copyright; Avail.: Defense <strong>Technical</strong> Information Center (DTIC)<br />

The need to measure sequence similarity arises in information extraction, object identity, data mining, biological sequence<br />

analysis, <strong>and</strong> other domains. This paper presents discriminative string-edit CRF’s a finite-state conditional r<strong>and</strong>om field model<br />

for edit sequences between strings. Conditional r<strong>and</strong>om fields have advantages over generative approaches to this problem,<br />

such as pair HMMs or the work of Ristad <strong>and</strong> Yianilos, because as conditionally-trained methods, they enable the use of<br />

complex, arbitrary actions <strong>and</strong> features of the input strings. As in generative models, the training data does not have to specify<br />

the edit sequences between the given string pairs. Unlike generative models, however, our model is trained on both positive<br />

<strong>and</strong> negative instances of string pairs. We present positive experimental results on several data sets.<br />

DTIC<br />

R<strong>and</strong>om Variables; Strings<br />

20060001854 Connecticut Univ., Storrs, CT USA<br />

Mapping Flows onto Networks to Optimize Organizational Processes<br />

Levchuk, Georgiy M.; Levchuk, Yuri N.; Pattipati, Krishna R.; Kleinman, David L.; Jan. 1, 2005; 25 pp.; In English; Original<br />

contains color illustrations<br />

Contract(s)/Grant(s): N00014-00-1-0101<br />

Report No.(s): AD-A440387; No Copyright; Avail.: Defense <strong>Technical</strong> Information Center (DTIC)<br />

Interdependence of tasks in a mission necessitates information flow among the organizational elements (agents) assigned<br />

to these tasks. This information flow introduces communication delays. An effective task schedule that minimizes the total<br />

execution time, including task processing <strong>and</strong> coordination delays, is an important issue in designing an organization <strong>and</strong> its<br />

task processing strategy. This paper defines the structure of information-dependent tasks, <strong>and</strong> describes an approach to map<br />

this structure to a network of organizational elements (agents). Since the general problem of scheduling tasks with<br />

communication is NP-hard, only fast heuristic (e.g., list scheduling <strong>and</strong> linear clustering) algorithms are discussed. The authors<br />

modify the priority calculation for list scheduling methods, matching the critical path with a network of heterogeneous agents.<br />

They then present their algorithm, termed Heterogeneous Dynamic Bottom Level (HDBL), <strong>and</strong> compare it with various<br />

list-scheduling heuristics. The results show that HDBL exhibits superior performance to all list scheduling algorithms,<br />

providing an improvement of over 25% in schedule length for communication-intensive task graphs.<br />

DTIC<br />

Comm<strong>and</strong> <strong>and</strong> Control; Information Transfer; Mapping; Networks; Optimization; Organizations; Scheduling<br />

20060001863 Maryl<strong>and</strong> Univ., College Park, MD USA<br />

Searching the Web with SHOE<br />

Heflin, Jeff; Hendler, James; Jan. 1, 2000; 7 pp.; In English; Original contains color illustrations<br />

Contract(s)/Grant(s): DAAL01-97-K-0135<br />

Report No.(s): AD-A440405; No Copyright; Avail.: Defense <strong>Technical</strong> Information Center (DTIC)<br />

Although search engine technology has improved in recent years, there are still many types of searches that return<br />

unsatisfactory results. This situation can be greatly improved if web pages use a semantic markup language to describe their<br />

content. We have developed SHOE, a language for this purpose, <strong>and</strong> in this paper describe a scenario for how the language<br />

could be used by search engines of the future. A major challenge to this system is designing a query tool that can exploit the<br />

power of a knowledge base while still being simple enough for the casual user. We present the SHOE Search tool, which<br />

225

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!