Skip Navigation
UT wordmark
College of Liberal Arts wordmark
linguistics masthead linguistics masthead
Richard P. Meier, Chair CLA 4.304, Mailcode B5100, Austin, TX 78712 • 512-471-1701

Talk - Michael White (Ohio State) "Minimal Dependency Length in Realization Ranking"

Fri, May 18, 2012 • 4:00 PM - 5:30 PM • PAR 10

(joint work with Rajakrishnan Rajkumar)
                          
In this talk, after reviewing our approach to surface realization with
Combinatory Categorial Grammar, I'll survey our recent efforts to
incorporate linguistically motivated features into our discriminative
realization ranking model, focusing on experiments investigating
dependency length minimization.  Comprehension and corpus studies have
found that the tendency to minimize dependency length has a strong
influence on constituent ordering choices.  We find that adding a
total dependency length feature to a comprehensive realization ranking
model yields statistically significant improvements in BLEU scores and
significantly reduces the number of heavy/light ordering errors, many
of which are egregious.  Through distributional analyses, we also show
that with simpler ranking models, dependency length minimization can
go overboard, too often sacrificing canonical word order to shorten
dependencies, while richer models manage to better counterbalance the
dependency length minimization preference against (sometimes)
competing canonical word order preferences.

Bio:

Dr. Michael White is an Associate Professor in the Department of
Linguistics at The Ohio State University.  After obtaining his
Ph.D. in Computer and Information Science from the University of
Pennsylvania in 1994, Dr. White worked for eight years at CoGenTex,
Inc., where he focused on developing practical applications of natural
language generation technologies during multiple SBIR and DARPA
projects and industrial consulting engagements.  In 2002, Dr. White
crossed the pond to Scotland where he worked for three years as a
Research Fellow at the University of Edinburgh, managing Edinburgh's
effort on the COMIC dialogue system project as part of the EU's Fifth
Framework Programme.  During this time, Dr. White also took over the
development of the open source OpenCCG library, the first practical
system for parsing and realization with Combinatory Categorial
Grammar.  With his colleagues in Edinburgh, Dr. White developed
grammar-based and data-driven methods for producing utterances that
use prosody to help highlight trade-offs among the available options
that are important to a user.  Since joining the faculty at OSU in
2005, Dr. White has continued to develop OpenCCG, extending it to a
broad-coverage setting, supported in part by an NSF grant on
grammar-based paraphrasing.  He has also conducted research on
evaluation, both on using eye tracking to evaluate prosody in
synthetic speech and on using machine translation metrics to evaluate
surface realization systems.


Bookmark and Share
bottom border