The American National Corpus Project
The American National Corpus (ANC) project is a major activity funded by the National Science Foundation that is building a massive corpus of texts and spoken transcriptions of contemporary American English. All of the data are annotated with linguistic analyses of various kinds so that computational linguists can build language models to assist in machine understanding of human language.
The project is based at the Department of Computer Science at Vassar; Princeton University, Columbia University, and the International Computer Science Institute at UC Berkeley are partners. As many as 8-10 Vassar students are involved in ANC research projects, ranging from computer and web programming to linguistic analysis, during both the academic year and the summer.
This project is funded under National Science Foundation grant IIS-0712911. The project involves the development and simulation of motion planning algorithms for a particular type of mobile robot: hexagonal metamorphic robots. We are currently working on the development of a complete, deterministic planner to reconfigure a system of these lattice-type robots from any initial shape to any final goal shape. The project will fund students for the academic year and/or summer months and provides funds for students to attend robotics conferences.
The following two projects are interdisciplinary with Jodi Schwarz in Biology. Work began with students during URSI 2008, and continues. Students interested in working on new phases of these projects should contact me.
AiptasiaBase is a database of annotated ESTs from the symbiotic sea anemone Aiptasia pallida.
The goal of AiptasiaWiki is to create a resource for building custom bioinformatics pipelines and storing the output as wiki pages. Users are able to upload scripts, tools, and datasets, and then chain them together to create pipelines.
Luke is currently collaborating with researchers from the University of Verona, Italy, on applications of temporal networks to the health-care domain.