ISLI (KHU)

Research Labs

1. Linguistic Theory Development Lab

통사-의미론영역의 연구의 역사를 보면 , Chomsky에 의해 지난 50년대 자연언어 기술을 위한 생성 문법이 도입된 이래 언어 연구의 목표를 인간의 마음의 본질을 밝히는 것과 모든 인간 언어의 보편원리를 밝히는 초점을 두었다. 이러한 언어 연구 목표는 개별언어의 특성보다는 범어적 속성을 밝히는 것이 곧 통사-의미 연구의 중심인 것으로 인식되어 왔으나나 생성 문법의 그 동안의 업적에도 불구하고 발생하게 되는 문제점 중 가장 심각한 문제점은 문법의 추상성으로 인해서 언어 본질의 경험적 측면을 등한시된다는 것이다.

지금까지의 생성 문법 , 특히 주류가 되어온 GB이론의 연구경향은 강력한 영향력을 행사하는 원리와 규칙에 기반을 두고 인간의 언어 능력을 설명할 수 있는 보편 문법의 수립을 추구함으로 인해 개별 언어 현상에 대한 관심은 상대적으로 희박해져왔고, 문법 이론은 실제적 표면 구조보다는 논리형태(LF)나 음운형태(PF)의 상호 작용에만 중점을 두어오고 있다.

본 랩은 이러한 문제점과 새로운 방향 모색의 필요성 기인하여 , 광범위한 말뭉치 자료를 바탕으로 하는 경험적 연구를 목표로 한다. 특히 어휘의 기본적 사용과 통사, 의미적 사용에 관련된 연구를 위해서는 약 1,000만 단어의 구어체 자료와 약 9,000만 단어의 문어체 자료를 가지고 있는 BNC (영국국립말뭉치)와 1억 정도의 어휘를 가지고 있는 Cambridge International Corpus를 바탕으로 이루어지며, 특히 통사 구문의 특성을 위해서는 최근 구축이 완료되었고 미국영어와 영국영어 뿐만 아니라 홍콩, 캐나다, 호주, 뉴질랜드 등의 어휘 분량의 자료를 가지고 있는 ICE (국제영어말뭉치) 말뭉치 자료를 사용하고자 한다. 또한 이 중 약 100만 어휘를 포함하고 있고 각 문장들이 통사적 구조로 tagging 되어 있는 영국영어 ICE-GB (영국영어말뭉치) 자료를 적극 활용하고자 한다.

2. Computational Linguistics Lab

The advent of the information era in this century has escalated the importance of processing linguistic information more precisely and correctly. Recent developments in artificial intelligence, information sciences and other high technology activities have made it possible to build feasible computational applications for language processing and understanding. Such applications (e.g. message extraction systems, web-based search engines, machine translation and dialogue understanding systems) demand increasing accuracy and robustness of the grammar (or parsers) combined with sophisticated statistical processing methods.

When considering the reality that the basic units in understanding language are sentences, we could not miss the fact that building a reliable syntactic and semantic parser is a prerequisite for language processing. Although there have been several successful morphological analyzers developed for Korean, no serious attempts have been made to build its syntactic or semantic parser(s), partly because of its structural complexity and partly because of the existence of no reliable grammar-build up system. As observed by Kang (1998), the research on syntactic and semantic processing in Korea is at the beginning stage and at least 10 to 15 years behind compared to the one for English.

The research for the development of English syntactic and semantic parsers has reached a significant level that can even allow real-time applications. For example, in past projects, the ERG (English Resource Grammar), a part of the LinGO project at CSLI (Center for the Study of Language and Information), was used in the Verbmobil machine translation system and in an NSF-funded project on computer-aided speech generation for people who cannot speak because of disability (cf. Copestake and Flickinger 2000, Flickinger 2002). However, in Korea there exist few reliable applications in particular for English and Korean or vice versa.

The urgent need to advance the lagging research for Korean syntactic/semantic parser provides the very motivation for this project.

The purpose of this lab is thus to build a general purpose system for processing the Korean language that will support both research and practical applications. The goal includes building a broad-coverage Korean grammar that can be used both to extract precise meanings from text input and to generate well-formed text output. To achieve this goal, the project will develop a computationally feasible Korean Resource Grammar and implement it into the LKB (Linguistic Knowledge Building) system developed by the LingGO (Linguistic Grammar Online) Lab researchers at the CSLI (Center for the Study of Language and Information).

3. Phonology and Phonetics Lab

음운-형태론 분야의 연구는 최적성 이론 (Optimality Theory)의 여러 이론적 틀을 적용하고 가장 적합한 모델을 개발하고 이를 실제 언어현상에 적용하여 영어음운-형태론 현상 전반에 대한 통합적인 기술과 설명이 가능하다.

구조주의의 언어이론을 대체한 이제까지의 생성문법 이론은 언어현상을 보다 많은 현상을 보편적으로 설명하고가 하는 공헌이 있었으나 수십 년을 지탱해온 이 생성이론은 규칙과 도출이라는 장치에 의해 언어현상을 강제로 획일화하려는 문제점에 봉착하여 실질적인 현실을 반영할 수 있는 것이 바로 새로운 최적성 이론이다. 이 이론에서는 입력부인 기저형을 설정하여 이를 단계적으로 변형시켜 출력부를 도출하는 대신 많은 수의 가능한 출력부의 후보(candidates)를 인정하고 가장 제약을 적게 받는 것을 최적의 출력부로 선택한다. 이 이론은 음운-형태론, 통사론 등 어느 특정 분야에 국한되지 않고 모든 영역에 모두 개방되어 있는 커다란 장점이 있다.

4. Applied Linguistics Lab

화용론 에서는 진리조건적 부류의 이론인 기존의 의미론에서 다루어질 수 없는 다양한 언어현상을 설명하기 위하여 새로운 성격의 규칙을 제시해 왔다 . 화용론은 논의의 성격과 목적 및 연구자의 이론적 시각에 따라 그 범주가 매우 다양하게 정의될 수 있겠으나, 기본적으로 언어를 화자 및 언어사용맥락에 비추어 언어구조와 언어사용원리간의 관계를 조명하고자 하는 분야라고 정의할 수 있다(Levinson 1983, Mey 1993). 따라서 본 연구에서는 영어를 대상으로 하여, 화자가 발화하는 발화문의 구조와 기능이 선행담화맥락이나 후행담화맥락 혹은 해당 언어사용맥락의 상황적 맥락에 의해 받는 ‘화용적 제약’의 여러 측면을 크게 ‘체계 제약 (system constraints)’과 ‘관습적 제약 (ritual constraints)’, 그리고 ‘정보-기능적 제약 (information-functional constraints)’을 중심으로 연구를 수행하고자 한다.