
|
 |
 |
Panel Discussion at COLING 2002
Semantic Web: A New Challenge for Language Technology
The vision of the Semantic Web as pictured by Berners-Lee, Hendler and
Lassila in their famous 2001 Scientific American article has become a
driving force for many large and small initiatives aiming at turning the
wealth of nearly unstructured digital information into a semantically
structured global knowledge base. New initiatives are underway, welcomed
and supported by industry, government agencies and academic
associations. The Semantic Web seems to become the major hype of the
first decade of our new millenium.
What is the relevance of the Semantic Web program for our discipline?
Can it provide realistic tasks and useful resources that traditional AI
could not deliver? Will it render some of our new language technologies
obsolete that were dedicated to the exploitation of unstructured data?
Will the Semantic Web need automatic language processing in order to
succeed?
The panel will concentrate on but not be restricted to the following
issues:
-
The employment of language technology for the construction of useful
ontologies:
One of the shortcomings of hand-crafted AI ontologies was their
artificial nature. Useful ontologies do rarely meet the high aesthetic
standards of philosophers or domain-specialized theoreticians. Can
data-oriented language technology facilitate the detection of useful
ontologies that reflect the needs and daily tasks of their users?
-
The exploitation of Semantic Web ontologies for LT applications such
as information extraction:
Domain modelling is a serious bottleneck for many language technology
applications. Can the Semantic Web movement help us by providing
well-designed ontologies for a multitude of knowledge domains?
-
The challenge of (partially) automating the detection and annotation
of concepts:
One of the major shortcomings of the original Semantic Web vision is its
reliance on extensive hand annotation of large volumes of digital
resources. As we know from daily experience, content developers
(authors) do not even exploit the modest means for encoding
meta-information that is provided by HTML. They do not have the time and
patience to find and insert the most useful hyperlinks. How can one
expect that the web will become semantified by human annotation?
-
The utilization of the Semantic Web as a resource for machine
learning in NLP:
Supervised learning from hand-annotated texts plays a major role in
language technology research and development. Will the Semantic Web
movement create large volumes of annotated texts? Can these texts be
used for machine learning techniques that improve topic detection,
information extraction, question answering and other language
technologies? Can systems for automatic annotation be trained in a
bootstrapping fashion?
-
The relationship between the Semantic Web and multilinguality:
The planned dense semantic markup will facilitate cross-lingual
navigation and information retrieval. Will the semantic web really
contribute to overcoming language barriers by making information better
accessible across languages? Will contents in all languages be annotated
and crosslinked at the same time and in comparable proportions? What
is the role of language technology in this process? Will the Semantic
Web help to reduce the knowledge gap among or will this gap be widened?
-
The Semantic Web and language variation:
Most
knowledge technologists have given up on the idea of one
comprehensive ontology for all users and all purposes. Preference is
given today to the vision of a whealth of ontologies with many partial
overlaps and mappings. To establish the association between users,
situations and the appropriate ontologies may be an issue for knowledge
management, but the association of the appropriate ontologies to texts
could also become a topic for language technologists. Will ontologies
be marked for certain variants of language such as historical variants,
sociolects or professional and genre specific jargons? Can those
variants be automatically detected?
Contributors in alphabetic order:
Paul Buitelaar DFKI, Saarbrücken
Ed Hovy ISI, Marina del Rey
Chu-Ren Huang, Academia Sinica, Taipei
Nancy Ide Vassar College, Poughkeepsie
Coordinator
Hans Uszkoreit, DFKI and Saarland University
|
 |