MultiModal semantic representation

P McKevitt

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Intelligent MultiMedia or MultiModal systems involve the computer processing, understanding and production of inputs and outputs from at least speech, text, and visual information in terms of semantic representations. One of the central questions for these systems is what form of semantic representation should be used, which of course goes back to the age old question of knowledge representation in artificial intelligence. When a system processes multimodal input it needs to map that input into the representation and vice-versa there needs to be a mapping out of the representation for multimodal output presentation. In addition, there are related issues of synchronisation of input/output and information fusion and coordination. Here, we look at current trends in multimodal semantic representation which are mainly XML- and frame- based, relate our experiences in the development of multimodal systems (CHAMELEON and CONFUCIUS) and conclude that producer/consumer, intention (speech acts), semantic-content, and timestamps are four important components of any multimodal semantic representation. In addition, multimodal semantic representations depend on the task at hand, system architecture, will be necessary at different levels (media-independent and dependent) and will have numerous forms of representation. Semantic representations and content will need to provide for reference and spatial relations, two key recurring problems in multimodal systems.
LanguageEnglish
Title of host publicationUnknown Host Publication
EditorsH Bunt, K Lee, L Romary, E Krahmer
Place of PublicationTilburg, The Netherlands
Pages1-16
Number of pages16
Publication statusPublished - Jan 2003
EventFirst Working Meeting of the SIGSEM Working Group on the Representation of MultiModal Semantic Information - Tilburg University, Tilburg, The Netherlands
Duration: 1 Jan 2003 → …

Workshop

WorkshopFirst Working Meeting of the SIGSEM Working Group on the Representation of MultiModal Semantic Information
Period1/01/03 → …

Fingerprint

Semantics
Information fusion
Knowledge representation
XML
Artificial intelligence
Synchronization
Processing

Cite this

McKevitt, P. (2003). MultiModal semantic representation. In H. Bunt, K. Lee, L. Romary, & E. Krahmer (Eds.), Unknown Host Publication (pp. 1-16). Tilburg, The Netherlands.
McKevitt, P. / MultiModal semantic representation. Unknown Host Publication. editor / H Bunt ; K Lee ; L Romary ; E Krahmer. Tilburg, The Netherlands, 2003. pp. 1-16
@inproceedings{6ce8e1d9e15f45a1b771448ceff6e4be,
title = "MultiModal semantic representation",
abstract = "Intelligent MultiMedia or MultiModal systems involve the computer processing, understanding and production of inputs and outputs from at least speech, text, and visual information in terms of semantic representations. One of the central questions for these systems is what form of semantic representation should be used, which of course goes back to the age old question of knowledge representation in artificial intelligence. When a system processes multimodal input it needs to map that input into the representation and vice-versa there needs to be a mapping out of the representation for multimodal output presentation. In addition, there are related issues of synchronisation of input/output and information fusion and coordination. Here, we look at current trends in multimodal semantic representation which are mainly XML- and frame- based, relate our experiences in the development of multimodal systems (CHAMELEON and CONFUCIUS) and conclude that producer/consumer, intention (speech acts), semantic-content, and timestamps are four important components of any multimodal semantic representation. In addition, multimodal semantic representations depend on the task at hand, system architecture, will be necessary at different levels (media-independent and dependent) and will have numerous forms of representation. Semantic representations and content will need to provide for reference and spatial relations, two key recurring problems in multimodal systems.",
author = "P McKevitt",
year = "2003",
month = "1",
language = "English",
pages = "1--16",
editor = "H Bunt and K Lee and L Romary and E Krahmer",
booktitle = "Unknown Host Publication",

}

McKevitt, P 2003, MultiModal semantic representation. in H Bunt, K Lee, L Romary & E Krahmer (eds), Unknown Host Publication. Tilburg, The Netherlands, pp. 1-16, First Working Meeting of the SIGSEM Working Group on the Representation of MultiModal Semantic Information, 1/01/03.

MultiModal semantic representation. / McKevitt, P.

Unknown Host Publication. ed. / H Bunt; K Lee; L Romary; E Krahmer. Tilburg, The Netherlands, 2003. p. 1-16.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - MultiModal semantic representation

AU - McKevitt, P

PY - 2003/1

Y1 - 2003/1

N2 - Intelligent MultiMedia or MultiModal systems involve the computer processing, understanding and production of inputs and outputs from at least speech, text, and visual information in terms of semantic representations. One of the central questions for these systems is what form of semantic representation should be used, which of course goes back to the age old question of knowledge representation in artificial intelligence. When a system processes multimodal input it needs to map that input into the representation and vice-versa there needs to be a mapping out of the representation for multimodal output presentation. In addition, there are related issues of synchronisation of input/output and information fusion and coordination. Here, we look at current trends in multimodal semantic representation which are mainly XML- and frame- based, relate our experiences in the development of multimodal systems (CHAMELEON and CONFUCIUS) and conclude that producer/consumer, intention (speech acts), semantic-content, and timestamps are four important components of any multimodal semantic representation. In addition, multimodal semantic representations depend on the task at hand, system architecture, will be necessary at different levels (media-independent and dependent) and will have numerous forms of representation. Semantic representations and content will need to provide for reference and spatial relations, two key recurring problems in multimodal systems.

AB - Intelligent MultiMedia or MultiModal systems involve the computer processing, understanding and production of inputs and outputs from at least speech, text, and visual information in terms of semantic representations. One of the central questions for these systems is what form of semantic representation should be used, which of course goes back to the age old question of knowledge representation in artificial intelligence. When a system processes multimodal input it needs to map that input into the representation and vice-versa there needs to be a mapping out of the representation for multimodal output presentation. In addition, there are related issues of synchronisation of input/output and information fusion and coordination. Here, we look at current trends in multimodal semantic representation which are mainly XML- and frame- based, relate our experiences in the development of multimodal systems (CHAMELEON and CONFUCIUS) and conclude that producer/consumer, intention (speech acts), semantic-content, and timestamps are four important components of any multimodal semantic representation. In addition, multimodal semantic representations depend on the task at hand, system architecture, will be necessary at different levels (media-independent and dependent) and will have numerous forms of representation. Semantic representations and content will need to provide for reference and spatial relations, two key recurring problems in multimodal systems.

M3 - Conference contribution

SP - 1

EP - 16

BT - Unknown Host Publication

A2 - Bunt, H

A2 - Lee, K

A2 - Romary, L

A2 - Krahmer, E

CY - Tilburg, The Netherlands

ER -

McKevitt P. MultiModal semantic representation. In Bunt H, Lee K, Romary L, Krahmer E, editors, Unknown Host Publication. Tilburg, The Netherlands. 2003. p. 1-16