Retrieval and AnnotationAuthors

This is an open access article distributed under the Creative Commons Attribution License unported 3.0, which permits unrestricted use, distribution, and reproduction in any medium, provided that original work is properly cited. Abstract Our system OSVIRA (Ontology-Based System for Semantic Information Retrieval Visio-conference and Annotation) is devoted to the development of help to annotation and semantic search of multimedia conferencing resources. It is based on the use of ontology associated with dense thesaurus. It allows using multiple ontology relating to the same seamlessly domain with the ability to define links of equivalence between concepts and / or relationships of different ontology considered. The OSVIRA' thesaurus besides the uses of uses linguistic relations to reinforce relations between terms and enlarge the semantics of vocabulary. OSVIRA allows describing semantically the content of a pedagogic multimedia resource on the basis of an intuitive model of annotation based on the triplet {Object, Relation, Object}. It formally represents this content using conceptual graphs.


Information
and Communication Technology (ICT) greatly improves our ways of communicating and being informed and trained.The emergence of this technology has led to the appearance of a new kind of learning known as elearning.The latter kind of learning is defined as the use of new multimedia technology and internet for the purpose of improving the quality of learning by facilitating the access to resources and services, the exchange of information, and distance collaboration.At university, the pedagogic documents constitute the whole information presented in the form of courses, videoconferences, etc… Taking into consideration the high cost of such resources and the necessary expertise, it is of primary importance to facilitate access to and exploitation and reuse of these resources.For this reason, the aim of the present study is to suggest a system of annotation of pedagogic multimedia documents for the purpose of facilitating their access.
Our OSVIRA system (an extension of CLOVIS model (Charhad, 2005) is devoted to the development of a system of assistance to the annotation and to the semantic research of the pedagogic multimedia resources, more precisely of videoconferences.Based on the use of dense ontology (Fürst& Trichet. 2006)] associated with a thesaurus, OSVIRA allows describing semantically the content of pedagogic multimedia resources and then formally representing the content with the help of conceptual graphs.Furthermore, the use of an ontology (defining the knowledge of an area bounded, i.e.UMLS, Unified Medical Language System, for medical) whose field is associated with a thesaurus allows our system to take into account several perspectives on the same Communications of the IBIMA 2 pedagogic multimedia resources.In contrast with generic ontology, domain ontology limits itself to the representation of the knowledge of a particular domain.
Our choice is motivated by the fact that domain ontology restrains the interpretation of the concepts which it defines in the context specified by the domain.This has the advantage of limiting the ambiguity of the terms defined in the ontology to reference concepts and thus to make their detection in the documents easy.
As such OSVIRA allows using jointly several types of ontology relating to the same domain in a transparent manner thanks to the possibility of defining links of equivalence between concepts and/ or relations of the different ontology considered.

Dense Ontology and Thesaurus
Ontology is commonly defined as an explicit and formal specification of a shared conceptualization (Gruber, 1993)  Unlike an annotation tool applied on a written document using an already prepared corpus of keys words, OSVIRA is supported by a group of concepts which it chooses at the time of the descriptions of the documents.To ensure the coherence between concepts added to our system, it is necessary that a certain organization and standardization of the vocabulary chosen to construct a thesaurus specific to our system.The use of a model of thesaurus in OSVIRA allows categorizing the knowledge represented in the semantic part.For instance, the term doctor can represent the professional category of two known pediatricians in Tunisia Awatef Chariez and Mourad HAmzaoui.The thesaurus of OSVIRA, beside the use of the relations of specialization between concepts which can be found in a taxonomy (Corcho, Gómez-Pérez, González-Cabero & Suárez-Figueroa, 2004), use linguistic relations to reinforce relations between terms and enlarge the semantics of vocabulary.These linguistic relations can be relations of synonymy, meronymy (relation of composition), hyponymy /hyperonymy (relation of specialization / generalization), and of antonymy.
The joint use of a dense ontology and a thesaurus proves to be promising as part of a system of documentary research of information by key words.It is about describing a documentary research of metadata (author, date, source, etc).These metadata are represented in the form of triplets: {Subject, Verb, Object} or { Object 1, Relation, Object 2} according to the type of the necessary descriptor (Vargas-Vera, Matta & Domingue, 2002).To facilitate the process of annotation of videoconferencing documents and to make its results coherent, we will use the principle of dense ontology in which the concepts and the relations are presented by the triplet {Object, Relation, and Object}.

The Model of Conceptual Graphs and Language (OCGL)
Because of the diversity of the semantic content of videoconference in terms of the type of information (visual, auditory and sometimes textual), the description and presentation of these contents are two rather important and complex stages.This diversity triggers a multitude of possibilities for the analysis of contents.By being situated at the level of symbolic description in which an audiovisual document is seen as a group of visual objects, of persons, etc, the intervention of the human operator proves to be necessary to describe the semantic content, classify information and choose for each audiovisual portion a description appropriate to its content.It is at this level that the difficulty of the description of the semantic content appears (Charhad, Zrigui & Quénot, 2005).The classical problem at this level results from the constraint of specific/generic which can strongly influence, for instance, the result of a system of indexation and research.
To solve and reduce this problem, our system brings solutions through a modeling of annotations related to the different portions of a videoconference under a graphic form that is easily understood by users.For this, it is necessary to envisage a generic model allowing representing the content of the different portions of the videoconference.
Our system envisages a model for the presentation of the semantic content of the videoconferencing documents.This model allows the integrated and synthetic consideration of the items of information taken from each of the three modalities of a videoconference (image, text, sound).It represents all information in a "standardized" manner to make the processes of indexation and research by contents easier.The instantiation of this model is accomplished with the help of Conceptual Graphs (GCs).The choice of this model is justified by its expressiveness and its adequacy in the context of indexation and research of information by contents.The model of Conceptual Graphs is a model of presentation of knowledge which belongs to the family of semantic networks.(Sowa, 1984).This model is mathematically founded on logic and theory of graphs (Chein & Mugnier, 1992).However, to reason with the help of GCS, two approaches can be distinguished (1) to consider GCs as a graphic interface for logic and therefore, to reason with the help of logic and (2) to consider GCs as a full model of presentation having its own mechanisms of reasoning founded on the theory of graphs.As part of our study, we adopt the second approach by using projection (an operation of graphs corresponding to a homomorphism) as operator of reasoning (cf. Figure 1); projection is complete and consistent in relation to deduction in logic of the first order (Chein & Mugnier, 1992).

Figure 1: GCs Project Operation
The first graph allows representing the following knowledge: « A pediatrician is a doctor having examined at least one baby ».By using projection principle, the second graph implies that Awatef Chariez is a Doctor specialized in Pediatrics.A concept (rectangle) is described by a label and a marker which identifies considered instance.The marker * denotes an indeterminate instance.A relation (ellipse) is only described by a label.
Ontology and Conceptual Graphs Language (OCGL) (Fürst & Trichet, 2004) is a language based on GCs and devoted to the presentation of dense ontology.To represent ontology by OCGL principally consists in (1) Specifying the conceptual vocabulary of the considered domain and (2) Specifying the semantics of this vocabulary with the help of Axioms.The conceptual vocabulary is composed of a group of concepts and a group of relations.These two groups can be structured either by using conceptual properties called Schemata of Axioms (hierarchy of concepts and relations independently), or by using domain Axioms used to represent rules and constraints.
The Schemata of Axioms proposed by default in OCGL are: (1) The relation of Specialization /Generalization attested between two concepts or two relations and used to construct taxonomies (tree or lattice), (2) The property of abstraction of a concept (a concept is said to be abstract if it does not accept direct instances: all these authorities are necessarily authorities of one of its son concepts),

Extensive Conceptual Description
The model of annotation adopted within OSVIRA rests on the triplet {Object, Relation, Object} allowing representing the content of the videoconferencing portions by simple sentences expressed in natural language such as "X talks about Y".
The visual objects are made up of entities (logical concepts) that can be linked to one or many video conferencing portions.A visual object can also be defined as a collection of visual areas which are grouped according to some criteria defined by the domain knowledge.These objects should equally meet some conditions as the semantic accordance and the representation of a real object for the users.
The first object is an item of information related to one or several relations.The interpretation of a videoconferencing portion containing one or several objects is generally linked up to a state or an activity.It is possible to generate several and more or less different interpretations of the same concept object contained in the same videoconferencing portion.

Because
of the variety of the videoconferencing content, it often happens that, for the same Video conferencing portion, we can associate several interpretations which differ especially at the level of abstraction and the source of information used (audio, picture, text).The source of information is a criterion adopted to distinguish between the different syntaxes of requests.In fact, a request based on the visual flow has the following syntax "look for the videoconferencing portions in which a person X appears".However, a request based on the audio flow will have the form: « look for the videoconferencing portions in which a person X speaks".
The different possibilities of interpretation of the same videoconferencing portion can generate problems at the level application of the modeling schema.In other terms, if this schema is used as an index base of a system of indexation and research, it is necessary to find means to overcome problems met at the linguistic level such as synonymy and the homonymy.Suppose for example that the user formulates his request using the concept "Doctor", the system already presented in (Yengui & Neji, 2009) cannot necessarily recognize that it can equally return the documents indexed by the concept "Patrician", the two terms are synonymous.
For that, an extension of the modeling schema by associating an ontological structure (a kind of a basic of knowledge) is deemed important.This structure will be exploited especially to enrich the description at the level of modeling schema.It also allows to structure requests so that one can specify or generalize the formulated description.Figure 5, describes the architecture of extensive modeling of OSVIRA.-The second concerns about the task of documents treatment.We recapitulate our contribution for the multi-facet and multimodel modeling based on the conceptual graph formalism.We bring out the concepts and the relations between them for each content.Then, we merge the obtained results in an only one conceptual graph.
-The third part of the architecture studies the use formalism of the model for the annotation and the research through the video-conferencing documents content.
-The fourth details the extension of the model by the proposal to enrich the model by extending the vocabulary of description and operation of external data in the form of ontology concepts and hierarchies

Description of the Annotation Model Adopted
Figure 6 illustrates the application of our annotation model in the following request "a pediatrician examines a measly patient".In this example, the first user u1 annotates the video-conferencing portion by using the ontology Onto1 and describes "Pediatrician Is_ in_ front_ of Patient" (Pediatrician being the concept Object, Is_ in_ front_ of being the relation, and Patient being the concept Object) and "Doctor speaks ".The second user u2 annotates by using two types of ontology Onto1 and Onto2, and describes "Patient suffers" without specifying the concept Object, and "The patient is attacked by an illness called measles" (i.e. a precision of the concept Object Patient which is here instantiated by measles).Finally, the last user u3 does not annotate the content of video-conferencing portion but annotates the potion itself as belonging to the domain of Medicine (a similar use of the instantiation of the concept Object to specify the annotation, in this case video-conference: Medicine).The research of resources starts with the formulation of a request {Object, Relation, Object}, or of group of requests linked to each other by the operators AND or OR.To formulate his requests, the user can either browse within hierarchies, or express freely terms which will then be compared with the entries of thesaurus to find the underlying concepts and relations.For instance, the research request « The visioconferencing analyzing pediatric diseases manifested by rash » is introduced by the following criteria (1) « a pediatric illness » represented by GC (Illness: * specialize pediatric: *), where Illness and Pediatric are concepts and to Specialize is a relation and (2) « manifested by rash » is represented by GC and composed only of the concept (Rash: *).Any resource for which there is a projection of each of these two graphs, one of these graphs representing one of the annotations of resources is considered as appropriate for request.

Conclusion
In this article, we introduced a tool of semantic annotation of videoconferencing documents.This tool makes it possible to represent semantically the content of a pedagogic multimedia resource and to formally represent this content with the aid of conceptual graphs.Our objectives are: -Exploiting semantic descriptions of the content; -Making the manipulation and the access to big videoconferencing databases easier; -Regrouping the descriptions taken from different modalities in the same modeling schema.
This tool uses a knowledge base which acts as a means of assistance and allows later to enrich the modeling schema through a correspondence between the different cases contained in videoconferencing document.
We can add that this tool could be exploited in a process of indexation and of semantic research of information in a videoconference.

( 1 )( 3 )
Define the meaning of components, (2) Define the restrictions on the value of attributes, Define the arguments of a relation,

( 4 )( 5 )
The property of disjunction between two concepts, The signature of a relation (specifying concepts linked by considered relation, The algebraic property of a relation (symmetry, reflexivity, transitivity, non reflexivity, anti-symmetry), are derived from the universal concept).In figure3, the relation Inspect (relating the concept dental arches to the concept doctor) specializes the relation Examine (relating the concept organ to the concept doctor).

Figure 2 :Figure 3 :Figure 4 :
Figure 2: Extract of a Hierarchy of Concepts of an Ontology OntoMed Devoted to the Domain of Medicine

Figure 5 :
Figure 5: General Architecture of OSVIRA Model The general architecture of our model consists of three parts: -The first consists in collecting a set of video-conferencing documents.

Figure 6 :
Figure 6: The Application of the Annotation Model {Object, Relation, Object} Multiontology and Multi Users This annotation model, intuitive, easily comprehensible and having a strong correspondence with model of GCs, allows annotating the content of a resource under several angles, and ultimately by using several types of ontology of the same domain.Moreover, including axioms in ontology allows enriching the annotations of resources and making their indexations easier.For instance, in the context of figure 6, having specified in ontology Onto1 that the relation Is_ in_ front_ of is a symmetrical relation allows enriching all the annotations automatically by adding the proposition (Patient is_ in_ front_ of Doctor).