The SAMBO [4] has been used as base system for aligning large ontologies with the notion of computation and validation session. For this purpose SAMBO has been analyzed in detail and the alignment framework is enhanced to meet the session based alignment. The framework architecture describes the alignment algorithm integrated with the computation and the validation sessions along with the recommendation process. Further every module is described separately in this chapter.
Framework Architecture
Figure 4.1 shows the architecture for our session based framework.
Figure 4.1 Session based framework for aligning large ontologies
We have divided our framework in three processes in which two sessions are integrated into each other. The Computation session [more details in section 4.3] is used to compute the suggestions after applying alignment algorithms including several matchers, combinations and filters. The output of a computation session will be used in a validation session [more details in section 4.4] where users have an option either to accept or reject the suggestions. The conflict checker will be applied on accepted suggestions in order to remove the conflicts and then a Partial Reference Alignment (PRA), i.e. a list of current mappings, will be generated for further use in the future iterations and in recommendation process.
PRA will be used at different steps of ontology alignment [6]. At preprocessing, PRA will be used to divide the ontologies into mappable parts containing mappings. It can also be used to compute similarities between terms and to filter mapping suggestions. The PRA based matchers can be created after using the underlying properties of the mappings in PRA.
The recommendation process [more details in section 4.5] will be working independently using the results from both of the sessions and recommending better matchers and filters for further computation of the suggestions.
The framework architecture presented in figure 4.1 is the extension of work showed in figure 2.3. The presented architecture is using the alignment strategies of SAMBO by adding the notions of computation and validation sessions and separating the suggestions after user actions. The alignment strategy defined for SAMBO in figure 2.3 is making possible to reuse only accepted suggestions but new framework introducing PRA and making it possible to use the rejected suggestions also. The recommendation process is totally new work in this system which will work parallel independently.
4.3 Computation Session
Figure 4.2 show the flow of computation session in the system. The computation session gets two source ontologies as input and has matchers and filters for aligning algorithms. The pre-processor in this process is used to determine whether it is the first computation of suggestions or there is any PRA available. The PRA will be available from validation session but if it is computing for the first time then there is no PRA available. The preprocessor is also used to find out whether there is any saved session data from previous work or not. If any saved data is found then users can load that session or can start a new session. The matchers are used to implement the strategies based on linguistic matching, structure-based matching, constraint-based approaches, instance-based strategies and strategies that use the auxiliary information or a combination of these. The matchers calculate the similarity values between the terms from different source ontologies. The suggestions are then generated by combining and filtering the results determined from one or more matchers. The user can select different matchers, and can set threshold value to obtain the results. The use of different matchers, combinations and filters we get results in different ways and then we get different alignment suggestions. The suggestions list generated from computation process will be used as an input in validation session.
Figure 4.2 Computation Session
4.4 Validation Session
Figure 4.3 show the flow of validation session. The validation process uses the suggestion list generated from the computation process as an input. These suggestions will be presented to the user. The user will perform an action on each and every suggestion by accepting or rejecting the suggestion. The acceptance or rejection of presented suggestion may have influence on other suggestions. The conflict checker algorithm is used to detect the unclassifiable concepts and can be used to remove redundancy on user request. These algorithms will only be applied on accepted suggestions and then these will be called Partial Reference Alignment (PRA). The rejected suggestions and PRA can be used in recommendation process and in computation session for re-computing the suggestions. The user can save the session at any time; in that case the system will be able to store the user information along with applied matchers and filters for the process and the list of generated suggestions. Next time when the user will load the session he will get all the information back. All the previously stored information will be available to him for further usage.
Figure 4.3 Validation Session
4.4 Recommendation Process
Figure 4.4 shows the flow of recommendation process. The recommendation process is a part of future work for this system and we are just introducing the process. We have not implemented this part in this thesis work.
This process will work independently along with computation and validation sessions. The user will only be able to see and use the output of this process. This process will use multiple computation sessions with different combinations of matchers and filters, resulting in multiple suggestion lists and will compare the suggestions lists with rejected suggestions and PRA, that were generated in validation sessions. After making comparison this process will recommend the matchers and filter combinations to the user to get better results. The process will generate an XML file containing recommended settings and the computation session will use that file and will show all the recommendations to the user before starting computation and the user may use these settings or can define his/her own.
Figure 4.4 Recommendation Process
4.5 The flow of System
Figure 4.5 shows the sequence diagram for the system. A user will select the ontology files from ontology source which will be uploaded by the system. Now the user has a chance to select the matchers, combination and filters from their respective sources. After starting the computation process the system will apply all the matchers, combination and filters on uploaded ontologies in order to compute suggestions. All the computed suggestions will be presented to the user for further action. After finishing computation, user will go to validate the suggestions which he may accept or reject. The rejected suggestions and the Partial Reference Alignment (PRA) will be available for further use in recommendation process for next possible computations, on same ontologies along with recommendation settings. In the end alignment results will be shown to the user.