Knowledge Management and Decision Support Systems: Application to Time Series ForecastingAuthors

Bertan Badur. This is an open access article distributed under the Creative Commons Attribution License unported 3.0, which permits unrestricted use, distribution, and reproduction in any medium, provided that original work is properly cited. Abstract The aim of the study is to develop a framework that integrates knowledge management (KM) and decision support systems (DSS) by using knowledge discovery techniques (KDT). KDT are applied for achieving conversions among different types of knowledge and also creating new models from previously defined ones. Extracted neural network rules are stored into a model base for achieving knowledge externalization. CLIQUE algorithm, suitable for clustering high dimensional data, is used for generating explicit knowledge by combining decision rules in the model base. The case base reasoning (CBR) paradigm is utilized for the other types of knowledge conversions, internalization and socialization. CBR enables to solve newly defined problem with the help of previous rules. The applicability of the proposed framework is demonstrated by an experimental study in which forecasting the change in US Dollar/Turkish Lira exchange rate is illustrated.


Introduction
Decision Support Systems (DSS), which provide support to managers for solving semi or ill-structured problems, are becoming more important to organizations in their short and long term business decisions (Turban et al, 2007 andCourtney, 2001).The typical DSS environment has two main folds.First, the classical decision models are generally constructed by using the small portion of organizational information that can only be stored in the computer based environments such as purchasing interactions and customer details (Henrichs and Lim, 2003).Second, the structures of decision models are highly generalized for instance, in a typical system, not more than two generalized models are used to predict the demand of customers Shim et al (2002).Precisely, the organizational information that is stored in databases is not more than a collection of data that are captured from different resource.Hence, the information is not in comprehensible, purposeful and sensible form by human intuition and beyond the human's information processing capability (Keim, 2002).In order to convert the raw data into more comprehensible chunks of information in the decision process, several static decision models, borrowed from operational research, are utilized.The static models have severely short expiry to date respond accurately to the current business problems in the decision making process.However, models should have adoptive nature rather than being static to respond rapidly to the changing demands in business decision making (Little, 2004).DSS in classical terms are insolvent to utilize deep and comprehensible Communications of the IBIMA 2 knowledge and also to provide adoptive decision models.Therefore, decision makers expect to make final decision by utilizing knowledge in their mind, called tacit knowledge (Shim et al, 2002 andBolloju et al, 2002).At this point, the power of human intuition, knowledge, or at the extreme level human wisdom, is not neglected.However, it is clear that classic DSS are insufficient to provide hidden knowledge in information and utilize selfadjustable models that can accord the more accurate and advance decision making process (Ergazakis et al, 2002 andErgazakis et al, 2008).This situation leads us to a new decision support concept that integrates the processes of both knowledge management (KM) and DSS by using knowledge discovery techniques (KDT), (Nemati et al, 2002).
The purpose of this study is to bring a new perspective to decision support systems that applies KDT for four steps of knowledge conversion (Dierkes et al, 2003) and generates specific decision models automatically by utilizing previously defined and stored models as much as possible.In order to achieve this purpose, various KDT are used in the four steps of the knowledge conversion spiral (Nemati et al, 2002) and case based reasoning (CBR) paradigm (Kolodner, 1993, Blanzieri and Portinale, 2000and Montani, 2010), is used for constructing specific decision models for a given problem.Hence, the effectiveness of DSS increases by enhancing more fresh and relevant knowledge in decision making.We conducted an experimental study to show the applicability of the proposed framework though the framework is not tied up to a specific problem domain.Forecasting the Turkish macroeconomic and financial time series is chosen as an experimental problem domain in the study.
The rest of the paper is organized as follows; next section contains the basic framework of incorporating knowledge discovery techniques in knowledge management.The complete explanation of the proposed framework is done throughout section 3, the implementation of the framework is explained in section 4 and conclusion is made in the final section.

Basic Framework
Scholars usually define DSS as a system used to support solving semi or illstructured problems.However, the definition is very broad for a classical DSS in a way that systems are supposed to solve the structured, quantifiable and formulated parts of the problem and ill-structured part of the problems can be puzzled out by the opinions and judgments of the decision maker.The statement can be restated by using the knowledge management terminology as the DSS use explicit knowledge, stored in digitalized environments to solve the structured part of the problem meanwhile, tacit knowledge is utilized by decision makers to solve the unstructured part of the decision problem (Nemati et al, 2002).Nemati's framework has been applied in several areas such as customer relationship management (Ranjan andBhatnagar, 2011), communication management (Wu et al, 2009) and group decision making (Irfan and Uddin-Shaikh, 2010).Independent from the computer systems and specifically DSS, knowledge management literature is modelled the knowledge conversion taxonomy in organizations (Nonaka, 1994 andNonaka, 2008).This taxonomy, that is well-known and highly accepted, streamlines the transfer of knowledge within the organization and especially highlights the tacit to explicit conversion of individual decision makers.
In the recently developed DSS environments, researchers try to find out a way to internalize tacit knowledge of decision maker into decision support process.Nemati et al (2002) describe the possibility of the architectural integration of DSS and knowledge, knowledge management and artificial intelligence.On the other hand, Bolloju et al (2002) propose a framework that inspires the proposed system in this paper to deploy both tacit and explicit knowledge in DSS environment.
In Bolloju's framework, the tacit models of different decision makers are simply stored in operational databases as decision instances.Data warehouse contains historical and current tacit models.Data marts are subset of data warehouses and serve for different functional domains in an organization such as; sales, marketing and production.For storing explicit knowledge that is derived by externalization and socialization processes, model marts, model warehouses and model bases are proposed.These external and internal knowledge sources are parallel to the data sources where tacit knowledge is stored.In the study of Bolloju et al (2002), the knowledge conversion from tacit to explicit or vice versa is done as follows; 1. Externalization: Tacit knowledge externalization (Herrgrad, 2000) is achieved by using knowledge discovery techniques in databases such as; decision trees (Rokach and Maimon, 2008), neural networks (Bishop, 2006), rough sets (Lin and Cercone, 1996), fuzzy rule discovery (Freitas, 2002) or even hybrid approaches (Zhang and Zhang, 2004).In externalization the data set is relatively small when compared to tacit models however, may contain large number of attributes for reflecting the complexity of tacit models.
2. Combination: New explicit models are created by generalization and integration of existing models in data warehouses (Nonaka et al, 2000).This process requires integration of models after resolving the differences between them.In order to perform combination process, different artificial intelligence and knowledge discovery techniques can be used (Carvalho and Ferreira, 2001).In our study, differences between models are found by using a clustering algorithm, called kmeans (Mirkin, 2005).Models are clustered based on their statistical properties.On the other hand a grid base density-clustering algorithm, called CLIQUE is used for integrating existing models results (Agrawal et al, 1998 andAggarwal et al, 1999).
3. Internalization: This process contains dissemination, exploration, analysis/evaluation and dynamic application of explicit models.In order to achieve these tasks, explicit models are visualized to decision makers.Internalization process enables decision maker to manipulate the output of model solver and analyse the explicit models (Tsai and Lee, 2006).

Socialization:
The socialization process provides the creation of new tacit knowledge by sharing and integrating tacit models (Lee and Choi, 2003).Data retrieval and interpretation tools such as OLAP, case based reasoning (CBR), are useful for performing socialization process (Wickramasinghe, et.al., 2004).

Methodology
The integration of KM and DSS enriches the decision making process by providing enhanced knowledge and by generating more specific decision models for each decision problem (Nazem and Shim, 2002).The proposed system utilizes various types of KDT to perform any type of knowledge integration and specific model utilization for each decision.Meanwhile, the system intends to reduce time and human effort significantly, to increase the effectiveness of DSS and thereby to improve the decision-making process.
In order to fulfil this aim, the proposed system has three main phases.The first phase is for externalizing the tacit knowledge into explicit one in the form of symbolic rules (Zhou et al, 2003) and ϐilling up the model base by these rules.The explicit knowledge is composed of artificial neural networks (ANN) parameters and symbolic rules that are extracted from ANN models.Meanwhile the externalization is not restricted to ANN results.Any type of techniques such as decision trees, genetic algorithms, support vector machines (Diederich, 2010) and etc. which can produce symbolic rule can be deployed as a decision making algorithm.

Communications of the IBIMA 4
Each of the explicit knowledge is put under a Generalized Episode (GE) (Mannila and Toivonen, 1996) that contains domain knowledge about decision models for more efficient and effective insertion and retrieval into the model base.Internalization and socialization processes are the second phase of the proposed model by both deploying simple but powerful graphical user interface (GUI) and CBR paradigm respectively.CBR also enables to utilize the specific model in problem solving (Diaz-Agudo and Gonzalez-Calero, 2000).The last phase performs the combination of similar explicit knowledge that is under the same GE by using a grid based density clustering algorithm, called CLIQUE algorithm.The detailed explanation of the three phases is made throughout this section.

Phase 1
Rule Extraction: Although neural networks are widely used to predict future values of financial and economic time series (Hansen and Nelson, 2002), the results of ANN are not comprehensible for a decision maker (Benitez et al, 1997).So, ANN results should be converted into symbolic rules.There are several studies in the literature for extracting rules form ANNs (Baesens, 2003, Ilonen et al, 2003, Mak and Munataka, 2002, Giles et al, 1997, Craven and Shavlik, 1997and Lu et al, 1995).All of these are based on assumptions about network architecture and they are designed for networks that produce binary outputs.So, these algorithms are not compatible with financial and economic time series forecasting problems because each problem can be handled by different network architectures and binary outputs are too simplistic to represent the behaviour of network.In this study, we develop a new, simple but effective symbolic rule extraction algorithm from feed forward neural networks (Ilonen et al, 2003).Since, in our experimental study, change in financial time series is forecasted, the algorithm divides both output and input spaces into five distinct clusters (down more, down, same, up and up more) and maps the input space to output space after network is trained by user defined parameters.Our proposed rule extraction algorithm is as follows: 1. Find set of quartiles (Q1, Q2, and Q3) and inter quartile range (IQR) of each input and output.General Episode Generation: In order to ease the search, retrieval and insertion procedures in explicit knowledge combination and also specific model generation for a problem, all knowledge is indexed under the most suitable GE that contains domain knowledge about cases.GE is obtained by clustering the statistical properties of outputs of existing models by using k-means algorithm (MacQueen, 1967 andPal andMitra, 2004) .The centre of each cluster constitutes the domain knowledge of GE.The statistical properties are the slope of the linear trend line, the percent of variation in dependent variable and the seasonality index.

Determine
1. Slope of the linear trend line that is the expected rate of change in predicated output for a given changes in time.The equation of the line is as follows: where, ŷ: is the predicted output a: is the intercept b: is the slope of trend line t: is the independent variable (time) (5) 2. The percent of variation in the dependent variable (y) that is explained by the regression equation is represented by r 2 (coefficient of determination).
3. Seasonality index is the difference between r 2 of the regression for seasonal output and r 2 of regression of the deseasonalized output.Deseasonalized output is fitted to time as in equation 5 and r 2 of this equation is taken as the r 2 of regression of the deseasonalized output, (Heizer and Render 2010).The deseasonalized output series are found as follows; 3.1.Find the average historical value of each season.
3.2.Calculate the average value over all years by dividing the sum of all the average historical value of each season with the number of seasons.
3.3.Compute seasonal index of each season by dividing the average historical value of time series (step 3.1) by the average value over all years (step 3.2).
3.4.Find deseasonalized output by dividing each of the time series by its corresponding seasonal index.

Phase 2
When a new problem is defined, it is assigned to the best matched GE.The cases which exist under this episode become the neighbours of the current problem.The question here is: which of these cases are the nearest neighbours of the problem?.The second phase of the proposed system addresses the question in order to find which cases gain importance while suggesting a solution for a new problem.Three similarity measures are used to calculate the similarity index between previous cases and a new problem for determining the nearest neighbours of a new problem.These measures, "concept hierarchy distance", "time interval distance" and "distance based on statistical properties" are explained as follows; 1. Concept Hierarchy Distance: Financial and economic time series are arranged in order for low level concepts to be more general.

Communications of the IBIMA 6
For instance, the national income account is the most general concept.The second level concept under national account is gross domestic product (GDP) and one of the third level concepts under GDP is public consumption as shown in Figure 1.In our Experimental study, concept hierarchies of time series are obtained from the Central Bank of Republic of Turkey.The distance measure between two time series that are in the same concept hierarchy is calculated as follows: 1.1.Find the common node from where desired output and output of a similar case are derived.
1.2.Find depth of the hierarchy between the common node and the desired output (deptho).
where, n: is the number of nodes between common node and desired output k: is the constant term which is specified by the decision maker.
1.3.Find depth of the hierarchy between the common node and a similar case output (depths) where, n: is the number of nodes between common node and a similar case output k: is the constant term which is specified by decision maker 1.4.The distance in concept hierarchy is: where, CHi is the Concept hierarchy distance between new case's output and i th similar case's output The constant term (k) is speciϐied as 0.2 in our experiment while calculating the deptho and depths.The value of k is an arbitrary value for distinguishing the distance between siblings and distance between child and its grandparents because a child is closer to his siblings than his grandparents.For instance, the distance between public consumption and private consumption is 2. On the other hand, the distance between public consumption and national accounts is 2.2 (Fig. 1.).If two nodes belong to totally different hierarchies, we do not consider them as neighbours so, the distance between them is not taken into account.TIi is the time interval distance between new case and i th similar case 3. Distance Based on Statistical Properties: This similarity measure is the sum of absolute value of difference between the statistical properties of new case's output and each existing case's output.The calculation is formulated as follows; where, SDi is the distance based on statistical properties new case and i th existing case Dist i T is the distance between slope of trend line of output of new case and i th existing case Dist i V is the distance between r 2 of output of new case and i th previous case Dist i S is the distance between seasonality index of output of new case and i th previous case

Communications of the IBIMA 8
After all of the similarity measures between output of new case and output of each pervious cases are found, similarity indices for each previous case are calculated as follows: where,

Dc is the weight of CHi
Dt is the weight of TIi Dd is the weight of SDi SIi is the similarity indices of the i th previous case The weights of these similarity measures are scaled relative to one of them whose value is equal to 1.In this study Dc, Dt and Dd are 1, 0.5 and 0.25 respectively.Decision maker can change these values if he wants to increase the importance of any of the similarity measures in new model generation.Similarity index of each previous case is used for putting the similar previous cases in descending order that reflects the importance of a case in new model generation and new rule set formation.
After similar cases are found, a new decision model for the new problem is constructed.The decision maker either chooses to train a new network from scratch or displays a new set of decision rules by combining rules of similar cases.Three components of a network model; network parameters, inputs and number of lag of inputs1 , are determined as follows.
1.The parameters of a new network are found as follows: 1.1.If the parameter is quantitative, it is calculated by taking the weighted average of corresponding variable in similar previous cases.
1.2.If the parameter is qualitative, it is determined by finding mode of each distinct value of the parameter.
2. Inputs are recommended to the decision maker in descending order.The order of inputs represents the level of importance of them in new model construction.Two measures are considered while arranging inputs in order.
2.1.The weighted frequency of each input in all previous cases is calculated as follows: where, n is the number of cases x is 1 if input is in i th pervious case otherwise x is 0 Ii is the importance of i th case in new model, is derived by using similarity index of i th case Freqinput is the weighted frequency of an input.
2.2.Information gain of each input is calculated.This measure is used to suggest attributes that has important contributions in the rules of previous similar cases.Information gain is calculated by subtracting the entropy of input from the expected information needed to classify the outputs, (Quinlan, 1986 andDaley &Jones, 2004).
2.3.The number of lag for each input is calculated by getting the weighted average of number of lags of inputs in previous cases.

∑ ∑
where, n is number of inputs Ii is order of input in new case generation Lag i is lag of i th input (13) After parameters and inputs of new network model are determined, inputs and number of lags for each input are displayed to the decision maker.In order to construct the new model, decision makers are free to select inputs among displayed ones and they are also free to accept or to change the number of lag for each input that is recommended.As stated before, the higher the place of an input in order is, the more important the contribution of this input in the new model becomes.So, it is suggested that decision maker should select inputs that are at the top of the input list for obtaining more accurate and reliable results.On the other hand, if user does not select any of the recommended inputs, the system automatically constructs new model by getting top five inputs in order.
When parameters and input structures of new model is set, a new decision model of a problem is constructed either for training the network from scratch or for recommending a new set of decision rules by combining existing ones according to user preferences.

Phase 3
The explicit knowledge which is obtained from tacit knowledge externalization by extracting ANN rules is combined with other existing knowledge by using a grid based density clustering algorithm, called CLIQUE.This algorithm is chosen because it can treat high dimensional data effectively, can produce interpretable results and can be scaled with number of dimensions and size of input space.In original CLIQUE algorithm, density of each dimension is equal in the units.In order to adopt the algorithm to our problem, density of dimensions (attributes) are multiplied with a coefficient that is found reflecting the importance of the attributes in previous similar models.By modifying the algorithm, most similar cases inputs gain more importance relative to less similar ones in the rule combination process.In this study, the number of units is equal to five because changes in financial and macroeconomic time series have been clustered into five distinct values.

Implementation
We assessed the proposed framework by conducting an experiment to forecast change in Turkish financial and macroeconomic time series.The time series in this study has two frequencies, daily and monthly.The system's output type, daily or monthly, changes in accordance with the input as well so, if the input is daily or monthly, the forecasting output will be daily or monthly respectively.Time series data, obtained from the Central Bank of the Republic of Turkey, were particularized as follows; Daily time series contains four important currency rates in the foreign exchange market (US Dollar/Turkish Lira, Euro/ Turkish Lira/Great Britain Pound/ Turkish Lira and Japanese Yen/ Turkish Lira), daily closing bids of Istanbul Stock Exchange (ISE), simple interest rate and daily closing prices of Istanbul Gold Exchange (IGE).Monthly time series are consumer price index (CIP), wholesales price index (WPI), gross national product (GNP), monthly foreign exchange rate, monthly simple interest rate and monthly gold prices.
The problem has two folds.In the first part, we demonstrate the symbolic rule extraction from neural networks, to fill the model base.This task is for utilizing extracted symbolic rules in performing the forecast.The second part will show how a forecast would be generated by using previous cases without deploying any symbolic rule generation procedure.In order to verify the rules, the example validation set covers the first ten days of 2005 (Table 6).Table 6 illustrates the comparison between the actual and the predicted directional movements of the US Dollar.

Conclusion and Further Research
In the classical DSS perspective, decision models are generated by using fractions of organizational information.Moreover, highly generalized model structures are created to suggest solutions for each problem.The deficiency of the classical approach is not utilizing previously learned cases in the decision making process.So, decision makers should give their final decision based on their tacit knowledge excessively without consulting the DSS.On Communications of the IBIMA 12 the other hand, classic approach might lead over generalization of the rules for specific problems so; the DSS recommendations' relevance deceases.In order to cope with these problems, this study proposes a new approach that combines concepts of knowledge management and decision support systems by using knowledge discovery techniques.
The proposed system is capable of combining the rules sets of previously defines cases, known as explicit knowledge combination, for constituting a new set of decision rule for a new problem.First, the tacit knowledge that is hidden in huge amount of data is converted in explicit knowledge by deriving symbolic decision rules from artificial neural network models.Second, explicit knowledge in previous cases is combined by using a grid based density algorithm called, CLIQUE.In addition to solution suggestion for the new problem, proposed methodology generates specific decision models for the new problem by deploying CBR paradigm.
Besides theses capabilities of the new framework, it is also claimed that there is a significant reduction in time and human effort in decision making process.
At first glance, our experimental study proves the applicability of the system with promising results.However, further study is needed to improve the proposed framework.First rule extraction algorithm, seems sufficient in this case, and could be enhanced to reflect the relations in hidden unites to the rules.However, a feedback mechanism that tests the accuracy of rule combination could be the most interesting and attractive topics as further study.
The proposed framework is not able to check the accuracy of the new rule sets or to warn the decision maker if the rule is not accurate enough.A utility that enables the system to learn from its mistakes should be added.Similar to the proposed framework, a standard CBR system can learn from its good experiences.However, fully functional CBR systems must consider its past mistakes as well to prevent decision maker from making error prone decisions.Furthermore, the system could be applied to other problem domains different from financial forecasting in order not to tie up the system to a specific problem domain.
To sum up, the proposed framework certainly proves the assertion that two different but related concepts, DSS and knowledge management, can be benefited from each other by using knowledge discovery techniques for giving fast and accurate decisions.These ideas will continue to arouse the interest of researchers in the future.

Fig. 1
Fig. 1 Example of Concept Hierarchy Distance 2. Time Interval Distance (TI): TI is calculated as the difference between the last date of time interval of similar cases' input and new problem.It is scaled by the length of the time interval of new problem's inputs.Time interval is the sample period of training data.The calculation is formulated as follows:

Table 1 : Input and Output Cluster Description
boundaries of five clusters.

Table 5 : Extracted Rules by Rule Combination Rules US Dollar Average-Up
1 If Average GBP down and Euro Up More and ISE Total Sales Down 2 If Average GBP up and Average Euro Up and ISE Total Sales Down