Since the economists have special interests on online auction and bidding process, several logics and techniques were proposed by researchers for predicting end bid prices. In such way, researchers proposed the prediction logic using the concepts of Data Mining, Fuzzy Logic, Genetic Algorithm, Neuro Fuzzy, Grey System Theory, etc.
Data Mining Concepts
Data Mining technique are being approached using Neural Network and Bayesian Network. To forecast the winning bid prices, this progresses four processes. They are Data Collection, Variable Selection, Data Transformation and Data Mining. Data Collection is to identify the available data sources and to extract the data. The strategy of data collection process will be varying with respect to the objectives. In Variable Selection, the variables selected for data mining are known as active variables. They are actively used to differentiate segments, make predictions, perform some other data mining operations. To select the variables for input variables, regression analysis, stepwise regression, discrimination analysis and decision tree are used, based on selected variables and subjective judgment, input variables are selected. During Data Transformation, the original value of data will be transformed for data mining. This is an important step for accuracy and validity of the result which also depends on how the data analyst decides to structure and present the input variable. Finally, actual Data Mining phase takes place to apply the selected data mining technique to transformed data.
Fuzzy Logic
Fuzzy logic utilizes fuzzy sets defined by membership functions in logical expressions to deal with the extent to which the object belongs to the set. Basically the construction of a fuzzy logic system consists of three major steps: fuzzification, construction of knowledge base and defuzzificaiton. Fuzzification is the process of converting crisp values to fuzzy values (e.g., low, medium, high) Each linguistic variable after mapping can have difference membership function values for different linguistic term, it breaks traditional binary logic that a case can only belong to or not belong to a category. This process is what we call fuzzification. Knowledge base is constructed by a series of "IF-THEN" rules. After fuzzification and fuzzy inference, each input value will have a corresponding value for each linguistic term of the output variable. The process to convert fuzzy values to the corresponding crisp value is called defuzzification. Basically it consists of two main steps. In the first step, a representative value is determined for each term in the linguistic variable. In the second step, the best crisp value for the liguistic result is computed.
Genetic Algorithm concepts
Genetic Algorithm is search algorithm based on the mechanics of natural selection and genetics and they combine survival of the fittest among string structures to form a search algorithm. Genetic Algorithm is particularly suitable for multi-parameter optimization problems with an objective function subject to numerous hard and soft constraints. The main idea of GA is to start with a population of solutions to a problem, and attempt to produce new generations of solutions which are better than the previous ones. Genetic Algorithm operates through a simple cycle consisting of the following four stages: initialization, selection, crossover, and mutation. In the initialization stage, a population of genetic structures (called chromosomes) that are randomly distributed in the solution space is selected as the starting point of the search. These chromosomes can be encoded using a variety of schemes including binary strings, real numbers or rules. After the initialization stage, each chromosome is evaluated using a user-defined fitness function. The goal of the fitness function is to numerically encode the performance of the chromosome. For real- world applications of optimization methods such as GA, the choice of the fitness function is the most critical step. The mating convention for reproduction is such that only the high scoring members will preserve and propagate their worthy characteristics from generations to generation and thereby help in continuing the search for an optimal solution. The chromosomes with high performance may be chosen for replication several times whereas poor-performing structures may not be chosen at all. Such a selective process causes the best-performing chromosomes in the population to occupy an increasingly larger proportion of the population over time. Crossover causes to form a new offspring between two randomly selected 'good parents'. Crossover operates by swapping corresponding segments of a string representation of the parents and extends the search for new solution in far-reaching direction. The crossover occurs only with some probability (the crossover rate). There are many different types of crossover that can be performed: the one-point, the two-point, and the uniform type. Mutation is a GA mechanism where we randomly choose a member of the population and change one randomly chosen bit in its bit string representation. Although the reproduction and crossover produce many new strings, they do not introduce any new information into the population at the bit level. If the mutant member is feasible, it replaces the member which was mutated in the population. The presence of mutation ensures that the probability of reaching any point in the search space is never zero.
Neuro Fuzzy technique
Neural networks impersonate biological information processing mechanisms which are designed to perform a nonlinear mapping from a set of inputs to a set of outputs. The mapping is carried out by the processing elements, called artiï¬cial neurons, which are interconnected to form a network divided into layers (usually three): theinput layer receives inputs from outside, the output layer sends outputs to the outside and one or more interme- diate layers (hidden layer) connect the input and output layers. Basically a neuro fuzzy system is a fuzzy logic system with a learning algorithm derived from or inspired by neural network theory to determine its parameters, including the parameters of the membership function and the relative importance of each fuzzy rule. The most common approach used to combine the two techniques is so-called Fuzzy Associative Memory. It attempts to use neural networks to implement the desired mapping for fuzzy systems by applying fuzzy rules to a set of inputs, combining the consequents of each rule, and producing a value for the output variable. Each rule is associated with a weight that represents the importance of the rule in relevance to the other rules in the system. The errors between the results computed by the Fuzzy Associative Memory system and the desired output are used to modify the weights.The training process will stop until the error is less than a certain threshold value. A fuzzy logic system is constructed by using the complete knowledge base to describe the relationship among independent and dependent variables. Then the knowledge base is ï¬ne-tuned by using the learning ability of neural network based on the training data set. Finally we use the testing data set to validate the obtained model.
Grey System Theory
Grey system theory works on unascertained systems with partially known and partially unknown information by drawing out valuable information, by generating and developing the partially known information. It can describe correctly and monitor effectively the systemic operational behaviour. Basically, the Grey system theory was chosen based on color. For instance, "black" is used to represent unknown information and "white" is the color used for complete information. Those partially know and partially unknown information is called the "Grey System Theory". The grey system theory has been successfully applied to economical, management, social systems, industrial systems, ecological systems, education, traffic, environmental sciences, and geography. It is used successfully to analyse uncertain systems that have multi-data inputs, discrete data, and insufficient data. Grey systems theory explores the law of subject's motivation using functions of sequence operators according to information coverage. It is different from fuzzy logic since it emphasizes on objects with definite external extensions and vague internal meanings. Table 1 shows the Grey prediction model compared to other traditional forecasting models. It can be seen that this model only requires short-term, current and limited data in order to predict a given value. Grey prediction is a quantitative prediction based on grey generating function, GM (1, l) model, which uses the variation within the system to find the relations between sequential data and then establish the prediction model. The grey fore- casting model is derived from the grey system, in which one examines changes within a system to discover a relation between sequence and data. After that, a valid prediction is made to the system. Grey prediction model has the following advantages: (a) It can be used in situations with relatively limited data down to as little as four observations, as stated in Table 1. (b) A few discrete data are sufficient to characterize an unknown system. (c) It is suitable for forecasting in competitive environments where decision-makers have only accessed to limited historical data.