B) C4.5
C4.5 algorithm is an improvement of IDE3 algorithm, developed by Quinlan Ross (1993). It is
based on Hunt’s algorithm and also like IDE3, it is serially implemented. Pruning takes place in
C4.5 by replacing the internal node with a leaf node thereby reducing the error rate (Podgorelec
et al, 2002). Unlike IDE3, C4.5 accepts both continuous and categorical attributes in building the
decision tree. It has an enhanced method of tree pruning that reduces misclassification errors due
noise or too-much details in the training data set. Like IDE3 the data is sorted at every node of
the tree in order to determine the best splitting attribute. It uses gain ratio impurity method to
evaluate the splitting attribute (Quinlan, 1993).
based on Hunt’s algorithm and also like IDE3, it is serially implemented. Pruning takes place in
C4.5 by replacing the internal node with a leaf node thereby reducing the error rate (Podgorelec
et al, 2002). Unlike IDE3, C4.5 accepts both continuous and categorical attributes in building the
decision tree. It has an enhanced method of tree pruning that reduces misclassification errors due
noise or too-much details in the training data set. Like IDE3 the data is sorted at every node of
the tree in order to determine the best splitting attribute. It uses gain ratio impurity method to
evaluate the splitting attribute (Quinlan, 1993).
c) C5.0
A.C5.0 Algorithm: C5.0 algorithm is an extension of C4.5 algorithm. C5.0 is the classification algorithm which applies in big data set. C5.0 is better than C4.5 on the efficiency and the memory. C5.0 model works by splitting the sample based on the field that provides the maximum information gain. The C5.0 model can split samples on basis of the biggest information gain field..The sample subset that is get from the former split will be split afterward. The process will continue until the sample subset cannot be split and is usually according to another field. Finally, examine the lowest level split, those sample subsets that don’t have remarkable contribution to the model will be rejected.
No comments:
Post a Comment