2. In machine learning, information gain is applied for attribute selection in building a decision tree.
Suppose a data set has 10 instances, each of which belongs to one of two classes, including class
C1 and class C2. Among the 10 instances, 6 belong to class C1 and 4 belong to class C2. Let A
be an attribute with two attribute values a1 and a2. The number of instances having A = a1 and
belonging to class C1 is 2, the number of instances having A = a2 and belonging to class C1 is 4,
the number of instances having A = a1 and belonging to class C2 is 4, and the number of instances
having A = a2 and belonging to class C2 is 0.