Important Conceptual Question: Decision Tree

4 min readJul 7, 2021

Hello Everyone!!

These are the sets of question related to decision tree and can help you in sorting out any conceptual gaps.

Enjoy!!

Q1: What is the main objective of decision tree algorithm?

Ans: The goal of a decision tree is that it makes the optimal choice at the end of each node it needs an algorithm that is capable of doing just that. That algorithm is known as Hunt’s algorithm (In Hunt’s algorithm, a decision tree is grown in a recursive fashion by partitioning the training records into successively purer subsets.), which is both greedy, and recursive. Greedy meaning that at step it makes the most best decision at that point of time and recursive meaning it splits the larger question into smaller questions and resolves them the same way. The decision to split at each node is made according to the metric called purity. A node is 100% impure when a node is split evenly 50/50 and 100% pure when all of its data belongs to a single class.

Q2: What is homogeneity and how do you measure it?

Ans: Homogeneity is a mean by which a decision tree decides where (In which feature) should it create its branch. And the level of homogeneity is measured by probability.

Q3: What are the various split criteria in what case will you use which?

Ans: Continuous Target Variable: Reduction in Variance

Categorical Target Variable: Gini Impurity, Information Gain and Chi-Square

Note: Gini Impurity and IG can be used for continuous target variable also

Q4: What is the advantage of Decision tree over linear/logistic regression?

Ans: Compared to other algorithms decision trees requires less effort for data preparation during pre-processing. A decision tree does not require normalization of data. A decision tree does not require scaling of data as well. It does not have any effect of outliers in the dataset. Easier to explain to the management. Missing data does not affect decision tree much.

Q5: Explain an example where decision tree will fail to perform:

Ans: Decision tree will surely not perform well when we have a lot many number of features present in the dataset. Basically as the features increases the depth of the tree also increases (level increases) due to which there is a high chance of having a over fitting issue. Also Check the Variance of the features in EDA

Q6: What is over fitting?

Over fitting is the scenario in which your accuracy predicted on the train data is greater than the accuracy predicted on Test data. In Simple terms your bias (Errors in train data) is low and to the variance (Errors in test data) is high.

Q7: Can a regression model overfit, if yes then how?

Ans: Yes there can be over fit in regression. As the no of features increases variance increases which results in overfitting. However, this can be minimised by the concept of regularization. Other technique can be to identify those features which are more contributing towards determining the target variable or feature and will considered those only in our scope of analysis.

Q8: What is the best use case for decision tree?

Ans: The best use case for decision tree will be:

1: If the assumptions of linear regression not satisfied.

2: If your dataset is having lot of null values.

3: If the data requires too much of pre-processing in terms of outliers.

Q9: Why is decision tree called a greedy technique?

It is called greedy algorithm because at each step of creation of sub-node it checks which sub-node has high level of homogeneity or which is having a high purity into it, without worrying about the future overall outcome.

Q10: What is pruning?

Ans: As we know decision tree will grow until it’s the split has 0–100% or 50–50% or it has come that level where there can’t be further split. And if the length grows more and more it will lead to the condition of over fit. And pruning is a way to limit the growth of decision tree. It can be done in Max_depth parameter in the decision tree. Pruning is also a way by which restrict the overfitting of decision tree. Pruning is a way of limiting the growth of DT.

Point to Remember: Decision Tree is the non-parametric model as there is no equation for it. But linear regression is called parametric model as it is defined using a algorithm.

Two split criteria are is used like gini and entropy what are the difference?

Both measure the purity of the split with respect to any feature.

Gini gives very less value and its computationally very efficient compared to entropy. Gini is considered generally in case of tree models. And entropy has a more computational time.

Entropy=log(base 2)p1-p1+log(base 2)p2-p2

Gini is p² + q² where p and q are the probability of success and failure of a subnode

Important Conceptual Question: Decision Tree

Written by Vikasmishra