Decision Trees are versatile Machine Learning algorithm that can perform both classification and regression tasks. They are very powerful algorithms, capable of fitting complex datasets. Besides, decision trees are fundamental components of random forests, which are among the most potent Machine Learning algorithms available today.
Decision Trees are used in the following areas of applications:
A Decision Tree is a Supervised Machine Learning algorithm that looks like an inverted tree, wherein each node represents a predictor variable (feature), the link between the nodes represents a Decision and each leaf node represents an outcome (response variable).
We will be using party
package for decision tree.
Use the below command in R console to install the package. You also have to install the dependent packages if any.
> install.packages("party")
To create and analyze decision tree it as ctree()
function.
The basic syntax for creating a decision tree in R is −
> ctree(formula, data)
Description of the parameters used −
Loading the party library by executing library(party)
Load the dataset readingSkills
and execute head(readingSkills)
> library(party)
> head(readingSkills)
nativeSpeaker age shoeSize score
1 yes 5 24.83189 32.29385
2 yes 6 25.95238 36.63105
3 no 11 30.42170 49.60593
4 yes 7 28.66450 40.28456
5 yes 11 31.88207 55.46085
6 yes 10 30.07843 52.83124
As you can see clearly there 4 columns nativeSpeaker, age, shoeSize, and score. Thus basically we are going to find out whether a person is a native speaker or not using the other criteria and see the accuracy of the decision tree model developed in doing so.
Create the decision tree model using ctree
and plot the model
> model <- ctree(nativeSpeaker ~ age + shoeSize + score, data= readingSkills)
> model
Conditional inference tree with 8 terminal nodes
Response: nativeSpeaker
Inputs: age, shoeSize, score
Number of observations: 200
1) score <= 43.34602; criterion = 1, statistic = 44.243
2) shoeSize <= 26.92283; criterion = 0.999, statistic = 13.746
3) score <= 31.08626; criterion = 1, statistic = 25.616
4)* weights = 24
3) score > 31.08626
5) age <= 6; criterion = 1, statistic = 17.578
6)* weights = 22
5) age > 6
7) score <= 38.68543; criterion = 1, statistic = 16.809
8)* weights = 13
7) score > 38.68543
9)* weights = 11
2) shoeSize > 26.92283
10)* weights = 51
1) score > 43.34602
11) age <= 9; criterion = 0.997, statistic = 10.843
12)* weights = 29
11) age > 9
13) score <= 50.2831; criterion = 1, statistic = 30.938
14)* weights = 16
13) score > 50.2831
15)* weights = 34
Plotting the model.
> plot(model)
Therefore, we studied from the decision tree shown above, we can conclude that anyone whose readingSkills
the score is less than 38.3 and age is more than 6 is not a native Speaker
This brings the end of this Blog. We really appreciate your time.
Hope you liked it.
Do visit our page www.zigya.com/blog for more informative blogs on Data Science
Keep Reading! Cheers!
Zigya Academy
BEING RELEVANT
Through the standard form offers different advantages in mathematical calculations and scientific notation. Firstly, it…
Introduction Stress is a feeling caused by an external trigger that makes us frustrated, such…
Sociology is a broad discipline that examines societal issues. It looks at the meaningful patterns…
Some info about Inch Inches are a unique measure that persuades us that even the…
You should be familiar with logarithms to understand antilogarithms in a better manner. Logarithms involve…
यहां "नाटककार सुरेंद्र वर्मा" पुस्तक की पीडीएफ विद्यार्थी, शोधार्थी और जो इसका अभ्यास के लिए…
View Comments
Pretty good article. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way Ill be coming back and I hope you post again soon.