I have used random forests in the R for classification, where the obvious value in the respective column (for example 0 or 1). For example, for the Iris database, we can use random forests to classify the data on the basis of species:
myRF & lt; - randomForest (species ~., Data = iris, importance = true, proximity = true) This makes sense because species can only take certain values. The question is whether species can take value from 1 to 100 and I want to classify the data into two categories: those whose value is more than 50 and whose value is less than 50?
Of course, I can add another column whose value is dependent on 1 or 0 species, and then I classify it on the last column rather than the species, but to tell RC directly The way we want to categorize our data is 2 categories: A category where one species is less than 50 and the other is greater than 50? (Assuming new fantasy values for species)?
Thanks
MyRf ~ randomForest (species & amp; 50; ., ...) which
-
is actually not different to defining a new variable, it also includes Species is less than 50, but avoids modifying your datasets; -
Only sensible if species instead of a persistent clear variable (i.e., to understand the number of species in this way is understood) . In a more general case where you can guess that one factor will appear in one of the subsets of values, you
use randomForest Can (% level ("level1", "level2", ...) ~ .....) in
No comments:
Post a Comment