Skip to main content

Table 1 Settings and explanations of the TreeNet model run

From: Ecological niche modeling of rabies in the changing Arctic of Alaska

Metric

Setting

Effect

Justification

Learnrate

AUTO

A detailed but slow model run

Known to provide best results for the algorithm ‘learning’ data

Subsample fraction

50%

Internal testing while model is grown

Standard approach for balanced tree models

Logistic residual trim fraction

0.10

Fine-tuning

Allows for better fits

Huber-M fraction of error squared

0.90

Accuracy level

A statistical standard threshold for certainty

Optimal logistic model selection

Cross entropy

How to find the optimal model

Usually the best setting for tree-based models

Number of trees to build

1000

Number of trees tried out for the best solution

This number should widely overshot the known optimum

Maximum number of nodes

6

Determines the node depth of trees used

This number determines whether a ‘stump’ or a fully fit tree is run

Terminal node minimum training cases

10

For most data cases it provides a robust tree

Number of cases for each tree branch split

Maximum number of most-optimal models to save summary results

1

Just 1 most-optimal model is saved

 

Regression loss criterion

Huber-M (Blend LS and LAD)

A statistical metric to express gain vs cost of a new rule

Standard approach in trees