What dataset training size should I use?

Training size refers to the portion of your data you will use to train your model as opposed to the portion of the dataset that will be set aside and used to validate / test how good your model is. By default Analyzr will split your dataset into two equal portions, and use 50% for training and 50% for testing. 

If your dataset is small, you may want to increase the share devoted to training as long as you keep at least a few hundred rows for testing. Conversely, if your dataset is large, you may want to reduce the training size to decrease your training time. 

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.