Test vs Validation
Dev set or validation set:
Same data as train, use in train model, maybe compare two different models' performance
Test set:
Same or different data as train, use in fine-tuning (after train model) unbiased estimate performance of model (can skip test set)
Mismatch train & test: cropped photo vs photo in the wild (low resolution, weird angle)
How to split data:
Train
Dev
Test
Small dataset
60% (6,000)
20% (2,000)
20% (2,000)
Big dataset
98% (1,000,000)
1% (10,000)
1% (10,000)
Last updated