Data augmentation has proven widely effective in computer vision. In Natural Language Processing (NLP) data augmentation remains an area of active research. There is no widely accepted augmentation technique that works well across tasks and model architectures. In this study, data augmentation techniques in the context of text classification are explored using a social media dataset. Popular varieties of data augmentation are implemented, starting with oversampling, Easy Data Augmentation (Wei and Zou, 2019) and Back-Translation (Sennrich et al., 2015). Greyscaling is also considered, a relatively unexplored data augmentation technique that seeks to mitigate the intensity of adjectives in examples. Finally, a few-shot learning approach is considered: Pattern-Exploiting Training (PET) (Schick et al., 2020). For the experiments a BERT transformer architecture is implemented.
Results show that augmentation techniques provide only minimal and inconsistent improvements. Synonym replacement provided evidence of some performance improvement and adjective scales with Grayscaling is an area where further exploration would be valuable. Few-shot learning experiments show consistent improvement over supervised training, and seem very promising when classes are easily separable.
- Tools: Python, Keras, TensorFlow.
- Related documents and code: HERE