Masked Language Modeling with lightning-transformers