Summary:
- OpenAI has released a massive multilingual AI dataset called OSCAR (Open Subtitle Corpus), which contains over 300 languages and 60 billion tokens.
- The dataset is designed to help address the global language divide and enable the development of AI models that can understand and communicate in a wide range of languages.
- The release of OSCAR is part of OpenAI's efforts to democratize AI and make it more accessible to people around the world, regardless of their language.