When asked about the total number of books in the world, Google’s response is safe, but accurate. “The answer changes every time the computation is performed, as we accumulate more data and fine-tune the algorithm. The current number is around 210 million,” says the search engine giant. According to Google, the number of published books written in the world is 129,864,880.
But even without Google’s insights, we know the importance of literature in our lives as legendary authors have enlightened us through the centuries.
So what makes an author great? Great writers are visionaries and know how to connect distinct dots in the plots. They also are gifted and they can lead their readers into magical and mystical worlds sometimes. The best of authors have a knack to express their ideas very precisely and clearly. But there is a problem. We have loads of pieces of stories and books without authors. Somehow the link between the stories and their authors is lost. Can you help us find the author of a mysterious book with limited signals and evidence?
We are looking for awesome data scientists and machine learning engineers to find great patterns in authors’ writing styles and texts.
The dataset is based on English language literature by 10 famous authors. The train and the test data consists of short samples of text, where each sample consists of a set of 10 sentences. These sentences are irrespective of the number of words which constitutes the X data and the corresponding Y data, the author.
The training data and test data comprise of 18,977 and of 6,326 samples each. This is a dataset which has been collected over some time to gather works of the best authors from many generations.
- Sample containing 10 sentences of english language text.
- Author of the corresponding text/sample (10 classes).
Based on the given piece of text in test data, build a model that predicts the name of the author who has written the text.