What do Asian Religions Have in Common? An Unsupervised Text Analytics Exploration
AI-generated Key Points
- The paper explores similarities between various sacred texts using text mining techniques
- Sacred texts can vary based on factors such as geographical location or the time of the birth of a particular religion
- Despite differences, there may be similarities in the lessons taught by these texts
- The study uses Asian texts (Tao Te Ching, Buddhism, Yogasutra, Upanishad) and non-Asian texts (four Bible texts) as the corpus
- Similarity is measured using different measures like Euclidean, Manhattan, Jaccard, and Cosine applied to raw Document Term Frequency (DTM) and normalized DTM
- Supervised learning algorithms such as K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest are used to predict the correct sacred text for any given chapter in the corpus
- K-means clustering visualizations on Euclidean distances of raw DTM reveal patterns of similarity among the sacred texts
- Upanishads and Tao Te Ching are found to be the most similar texts in the corpus
- The research aims to find similarities between various sacred texts in terms of what they teach and how they teach religious lessons
- Text mining using machine learning and feature extraction is employed to identify patterns in document collections
- Similarity measures such as Euclidean, Manhattan, Jaccard, and Cosine are applied to analyze word frequency matrices and calculate distance matrices on Document Term Matrix formed by LDA (Latent Dirichlet Allocation)
- Supervised learning algorithms including KNN, SVM, and Random Forest are trained on a labeled corpus to predict the origin of fragments of spiritual literature with accuracy measured for effectiveness in prediction
Authors: Preeti Sah, Ernest Fokoué
Abstract: The main source of various religious teachings is their sacred texts which vary from religion to religion based on different factors like the geographical location or time of the birth of a particular religion. Despite these differences, there could be similarities between the sacred texts based on what lessons it teaches to its followers. This paper attempts to find the similarity using text mining techniques. The corpus consisting of Asian (Tao Te Ching, Buddhism, Yogasutra, Upanishad) and non-Asian (four Bible texts) is used to explore findings of similarity measures like Euclidean, Manhattan, Jaccard and Cosine on raw Document Term Frequency [DTM], normalized DTM which reveals similarity based on word usage. The performance of Supervised learning algorithms like K-Nearest Neighbor [KNN], Support Vector Machine [SVM] and Random Forest is measured based on its accuracy to predict correct scared text for any given chapter in the corpus. The K-means clustering visualizations on Euclidean distances of raw DTM reveals that there exists a pattern of similarity among these sacred texts with Upanishads and Tao Te Ching is the most similar text in the corpus.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.