Skip to content

guptalab/hindisanskritstat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Authors

[Rudra Gohel] [Dev Kabra] [Kirtan Shah]

Statistical Analysis of Hindi and Sanskrit Languages

Overview

This repository contains the code and datasets for a research project that focuses on a comprehensive statistical analysis of the Hindi and Sanskrit languages. The study aims to provide valuable insights into the linguistic structures of these languages and explore their relationship with culture and society. The findings have practical applications in fields such as cryptanalysis, machine translation, natural language processing, and sentiment analysis.

Research Highlights

  • Dataset Selection: Meticulous selection and evaluation of datasets for both Hindi and Sanskrit languages.

  • Linguistic Aspects Explored:

    • Frequency Analysis
    • Character Grouping
    • Digrams and Trigrams
    • Average Word Length
    • Zipf’s Law
    • Word Entropy
    • N-gram Entropy
  • Encouraging Results:

    • Distinct patterns in character occurrences
    • Structural complexities
    • Adherence to Zipf’s Law in both languages
    • Balanced mix of structured and variable word usage based on Word Entropy analysis
  • Comparisons with English:

    • N-gram Entropy comparisons with English for insights into symbol relationships.

Repository Structure

The repository contains two main folders - hindi and sanskrit. Each folder contains analysis.ipynb to generate the results. Results are in the form of CSV files and images.

Usage

To reproduce the results of the research, follow these steps:

  1. Clone the repository:

    git clone https://github.com/guptalab/hindisanskritstat.git
    
  2. Install the required packages:

    pip install -r requirements.txt
  3. Run the 'analysis.ipynb' file in both folders to generate the results.

About

Statistical Analysis of Hindi and Sanskrit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5