Skip to content

Implementations of Unigram language modeling with smoothing, Document ranking using probabilistic retrieval, and Zipf’s law and cross-entropy evaluation

Notifications You must be signed in to change notification settings

ZahraRahimii/NLP-Unigram-Retrieval-Models

Repository files navigation

NLP-Unigram-Retrieval-Models

This repo includes basic implementations of Unigram language modeling with smoothing, Document ranking using probabilistic retrieval, and Zipf’s law and cross-entropy evaluation

Project Workflow:

  1. Unigram Language Modeling:

    • Tokenization and preprocessing
    • Unigram probability estimation
    • Evaluation using Zipf’s law and cross-entropy
  2. Probabilistic Information Retrieval:

    • Ranking documents based on unigram probabilities (Jelinek-Mercer smoothing)
    • Evaluation of ranking effectiveness based on λ tuning and query types

About

Implementations of Unigram language modeling with smoothing, Document ranking using probabilistic retrieval, and Zipf’s law and cross-entropy evaluation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published