Skip to content

Degeneration-of-the-Nation/multilingual-HTML-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 

Repository files navigation

multilingual-HTML-dataset

DOI

A compact, analysis-friendly extraction of the Degeneration-of-the-Nation project:
12-language corpus at the intersection of philosophy, culture & literature with technology and artificial intelligence.

🌐 Websites


🌍 Languages

ISO Language Words
he Hebrew (source) ~1 M
en, es, fr, de, pt, it, ja, ru, ko, zh-Hans, hi Translations ~1 M Γ— 11

LICENSE

CC-BY-4.0

All datasets

https://hitdarderut-haaretz.org/dataset

About

Multilingual Corpus v1.0.0 - Philosophy, AI, Literature, Culture (CC-BY-4.0, 12 languages, CSV/JSON, static)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published