Skip to content

{pollspain} R package repo. Authors: Javier Álvarez-Liébana, David Pereiro-Pol, Mafalda González-González, Irene Bosque-Gala y Mikaela De Smedt

License

Notifications You must be signed in to change notification settings

mafaldi/pollspain

 
 

Repository files navigation

pollspain

Lifecycle: experimental Project-Status:Active GitHub release

The main objective of the R package {pollspain} is to provide social scientists, political analysts and citizens with easy and straightforward access to electoral data from Spain. This includes both aggregated election results extracted from polling stations and survey data (including housing effects). It also offers tools for seat allocation, vote simulation, and dataviz. The package is designed under tidyverse-style specially tailored for beginners.

Installation

You can install the development version from GitHub with:

# install.packages("devtools") # only if not already installed
devtools::install_github("dadosdelaplace/pollspain")
library(pollspain) # after installing

An internet connection is required for installing and downloading data. Data is stored in the accompanying pollspain-data repository. You can find more information about the data structure at https://github.com/dadosdelaplace/pollspain-data

Usage

Election summaries

Using summary_election_data(type_elec = ..., year = ..., level = ...) with

  • a vector of election dates (e.g., year = 2023 or date = "2023-07-24")
  • a vector of election types (currently, it only works properly for type_elec = "congress")
  • a geographic level for aggregation (e.g., level = "ccaa" or level = "prov")

returns a summary table of election results aggregated at the administrative level. This includes both general data (blank votes, turnout, etc) and the ballots received by each party or candidacy. The available aggregation levels (level) are: "all" (for a national summary), "ccaa" (autonomous communities), "prov" (province), "mun" (municipality), "mun_district" (electoral district), "sec" (census section), and "poll_station".

# Summary election data at national level for both elections in 2019
# (general data without candidacies ballots)
summary_data_all <- summary_election_data(type_elec = "congress", year = 2019)
summary_data_all
# Summary election data, aggregating candidacies ballots at prov level
summary_data_prov_parties <-
  summary_election_data("congress", year = c(2000, 2008, 2023),
                        level = "prov", by_parties = TRUE)
head(summary_data_prov_parties, 5)

summary_election_data() is a user-friendly combination of get_election_data() (which merges different data sources at the polling station level) and aggregate_election_data() (which aggregates the data to the requested level). See some uses and detailed input arguments in , and for advanced users.

⚠️ About municipalities

The municipality data (names and codes) were extracted from the version published by the National Statistics Institute (INE) on February 6, 2025. The configuration of municipalities from previous years has been adapted to the most recent setup, recoding cases where municipalities have merged or disappeared.

Data extracted from https://www.ine.es/daco/daco42/codmun/codmun20/20codmun.xlsx

⚠️ About CERA

According to the National Statistics Institute (INE) «the electoral roll contains the registration of those who meet the requirements to be voters and are not definitively or temporarily deprived of the right to vote. The electoral roll is composed of:

  • The electoral roll of Spanish citizens residing in Spain (CER).
  • The electoral roll of Spanish citizens residing abroad (CERA).

The electoral roll of residents in Spain who are nationals of countries with Agreements for municipal elections (CERE Agreements), and the electoral roll of citizens of the European Union residing in Spain for municipal and European Parliament elections (CERE EU)».

Los datos relativos a CERA se han agregado a nivel nacional, comunidad autónoma y provincial. …

Seat allocation

Surveys summaries

Estimating electoral surveys

Simulating electoral results

Data viz

Other functions

{pollspain} also provides more advanced users with useful functions to preprocess and analyze electoral data—even their own data, as long as it is provided in a proper format.

type_to_code_election(type_elec = "congress")
#> [1] "02"
extract_code("01-04-003-01-004-B", level = "mun")
#> [1] "003"
extract_code("01-04-003-01-004-B", level = "mun", full_cod = TRUE)
#> [1] "01-04-003"
  • Import raw data: functions starting with import_..._data() (code can be found in the import_elections_data.R file) are aimed at importing and preprocessing as raw as possible the .DAT election files from the Spanish Ministry of Interior files available in the pollspain-data Github repository. See <…> for more examples about how to use them.

Contributing

Any contribution is warmly welcome, whether as a developer or a beta tester of the package. Please feel free to propose any suggestions by opening an issue for discussion.

References

This package has been designed based on the following resources and references

Data sources

Authors

Javier Álvarez-Liébana (maintainer), David Pereiro-Pol, Mafalda González-González, Irene Bosque-Gala and Mikaela De Smedt. The development of {pollspain} package has been part of several Master’s Theses from the Master in Computational Data Science at UC3M (Madrid).

Usability

The package’s usability and functionality have been tested by the following collaborators:

Contributor database

Contributor R knowledge Political science knowledge Usability score Functionality score
9 9
9 2
5 9
6 7
2 3
6 1
10 7
2 8
5 5
10 3
3 9
9 5

About

{pollspain} R package repo. Authors: Javier Álvarez-Liébana, David Pereiro-Pol, Mafalda González-González, Irene Bosque-Gala y Mikaela De Smedt

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 99.7%
  • CSS 0.3%