The main objective of the R package {pollspain}
is to provide social
scientists, political analysts and citizens with easy and
straightforward access to electoral data from
Spain. This includes both aggregated election results
extracted from polling stations and survey data (including housing
effects). It also offers tools for seat allocation, vote simulation, and
dataviz. The package is designed under
tidyverse-style specially tailored for
beginners.
You can install the development version from GitHub with:
# install.packages("devtools") # only if not already installed
devtools::install_github("dadosdelaplace/pollspain")
library(pollspain) # after installing
An internet connection is required for installing and downloading
data. Data is stored in the accompanying pollspain-data
repository.
You can find more information about the data structure at
https://github.com/dadosdelaplace/pollspain-data
Using summary_election_data(type_elec = ..., year = ..., level = ...)
with
- a vector of election dates (e.g.,
year = 2023
ordate = "2023-07-24"
) - a vector of election types (currently, it only works properly for
type_elec = "congress"
) - a geographic level for aggregation (e.g.,
level = "ccaa"
orlevel = "prov"
)
returns a summary table of election results
aggregated at the administrative level. This includes both
general data (blank votes, turnout, etc) and the ballots received by
each party or candidacy. The available aggregation
levels (level
) are: "all"
(for a national summary),
"ccaa"
(autonomous communities), "prov"
(province), "mun"
(municipality), "mun_district"
(electoral district), "sec"
(census
section), and "poll_station"
.
# Summary election data at national level for both elections in 2019
# (general data without candidacies ballots)
summary_data_all <- summary_election_data(type_elec = "congress", year = 2019)
summary_data_all
# Summary election data, aggregating candidacies ballots at prov level
summary_data_prov_parties <-
summary_election_data("congress", year = c(2000, 2008, 2023),
level = "prov", by_parties = TRUE)
head(summary_data_prov_parties, 5)
summary_election_data()
is a user-friendly combination of
get_election_data()
(which merges different data sources at the
polling station level) and aggregate_election_data()
(which aggregates
the data to the requested level). See some uses and
detailed input arguments in
…, and
… for
advanced users.
⚠️ About municipalities
The municipality data (names and codes) were extracted from the version published by the National Statistics Institute (INE) on February 6, 2025. The configuration of municipalities from previous years has been adapted to the most recent setup, recoding cases where municipalities have merged or disappeared.
Data extracted from https://www.ine.es/daco/daco42/codmun/codmun20/20codmun.xlsx
⚠️ About CERA
According to the National Statistics Institute (INE) «the electoral roll contains the registration of those who meet the requirements to be voters and are not definitively or temporarily deprived of the right to vote. The electoral roll is composed of:
- The electoral roll of Spanish citizens residing in Spain (CER).
- The electoral roll of Spanish citizens residing abroad (CERA).
The electoral roll of residents in Spain who are nationals of countries with Agreements for municipal elections (CERE Agreements), and the electoral roll of citizens of the European Union residing in Spain for municipal and European Parliament elections (CERE EU)».
Los datos relativos a CERA se han agregado a nivel nacional, comunidad autónoma y provincial. …
{pollspain}
also provides more advanced users with
useful functions to preprocess and analyze electoral data—even
their own data, as long as it is provided in a proper format.
- Utils: functions contained in the
utils.R
script are intended to serve as helper functions for data preprocessing. See https://javieralvarezliebana.es/pollspain/articles/utils.html for more examples about how to use them.
type_to_code_election(type_elec = "congress")
#> [1] "02"
extract_code("01-04-003-01-004-B", level = "mun")
#> [1] "003"
extract_code("01-04-003-01-004-B", level = "mun", full_cod = TRUE)
#> [1] "01-04-003"
- Import raw data: functions starting with
import_..._data()
(code can be found in theimport_elections_data.R
file) are aimed at importing and preprocessing as raw as possible the.DAT
election files from the Spanish Ministry of Interior files available in the pollspain-data Github repository. See <…> for more examples about how to use them.
Any contribution is warmly welcome, whether as a developer or a beta tester of the package. Please feel free to propose any suggestions by opening an issue for discussion.
This package has been designed based on the following resources and references
- ACE: The Electoral Knowledge Network. https://aceproject.org/
- Albuja J. (2025). R pacakge
{electoral}
: allocating seats methods and party system scores (v0.1.4). https://cran.r-project.org/web/packages/electoral/index.html - García Guzmán P. (2025). WikiBarrio: Explore Spanish socio-demographic data at the neighborhood level. https://www.wikibarrio.es/
- García Guzmán P. (2025). ineAtlas: Access to Spanish Household Income Distribution Atlas Data. R package version 0.1.3.9000, https://github.com/pablogguz/ineAtlas
- Meleiro H. (2024). infoelectoral: Download Spanish Election Results. R package version 1.0.2, https://github.com/rOpenSpain/infoelectoral
- Silge J., Nash J.C., and Graves S. (2018). Navigating the R Package Universe. The R Journal 10 (2): 558–63. https://doi.org/10.32614/RJ-2018-058
- V-Dem Dataset. Varieties of Democracy (V-Dem) Project. https://v-dem.net/data/
- Wickham H. and Bryan J. R Packages: Organize, Test, Document, and Share Your Code (2023). https://r-pkgs.org/
- Spanish electoral data downloaded from repository of the Spanish Ministry of the Interior.
- Worldwide electoral data downloaded from Comparative Study of Electoral Systems (CSES) and Election Data Archive (ICPSR)
- Survey data download from Wikipedia links (example).
- Seat allocation methods checked in the Electoral System Design Database (International IDEA)
Javier Álvarez-Liébana (maintainer), David Pereiro-Pol,
Mafalda González-González, Irene Bosque-Gala and Mikaela De
Smedt. The development of {pollspain}
package has been part of
several Master’s Theses from the Master in Computational Data Science at
UC3M (Madrid).
The package’s usability and functionality have been tested by the following collaborators:
Contributor database
Contributor | R knowledge | Political science knowledge | Usability score | Functionality score |
---|---|---|---|---|
… | 9 | 9 | … | … |
… | 9 | 2 | … | … |
… | 5 | 9 | … | … |
… | 6 | 7 | … | … |
… | 2 | 3 | … | … |
… | 6 | 1 | … | … |
… | 10 | 7 | … | … |
… | 2 | 8 | … | … |
… | 5 | 5 | … | … |
… | 10 | 3 | … | … |
… | 3 | 9 | … | … |
… | 9 | 5 | … | … |