Skip to content

DOC/ENH: Holiday exclusion argument #61600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jun 13, 2025
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ Other enhancements
- :meth:`Series.nlargest` uses a 'stable' sort internally and will preserve original ordering.
- :class:`ArrowDtype` now supports ``pyarrow.JsonType`` (:issue:`60958`)
- :class:`DataFrameGroupBy` and :class:`SeriesGroupBy` methods ``sum``, ``mean``, ``median``, ``prod``, ``min``, ``max``, ``std``, ``var`` and ``sem`` now accept ``skipna`` parameter (:issue:`15675`)
- :class:`Holiday` has gained the constructor argument and field ``exclude_dates`` to exclude specific datetimes from a custom holiday calendar (:issue:`54382`)
- :class:`Rolling` and :class:`Expanding` now support ``nunique`` (:issue:`26958`)
- :class:`Rolling` and :class:`Expanding` now support aggregations ``first`` and ``last`` (:issue:`33155`)
- :func:`read_parquet` accepts ``to_pandas_kwargs`` which are forwarded to :meth:`pyarrow.Table.to_pandas` which enables passing additional keywords to customize the conversion to pandas, such as ``maps_as_pydicts`` to read the Parquet map data type as python dictionaries (:issue:`56842`)
Expand Down
54 changes: 54 additions & 0 deletions pandas/tests/tseries/holiday/test_holiday.py
Original file line number Diff line number Diff line change
Expand Up @@ -353,3 +353,57 @@ def test_holidays_with_timezone_specified_but_no_occurences():
expected_results.index = expected_results.index.as_unit("ns")

tm.assert_equal(test_case, expected_results)


def test_holiday_with_exclusion():
# GH 54382
start = Timestamp("2020-05-01")
end = Timestamp("2025-05-31")
exclude = [Timestamp("2022-05-30")] # Queen's platinum Jubilee
default_uk_spring_bank_holiday: Holiday = Holiday(
"UK Spring Bank Holiday",
month=5,
day=31,
offset=DateOffset(weekday=MO(-1)),
)

queens_jubilee_uk_spring_bank_holiday: Holiday = Holiday(
"Queen's Jubilee UK Spring Bank Holiday",
month=5,
day=31,
offset=DateOffset(weekday=MO(-1)),
exclude_dates=exclude,
)

original_dates = list(default_uk_spring_bank_holiday.dates(start, end))
exclusion_dates = list(queens_jubilee_uk_spring_bank_holiday.dates(start, end))

assert all(ex in original_dates for ex in exclude)
assert all(ex not in exclusion_dates for ex in exclude)
assert set(exclusion_dates).issubset(original_dates)


def test_holiday_with_multiple_exclusions():
start = Timestamp("2000-01-01")
end = Timestamp("2100-05-31")
exclude = [
Timestamp("2025-01-01"),
Timestamp("2042-01-01"),
Timestamp("2061-01-01"),
] # Yakudoshi new year
default_japan_new_year: Holiday = Holiday(
"Japan New Year",
month=1,
day=1,
)

yakudoshi_new_year: Holiday = Holiday(
"Yakudoshi New Year", month=1, day=1, exclude_dates=exclude
)

original_dates = list(default_japan_new_year.dates(start, end))
exclusion_dates = list(yakudoshi_new_year.dates(start, end))

assert all(ex in original_dates for ex in exclude)
assert all(ex not in exclusion_dates for ex in exclude)
assert set(exclusion_dates).issubset(original_dates)
9 changes: 9 additions & 0 deletions pandas/tseries/holiday.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,7 @@ def __init__(
start_date=None,
end_date=None,
days_of_week: tuple | None = None,
exclude_dates: DatetimeIndex | None = None,
) -> None:
"""
Parameters
Expand All @@ -193,6 +194,8 @@ class from pandas.tseries.offsets, default None
days_of_week : tuple of int or dateutil.relativedelta weekday strs, default None
Provide a tuple of days e.g (0,1,2,3,) for Monday Through Thursday
Monday=0,..,Sunday=6
exclude_dates : DatetimeIndex or default None
Specific dates to exclude e.g. skipping a specific year's holiday

Examples
--------
Expand Down Expand Up @@ -257,6 +260,9 @@ class from pandas.tseries.offsets, default None
self.observance = observance
assert days_of_week is None or type(days_of_week) == tuple
Copy link
Contributor Author

@sharkipelago sharkipelago Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I also switch this to throw a ValueError on failing? Or would that be a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate PR would be better, thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, should I open a new issue? Or just make another PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just another PR is fine

self.days_of_week = days_of_week
if exclude_dates is not None and type(exclude_dates) != DatetimeIndex:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if exclude_dates is not None and type(exclude_dates) != DatetimeIndex:
if not (exclude_dates is None or isinstance(exclude_dates, DatetimeIndex):

raise ValueError("exclude_dates must be None or of type DatetimeIndex.")
self.exclude_dates = exclude_dates

def __repr__(self) -> str:
info = ""
Expand Down Expand Up @@ -328,6 +334,9 @@ def dates(
holiday_dates = holiday_dates[
(holiday_dates >= filter_start_date) & (holiday_dates <= filter_end_date)
]

if self.exclude_dates:
holiday_dates = holiday_dates.difference(self.exclude_dates)
if return_name:
return Series(self.name, index=holiday_dates)
return holiday_dates
Expand Down
Loading