-
Notifications
You must be signed in to change notification settings - Fork 32
Description
I'm using pg_party to handle partitioning of messages based on a chat_id, where each chat that reaches over 10,000 messages is given its own partition. My original unpartitioned messages column was previously 95GB in size, to give a bit of context for the scope of what I am working with.
The result of partitioning with that criteria was about 4000 partition tables total. This did speed things up dramatically, which I'm very thankful for, however I ran into a very strange issue that took me a while to debug.
I noticed that after initial app load, when attempting to visit the log page of any of these chats, there would be a long wait time while fetching messages. There was no output on the console hinting at extra queries being done, and when testing the query plan directly on the database everything ran very fast. It also didn't happen again on subsequent queries unless I left it alone for a minute, hinting to me that whatever was happening was being cached with pg_party's ttl in my config.
Diving deeper, I noticed that any time my Message model was being invoked under any circumstance, these sorts of queries were happening on my database over all of my partition tables sequentially. It didn't matter if I was trying to grab messages, or if I was simply calling Message.load_schema
in the console, the slowdown was present for each case.
(I should note, I have also opted to give each of the partitions a custom human-readable name when they are created. Don't know if this is relevant to the issue.)
SELECT pg_inherits.inhrelid::regclass::text
FROM pg_tables
INNER JOIN pg_inherits
ON pg_tables.tablename::regclass = pg_inherits.inhparent::regclass
WHERE pg_tables.schemaname = ANY(current_schemas(false)) AND
pg_tables.tablename = 'messages_partitioned_chat_XXX'
Which corresponds to this piece of code.
adapter_decorator.rb:112
def partitions_for_table_name(table_name, include_subpartitions:, _accumulator: [])
select_values(%[
SELECT pg_inherits.inhrelid::regclass::text
FROM pg_tables
INNER JOIN pg_inherits
ON pg_tables.tablename::regclass = pg_inherits.inhparent::regclass
WHERE pg_tables.schemaname = ANY(current_schemas(false)) AND
pg_tables.tablename = #{quote(table_name)}
], "SCHEMA").each_with_object(_accumulator) do |partition, acc|
acc << partition
next unless include_subpartitions
partitions_for_table_name(partition, include_subpartitions: true, _accumulator: acc)
end
end
Following the trail, I found this one next. That however is where I hit a snag, I can't tell what's invoking this automatically.
model_decorator.rb:65
def partitions(include_subpartitions: PgParty.config.include_subpartitions_in_partition_list)
PgParty.cache.fetch_partitions(cache_key, include_subpartitions) do
connection.partitions_for_table_name(table_name, include_subpartitions: include_subpartitions)
end
rescue
[]
end
In order to avoid the slowdown, I had to add the following to my model.
message.rb
def self.partitions
[]
end
It isn't an ideal solution, but it's the best I could come up with without understanding exactly what's happening here.
Update:
After modifying the code to trigger some exceptions and get a stack trace, I came across this.
shared_methods.rb:20
def table_exists?
target_table = partitions.first || table_name
connection.schema_cache.data_source_exists?(target_table)
end
table_exists?
is invoked by active_record, but table_exists?
calls partitions
, which fetches all partitions and then returns the first.
The following monkey patch converts partitions_for_table_name
into a proper enumerator so that .first
is performed lazily. This solves the issue in my case. Though I suspect it will cause other problems with caching if I don't also rework that later. In my case I don't currently need the caching, so it works fine.
initializers/pg_party.rb
...
ActiveSupport.on_load(:active_record) do
require "pg_party/adapter_decorator"
module PgParty
class AdapterDecorator < SimpleDelegator
def partitions_for_table_name(table_name, include_subpartitions:, _accumulator: [])
Enumerator.new do |yielder|
select_values(%[
SELECT pg_inherits.inhrelid::regclass::text
FROM pg_tables
INNER JOIN pg_inherits
ON pg_tables.tablename::regclass = pg_inherits.inhparent::regclass
WHERE pg_tables.schemaname = ANY(current_schemas(false)) AND
pg_tables.tablename = #{quote(table_name)}
], "SCHEMA")
.each_with_object(_accumulator) do |partition, acc|
acc << partition
yielder << partition
if include_subpartitions
yield partitions_for_table_name(partition, include_subpartitions: true, _accumulator: acc)
end
end.lazy
end
end
end
end
end