Skip to content

table_exists? fetching all partitions causes slowdown with many partition tables #97

@hecksadecimal

Description

@hecksadecimal

I'm using pg_party to handle partitioning of messages based on a chat_id, where each chat that reaches over 10,000 messages is given its own partition. My original unpartitioned messages column was previously 95GB in size, to give a bit of context for the scope of what I am working with.

The result of partitioning with that criteria was about 4000 partition tables total. This did speed things up dramatically, which I'm very thankful for, however I ran into a very strange issue that took me a while to debug.

I noticed that after initial app load, when attempting to visit the log page of any of these chats, there would be a long wait time while fetching messages. There was no output on the console hinting at extra queries being done, and when testing the query plan directly on the database everything ran very fast. It also didn't happen again on subsequent queries unless I left it alone for a minute, hinting to me that whatever was happening was being cached with pg_party's ttl in my config.

Diving deeper, I noticed that any time my Message model was being invoked under any circumstance, these sorts of queries were happening on my database over all of my partition tables sequentially. It didn't matter if I was trying to grab messages, or if I was simply calling Message.load_schema in the console, the slowdown was present for each case.

(I should note, I have also opted to give each of the partitions a custom human-readable name when they are created. Don't know if this is relevant to the issue.)

SELECT pg_inherits.inhrelid::regclass::text
FROM pg_tables
INNER JOIN pg_inherits
    ON pg_tables.tablename::regclass = pg_inherits.inhparent::regclass
WHERE pg_tables.schemaname = ANY(current_schemas(false)) AND
      pg_tables.tablename = 'messages_partitioned_chat_XXX'

Which corresponds to this piece of code.

adapter_decorator.rb:112

def partitions_for_table_name(table_name, include_subpartitions:, _accumulator: [])
  select_values(%[
      SELECT pg_inherits.inhrelid::regclass::text
      FROM pg_tables
      INNER JOIN pg_inherits
        ON pg_tables.tablename::regclass = pg_inherits.inhparent::regclass
      WHERE pg_tables.schemaname = ANY(current_schemas(false)) AND
      pg_tables.tablename = #{quote(table_name)}
                ], "SCHEMA").each_with_object(_accumulator) do |partition, acc|
    acc << partition
    next unless include_subpartitions

    partitions_for_table_name(partition, include_subpartitions: true, _accumulator: acc)
  end
end

Following the trail, I found this one next. That however is where I hit a snag, I can't tell what's invoking this automatically.

model_decorator.rb:65

def partitions(include_subpartitions: PgParty.config.include_subpartitions_in_partition_list)
  PgParty.cache.fetch_partitions(cache_key, include_subpartitions) do
    connection.partitions_for_table_name(table_name, include_subpartitions: include_subpartitions)
  end
rescue
  []
end

In order to avoid the slowdown, I had to add the following to my model.

message.rb

def self.partitions
  []
end

It isn't an ideal solution, but it's the best I could come up with without understanding exactly what's happening here.

Update:

After modifying the code to trigger some exceptions and get a stack trace, I came across this.

shared_methods.rb:20

def table_exists?
  target_table = partitions.first || table_name

  connection.schema_cache.data_source_exists?(target_table)
end

table_exists? is invoked by active_record, but table_exists? calls partitions, which fetches all partitions and then returns the first.

Image

The following monkey patch converts partitions_for_table_name into a proper enumerator so that .first is performed lazily. This solves the issue in my case. Though I suspect it will cause other problems with caching if I don't also rework that later. In my case I don't currently need the caching, so it works fine.

initializers/pg_party.rb

...

ActiveSupport.on_load(:active_record) do
  require "pg_party/adapter_decorator"

  module PgParty
    class AdapterDecorator < SimpleDelegator
      def partitions_for_table_name(table_name, include_subpartitions:, _accumulator: [])
        Enumerator.new do |yielder|
          select_values(%[
            SELECT pg_inherits.inhrelid::regclass::text
            FROM pg_tables
            INNER JOIN pg_inherits
              ON pg_tables.tablename::regclass = pg_inherits.inhparent::regclass
            WHERE pg_tables.schemaname = ANY(current_schemas(false)) AND
            pg_tables.tablename = #{quote(table_name)}
          ], "SCHEMA")
          .each_with_object(_accumulator) do |partition, acc|
            acc << partition
            yielder << partition
            if include_subpartitions
              yield partitions_for_table_name(partition, include_subpartitions: true, _accumulator: acc)
            end
          end.lazy
        end
      end
    end
  end
end

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions