Skip to content

Add a multithreaded performance section #151

@colesbury

Description

@colesbury

There's a good deal of documentation about thread-safety (correctness), but not much about multithreaded performance. This is a bit different from #149, and I think we should do both.

I'd like the page to help the reader develop a basic mental model for how to write programs that will scale well in the free threaded Python builds. Here are some things that may be worth covering.

Reference count contention

Frequent accesses to the same object can inhibit scaling.

Recommendations:

  • Use data that's private to the thread
  • Aggregate results at the end of a task. A basic counting example may help here, something like

Good:

global_counter = 0
global_lock = threading.Lock()

def my_thread():
  counter = 0
  for _ in range(...):
     counter += 1
  with global_lock:
    global_counter += counter

Collection (dict, list, set) performance

The builtin dict, list, and set classes not designed to be concurrent collections. They are designed to be thread-safe, but not necessarily scale well for multithreaded access.

Concurrent reads from a shared dictionary:
Sometimes scales well, but reference count contention on the dictionary may be a bottleneck

Frequent concurrent writes or reads & writes to a shared dictionary:
Avoid. Does not scale well due to contention on dictionary's lock

Recommendation: use ft_utils or some other data structure if you need a concurrent collection

Task size?

Your tasks size has to be big enough that the overhead of dispatching to a thread is much smaller than the time it takes to run the task.

Recommendation: make your tasks bigger (include example with concurrent.futures.ThreadPoolExecutor)

Gotchas

random: can be a source of unexpected bottlenecks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions