Skip to content

Tiled clipping #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Tiled clipping #41

wants to merge 1 commit into from

Conversation

gpeairs
Copy link
Member

@gpeairs gpeairs commented Apr 11, 2025

Demonstrates one approach to #30. Just parking this here for now, I'm not yet sure this is the way, or if it's even worthwhile to solve this. From the issue:

Currently, Clipper operations on large sets of mostly-disjoint polygons like union2d(entire_chip => :layername) are very expensive. We should be able to break down the problem into local operations that are much faster using spatial indexing.

This would enable features that are currently best done in external tools like KLayout—in particular, layerwise XOR and geometry-level DRC of entire layouts.

I think there's actually no speedup to be had as long as we're using Clipper (I would guess it uses equivalently fast spatial sorting internally)—but we can limit peak memory usage so we don't run out. Here we try the approach that KLayout takes: cover the geometry with tiles and do the operation on each tile. It's a pretty naive implementation, not doing anything clever: Build a spatial index (RTree) for each set of polygons, find the polygons touching each 1mm tile (~100 polygons per tile), intersect the polygons in each tile, then collect the results. To benchmark we create two sets of n circle polygons in a grid, offset from each other, then take their intersection. After compilation:

julia> benchmark_clip(1000); benchmark_clip(1000; tiled=true);
Direct (total): 0.107539 seconds (273.46 k allocations: 32.592 MiB)
Tree1: 0.000544 seconds (81 allocations: 224.078 KiB)
Tree2: 0.000551 seconds (81 allocations: 224.078 KiB)
Finding: 0.000578 seconds (10.56 k allocations: 543.078 KiB)
Intersecting: 0.113519 seconds (274.13 k allocations: 32.633 MiB, 14.05% gc time)
Tiled (total): 0.116130 seconds (284.98 k allocations: 33.680 MiB, 13.73% gc time)

julia> benchmark_clip(10000); benchmark_clip(10000; tiled=true);
Direct (total): 1.268168 seconds (2.67 M allocations: 318.480 MiB, 1.28% gc time)
Tree1: 0.006923 seconds (544 allocations: 2.038 MiB)
Tree2: 0.007564 seconds (544 allocations: 2.038 MiB)
Finding: 0.008397 seconds (117.85 k allocations: 5.335 MiB)
Intersecting: 1.298350 seconds (2.67 M allocations: 318.403 MiB, 2.84% gc time)
Tiled (total): 1.323002 seconds (2.79 M allocations: 331.691 MiB, 2.79% gc time)

julia> benchmark_clip(100000); benchmark_clip(100000; tiled=true);
Direct (total): 21.256846 seconds (26.66 M allocations: 3.103 GiB, 3.14% gc time)
Tree1: 0.106601 seconds (5.58 k allocations: 31.773 MiB)
Tree2: 0.107319 seconds (5.58 k allocations: 31.773 MiB)
Finding: 0.086505 seconds (1.16 M allocations: 52.680 MiB)
Intersecting: 20.095295 seconds (27.39 M allocations: 3.186 GiB, 3.42% gc time)
Tiled (total): 20.688486 seconds (28.56 M allocations: 3.690 GiB, 3.61% gc time)

Clipping 1 million pairs of circles directly would use too much memory for my laptop, but we can do it with tiles.

julia> benchmark_clip(1000000; tiled=true);
Tree1: 1.076922 seconds (54.45 k allocations: 301.200 MiB, 23.32% gc time)
Tree2: 0.947303 seconds (54.45 k allocations: 301.200 MiB)
Finding: 1.455029 seconds (11.96 M allocations: 537.995 MiB, 23.14% gc time)
Intersecting: 150.370142 seconds (267.44 M allocations: 31.094 GiB, 6.97% gc time)
Tiled (total): 180.971759 seconds (279.53 M allocations: 69.466 GiB, 12.19% gc time)

(Oops, big allocation at the end just for concatenating the results into a single array, which could be avoided.)

Hard to compare directly, but KLayout seems to do XOR at least 2x faster than our intersection (and Clipper XOR takes 25-50% longer than intersection here).

Tile size doesn't matter much but 100-1000 polygons per tile seems reasonable. Spatial indexing and querying are always much faster than clipping.

It seems Clipper isn't thread-safe so multithreading doesn't help, e.g.:

    # Threaded version of map
    output_poly = similar(tiles, Vector{Polygon{T}})
    Base.Threads.@threads for out_idx in eachindex(output_poly)
        idx1, idx2 = tile_poly_indices[out_idx]
        obj = @view poly1[idx1]
        tool = @view poly2[idx2]
        output_poly[out_idx] = reduce(vcat, to_polygons(intersect2d(obj, tool)), init=Polygon{T}[])
    end
    output_poly = reduce(vcat, output_poly; init=Polygon{T}[])
Exception: EXCEPTION_ACCESS_VIOLATION at 0x2474ac5 -- _ZN10ClipperLib11ClipperBase5ResetEv at C:\Users\gpeairs\.julia\artifacts\137fda8664ad2b91186eb126bdbbb66af87d65b1\bin\libcclipper.dll (unknown line)
_ZN10ClipperLib11ClipperBase5ResetEv at C:\Users\gpeairs\.julia\artifacts\137fda8664ad2b91186eb126bdbbb66af87d65b1\bin\libcclipper.dll (unknown line)
_ZN10ClipperLib7Clipper15ExecuteInternalEv at C:\Users\gpeairs\.julia\artifacts\137fda8664ad2b91186eb126bdbbb66af87d65b1\bin\libcclipper.dll (unknown line)
_ZN10ClipperLib7Clipper7ExecuteENS_8ClipTypeERNS_8PolyTreeENS_12PolyFillTypeES4_ at C:\Users\gpeairs\.julia\artifacts\137fda8664ad2b91186eb126bdbbb66af87d65b1\bin\libcclipper.dll (unknown line)
execute_pt at C:\Users\gpeairs\.julia\artifacts\137fda8664ad2b91186eb126bdbbb66af87d65b1\bin\libcclipper.dll (unknown line)
execute_pt at C:\Users\gpeairs\.julia\packages\Clipper\kcvXW\src\Clipper.jl:215 [inlined]

(If only we had pure Julia clipping... #35)

A simple multiprocessing version with pmap in place of map does give a speedup for large enough batch size (I can get 2x with 2 cores, anyway), although you'd probably want to write it so that the workers can load only Clipper.

Note on healing: Results of operations between polygons touching tile edges may be duplicated, so we also take the union of results touching tile edges if the user calls with heal=true. If such results don't themselves touch tile edges, they won't be healed (I think the KLayout implementation also has this issue). [Haven't tested healing in this initial demo.]

I think the question is whether we even need this for realistic use cases (e.g. full-layout XOR) and if we do, whether we need to be able to do this in DeviceLayout rather than KLayout, which is faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant