Skip to content
Chris Nuernberger edited this page Feb 27, 2018 · 10 revisions

thesis

  • the organization of computations and data for a given algorithm is constrained by a fundamental tension between parallelism, locality, and redundant computation of shared values

    • This is addressed by a systematic model of schedules.
  • Stencil computation - repeated application of kernel across a dataset.

  • loop fusion

  • loop fusion + tiling slides

  • I believe the right way to program image processing pipelines is to separate the intrinsic algorithm—what is computed—from the concerns of e�ciently organizing it for machine execution—decisions about storage and the ordering of computation.

  • this is also why libraries of optimized code cannot deliver efficient performance when building real image processing pipelines: individually optimized subroutines do not compose into an optimized whole, since they cannot reorganize computation for locality or parallel execution across function boundaries.

Clone this wiki locally