Skip to content

3 MapReduce for Python

tkaitchuck edited this page Nov 26, 2014 · 15 revisions

The Python MapReduce library can be used for complete map-shuffle-reduce pipelines only. It does not have the ability to run a map-only job.

Features and capabilities

The App Engine adaptation of Google's MapReduce model is optimized for the needs of the App Engine environment, where resource quota management is a key consideration. This release of the MapReduce API provides the following features and capabilities:

  • Automatic sharding for faster execution, allowing you to use as many workers as you need to get your results faster
  • Standard data input readers for iterating over blob and datastore data.
  • Standard output writers
  • Status pages to let you see how your jobs are running
  • Processing rate limiting to slow down your mapper functions and space out the work, helping you avoid exceeding your resource quotas

Creating and Running a Job Example Code The MapreducePipeline Class Readers and Writers

Clone this wiki locally