-
Notifications
You must be signed in to change notification settings - Fork 106
3 MapReduce for Python
tkaitchuck edited this page Nov 26, 2014
·
15 revisions
The Python MapReduce library can be used for complete map-shuffle-reduce pipelines only. It does not have the ability to run a map-only job.
The App Engine adaptation of Google's MapReduce model is optimized for the needs of the App Engine environment, where resource quota management is a key consideration. This release of the MapReduce API provides the following features and capabilities:
- Automatic sharding for faster execution, allowing you to use as many workers as you need to get your results faster
- Standard data input readers for iterating over blob and datastore data.
- Standard output writers
- Status pages to let you see how your jobs are running
- Processing rate limiting to slow down your mapper functions and space out the work, helping you avoid exceeding your resource quotas
Creating and Running a Job Example Code The MapreducePipeline Class Readers and Writers