Benchmarks of distributed systems
-
A
Coordinator, and one or moreWorkerprocesses executing in parallel. TheWorkerwill interact with theCoordinatorvia RPC. -
The
Coordinatoris responsible for assigning tasks and noting that aWorkercompletes its tasks in a reasonable amount of time, and recycles if not. -
Each
workerprocess requests a task from theCoordinator, reads the task's input from one or more files, executes the task, and writes the task's output to one or more files.
Coordinator :
-
Create a temporary file directory and an output directory.
-
Start RPC service with thread pool to provide socket connection service for worker.
-
Maintain multiple Task-related queues and collections and monitor their status.
-
Assign map and reduce tasks to Workers, and monitor the recycling of tasks that execute timeouts.
Worker :
-
The loop asks the Coordinator for the Task, and after the execution is completed, it is verified whether it is completed.
-
It is divided into two operation modes: map and reduce. When the map mode is completed, the intermediate results are written to the temp file, and then switched to the reduce mode. When the reduce mode is completed, the final result file is output.
-
Implementation of data structures: custom KeyValue, doubly circular linked list, blockQueue, mapSet, etc
-
Implementation of RPC communication between each Worker and Coordinator
-
Concurrency implementation: Lock, ReentrantLock, Condition, etc
-
Unknown bug