@@ -26,7 +26,7 @@ that consists of two main methods namely initialize and schedule methods to init
2626configuration and schedule the taskgraph based on the workers information available in the resource plan.
2727
2828``` text
29- edu.iu.dsc.tws.tsched.spi.taskschedule .ITaskScheduler
29+ edu.iu.dsc.tws.api.compute.schedule .ITaskScheduler
3030```
3131
3232has the following methods namely
@@ -71,9 +71,13 @@ container id, task instance plan, required and scheduled resource of the contain
7171plan holds the jobid or the taskgraph id and the container plan. The task schedule plan list is mainly
7272responsible for holding the taskschedule of the batch tasks.
7373
74- \`\` bash message Resource { double availableCPU = 1; double availableMemory = 2; double availableDisk = 3; }
75-
7674``` text
75+ message Resource {
76+ double availableCPU = 1;
77+ double availableMemory = 2;
78+ double availableDisk = 3;
79+ }
80+
7781message TaskInstancePlan {
7882 int32 taskid = 1;
7983 string taskname = 2;
@@ -99,8 +103,6 @@ message TaskSchedulePlanList {
99103}
100104```
101105
102- \`\`
103-
104106## YAML file
105107
106108The task scheduler has task.yaml in the config directory. The task scheduler mode represents either
@@ -109,49 +111,84 @@ memory, disk, and cpu values assigned to the task instances. The default contain
109111represents the percentage of values to be added to each container. The default container instance
110112values represents the default size of memory, disk, and cpu of the container. The task parallelism
111113represents the default parallelism value assigned to each task instance. The task type represents
112- the streaming or batch task.The task scheduler dynamically loads the respective streaming and batch
114+ the streaming or batch task. The task scheduler dynamically loads the respective streaming and batch
113115task schedulers based on the configuration values specified in the task.yaml.
114116
115- \`\` yaml
116-
117117``` text
118- #Streaming Task Scheduler Mode "roundrobin" or "firstfit" or "datalocalityaware"
119- twister2.streaming. taskscheduler: "roundrobin"
118+ # Task scheduling mode for the streaming jobs "roundrobin" or "firstfit" or "datalocalityaware" or "userdefined "
119+ twister2.taskscheduler.streaming : "roundrobin"
120120
121- #Streaming Task Scheduler Class
122- twister2.streaming. taskscheduler.class: "edu.iu.dsc.tws.tsched.streaming.roundrobin.RoundRobinTaskScheduler"
121+ # Task Scheduler class for the round robin streaming task scheduler
122+ twister2.taskscheduler.streaming .class: "edu.iu.dsc.tws.tsched.streaming.roundrobin.RoundRobinTaskScheduler"
123123
124- #Batch Task Scheduler Mode "roundrobin" or "datalocalityaware"
125- twister2.batch. taskscheduler: "roundrobin"
124+ # Task scheduling mode for the batch jobs "roundrobin" or "datalocalityaware" or "userdefined" or "batchscheduler "
125+ twister2.taskscheduler.batch : "roundrobin"
126126
127- #Batch Task Scheduler Class
128- twister2.batch. taskscheduler.class: "edu.iu.dsc.tws.tsched.batch.roundrobin.RoundRobinBatchTaskScheduler"
127+ # Task Scheduler class for the round robin batch task scheduler
128+ twister2.taskscheduler.batch .class: "edu.iu.dsc.tws.tsched.batch.roundrobin.RoundRobinBatchTaskScheduler"
129129
130- #Default Task Instance Values
131- twister2.task.instances: 2
132- twister2.task.instance.ram: 512.0
133- twister2.task.instance.disk: 500.0
134- twister2.task.instance.cpu: 2.0
130+ # Number of task instances to be allocated to each worker/container
131+ twister2.taskscheduler.task.instances: 2
135132
136- #Default Container Padding Values
137- twister2.ram.padding.container: 2.0
138- twister2.disk.padding.container: 12.0
139- twister2.cpu.padding.container: 1.0
140- twister2.container.padding.percentage: 2
133+ # Ram value to be allocated to each task instance
134+ twister2.taskscheduler.task.instance.ram: 512.0
141135
142- #Default Container Instance Values
143- twister2.container.instance.ram: 2048.0
144- twister2.container.instance.disk: 2000.0
145- twister2.container.instance.cpu: 4.0
136+ # Disk value to be allocated to each task instance
137+ twister2.taskscheduler.task.instance.disk: 500.0
146138
147- #Default Task Parallelism Value
148- twister2.task.parallelism : 2
139+ # CPU value to be allocated to each task instancetwister2.task.parallelism
140+ twister2.taskscheduler.instance.cpu : 2.0
149141
150- #Default Task Type "streaming" or "batch"
151- twister2.task.type: "streaming"
152- ```
142+ # Default Container Instance Values
143+ # Ram value to be allocated to each container
144+ twister2.taskscheduler.container.instance.ram: 4096.0
153145
154- \`\`
146+ # Disk value to be allocated to each container
147+ twister2.taskscheduler.container.instance.disk: 8000.0
148+
149+ twister2.taskscheduler.container.instance.cpu: 16.0
150+
151+ # Default Container Padding Values
152+ # Default padding value of the ram to be allocated to each container
153+ twister2.taskscheduler.ram.padding.container: 2.0
154+
155+ # Default padding value of the disk to be allocated to each container
156+ twister2.taskscheduler.disk.padding.container: 12.0
157+
158+ # CPU padding value to be allocated to each container
159+ twister2.taskscheduler.cpu.padding.container: 1.0
160+
161+ # Percentage value to be allocated to each container
162+ twister2.taskscheduler.container.padding.percentage: 2
163+
164+ # Static Default Network parameters
165+ # Bandwidth value to be allocated to each container instance for datalocality scheduling
166+ twister2.taskscheduler.container.instance.bandwidth: 100 #Mbps
167+
168+ # Latency value to be allocated to each container instance for datalocality scheduling
169+ twister2.taskscheduler.container.instance.latency: 0.002 #Milliseconds
170+
171+ # Bandwidth to be allocated to each datanode instance for datalocality scheduling
172+ twister2.taskscheduler.datanode.instance.bandwidth: 200 #Mbps
173+
174+ # Latency value to be allocated to each datanode instance for datalocality scheduling
175+ twister2.taskscheduler.datanode.instance.latency: 0.01 #Milliseconds
176+
177+ # Prallelism value to each task instance
178+ twister2.taskscheduler.task.parallelism: 2
179+
180+ # Task type to each submitted job by default it is "streaming" job.
181+ twister2.taskscheduler.task.type: "streaming"
182+
183+ # number of threads per worker
184+ twister2.exector.worker.threads: 1
185+
186+ # name of the batch executor
187+ twister2.executor.batch.name: "edu.iu.dsc.tws.executor.threading.BatchSharingExecutor2"
188+
189+ # number of tuples executed at a single pass
190+ twister2.exector.instance.queue.low.watermark: 10000
191+ ```
155192
156193## User-Defined Task Scheduler
157194
@@ -165,8 +202,31 @@ as "user-defined" with the corresponding "user-defined" task scheduler class nam
165202#User-defined Streaming Task Scheduler
166203twister2.streaming.taskscheduler: "user-defined"
167204
168- #User-defined Streaming Task Scheduler Class
169- twister2.streaming.taskscheduler.class: "edu.iu.dsc.tws.tsched.userdefined.UserDefinedTaskScheduler"
205+ # Task Scheduler for the userDefined Streaming Task Scheduler
206+ #twister2.taskscheduler.streaming.class: "edu.iu.dsc.tws.tsched.userdefined.UserDefinedTaskScheduler"
207+
208+ # Task Scheduler for the userDefined Batch Task Scheduler
209+ #twister2.taskscheduler.batch.class: "edu.iu.dsc.tws.tsched.userdefined.UserDefinedTaskScheduler"
210+ ```
211+
212+ \`\`
213+
214+ ## Batch Task Scheduler
215+
216+ Batch Task Scheduler is able to handle and schedule both single task graph as well as multiple
217+ dependent task graphs. The main constraint considered in the batch task scheduler is specify the same
218+ parallelism value for the dependent tasks in the task graphs. It schedule the tasks in a round
219+ robin fashion but, while scheduling the child or the dependent tasks it considers the data locality
220+ of the input data from the parent tasks and schedule the tasks in a round robin fashion to the workers.
221+
222+ \`\` yaml
223+
224+ ``` text
225+ #Batch Task Scheduler
226+ twister2.taskscheduler.batch: "batchscheduler"
227+
228+ #Task Scheduler class for the batch task scheduler
229+ twister2.taskscheduler.batch.class: "edu.iu.dsc.tws.tsched.batch.batchscheduler.BatchTaskScheduler"
170230```
171231
172232\`\`
@@ -176,25 +236,18 @@ twister2.streaming.taskscheduler.class: "edu.iu.dsc.tws.tsched.userdefined.UserD
176236The other task schedulers and their respective class names are given below. The user have to specify
177237the respective scheduler mode and their corresponding class names.
178238
239+
179240\`\` yaml
180241
181242``` text
182- #Streaming Task Scheduler Mode "roundrobin" or "firstfit" or "datalocalityaware"
183- twister2.streaming.taskscheduler: "roundrobin"
184-
185- #Streaming Task Scheduler Class
186- twister2.streaming.taskscheduler.class: "edu.iu.dsc.tws.tsched.streaming.roundrobin.RoundRobinTaskScheduler"
187-
188- #twister2.streaming.taskscheduler.class: "edu.iu.dsc.tws.tsched.streaming.datalocalityaware.DataLocalityStreamingTaskScheduler"
189- #twister2.streaming.taskscheduler.class: "edu.iu.dsc.tws.tsched.streaming.firstfit.FirstFitStreamingTaskScheduler"
243+ # Task Scheduler for the Data Locality Aware Streaming Task Scheduler
244+ #twister2.taskscheduler.streaming.class: "edu.iu.dsc.tws.tsched.streaming.datalocalityaware.DataLocalityStreamingTaskScheduler"
190245
191- #Batch Task Scheduler Mode "roundrobin" or "datalocalityaware"
192- #twister2.batch.taskscheduler: "roundrobin"
193- twister2.batch.taskscheduler: "datalocalityaware"
246+ # Task Scheduler for the FirstFit Streaming Task Scheduler
247+ #twister2.taskscheduler.streaming.class: "edu.iu.dsc.tws.tsched.streaming.firstfit.FirstFitStreamingTaskScheduler"
194248
195- #Batch Task Scheduler Class
196- twister2.batch.taskscheduler.class: "edu.iu.dsc.tws.tsched.batch.datalocalityaware.DataLocalityBatchTaskScheduler"
197- #twister2.batch.taskscheduler.class: "edu.iu.dsc.tws.tsched.batch.roundrobin.RoundRobinBatchTaskScheduler"
249+ # Task Scheduler for the Data Locality Aware Batch Task Scheduler
250+ #twister2.taskscheduler.batch.class: "edu.iu.dsc.tws.tsched.batch.datalocalityaware.DataLocalityBatchTaskScheduler"
198251```
199252
200253\`\`
0 commit comments