序
本文主要研究一下PowerJob的MapReduceProcessor
MapReduceProcessor
代码语言:javascript复制public interface MapReduceProcessor extends MapProcessor {
/**
* reduce方法将在所有任务结束后调用
* @param context 任务上下文
* @param taskResults 保存了各个子Task的执行结果
* @return reduce产生的结果将作为任务最终的返回结果
*/
ProcessResult reduce(TaskContext context, List<TaskResult> taskResults);
}
MapReduceProcessor继承了MapProcessor,它新增了reduce方法
TaskResult
tech/powerjob/worker/core/processor/TaskResult.java
代码语言:javascript复制@Data
public class TaskResult {
private String taskId;
private boolean success;
private String result;
}
TaskResult定义了taskId、success、result属性
handleLastTask
tech/powerjob/worker/core/processor/runnable/HeavyProcessorRunnable.java
代码语言:javascript复制 private void handleLastTask(String taskId, Long instanceId, TaskContext taskContext, ExecuteType executeType) {
final BasicProcessor processor = processorBean.getProcessor();
ProcessResult processResult;
Stopwatch stopwatch = Stopwatch.createStarted();
log.debug("[ProcessorRunnable-{}] the last task(taskId={}) start to process.", instanceId, taskId);
List<TaskResult> taskResults = workerRuntime.getTaskPersistenceService().getAllTaskResult(instanceId, task.getSubInstanceId());
try {
switch (executeType) {
case BROADCAST:
if (processor instanceof BroadcastProcessor) {
BroadcastProcessor broadcastProcessor = (BroadcastProcessor) processor;
processResult = broadcastProcessor.postProcess(taskContext, taskResults);
} else {
processResult = BroadcastProcessor.defaultResult(taskResults);
}
break;
case MAP_REDUCE:
if (processor instanceof MapReduceProcessor) {
MapReduceProcessor mapReduceProcessor = (MapReduceProcessor) processor;
processResult = mapReduceProcessor.reduce(taskContext, taskResults);
} else {
processResult = new ProcessResult(false, "not implement the MapReduceProcessor");
}
break;
default:
processResult = new ProcessResult(false, "IMPOSSIBLE OR BUG");
}
} catch (Throwable e) {
processResult = new ProcessResult(false, e.toString());
log.warn("[ProcessorRunnable-{}] execute last task(taskId={}) failed.", instanceId, taskId, e);
}
TaskStatus status = processResult.isSuccess() ? TaskStatus.WORKER_PROCESS_SUCCESS : TaskStatus.WORKER_PROCESS_FAILED;
reportStatus(status, suit(processResult.getMsg()), null, taskContext.getWorkflowContext().getAppendedContextData());
log.info("[ProcessorRunnable-{}] the last task execute successfully, using time: {}", instanceId, stopwatch);
}
HeavyProcessorRunnable的handleLastTask方法先通过workerRuntime.getTaskPersistenceService().getAllTaskResult获取taskResults,然后对于MapReduceProcessor则回调mapReduceProcessor.reduce方法
getAllTaskResult
tech/powerjob/worker/persistence/TaskPersistenceService.java
代码语言:javascript复制 public List<TaskResult> getAllTaskResult(Long instanceId, Long subInstanceId) {
try {
return execute(() -> taskDAO.getAllTaskResult(instanceId, subInstanceId));
}catch (Exception e) {
log.error("[TaskPersistenceService] getTaskId2ResultMap for instance(id={}) failed.", instanceId, e);
}
return Lists.newLinkedList();
}
TaskPersistenceService的getAllTaskResult方法根据instanceId, subInstanceId查询task_info表
select task_id, status, result from task_info where instance_id = ? and sub_instance_id = ?
,最后只返回状态是WORKER_PROCESS_SUCCESS或者WORKER_PROCESS_FAILED的任务信息
小结
MapReduceProcessor继承了MapProcessor,它新增了reduce方法;HeavyProcessorRunnable的handleLastTask方法先通过workerRuntime.getTaskPersistenceService().getAllTaskResult获取taskResults,然后对于MapReduceProcessor则回调mapReduceProcessor.reduce方法;getAllTaskResult方法根据instanceId, subInstanceId查询task_info表返回状态是WORKER_PROCESS_SUCCESS或者WORKER_PROCESS_FAILED的任务信息(task_info表只在worker节点上
),默认是h2(~/powerjob/worker/h2/{uuid}/powerjob_worker_db.mv.db
)