Abstrato

A Job Utility and Size-based Scheduler for Meeting the Client�s Job Requirements in Hadoop

Aditi Jain*, Sanjay Jain, DA Mehta

Hadoop, an implementation of MapReduce paradigm, is an open-source powerful parallel processing framework for handling big data on distributed commodity hardware clusters such as Clouds. Proper scheduling of jobs on such a distributed cluster is an important factor in determining the clusters’ performance. Proper scheduling of jobs in Hadoop cluster requires usage of efficient algorithms that should focus on meeting job requirements like job deadline, job priority etc. provided by clients and also, the improvement in the average job response time in the cluster. Client’s requirement on job completion is an important way to measure the service quality which the client obtains from the cloud. Utility of a job denotes the quality of service requirements between client and service provider. Existing job schedulers in Hadoop (viz., FIFO, Fair Scheduler, Capacity Scheduler) usually ignore job’s requirements (like job deadline, job priority etc.) specified by clients. There is a need of a scheduler that schedules jobs efficiently considering the clients’ job requirements. The problem addressed in this work is of scheduling jobs taking into account the job requirements specified by client. In order to satisfy the client-specified job requirements, the scheduling algorithm calculates the utility value of each job using the job requirements specified by clients and the estimated job size. The results show an increase in the percentage by which jobs in the proposed scheduler are meeting client’s job requirements when compared to the default scheduler in Hadoop.

Isenção de responsabilidade: Este resumo foi traduzido usando ferramentas de inteligência artificial e ainda não foi revisado ou verificado