PySchedCL

    PySchedCL is a python based scheduling framework for OpenCL applications. The framework heavily relies on the PyOpenCL package which provides easy access to the full power of the OpenCL API to the end user. The primary goal of PySchedCL is to provide a platform for rapid prototyping of high performance applications on heterogeneous architectures. It is also a research tool for experimenting with various scheduling and mapping policies for running multiple OpenCL applications for different heterogeneous platforms. The base API of the tool is written in python leveraging the functionalities of PyOpenCL for executing OpenCL kernels.

Project Hierarchy

  • Base Package: The base file pyschedcl.py contains the primary functions for PySchedCL. The package contains a Kernel class implementation that encapsulates all data related to an OpenCL kernel task and functions like building programs against OpenCL devices, loading/generating input data, creating buffers for data transfer and dispatching an OpenCL kernel to specific devices.
  • Constants: The file pyschedcl_constants.py contains a list of constant values used by the base package The values are generated during initial configuration of the tool.
    • CPU_PLATFORM: List of platform names for CPU device
    • GPU_PLATFORM: List of platform names for GPU device
    • NUM_CPU_DEVICES: Number of CPU devices available
    • NUM_GPU_DEVICES: Number of GPU devices available
    • SOURCE_DIR: Absolute path containing the pyschedcl package.
  • Partition: The primary script for partitioning an OpenCL kernel, partition.py partitions a single kernel, given a partition class value and dataset size across one CPU and one GPU device.
  • Scheduling: The primary script for scheduling scheduler.py contains all the necessary modules written using the base PySchedCL API for scheduling an independent set of OpenCL kernels to multiple CPU and GPU devices.
  • utils: Folder containing additional scripts related to pyschedcl.
    • get_optimal_partition.py: Runs a specific kernel for all partition class values to obtain the optimal partition class value.
    • parse_output_dump.py: Parses output data dump after kernel execution. The output dumps are stored in the outputs folder.
    • run_scheduler.py: Runs the scheduling script multiple times for a given set of kernels.
    • log_parser.py: Parses the log file generated after execution of a single kernel or multiple kernels to get various event specific information.
  • info: Folder containing the kernel specification file for the individual kernels (.json file) as well as task files for scheduling (.task file). Users must dump their kernel specification files and task files in this folder while executing scripts related to PySchedCL.
  • kernel_src: Folder containing the kernel source files for the individual kernels (.cl file). User must dump their kernel source file in this folder to execute PySchedCL.
  • logs: Folder containing the log files generated during execution of partition.py or scheduler.py scripts. The naming for a log file is kernelName_partitionClass_datasetSize_timestamp_debug.log.
  • gantt_chart: Folder containing Gantt charts generated after execution of partition.py and scheduler.py scripts. The naming convention for the Gantt chart file is kernelName_partitionClass_datasetSize_timestamp.png.
  • outputs: Folder containing data dump outputs generated after execution of partition.py and scheduler.py scripts. The naming convention for the dump file is kernelName_partitionClass_datasetSize_timestamp.pickle.