CLyther

CLyther is a Python tool similar to Cython. CLyther is a python language extension that makes writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features to OpenCL. CLyther was inspired by PyCUDA and its views on Metaprogramming.

CLyther exposes both the OpenCL C library as well as the OpenCL language to python.

Features

  • Fast prototyping of OpenCL code.
  • Create OpenCL code using the Python language definition.
  • Strong OOP programming in OpenCL code.
  • Passing functions as arguments to kernel functions.
  • Python emulation mode of OpenCL code.
  • Fancy indexing of arrays
  • Device memory management
  • Dynamic compilation at runtime

Example

This is an example of a generic OpenCL reduce operator. The arguments include oper which May Be a python function passed to reduce at runtime.

    @clyther.kernel
    @clyther.const( 'group_size' )
    @clyther.bind( 'global_work_size', 1 )
    @clyther.bind( 'local_work_size','group_size' )
    def reduce( output, input, shared, oper, group_size=512 ):

        lid = clrt.get_local_id(0)

        gid = clrt.get_group_id(0)
        gsize = clrt.get_num_groups(0)

        gs2 = group_size * 2

        stride = gs2 * gsize

        shared[lid] = 0.0

        i = gid * gs2 + lid

        shared[lid] = 0
        
        while i < input.size:
            shared[lid] = oper( shared[lid], input[i] )
            shared[lid] = oper( shared[lid], input[i+group_size] )
            
            i += stride
            
            clrt.barrier( clrt.K_LOCAL_MEM_FENCE )
        
        for cgs in [ 512 , 256, 128, 64, 32, 16, 8, 4, 2]:
            if group_size >= cgs:
                if lid < cgs/2:
                    shared[lid] = oper( shared[lid], shared[lid + cgs/2] )
                clrt.barrier( clrt.K_LOCAL_MEM_FENCE )
            
        if lid == 0: 
            output[gid] = shared[0]

This example shows an extensible OpenCL reduction operation written with CLyther.

To use this example you may run somthing like.

    # where `clarray` is a device array.  
    
    @cl.device
    def add( a,b):
        return a+b

    output = clyther.buffer( [1], ctype=clarray.ctype )
    shared = clyther.shared( ctype=ctypes.c_float )
    
    reduce( output, clarray, shared, add,  group_size=128 )
    
    print output.item()

See also

  • GPGPU
  • OpenCL
  • Metaprogramming