I´m getting some problems in port CUDAFy to VB.NET.
I´m utilizing VS2013 (vb.net) and CUDAFy 1.29
Could you please try to help me understand what to do?
My doubts:
1- How can I serialize the function of module?
2- Why CUDAFy is not finding my function?
Thanks a LOT for any help.
See below, in BOLD, the problems related to my doubts.
Variables
I´m utilizing VS2013 (vb.net) and CUDAFy 1.29
Could you please try to help me understand what to do?
My doubts:
1- How can I serialize the function of module?
2- Why CUDAFy is not finding my function?
Thanks a LOT for any help.
See below, in BOLD, the problems related to my doubts.
Variables
Shared cs_CC As String = "adiciona"
Shared MyGPU As GPGPU = Nothing
Shared Arch As eArchitecture = Nothing
See my code: Shared Executa()
if Loader = true
Dim Modulo = CudafyModule.TryDeserialize(cs_cc)
If IsNothing(Modulo) OrElse (Not Modulo.TryVerifyChecksums) Then
Modulo = CudafyTranslator.Cudafy(ePlatform.All, Arch, cs_CC.GetType)
Modulo.Serialize()
End If
Problem 1: the Serialize() function creates a STRING.CDFY in my folder, but not the ADICIONA.CDFY which would be correct (the function Adiciona). Since I would like to avoid compilation at every run of this code, how to correctly make CUDAFy to write it? I mean, is it only to VARIABLE types? Dim a As Integer() = New Integer(N - 1) {}
Dim b As Integer() = New Integer(N - 1) {}
Dim c As Integer() = New Integer(N - 1) {}
' allocate the memory on the GPU
Dim dev_a As Integer() = MyGPU.Allocate(Of Integer)(a)
Dim dev_b As Integer() = MyGPU.Allocate(Of Integer)(b)
Dim dev_c As Integer() = MyGPU.Allocate(Of Integer)(c)
' fill the arrays 'a' and 'b' on the CPU
For i As Integer = 0 To N - 1
a(i) = i
b(i) = 2 * i
Next
' copy the arrays 'a' and 'b' to the GPU
MyGPU.CopyToDevice(a, dev_a)
MyGPU.CopyToDevice(b, dev_b)
' 128 threads
For i As Integer = 0 To 128
MyGPU.Launch(128, 1).adiciona(dev_a, dev_b, dev_c)
Next
That is the problem! When VS runs, everything goes OK until this point - the VS stops with a message "COULD NOT FIND FUNCTION 'ADICIONA' IN MODULE". end if
End Sub
My function ADICIONA<Cudafy()> _
Shared Sub adiciona(thread As GThread, a As Integer(), b As Integer(), c As Integer())
' each thread responsible for N/128 ie 256 elements (32768)
' each thread responsible for N/128 ie 512 elements (65536)
' tid (threadID) will be a number between 0 and 128 ie starting point
Dim tid As Integer = thread.blockIdx.x
While tid < N
c(tid) = a(tid) + b(tid)
' jump how many elements (128)
tid += thread.gridDim.x
End While
End Sub
My function LOADERPublic Shared Function Loader() As Boolean
DeviceType = eGPUType.Cuda
CudafyModes.Target = DeviceType
CudafyTranslator.Language = If(CudafyModes.Target = eGPUType.Cuda, eLanguage.Cuda, eLanguage.OpenCL)
Dim CompatibleDevice As GPGPUProperties() = CudafyHost.GetDeviceProperties(CudafyModes.Target, True).ToArray
If Not CompatibleDevice.Any Then ' não possui um full-CUDA device
MsgBox("I do not found any OpenCL or CUDA compatible device")
Return False
End If
Dim selectedDevice As GPGPUProperties = CompatibleDevice(0)
If IsNothing(selectedDevice) Then
MsgBox("I cannot allocate a compatible device")
Return False
End If
CudafyModes.DeviceId = selectedDevice.DeviceId
Thread_per_Block = selectedDevice.MaxThreadsPerBlock
Blocks_per_Grid = selectedDevice.MaxThreadsSize.x
Shared_Mem_per_Block = selectedDevice.SharedMemoryPerBlock
MyGPU = CudafyHost.GetDevice(CudafyModes.Target, CudafyModes.DeviceId)
Arch = MyGPU.GetArchitecture
Return True
End Function