Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

CNugteren / CLTune Public

Notifications You must be signed in to change notification settings
Fork 36
Star 170

Code
Issues 2
Pull requests
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Releases: CNugteren/CLTune

Releases · CNugteren/CLTune

Version 2.7.0

26 Jun 19:27

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.7.0 Latest

Latest

Version 2.7.0

CLTune now automatically ensures global size is a multiple of the local workgroup size
Added GetBestResult() to the tuner's API to retrieve the best parameters programmatically
Changed std::initalizer_list in the AddParameters API to std::vector
Fixed a bug in the simulated annealing search method

Assets 5

Loading

All reactions

Version 2.6.0

23 Oct 13:51

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.6.0

Version 2.6.0

Changed timing measurements to now also include the (varying) kernel launch overhead
It is now possible to set OpenCL compiler options through the env variable CLTUNE_BUILD_OPTIONS
Added support for compilation under Visual Studio 2013 (MSVC++ 12.0)
Added an option to build a static version of the library

Assets 6

Loading

All reactions

Version 2.5.0

27 Sep 19:05

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.5.0

Version 2.5.0

Updated to version 8.0 of the CLCudaAPI header
Made it possible to configure the number of times each kernel is run (to average results)
Minor bugfixes

Assets 4

Loading

All reactions

Version 2.4.0

29 Jun 17:52

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.4.0

Version 2.4.0

Made it possible to run the unit-tests independently of the provided OpenCL kernel samples
Added an option to compile in verbose mode for additional diagnostic messages (-DVERBOSE=ON)
Now using version 6.0 of the CLCudaAPI header
Fixed the RPATH settings on OSX
Added Appveyor continuous integration and increased coverage of the Travis builds

Assets 4

Loading

All reactions

Version 2.3.1

25 May 11:04

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.3.1

Version 2.3.1 (bug-fix release)

Fixed a bug where an output buffer could not be used as input at the same time
Fixed computing the validation error for half-precision fp16 data-types

Assets 2

Loading

All reactions

Version 2.3.0

22 May 15:06

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.3.0

Version 2.3.0

Added support for 'short' and 'cl_half' data-types as kernel buffer and scalar arguments
Fixed a bug where failed results would still show up in the tuning results
Made MSVC link the run-time libraries statically

Assets 2

Loading

All reactions

Version 2.2.0

27 Apr 09:08

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.2.0

Version 2.2.0

Added two new simpler samples of using the tuner (vector-add and convolution)
Updated the general documentation
Added API documentation
Now using version 5.0 of the CLCudaAPI header

Assets 2

Loading

All reactions

Version 2.1.0

31 Mar 04:13

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.1.0

Version 2.1.0

Added exports to be able to create a DLL on Windows (thanks to Marco Hutter)
Added command-line OpenCL platform selection in the examples (thanks to William J Shipman)

Assets 2

Loading

All reactions

Version 2.0.0

22 Nov 11:21

CNugteren

Compare

Choose a tag to compare

Loading

Version 2.0.0

Version 2.0.0

Added support for machine learning models. These models can be trained on a small fraction of the
tuning configurations and can be used to predict the remainder. Two models are supported:
- Linear regression
- A 3-layer neural network
Now using version 4.0 of the CLCudaAPI header (previously known as Claduc)
Added experimental support for CUDA kernels
Added support for MSVC (Visual Studio) 2015
Using Catch instead of GTest for unit-testing
Various minor fixes

Assets 2

Loading

All reactions

Version 1.7.0

03 Aug 15:17

CNugteren

Compare

Choose a tag to compare

Loading

Version 1.7.0

Version 1.7.0

Now using the Claduc C++11 interface to OpenCL (see https://github.com/CNugteren/Claduc)
Added a method to print all tuning results in JSON-format to file

Assets 2

Loading

All reactions

Previous 1 2 3 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.