Skip to content

primesearch/mfaktc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#################
# mfaktc README #
#################

Content

0   What is mfaktc?
1   Supported Hardware
2   Compilation
2.1 Compilation (Linux)
2.2 Compilation (Windows)
3   Running mfaktc (Linux)
3.1 Running mfaktc (Windows)
4   How to get work and report results from/to the Primenet server
5   Known issues
5.1 Stuff that looks like an issue but actually isn't an issue
6   Tuning
7   FAQ
8   .plan



####################
# 0 What is mfaktc #
####################

mfaktc is a program for trial factoring of Mersenne numbers. The name mfaktc
is "Mersenne FAKTorisation with Cuda". Faktorisation is a mixture of the
English word "factorisation" and the German word "Faktorisierung".
It uses CPU and GPU resources.
It runs almost entirely on the GPU since v0.20 (previous versions used both
CPU and GPU resources).



########################
# 1 Supported Hardware #
########################

mfaktc should run all all CUDA capable Nvidia GPUs with compute capability
>= 1.1. From my knowledge there is only one CUDA capable GPU with compute
capability 1.0: the G80 chip which is found on Geforce 8800 Ultra / GTX /
GTS 640 / GTS 320 and their Quadro and Tesla variants.
For AMD Radeon GPUs check out the open CL port of mfaktc: mfakto by Bertram
Franz



#################
# 2 Compilation #
#################

It is assumed that you've already set up your compiler and CUDA environment.
There are some compiletime settings in the file src/params.h possible:
- in the upper part of the file there are some settings which "advanced
  users" can chance if they think it is beneficial. Those settings are
  verified for reasonable values.
- in the middle are some debug options which can be turned on. These options
  are only useful for debugging purposes.
- the third part contains some defines which should _NOT_ be changed unless
  you really know what they do. It is easily possible to screw up something.

A 64bit built is prefered except for some old lowend GPUs because the
performance critical CPU code runs ~33% faster compared to 32bit. (measured
on a Intel Core i7)



###########################
# 2.1 Compilation (Linux) #
###########################

Change into the subdirectory "src/"

Adjust the path to your CUDA installation in "Makefile" and run 'make'.
The binary "mfaktc.exe" is placed into the parent directory.

I'm using
- OpenSUSE 12.2 x86_64
- gcc 4.7.1 (OpenSUSE 12.2)
- Nvidia driver 343.36
- Nvidia CUDA Toolkit
  - 6.5

Older CUDA Toolkit versions should work, too.

I didn't spend time testing mfaktc on 32bit Linux because I think 64bit
(x86_64) is adopted by most Linux users now. Anyway mfaktc should work on
32bit Linux, too. If problems are reported I'll try to fix them. So I don't
drop Linux 32bit support totally. ;)

When you compile mfaktc on a 32bit system you must change the library path
in "Makefile" (replace "lib64" with "lib").



#############################
# 2.2 Compilation (Windows) #
#############################

The following instructions have been tested on Windows 7 64bit using Visual
Studio 2012 Professional. A GNU compatible version of make is also required
as the Makefile is not compatible with nmake. GNU Make for Win32 can be
downloaded from http://gnuwin32.sourceforge.net/packages/make.htm.

Run the Visual Studio 2012 x64 Win64 Command Prompt for x64 or
Run the Visual Studio 2012 x86 Native Tools Command Prompt for x86 (32 bit)

 and change into the "\src" subdirectory.

Run 'make -f Makefile.win' for a 64bit built (recommended on 64bit systems)
or 'make -f Makefile.win32' for a 32bit built.

You will have to adjust the paths to your CUDA installation and the
Microsoft Visual Studio binaries in the makefiles if you have something
other than CUDA 6.5 and MSVS 2012. The binaries "mfaktc-win-64.exe" or
"mfaktc-win-32.exe" are placed in the parent directory.



############################
# 3 Running mfaktc (Linux) #
############################

Just run './mfaktc.exe -h'. It will tell you what parameters it accepts.
Maybe you want to tweak the parameters in mfaktc.ini. A small description
of those parameters is included in mfaktc.ini, too.
Typically you want to get work from a worktodo file. You can specify the
name in mfaktc.ini. It was tested with primenet v5 worktodo files but v4
should work, too.

Please run the builtin selftest each time you've
- recompiled the code
- downloaded a new binary from somewhere
- changed the Nvidia driver
- changed your hardware

Example worktodo.txt
-- cut here --
Factor=bla,66362159,64,68
Factor=bla,3321932839,50,71
-- cut here --

Than run e.g. './mfaktc.exe'. If everything is working as expected this
should trial factor M66362159 from 2^64 to 2^68 and after that trial factor
M3321932839 from 2^50 to 2^71.



################################
# 3.1 Running mfaktc (Windows) #
################################

Similar to Linux (read above!).
Open a command window and run 'mfaktc.exe -h'.



####################################################################
# 4 How to get work and report results from/to the Primenet server #
####################################################################

Getting work:
    Step 1) go to http://www.mersenne.org/ and login with your username and
            password
    Step 2) on the menu on the left click "Manual Testing" and then
            "Assignments"
    Step 3) choose the number of assignments by choosing
            "Number of CPUs (cores) you need assignments for (maximum 12)"
            and "Number of assignments you want for each core"
    Step 4) Change "Preferred work type" to "Trial factoring"
    Step 5) click the button "Get Assignments"
    Step 6) copy&paste the "Factor=..." lines directly into the worktodo.txt
            in your mfaktc directory

You can also use www.GPU72.org to get assignments and 
you can also you MISFIT to get and report results
http://www.mersenneforum.org/misfit/

Start mfaktc and stress your GPU ;)

Once mfaktc has finished all the work report the results to the primenet
server:
    Step 1) go to http://www.mersenne.org/ and login with your username and
            password
    Step 2) on the menu on the left click "Manual Testing" and then
            "Results"
    Step 3) upload the results.txt file generated by mfaktc using the
            "search" and "upload" button
    Step 4) once you've verified that the Primenet server has recognized
            your results delete or rename the results.txt from mfaktc

Advanced usage (extend the upper limit):
    Since mfaktc works best on long running jobs you may want to extend the
    upper TF limit of your assignments a little bit. Take a look how much TF
    is usually done here: http://www.mersenne.org/various/math.php
    Lets assume that you've received an assignment like this:
        Factor=<some hex key>,78467119,65,66
    This means that Primenet server assigned you to TF M78467119 from 2^65
    to 2^66. Take a look at the site noted above, those exponent should be
    TFed up to 2^71. Primenet will do this in multiple assignments (step by
    step) but since mfaktc runs very fast on modern GPUs you might want to
    TF up to 2^71 or even 2^72 directly. Just replace the 66 at the end of
    the line with e.g. 72 before you start mfaktc:
        e.g. Factor=<some hex key>,78467119,65,72
    When you increase the upper limit of your assignments it is import to
    report the results once you've finished up to the desired level. (Do not
    report partial results before completing the exponent!)



##################
# 5 Known issues #
##################

- The user interface isn't hardened against malformed input. There are some
  checks but when you really try you should be able to screw it up.
- The GUI of your OS might be very laggy while running mfaktc. (newer GPUs
  with compute capabilty 2.0 or higher can handle this _MUCH_ better)
  Comment from James Heinrich:
    Slower/older GPUs (e.g. compute v1.1) that experience noticeable lag can
    get a significant boost in system usability by reducing the NumStreams
    setting from default "3" to "2", with minimal performance loss.
    Decreasing to "1" provides much greater system responsiveness, but also
    much lower throughput.
    At least it did so for me. With NumStreams=3, I could only run mfaktc
    when I wasn't using the computer. Now I run it all the time (except when
    watching a movie or playing a game...)
  Another thing worth trying is different settings of GridSize in
  mfaktc.ini. Smaller grids should have higher responsibility with the cost
  of a little performance penalty. Performancewise this is not recommended
  on GPUs which can handle >= 100M/s candidates.
- the debug options CHECKS_MODBASECASE (and USE_DEVICE_PRINTF) might report
  too high qi values while using the barrett kernels. They are caused by
  factor candidates out of the specified range.



##################################################################
# 5.1 Stuff that looks like an issue but actually isn't an issue #
##################################################################

- mfaktc runs slower on small ranges. Usually it doesn't make much sense to
  run mfaktc with an upper limit smaller than 2^64. It is designed for trial
  factoring above 2^64 up to 2^95 (factor sizes). ==> mfaktc needs
  "long runs"!
- mfaktc can find factors outside the given range.
  E.g. './mfaktc.exe -tf 66362159 40 41' has a high change to report
  124246422648815633 as a factor. Actually this is a factor of M66362159 but
  it's size is between 2^56 and 2^57! Of course
  './mfaktc.exe -tf 66362159 56 57' will find this factor, too. The reason
  for this behaviour is that mfaktc works on huge factor blocks. This is
  controlled by GridSize in mfaktc.ini. The default value is 3 which means
  that mfaktc runs up to 1048576 factor candidates at once (per class). So
  the last block of each class is filled up with factor candidates above to
  upper limit. While this is a huge overhead for small ranges it's save to
  ignore it on bigger ranges. If a class contains 100 blocks the overhead is
  on average 0.5%. When a class needs 1000 blocks the overhead is 0.05%...



############
# 6 Tuning #
############

Read mfaktc.ini and read before editing. ;)



#########
# 7 FAQ #
#########

Q Does mfaktc support multiple GPUs?
A Yes, with the exception that a single instance of mfaktc can only use one
  GPU. For each GPU you want to run mfaktc on you need (at least) one
  instance of mfaktc. For each instance of mfaktc you can use the
  commandline option "-d <GPU number>" to specify which GPU to use for each
  specific mfaktc instance. Please read the next question, too.

Q Can I run multiple instances of mfaktc on the same computer?
A Yes! You need a separate directory for each instance of mfaktc.

Q Can I continue (load a checkpoint) from a 32bit version of mfaktc with a
  64bit version of mfaktc (and vice versa)?
A Yes!

Q Version numbers
A release versions are usually 0.XX where XX increases by one for each new
  release. Sometimes there are version which include a single (quick) patch.
  If you look into the Changelog.txt you can see the mfaktc 0.13 was
  followed by mfaktc 0.13p1 followed by mfaktc 0.14. These 0.XXpY versions
  are intended for daily work by regular users!
  Additionally there are lots of 0.XX-preY versions which are usually not
  public available. They are usually *NOT* intended for productive usage,
  sometimes they don't even compile or have the computational part disabled.
  If you somehow receive one of those -pre versions please don't use them
  for productive work. They had usually minimal to zero QA.


###########
# 8 .plan #
###########

0.22
- merge "worktodo.add" from mfakto <-- done in 0.21
- check/validate mfaktc for lower exponents <-- done in 0.21
- rework debug code
- fast (GPU-sieve enabled) kernel for factors < 2^64?

0.??
- automatic primenet interaction (Eric Christenson is working on this)         <- specification draft exists; on hold, Eric doesn't want to continue his efforts. :(
  - this will greatly increase usability of mfaktc
  - George Woltman agreed to include the so called "security module" in
    mfaktc for a closed source version of mfaktc. I have to check license
    options, GPL v3 does not allow to have parts of the program to be
    closed source. Solution: I'll re-release under another license. This is
    NOT the end of the GPL v3 version! I'll release future versions of
    mfaktc under GPL v3! I want mfaktc being open source! The only
    differences of the closed version will be the security module and the
    license information.

not planned for a specific release yet, no particular order!
- performance improvements whenever I find them ;)
- change compiletime options to runtime options (if feasible and useful)
- documentation and comments in code
- try to use double precision for the long integer divisions                  <-- unsure
- json output for wagstaff numbers https://www.mersenneforum.org/showpost.php?p=662680&postcount=3769
- factors-meta.<factor>.timestamp https://www.mersenneforum.org/showpost.php?p=662603&postcount=3750
- factors-meta.<factor>.class https://www.mersenneforum.org/showpost.php?p=662720&postcount=3781
- found factors support https://www.mersenneforum.org/showpost.php?p=662682&postcount=3770
- os info https://www.mersenneforum.org/showpost.php?p=662648&postcount=3757
- security checksum https://www.mersenneforum.org/showpost.php?p=662658&postcount=3761
- detailed runtime logging https://www.mersenneforum.org/showpost.php?p=662953&postcount=3845
- begink and endk logging https://www.mersenneforum.org/showpost.php?p=662953&postcount=3845
- only log every n seconds https://www.mersenneforum.org/showpost.php?p=662795&postcount=3826
- catch HUP https://www.mersenneforum.org/showpost.php?p=662777&postcount=3815
- non-prime exponents https://www.mersenneforum.org/showpost.php?p=663442&postcount=3873
- TF10G support https://www.mersenneforum.org/showpost.php?p=663442&postcount=3873
- Remove CPU Sieving support https://www.mersenneforum.org/showpost.php?p=663517&postcount=3894