System requirements
The current version of TeraChem was compiled and tested under 64-bit RedHat Enterprise Linux 5.3 operating system running on Intel Core2 quad-core and Intel Xeon 5520 dual quad-core CPU machines. An Nvidia compute capability 1.3 (Tesla C1060 or similar) or higher (i.e. Tesla C2050 or similar) graphics card is required to run the program. Please refer to the CUDA Programming Guide for the most current list of Nvidia GPU’s that meet this requirement. A CUDA driver (270.41.19 or later) must be installed on the system as well as v4.0 of the CUDA Toolkit. Details on how to obtain and install the CUDA driver are provided below.
Because the binary file is linked against the Intel MKL library, it is recommended to run TeraChem on Intel-based workstations.
The amount of CPU RAM needed depends on the size of the molecules that will be studied. If the molecules of interest are relatively small (less than 500 atoms), the usual 8Gb or 16Gb configuration is acceptable. For very large molecules (in excess of 10,000 basis functions), CPU RAM will often be a limiting factor. For example, molecules with 25,000 basis functions require almost 70GB of CPU memory.
Performance
The benchmarked hardware:
Performance (time for the entire SCF procedure in seconds) in single point energy calculations of some representative molecules. Note that C1060 and C2050 provide slightly different results (total energy difference of the order of 0.001 kcal/mol) because C2050 supports denormals and fully IEEE-compliant floating point square root and division operations. Therefore, results obtained on GPUs of 2.0 and higher compute capabilty (i.e. Tesla C2050, etc) are more accurate than results obtained on GPUs of 1.x compute capability (i.e. Tesla C1060 or similar).
| Method | 8 Tesla C2050 | 8 Tesla C1060 | ||||
| Taxol (jpg, xyz) |
Valinomycin (jpg, xyz) |
Olestra (jpg, xyz) |
Taxol (jpg, xyz) |
Valinomycin (jpg, xyz) |
Olestra (jpg, xyz) |
|
| RHF/6-31G | 16.19 sec | 24.88 sec | 99.12 sec | 18.47 sec | 29.87 sec | 114.1 sec |
| BLYP/6-31G, Grid 1 | 22.86 sec | 27.89 sec | 159.8 sec | 31.80 sec | 35.79 sec | 181.4 sec |
| BLYP/6-31G, Grid 2 | 32.81 sec | 37.32 sec | 193.5 sec | 44.66 sec | 52.13 sec | 233.6 sec |
| B3LYP/6-31G, Grid 1 | 34.05 sec | 44.17 sec | 154.2 sec | 43.15 sec | 54.68 sec | 184.8 sec |
| B3LYP/6-31G, Grid 2 | 41.08 sec | 54.34 sec | 169.3 sec | 53.84 sec | 70.61 sec | 216.9 sec |
Older results: comparison of times for a single SCF iteration (BLYP/6-31G**) using TeraChem on a single GPU GeForce 480GTX compared to GAMESS on a single Intel Xeon X5680 3.33GHz CPU core. Note the logarithmic scale needed to make a meaningful comparison.


