Back to Toolbox


Refmac is a refinement program that comes bundled with ccp4i and is probably the easiest to use if you have just scaled and phased your data using ccp4. It is written by Garib Murshudov at the University of York, and its web page can be found here.

To open the program click on the yellow tab in the top left corner of the ccp4i GUI, select "Refinement" and then "Run Refmac5". This should give you the following menu:


Starting at the top of the menu you need to give your job a title (something you will be able to remember), and then select the type of refinement you would like to perform. This can be selected by clicking on the "Do restrained refinement" tab which will give you the following drop-down menu:


If you are just beginning a refinement round you want to select "rigid body refinement" for your first refinement round, however after this you will probably need to be running restrained refinement. You can also select TLS (translation, libration and screw-rotation) with restrained refinement, however I would suggest running a couple rounds of normal restrained refinement before starting to play with TLS.

Next the program gives you some check boxes titled "Input fixed TLS parameters"... etc. If you are going to be using Coot for your model building I would ignore these. However if you need to create map files for other graphics programs (e.g. TURBO you need to select the "Generate weighted difference map files in CCP4 format" option.

As with other ccp4i programs you are next prompted to give the program your input files; which in this case will be your experimental data as an MTZ file (MTZ in) and your molecular replacement solution as a pdb file (PDB in). The program should then automatically generate names for the new files it will create once the it has run the refinement job (MTZ out and pdb out), however you should check the paths to make sure the files are going to be saved in the correct place. Also worth watching is the tabs saying "FP" and "sigma" which should automatically pick up you structure factor amplitude and error columns from your mtz file once you have told the program which mtz file to use. Ignore the library line if you have yet to build a ligand into your model (see below if you have built a ligand already).

The next section prompts you to give a "dataset name" which again is arbitrary, however after this you get to the most important box titled "Refinement parameters":


Initially 10 rounds of refinement should be OK, however sometimes you might need to increase this if your refinement is not converging (see below). If you want to alter the resolution of the data that is refined you can do this by clicking the box next to "Resolution range from...".

IMPORTANT - as a default the program next selects "Use experimental sigmas to weight X-ray terms". I have found that this generally puts too heavy a weight on the X-ray terms and thus gives rather high Rfree values at the end of a refinement. Thus I always deselect this box, and then try to type in a sensible value next to the word "Use weighting term". If you have good high resolution data 0.3 is reasonable, however if you are having problems with an overly high Rfree or poor geometry, I would recommend values as low as 0.1.

The next line allows you to select B factor refinement which I would suggest NOT doing initially or if you have data worse than 2.6A resolution. For data down to about 2A isotropic temperature factors is normally a good idea, and if you have even higher resolution data you can change this to anisotropic temperature factors. Be careful here as anisotropic temperature refinement can make your obs:params ratio rather low, and hence you only want to do this if you have a lot of data.

Finally in this section you can pick your Rfree set. If you processed your data in scala you can select your Rfree column (which normally contains the words Rfree) and then any value between 0 and 20 as scala picks twenty different Rfree sets. I would, however, recommend deciding which Rfree set you want to use early on and then sticking with it. I almost always use 1. One other word of caution at this stage regards non-crystallographic symmetry. If you have NCS greater than about 4 (ie 4 identical subunits in your asymmetric unit) I would recommend picking an Rfree set based on thin resolution shells rather than in the normal random method. To do this you to use the program "dataman" from the usf suite (as this is quite complicated I will try to write a web page with instructions on how to do this when I have time).

Other options

For an initial refinement run you can generally ignore the other options and thus click "Run" followed by "Run now" in the bottom left of the window. However if you have NCS it is always worth setting this by clicking on the "Setup Non-Crystallographic Symmetry (NCS) restraints box. Initially it will say "No NCS restraints are currently defined" so you will need to click on the box entitled "Add NCS restraint" to give you the following:


As I normally work on SAP, a single domain pentameric protein, I select "NCS restraint chain A" and then residues 1 to 204 which encompass an entire subunit. I then click "Add chain" four times and select chains B,C,D and E. This tells the program to restrain all five subunits within one NCS group. If you have different domains you can create new NCS groups by clicking on "Add NCS restraint" on the bottom right of the task window. You can also select the strength of your restraints by clicking on the drop-down box that initially should say "medium":


The options are fairly self-explanatory, however at the beginning of a refinement I would select "tight" (which are essentially constraints) and then slowly step down through the different options in subsequent refinement rounds.

I seldom change any options in the other boxes, however one menu worth baring in mind is the one entitled "Geometric parameters":


This can be used to alter the weighting of the various geometric restraints including the B value range at the very bottom. I would, however, suggest only changing things in this box if you have a very good reason.

Refining a Ligand

If you have added a ligand to your pdb file, you will not be able to refine without specifically telling refmac the geometric restraints of your ligand. Although ccp4 has an inbuilt way of generating such "dictionaries", by far the easiest way of doing this is to use the free web site called PRODRG2.

Instructions for how to this are found here.

Refinement Results

Once you have successfully run Refmac the job should show up as "finished" in the ccp4i main window. The program generates a number of graphs that you need to check to make sure your refinement has proceeded appropriately.

To see these graphs make sure your Refmac job is highlighted, and then click on "View files from job" and "View log graphs" in the subsequent drop-down menu. This will give you the following window:


In the window "Tables from File" you need to scroll down and select "Rfactor analysis, stats vs cycle". This will give you a graph that should look like this:


The Y axis represents the value of your R (red) and Rfree (blue) factors, and the x axis the refinement cycle. If you hold the mouse over the dots you will see cross-hairs, and the value is given in the bottom right of the window. In the above case the cross-hairs are held over the R value after the 10th refinement step which has a value of 0.25. For a successful refinement you need to see your R and Rfree values either drop or converge. In general you need to aim for an R value of around 1/10 your maximum resolution (eg if you have 2.3A data you want to end up with an R value of about 0.23) and no more than 0.05 difference between the R and Rfree. You will not see this the first ten (or more!!) times you run a refinement. Thus once you have run a refinement and seen an improvement you need to check your coordinates in a graphics package, adjust the atoms to fit the density, and then re-refine in an iterative process.

If your R values both increase, or your R drops a lot more than your Rfree, you should first go back and reduce the X-ray weight (e.g. to 0.1). If this does not work try loosening or tightening your NCS. If neither of these work you have probably made a model-building error such as adding waters into noise.

Another useful graph to have a look at in the Refmac output is "Geometry vs cycle" in the "Graphs in Selected Table" window. This will give you something like this:


From this graph you can check to see if your geometry has improved over the course of the refinement. The y axis shows RMSD and the x axis the refinement cycle. Do not worry about spikes (such as in round 3 in the above example) so long as your final RMSD is better than you initial values. The bond angles should converge around 1 (degree) and the rmsANGLE and rmsCHIRAL around 0.1. If any of these values do not converge, or even get worse, you probably need to decrease the X-ray weighting on the input menu.

Other graphs that might be interesting to look at are listed under the "Tables in File" section. These give you graphs for Resolution vs R factor for each refinement cycle (useful to see if you have a resolution bin of particularly bad data that might be worth removing) as well as F distribution vs Resolution. From these graphs you can check to see if you might be trying to stretch the resolution too far.

Back to Toolbox