FCMA File Generator
Here you can generate an FCMA config file, which is used by the pni_fcma_submi
t script to run an FCMA analysis. It does this by constructing the exact command-line invocation syntax required to run the pni_fcma
executable under MPI via mpirun. A generic version of the pni_fcma_submit
script can be found in the source distribution here.
How it works
Set your desired FCMA parameters on the left, and the corresponding FCMA file will be automatically generated on the right in response to your changes. Interdependent parameters should also update in response to your choices. You can then either copy and paste the contents of the window, or download it as a file. Consult the reference below for detailed information about each configuration parameter, and the this section to learn more about the larger context.
Reference
- Analysis stage: The first stage, voxel selection, does most of the heavy lifting. It computes correlation matrices for every block and runs classification on every voxel in mask1, so is the most compute-intensive stage. The result is a sorted list of voxels, in decreasing order of predictive power assessed by an SVM classifier during training (the sequence and score are also saved as NIfTI volumes). A third stage, visualization simply saves all the voxel-voxel correlations for a particular block (See Correlation Visualization section below for details). The second stage, testing predictive accuracy tests the voxels from stage 1 using the output of stage 1, in different ways depending on the following 2 settings.
- Test increasing top voxels. The list of top voxels from stage1 (the file ending in
_list.txt
) can be used to assess how classification changes with an increasing number of top voxels from the list. Top 10, 50, 100, 200, 500, 1000 ... 5000 voxels are tested and results reported for each of them. - Test via cross validation. If selected, a series of tests will be performed, each with a different set of blocks held out (set size same as number of blocks held out) and the total percent correct reported. If not set, one test is performed for each block held out (see the holds section below). Whether set or not, the NIfTI volumes output from stage 1 can be used as mask1 with a wholebrain mask typically used as mask2. The output files end in
_seq.nii.gz
and_scores.nii.gz
, and should be thresholded first based on the results from increasing the number of top voxels, to prevent overfitting.
- Test increasing top voxels. The list of top voxels from stage1 (the file ending in
- Number of folds: This setting is only used in stage 1 (voxel selection). Folds are used in the test phase, but the number of folds is calculated from the number of test (ie held) blocks, assuming each subject has the same number. Since voxels are rated based on how they fare in a k-fold SVM cross-validation, which divides data into k subsets, a good choice for k is the number of subjects (minus one) if comparing across subjects, or runs (minus test runs) if comparing across runs, etc. The number of folds should evenly divide the training (ie non-held) blocks. This way training data will be roughly balanced across sources, assuming an equal number of "on" blocks as "off" blocks in the blockfile labels (regressors).
- Holds for selection or cross-validation: During selection (stage 1) you can leave out a particular string of trials beginning with the index of the first block. Blocks (ie trials) in the range
[0, subjects x blocks in blockfile]
). Alternately you can remove those blocks from the blockfile(s) altogether and/or their corresponding data files (be careful not to throw off TR numbering!) When testing prediction accuracy (stage 2) a test is performed on the number of blocks held. Blocks are always held at the end of the block list. The first block setting is ignored in the test phase. Predictions are based on the firsttrials - holds
blocks, and tested on the remainingholds
blocks. If cross validating, every permutation oftrials - holds
vsholds
is used for prediction and testing and results printed to the console (or log file if running batch jobs). Trial (ie block) numbering spans subjects: if there are 12 blocks per subject, block ID "0" is subject1_block1, "1" is subject1_block2 ... "11" is subject1_block12, "12" is subject2_block1, "13" is subject2_block2 ... and so on. - Blocks directory/file: Use one blockfile if the same blocks apply to files in the data directory. This might be the case if subjects were presented trials in identical order, with identical timing, or if the data has been re-ordered to make it so. Supply a directory name containing blockfiles otherwise. If using a blockfile directory, there must be 1:1 correspondence between blockfiles:datafiles. The filenames must be the same except for their file extensions (NIfTI compressed
.nii.gz
data, plaintext.txt
blockfiles). - Blockfile format
A blockfile starts with the number of blocks on the first line, followed by as many rows. Each row's first column is the label (0 or 1), column 2 is the start TR, and column 3 the ending TR number in the block. Any hemodynamic shift would need to be added manually here, but shifts are restricted to whole TRs such that block ranges remain as integers.NLabel StartTR EndTR
... N rows total ...
Label StartTR EndTR - Data directory: All files in the data directory are loaded. The number of subjects is set equal to the number of files found. Currently only NIfTI compressed (
.nii.gz
extension) is recognized. Note that excessive data movement can be avoided by using symlinks in this directory, pointing to wherever the real data files might be kept. - Voxel masks: Depending on the task of analysis, a different number and type of mask is used. During selection, voxels are correlated between mask1 and mask2 regions. If both mask1 and mask2 are the same, then voxels within that region are auto-correlated. If no masks are provided, all voxels will be auto-correlated. If only mask1 is provided, top voxels in mask1 are selected based on their correlations with all voxels. If only mask2 is provided, top voxels are chosen from the set of all voxels, based on their correlation with the voxels in mask2. Proceeding without a mask is not recommended due to the large number of voxels outside the brain in the typical volume (~1.5x the number of brain voxels). Since every voxel is correlated and classified, this is particularly expensive in the case of FCMA. A wholebrain mask should be provided at the very least. In the test phase, top voxel number testing should use a single wholebrain mask1. The single and cross-validation tests should set mask1 to a thresholded versions of the
_seq.nii.gz
file output from stage one-- thresholded so that only the top N voxels have nonzero values, for some chosen N based on the top voxel number test. Mask2 in this final test phase should be a wholebrain mask. - Output file: This should be set to a filename prefix of choice in the selection phase; the top voxel number test will use this setting as input. The prefix will be used to construct 3 output filenames at the end of voxel selection: 1) a
[outputfile]_list.txt
file with the top-scoring voxels, 2) a[outputfile]_seq.nii.gz
file with the voxel rank at its corresponding x,y,z position, and 3)[outputfile]_score.nii.gz
which has voxel score at its corresponding x,y,z position. These files can be used directly (or thresholded) as input to the testing stage. The entire filename, not just its prefix, is required when the file is used as input for the top voxel amount test. The output of prediction accuracy in the second stage is currently written to the console as standard out.- Correlation Visualization. Supply a blockID (with value determined using the numbering described at the end of the Holds section) for which to save all voxel-voxel correlations, and a NIfTI reference file whose header info can be used to construct the output NIfTI of saved correlations. Each volume in the 4D output holds the correlations between one voxel in Mask1 vs all the voxels in Mask2, for the chosen block (so each correlation is between two vectors with length equal to the number of TRs per block). The actual visualization of the saved correlations is an exercise left to the user, but the goal is to turn this output into a circos chart!
- Classifier: While SVM prediction should be used, some experimental alternatives are available, namely smart distance ratio and correlation sum. Note that if Use activity instead of correlations is checked (up next), this setting is ignored and a searchlight + svm selection + classifier will be used.
Note: SVM is the default classifier; only binary classification (0/1) is currently supported in the toolbox as a result. Multiple classes are not hard to add, however, using the standard methods for extending SVM to handle multiple classes (one-to-all, one-to-one). Currently this needs to be be done manually via a set of appropriately constructed blockfiles. - Use activity instead of correlation: Assessing the advantage (if any) of using correlation patterns vs activation patterns can be explored by using searchlight-based activity averaging during selection and training, and normalized activation vectors during testing.