Configuration¶

The preprocessing pipeline requires proper configuration of several parameters in the config.yaml file. This guide explains how to configure your pipeline.

v0.2.0 Breaking Change

The configuration system has migrated from settings.sh (Bash) to config.yaml (YAML). See the migration guide section below if upgrading from an earlier version.

Configuration File¶

The main configuration file is config.yaml, which is created by copying config.template.yaml:

cp config.template.yaml config.yaml

The configuration is loaded by sourcing load_config.sh, which parses the YAML file and exports environment variables for use in pipeline scripts:

source ./load_config.sh

Path Configuration¶

Set up your directory structure in the directories section:

directories:
  base_dir: '/path/to/your/study'
  scripts_dir: '/path/to/your/study/code'
  raw_dir: '/path/to/your/study/sourcedata'
  trim_dir: '/path/to/your/study'
  workflow_log_dir: '/path/to/your/study/logs'
  templateflow_host_home: '~/.cache/templateflow'
  fmriprep_host_cache: '~/.cache/fmriprep'
  freesurfer_license: '~/freesurfer.txt'

Path Descriptions:

base_dir: Root directory for the study
scripts_dir: Path of cloned fmriprep-workbench repository
raw_dir: Raw BIDS-compliant data location (sourcedata)
trim_dir: Destination for processed data
workflow_log_dir: Directory for workflow logs
templateflow_host_home: Host cache directory for TemplateFlow templates
fmriprep_host_cache: fMRIPrep-specific cache directory
freesurfer_license: Path to your FreeSurfer license file

User Configuration¶

Configure user-specific settings:

user:
  email: 'johndoe@stanford.edu'
  username: 'johndoe'
  fw_group_id: 'pi'
  fw_project_id: 'amass'

Task Parameters¶

Configure your task-specific settings in the scan section:

scan:
  fw_cli_api_key_file: '~/flywheel_api_key.txt'
  fw_url: 'cni.flywheel.io'
  config_file: 'scan-config.json'
  experiment_type: 'advanced'
  task_id: 'OriginalTaskName'
  new_task_id: 'cleanname'
  n_dummy: 5
  run_numbers:
    - '01'
    - '02'
    - '03'
    - '04'
    - '05'
    - '06'
    - '07'
    - '08'

Parameter Descriptions:

task_id: Original task name in BIDS format
new_task_id: New task name (if renaming needed), otherwise set same value as task_id
n_dummy: Number of dummy TRs to remove from the beginning of each run
run_numbers: List of all task BOLD run numbers (as strings with zero-padding)

Data Validation¶

Set expected volume counts for validation in the validation section:

validation:
  expected_fmap_vols: 12
  expected_bold_vols: 220
  expected_bold_vols_after_trimming: 215

These values are used by QC steps (04-qc-metadata and 05-qc-volumes) to verify that your scans have the expected number of volumes.

Fieldmap Mapping¶

Map fieldmaps to BOLD runs in the fmap_mapping section:

fmap_mapping:
  '01': '01'  # TASK BOLD RUN 01 USES FMAP 01
  '02': '01'  # TASK BOLD RUN 02 USES FMAP 01
  '03': '02'  # TASK BOLD RUN 03 USES FMAP 02
  '04': '02'  # TASK BOLD RUN 04 USES FMAP 02
  '05': '03'  # TASK BOLD RUN 05 USES FMAP 03
  '06': '03'  # TASK BOLD RUN 06 USES FMAP 03
  '07': '04'  # TASK BOLD RUN 07 USES FMAP 04
  '08': '04'  # TASK BOLD RUN 08 USES FMAP 04

Each key represents a BOLD run number, and its value is the fieldmap number that covers that run. This mapping determines which fieldmap is used for susceptibility distortion correction for each BOLD run.

Subject Lists¶

Basic subject list in all-subjects.txt:

# This is a comment - these lines are automatically filtered
# Blank lines are also ignored

101
102
103

v0.2.0 Enhancement

Comment lines (starting with #) and blank lines are now automatically filtered when counting subjects for SLURM array jobs. This makes it easier to document your subject lists.

Subject ID Modifiers¶

You can use suffix modifiers for per-subject control:

101                # Standard subject, runs all steps
102:step4          # Only run step 4 for this subject
103:step4:step5    # Only run steps 4 and 5
104:force          # Force rerun all steps
105:step5:force    # Only run step 5, force rerun
106:skip           # Skip this subject

Available Modifiers:

step1 to step7 - Run specific steps only
force - Force rerun even if already processed
skip - Skip this subject entirely

Step-Specific Subject Files¶

By default, subjects are pulled from all-subjects.txt. You can optionally specify different subject lists per pipeline step in config.yaml:

subjects_mapping:
  '01-fw2server': '01-subjects.txt'
  '02-dcm2niix': '02-subjects.txt'
  '06-run-fmriprep': '06-subjects.txt'

Permissions¶

Set file and directory permissions:

permissions:
  dir_permissions: '775'
  file_permissions: '775'

SLURM Configuration¶

Configure SLURM job parameters for general tasks:

slurm:
  email: 'your.email@institution.edu'
  time: '2:00:00'
  dcmniix_time: '12:00:00'
  mem: '4G'
  cpus: 8
  array_throttle: 10
  log_dir: '/path/to/your/study/logs'
  partition: 'partition1,partition2'

v0.2.0 Change

SLURM job names now use a unified fmriprep-workbench-{N} naming pattern (e.g., fmriprep-workbench-3 for step 3). The STEP_NAME variable is used for directory organization, while JOB_NAME is used for SLURM display.

Pipeline Settings¶

Configure container and derivatives paths:

pipeline:
  fmriprep_version: '24.0.1'
  derivs_dir: '/path/to/your/study/derivatives/fmriprep-24.0.1'
  singularity_image_dir: '/path/to/your/study/containers'
  singularity_image: 'fmriprep-24.0.1.simg'
  heudiconv_image: 'heudiconv_latest.sif'

fMRIPrep SLURM Settings¶

Configure SLURM parameters specifically for fMRIPrep jobs (steps 6 and 9):

fmriprep_slurm:
  job_name: 'fmriprep_yourproject'
  time: '48:00:00'
  cpus_per_task: 16
  mem_per_cpu: '4G'

fMRIPrep Settings¶

Configure fMRIPrep-specific parameters:

fmriprep:
  omp_threads: 8
  nthreads: 12
  mem_mb: 30000
  fd_spike_threshold: 0.9
  dvars_spike_threshold: 3.0
  output_spaces: 'MNI152NLin2009cAsym:res-2 anat fsnative fsaverage5'

FreeSurfer Manual Editing Settings¶

Configure default values for FreeSurfer manual editing workflow (Steps 7 and 8):

freesurfer_editing:
  remote_server: ''                              # Remote server hostname
  remote_user: ''                                # Remote username
  remote_base_dir: ''                            # Remote base directory path
  local_freesurfer_dir: '~/freesurfer_edits'     # Local directory for edits
  subjects_list: ''                              # Default subjects list
  download_all: false                            # Download all subjects by default
  upload_all: false                              # Upload all subjects by default
  backup_originals: true                         # Create backups when uploading

Parameter Descriptions:

remote_server: Remote server hostname (e.g., login.sherlock.stanford.edu)
remote_user: Remote username for SSH connection (e.g., SUNet ID)
remote_base_dir: Remote base directory containing FreeSurfer outputs (absolute path to BASE_DIR on server)
local_freesurfer_dir: Local directory for downloading/uploading edited FreeSurfer outputs (default: ~/freesurfer_edits)
subjects_list: Default subjects list file or comma-separated subject IDs
download_all: Download all subjects by default when using download_freesurfer.sh
upload_all: Upload all subjects by default when using upload_freesurfer.sh
backup_originals: Create timestamped backups of original FreeSurfer outputs before uploading edits (highly recommended)

Configuration Convenience

Setting these values in config.yaml allows you to run the FreeSurfer editing scripts without specifying common parameters on the command line each time. Command-line arguments always override config defaults.

Backup Safety

Keep backup_originals: true to prevent accidental data loss. Backups are created as {subject}.backup.{timestamp} on the server before uploading edited surfaces.

Miscellaneous Settings¶

misc:
  debug: 0  # Debug mode (0=off, 1=on)

When debug is set to 1, the pipeline runs with only a single subject (array index 0) for testing purposes.

Migration from settings.sh (v0.1.x to v0.2.0)¶

If you are upgrading from a version that used settings.sh, follow these steps:

1. Create new config file¶

cp config.template.yaml config.yaml

2. Transfer your settings¶

Map your old Bash variables to the new YAML structure:

Old (settings.sh)	New (config.yaml)
`BASE_DIR="/path/to/study"`	`directories.base_dir: '/path/to/study'`
`task_id="TaskName"`	`scan.task_id: 'TaskName'`
`n_dummy=5`	`scan.n_dummy: 5`
`run_numbers=("01" "02")`	`scan.run_numbers: ['01', '02']`
`declare -A fmap_mapping=(["01"]="01")`	`fmap_mapping: {'01': '01'}`
`EXPECTED_FMAP_VOLS=12`	`validation.expected_fmap_vols: 12`
`SLURM_EMAIL="email@edu"`	`slurm.email: 'email@edu'`
`FMRIPREP_VERSION="24.0.1"`	`pipeline.fmriprep_version: '24.0.1'`

3. Update step references¶

Note the expanded 14-step workflow:

Steps 1-5: FlyWheel download, DICOM conversion, prep, and QC (unchanged)
Step 6: fMRIPrep anatomical-only workflows (06-run-fmriprep, optional for manual FreeSurfer editing)
Step 7: Download FreeSurfer outputs (toolbox/download_freesurfer.sh, optional)
Step 8: Upload edited FreeSurfer outputs (toolbox/upload_freesurfer.sh, optional)
Step 9: fMRIPrep full workflows (09-run-fmriprep, previously step 7)
Step 10: FSL GLM model setup (10-fsl-glm/setup_glm.sh, new)
Step 11: FSL Level 1 analysis (08-run.sbatch, new)
Step 12: FSL Level 2 analysis (09-run.sbatch, new)
Step 13: FSL Level 3 analysis (10-run.sbatch, new)
Step 14: Tarball utility (toolbox/tarball_sourcedata.sh, new)

4. Remove old settings.sh¶

Once migrated, you can remove the old settings.sh file as it is no longer used.

Validation¶

Before running the pipeline:

Verify all paths exist and are accessible
Confirm volume counts match your acquisition protocol
Test configuration loading:

source ./load_config.sh

Test on a single subject before batch processing
Review logs for configuration warnings

Common Issues¶

YAML Syntax Errors : Ensure proper YAML formatting. Use a YAML validator if needed. Common issues include incorrect indentation and missing quotes around strings.

Path Issues : Double-check all path specifications are absolute and accessible. Paths with tildes (~) are expanded automatically.

Volume Mismatches : Verify validation.expected_fmap_vols and validation.expected_bold_vols match your acquisition protocol.

Fieldmap Mapping : Ensure each BOLD run has a corresponding fieldmap entry in fmap_mapping. Keys and values should be quoted strings (e.g., '01': '01').

Permission Problems : Check that permissions.dir_permissions and permissions.file_permissions are appropriate for your cluster environment.

Next Steps¶

After configuration, see the Usage guide to learn how to run the pipeline.