7  Tutorial Chapter 1: Setup + download ds000102

Time: ~30 minutes total (5 min setup + 25 min download). Prereqs: A working laptop or HPC node, ~5 GB free disk, AWS CLI.

This chapter gets you from a blank directory to a configured Reproducible-fMRI checkout pointing at the Flanker dataset.

7.1 1. Clone the template

git clone https://github.com/CNClaboratory/Reproducible-fMRI flanker-tutorial
cd flanker-tutorial

7.2 2. Run setup

make setup

You’ll see something like:

Detected site: local (no NeuroCommand modules, no SLURM)
Copied config/presets/local/paths.toml → config/paths.toml
Copied config/presets/local/site.conf  → config/site.conf
Running preflight (--fix mode)…
preflight summary: 13 PASS, 0 FAIL, 2 WARN
  WARN: paths.toml contains placeholder <user>
  WARN: data directories not yet created

That’s expected — placeholders haven’t been edited yet, no data downloaded.

7.3 3. Download the Flanker dataset

# Install AWS CLI if you don't have it (on macOS: `brew install awscli`)
# OpenNeuro hosts datasets on S3 with anonymous read access.
mkdir -p data/ds000102
aws s3 sync --no-sign-request s3://openneuro.org/ds000102 data/ds000102

While that runs (~25 min):

💡 What you’re getting: 26 subjects performing a Flanker task — congruent vs. incongruent flanker conditions, classic cognitive-control fMRI paradigm. Each subject has anatomical (T1w) and 1-2 functional runs.

When it finishes, verify:

ls data/ds000102/
# Expected: dataset_description.json, participants.tsv, sub-*/
ls data/ds000102/sub-08/
# Expected: anat/  func/

7.4 4. Point the template at the data

Edit config/paths.toml:

[paths.roots]
codebase = "/path/to/flanker-tutorial"     # this directory
dataset  = "/path/to/flanker-tutorial/data/ds000102"  # where you downloaded

Re-verify:

make preflight

Now you should see 13 PASS, 0 FAIL, 0 WARN. If preflight is green, you’re ready for Chapter 2.

7.5 5. Sanity check

.venv/bin/python -c "
from libs.paths import get_paths
p = get_paths()
print('rawdata_root:', p.rawdata_root)
print('subjects:', sorted(d.name for d in p.rawdata_root.glob('sub-*'))[:5], '…')
"

Expected output:

rawdata_root: /path/to/flanker-tutorial/data/ds000102
subjects: ['sub-01', 'sub-02', 'sub-03', 'sub-04', 'sub-05'] …

Chapter 1 complete. Continue to 02_run_fmriprep.md (pending).


7.6 Why this is different from Andy’s Brain Book Tutorial #1

Brain Book gets you to “you have data on disk.” We get you to “you have data + a configured pipeline + a preflight that says everything is wired correctly.” The 5-minute make setup + edit-paths.toml + re-preflight loop is the surface area of our infrastructure contribution. From here, every make command works the same way across all four CNC Lab child repos and on UCI HPC3, NEU Explorer, and your laptop.