{ "cells": [ { "cell_type": "markdown", "id": "00e63e99", "metadata": {}, "source": [ "# Data: datasets, parcellations & custom inputs\n", "\n", "Before running analyses, you need data. NiSpace ships with no data bundled — everything is downloaded on demand and cached locally. This notebook covers what's available, how to fetch it, how to use your own data, and how the parcellation system works under the hood.\n", "\n", "For a full overview of the integrated datasets, see the [Datasets](../datasets.rst), [Parcellations](../parcellations.rst), and [Templates](../templates.rst) pages." ] }, { "cell_type": "code", "execution_count": 1, "id": "5e371db4", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:14.442229Z", "iopub.status.busy": "2026-06-01T13:03:14.442090Z", "iopub.status.idle": "2026-06-01T13:03:14.494317Z", "shell.execute_reply": "2026-06-01T13:03:14.493993Z" } }, "outputs": [], "source": [ "import tqdm.notebook\n", "tqdm.notebook.tqdm = tqdm.tqdm" ] }, { "cell_type": "markdown", "id": "1f236fcb", "metadata": {}, "source": [ "## Fetching reference datasets\n", "\n", "NiSpace provides a curated collection of reference brain maps; e.g. PET receptor densities, mRNA gene expression, ENIGMA effect sizes, and many more. Use `fetch_reference()` to load any of them.\n", "\n", "The first call downloads the data; subsequent calls use the local cache.\n", "\n", "If you pass a parcellation, you will receive parcellated data. If you do not, you will receive nifti or gifti maps (as defined by the `space` argument). Some reference datasets are only available for specific spaces, or only in tabulated format for specific parcellations.\n", "\n", "Let us first fetch all the included PET maps in MNI152NLin2009cAsym space." ] }, { "cell_type": "code", "execution_count": 2, "id": "24e655ff", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:14.496035Z", "iopub.status.busy": "2026-06-01T13:03:14.495929Z", "iopub.status.idle": "2026-06-01T13:03:16.660755Z", "shell.execute_reply": "2026-06-01T13:03:16.660422Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading pet maps.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "The NiSpace \"PET\" dataset is based on openly available nuclear imaging maps largely accessed via neuromaps \n", "(https://neuromaps-main.readthedocs.io/). If requested in the varying original spaces and resolutions (termed \"MNI152\", \n", "\"fsaverageOriginal\", or \"fsLROriginal\"), the maps are downloaded directly from the source and cached locally. If, as is highly \n", "recommended, the maps are requested in a defined space (\"MNI152NLin2009cAsym\", \"MNI152NLin6Asym\", \"fsaverage\", or \"fsLR\"), \n", "they are downloaded from the NiSpace-data GitHub repo (find them in `~HOME/nispace-data/reference/pet/map`). \n", "The NiSpace-hosted MNI maps were directly registered to 2mm MNI152NLin6Asym space, and transformed to 2mm MNI152NLin2009cAsym \n", "with a pre-estimated MNI-to-MNI transformation using SynthMorph v4 (https://martinos.org/malte/synthmorph/). The resulting maps \n", "were masked with a liberal grey matter mask generated from the Harvard-Oxford atlas and scaled from 1e-6 to 1. The scaling was \n", "transferred from MNI to surface maps if both were available for the same source (e.g., maps from Beliveau et al.).\n", "The accompanying metadata table contains detailed information about tracers, source samples, original publications and data \n", "sources, as well as the publication licenses. Every map should be cited when used. The responsibility for this lies with the user! \n", "We additionally ask to cite:\n", "- Markello et al., 2022 (https://doi.org/10.1038/s41592-022-01625-w)\n", "- Hansen et al., 2022 (https://doi.org/10.1038/s41593-022-01186-3)\n", "- Dukart et al., 2021 (https://doi.org/10.1002/hbm.25244)\n", "- Hoffmann et al., 2024 (https://doi.org/10.1162/imag_a_00197; if NiSpace-processed maps are used)\n", "To ensure reproducibility, note the NiSpace commit/version: 0.0.1b5.dev68+g20be93445.d20260520\n", "\n", "- target-5HT1a_tracer-cumi101_n-8_dx-hc_pub-beliveau2017 Source: Beliveau2017 CC BY-NC-SA 4.0 https://doi.org/10.1523/JNEUROSCI.2830-16.2016\n", " CAVE: Processed in fsaverage space, use volumetric maps only for subcortex!\n", "- target-5HT1a_tracer-way100635_n-35_dx-hc_pub-savli2012 Source: Savli2012 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.neuroimage.2012.07.001\n", "- target-5HT1b_tracer-az10419369_n-36_dx-hc_pub-beliveau2017 Source: Beliveau2017 CC BY-NC-SA 4.0 https://doi.org/10.1523/JNEUROSCI.2830-16.2016\n", " CAVE: Processed in fsaverage space, use volumetric maps only for subcortex!\n", "- target-5HT1b_tracer-p943_n-23_dx-hc_pub-savli2012 Source: Savli2012 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.neuroimage.2012.07.001\n", "- target-5HT1b_tracer-p943_n-65_dx-hc_pub-gallezot2010 Source: Gallezot2010 CC BY-NC-SA 4.0 https://doi.org/10.1038/jcbfm.2009.195, https://doi.org/10.1007/s00213-010-1881-0, https://doi.org/10.1001/archgenpsychiatry.2011.91, https://doi.org/10.1016/j.biopsych.2013.11.022, https://doi.org/10.1016/j.jad.2016.02.021, https://doi.org/10.1007/s00259-014-2958-5, https://doi.org/10.1002/syn.22159\n", "- target-5HT2a_tracer-altanserin_n-19_dx-hc_pub-savli2012 Source: Savli2012 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.neuroimage.2012.07.001\n", "- target-5HT2a_tracer-cimbi36_n-29_dx-hc_pub-beliveau2017 Source: Beliveau2017 CC BY-NC-SA 4.0 https://doi.org/10.1523/JNEUROSCI.2830-16.2016\n", " CAVE: Processed in fsaverage space, use volumetric maps only for subcortex!\n", "- target-5HT4_tracer-sb207145_n-59_dx-hc_pub-beliveau2017 Source: Beliveau2017 CC BY-NC-SA 4.0 https://doi.org/10.1523/JNEUROSCI.2830-16.2016\n", " CAVE: Processed in fsaverage space, use volumetric maps only for subcortex!\n", "- target-5HT6_tracer-gsk215083_n-30_dx-hc_pub-radhakrishnan2018 Source: Radhakrishnan2018 CC BY-NC-SA 4.0 https://doi.org/10.2967/jnumed.117.206516, https://doi.org/10.1016/j.pscychresns.2019.111007\n", "- target-5HTT_tracer-dasb_n-100_dx-hc_pub-beliveau2017 Source: Beliveau2017 CC BY-NC-SA 4.0 https://doi.org/10.1523/JNEUROSCI.2830-16.2016\n", " CAVE: Processed in fsaverage space, use volumetric maps only for subcortex!\n", "- target-5HTT_tracer-dasb_n-18_dx-hc_pub-savli2012 Source: Savli2012 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.neuroimage.2012.07.001\n", "- target-5HTT_tracer-madam_n-10_dx-hc_pub-fazio2016 Source: Fazio2016 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.neuroimage.2016.03.019\n", "- target-A4B2_tracer-flubatine_n-30_dx-hc_pub-hillmer2016 Source: Hillmer2016 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.neuroimage.2016.07.026, https://doi.org/10.1093/ntr/ntx091\n", "- target-CB1_tracer-fmpepd2_n-22_dx-hc_pub-laurikainen2019 Source: Laurikainen2019 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.neuroimage.2018.10.013\n", "- target-CB1_tracer-omar_n-77_dx-hc_pub-normandin2015 Source: Normandin2015 CC BY-NC-SA 4.0 https://doi.org/10.1038/jcbfm.2015.46, https://doi.org/10.1016/j.bpsc.2015.09.008, https://doi.org/10.1016/j.biopsych.2015.08.021, https://doi.org/10.1111/j.1530-0277.2012.01815.x\n", "- target-CMRglu_tracer-fdg_n-20_dx-hc_pub-castrillon2023 Source: Castrillon2023 CC BY-NC-SA 4.0 https://doi.org/10.1126/sciadv.adi7632\n", "- target-COX1_tracer-ps13_n-11_dx-hc_pub-kim2020 Source: Kim2020 CC0 1.0 https://doi.org/10.1007/s00259-020-04855-2, https://doi.org/10.18112/openneuro.ds004401.v1.0.1\n", "- target-D1_tracer-sch23390_n-13_dx-hc_pub-kaller2017 Source: Kaller2017 CC BY-NC-SA 4.0 https://doi.org/10.1007/s00259-017-3645-0\n", "- target-D23_tracer-fallypride_n-49_dx-hc_pub-jaworska2020 Source: Jaworska2020 CC BY-NC-SA 4.0 https://doi.org/10.1038/s41386-020-0662-7\n", "- target-D23_tracer-flb457_n-37_dx-hc_pub-smith2017 Source: Smith2017 CC BY-NC-SA 4.0 https://doi.org/10.1177/0271678X17737693\n", "- target-D23_tracer-flb457_n-55_dx-hc_pub-sandiego2015 Source: Sandiego2015 CC BY-NC-SA 4.0 https://doi.org/10.1038/jcbfm.2014.237, https://doi.org/10.1177/0271678X17737693, https://doi.org/10.1038/s41386-019-0456-y, https://doi.org/10.1001/jamapsychiatry.2014.2414, https://doi.org/10.1038/npp.2017.223\n", "- target-DAT_tracer-fepe2i_n-6_dx-hc_pub-sasaki2012 Source: Sasaki2012 CC BY-NC-SA 4.0 https://doi.org/10.2967/jnumed.111.101626\n", "- target-DAT_tracer-fpcit_n-174_dx-hc_pub-dukart2018 Source: Dukart2018 CC BY-NC-SA 4.0 https://doi.org/10.1038/s41598-018-22444-0\n", " CAVE: SPECT, not PET!\n", "- target-DAT_tracer-fpcit_n-30_dx-hc_pub-garciagomez2013 Source: Garciagomez2013 free https://doi.org/10.1016/j.remn.2013.02.009\n", " CAVE: SPECT, not PET!\n", "- target-FDOPA_tracer-fluorodopa_n-12_dx-hc_pub-garciagomez2018 Source: Garciagomez2018 free https://doi.org/10.33588/imagendiagnostica.901.2\n", "- target-GABAa_tracer-flumazenil_n-16_dx-hc_pub-norgaard2021 Source: Norgaard2021 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.neuroimage.2021.117878\n", " CAVE: Processed in fsaverage space, use volumetric maps only for subcortex!\n", "- target-GABAa_tracer-flumazenil_n-6_dx-hc_pub-dukart2018 Source: Dukart2018 CC BY-NC-SA 4.0 https://doi.org/10.1038/s41598-018-22444-0\n", "- target-GABAa5_tracer-ro154513_n-10_dx-hc_pub-lukow2022 Source: Lukow2022 CC BY 4.0 https://doi.org/10.1038/s42003-022-03268-1\n", "- target-H3_tracer-gsk189254_n-8_dx-hc_pub-gallezot2017 Source: Gallezot2017 CC BY-NC-SA 4.0 https://doi.org/10.1177/0271678X16650697, https://doi.org/10.1038/jcbfm.2009.195\n", "- target-HDAC_tracer-martinostat_n-8_dx-hc_pub-wey2016 Source: Wey2016 CC0 1.0 https://doi.org/10.1126/scitranslmed.aaf7551\n", "- target-KOR_tracer-ly2795050_n-28_dx-hc_pub-vijay2018 Source: Vijay2018 CC BY-NC-SA 4.0 https://doi.org/10.1038/s41386-018-0199-1\n", "- target-M1_tracer-lsn3172176_n-24_dx-hc_pub-naganawa2020 Source: Naganawa2020 CC BY-NC-SA 4.0 https://doi.org/10.2967/jnumed.120.246967\n", "- target-mGluR5_tracer-abp688_n-22_dx-hc_pub-rosaneto Source: Rosaneto CC BY-NC-SA 4.0 https://doi.org/10.1101/2021.10.28.466336\n", "- target-mGluR5_tracer-abp688_n-28_dx-hc_pub-dubois2015 Source: Dubois2015 CC BY-NC-SA 4.0 https://doi.org/10.1007/s00259-015-3167-6\n", "- target-mGluR5_tracer-abp688_n-73_dx-hc_pub-smart2019 Source: Smart2019 CC BY-NC-SA 4.0 https://doi.org/10.1007/s00259-018-4252-4\n", "- target-MOR_tracer-carfentanil_n-204_dx-hc_pub-kantonen2020 Source: Kantonen2020 CC BY-NC-SA 4.0 https://doi.org/10.1038/mp.2017.183\n", "- target-MOR_tracer-carfentanil_n-39_dx-hc_pub-turtonen2021 Source: Turtonen2021 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.bpsc.2020.10.013\n", "- target-NET_tracer-mrb_n-10_dx-hc_pub-hesse2017 Source: Hesse2017 CC BY-NC-SA 4.0 https://doi.org/10.1007/s00259-016-3590-3\n", "- target-NET_tracer-mrb_n-77_dx-hc_pub-ding2010 Source: Ding2010 CC BY-NC-SA 4.0 https://doi.org/10.1002/syn.20696, https://doi.org/10.1016/j.neuroimage.2013.10.004, https://doi.org/10.1038/s41366-019-0471-4, https://doi.org/10.1210/jc.2017-02717\n", "- target-NMDA_tracer-ge179_n-29_dx-hc_pub-galovic2021 Source: Galovic2021 CC BY-NC-SA 4.0 https://doi.org/10.1001/jamaneurol.2022.4352, https://doi.org/10.1016/j.neuroimage.2021.118194, https://doi.org/10.2967/jnumed.113.130641\n", " CAVE: Unlike other tracers, [18F]GE-179 binds to open (active) NMDA receptors!\n", "- target-rCPS_tracer-leucine_n-42_dx-hc_pub-smith2023 Source: Smith2023 CC0 1.0 https://doi.org/10.18112/openneuro.ds004654.v1.0.1, https://doi.org/10.18112/openneuro.ds004730.v1.0.0, https://doi.org/10.18112/openneuro.ds004731.v1.0.0, https://doi.org/10.18112/openneuro.ds004733.v1.0.0, https://doi.org/10.1016/j.nbd.2020.104978, https://doi.org/10.1177/0271678X221090997, https://doi.org/10.1038/jcbfm.2009.7, https://doi.org/10.1093/sleep/zsy088, https://doi.org/10.1177/0271678X221121873\n", "- target-SV2A_tracer-ucbj_n-76_dx-hc_pub-finnema2016 Source: Finnema2016 CC BY-NC-SA 4.0 https://doi.org/10.1177/0271678X17724947, https://doi.org/10.2967/jnumed.120.246967, https://doi.org/10.1177/0271678X211004312, https://doi.org/10.1186/s13195-020-00742-y, https://doi.org/10.1177/0271678X20946198, https://doi.org/10.1093/cid/ciab484, https://doi.org/10.1038/s41380-021-01184-0, https://doi.org/10.1016/j.bpsc.2015.09.008, https://doi.org/10.1111/epi.16653, https://doi.org/10.1186/s13550-020-00670-w, https://doi.org/10.1002/alz.12097, https://doi.org/10.1111/epi.14701, https://doi.org/10.1038/s41467-019-09562-7, https://doi.org/10.1001/jamaneurol.2018.1836\n", "- target-TSPO_tracer-pbr28_n-6_dx-hc_pub-lois2018 Source: Lois2018 MIT https://doi.org/10.1021/acschemneuro.8b00072, https://doi.org/10.5281/zenodo.1174364\n", "- target-VAChT_tracer-feobv_n-18_dx-hc_pub-aghourian2017 Source: Aghourian2017 CC BY-NC-SA 4.0 https://doi.org/10.1038/mp.2017.183\n", "- target-VAChT_tracer-feobv_n-4_dx-hc_pub-tuominen Source: Tuominen CC BY-NC-SA 4.0 https://doi.org/10.1101/2021.10.28.466336\n", "- target-VAChT_tracer-feobv_n-5_dx-hc_pub-bedard2019 Source: Bedard2019 CC BY-NC-SA 4.0 https://doi.org/10.1016/j.sleep.2018.12.020\n", "- target-VMAT2_tracer-dtbz_n-76_dx-hc_pub-larsen2020 Source: Larsen2020 CC0 1.0 https://doi.org/10.18112/openneuro.ds002385.v1.1.0, https://doi.org/10.1038/s41467-020-14693-3\n", "\n", "First two pet maps:\n", "/Users/llotter/nispace-data/reference/pet/map/target-5HT1a_tracer-cumi101_n-8_dx-hc_pub-beliveau2017/target-5HT1a_tracer-cumi101_n-8_dx-hc_pub-beliveau2017_space-MNI152NLin2009cAsym.nii.gz\n", "/Users/llotter/nispace-data/reference/pet/map/target-5HT1a_tracer-way100635_n-35_dx-hc_pub-savli2012/target-5HT1a_tracer-way100635_n-35_dx-hc_pub-savli2012_space-MNI152NLin2009cAsym.nii.gz\n" ] } ], "source": [ "from nispace.datasets import fetch_reference\n", "\n", "pet_maps = fetch_reference(\"pet\")\n", "print(\"First two pet maps:\")\n", "print(pet_maps[0])\n", "print(pet_maps[1])" ] }, { "cell_type": "markdown", "id": "b6c72f1a", "metadata": {}, "source": [ "Let us now pass a parcellation to fetch tabulated data directly:" ] }, { "cell_type": "code", "execution_count": 3, "id": "db51cfbd", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:16.662421Z", "iopub.status.busy": "2026-06-01T13:03:16.662289Z", "iopub.status.idle": "2026-06-01T13:03:16.673366Z", "shell.execute_reply": "2026-06-01T13:03:16.673096Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading pet maps.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading data parcellated with 'Schaefer200Parcels7Networks'\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "PET: 49 maps x 200 parcels\n", "Index: ['map']\n", "First two entries: ['target-5HT1a_tracer-cumi101_n-8_dx-hc_pub-beliveau2017', 'target-5HT1a_tracer-way100635_n-35_dx-hc_pub-savli2012']\n" ] } ], "source": [ "# fetch PET maps in Schaefer200 parcellation\n", "pet_tab = fetch_reference(\n", " \"pet\",\n", " parcellation=\"Schaefer200\",\n", " print_references=False\n", ")\n", "print(f\"PET: {pet_tab.shape[0]} maps x {pet_tab.shape[1]} parcels\")\n", "print(f\"Index: {pet_tab.index.names}\") # ['set', 'map'] — two-level MultiIndex\n", "print(\"First two entries:\", list(pet_tab.index[:2]))" ] }, { "cell_type": "markdown", "id": "f9f2c660", "metadata": {}, "source": [ "We often do not want to fetch all maps; NiSpace provides readymade \"collections\" of reference datasets, which can be called via the `collection` argument." ] }, { "cell_type": "code", "execution_count": 4, "id": "c657dcc2", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:16.674855Z", "iopub.status.busy": "2026-06-01T13:03:16.674739Z", "iopub.status.idle": "2026-06-01T13:03:16.689367Z", "shell.execute_reply": "2026-06-01T13:03:16.689088Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading pet maps.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading integrated collection 'UniqueTracers' for dataset 'pet'.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Filtering maps by collection.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading data parcellated with 'Schaefer200Parcels7Networks'\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "PET (UniqueTracers): 29 maps x 200 parcels\n", "Index: ['set', 'map']\n", "First few entries: [('General', 'target-CMRglu_tracer-fdg_n-20_dx-hc_pub-castrillon2023'), ('General', 'target-rCPS_tracer-leucine_n-42_dx-hc_pub-smith2023'), ('General', 'target-SV2A_tracer-ucbj_n-76_dx-hc_pub-finnema2016'), ('General', 'target-HDAC_tracer-martinostat_n-8_dx-hc_pub-wey2016'), ('General', 'target-VMAT2_tracer-dtbz_n-76_dx-hc_pub-larsen2020')]\n" ] } ], "source": [ "# fetch PET maps in Schaefer200 parcellation using the UniqueTracers collection\n", "# -> one map per tracer/target\n", "pet_unique = fetch_reference(\n", " \"pet\",\n", " parcellation=\"Schaefer200\",\n", " collection=\"UniqueTracers\",\n", " print_references=False\n", ")\n", "print(f\"PET (UniqueTracers): {pet_unique.shape[0]} maps x {pet_unique.shape[1]} parcels\")\n", "print(f\"Index: {pet_unique.index.names}\") # ['set', 'map'] — two-level MultiIndex\n", "print(\"First few entries:\", list(pet_unique.index[:5]))" ] }, { "cell_type": "markdown", "id": "e3128596", "metadata": {}, "source": [ "The result is a DataFrame with a two-level `['set', 'map']` MultiIndex. The `set` level groups maps by biological system (e.g. `\"Serotonin\"`, `\"Dopamine\"`); the `map` level is the individual tracer identifier. This set structure is what X-Set Enrichment Analysis ([Notebook 10](intro10_xsea.ipynb)) operates on.\n", "\n", "### Collections\n", "\n", "A **collection** is a predefined selection of maps. You pass its name as the `collection` argument. The index structure you get back depends on how the collection is defined internally:\n", "\n", "| Scenario | Example | Index |\n", "|----------|---------|-------|\n", "| No collection | `fetch_reference(\"pet\")` | Flat `['map']` — all maps, no grouping |\n", "| Text-file collection | `collection=\"All\"` | Flat `['map']` — a named subset, still no grouping |\n", "| JSON collection | `collection=\"UniqueTracers\"` | Two-level `['set', 'map']` — maps organized into named groups |\n", "\n", "A flat index is fine for all standard colocalization analyses. The two-level index is required only if you want to use XSEA (where the `set` groupings are the unit of analysis).\n", "\n", "For PET, `\"UniqueTracers\"` is the recommended default: it keeps one representative tracer per receptor to reduce redundancy and groups them by neurotransmitter system. Again, here is what you get without a collection, for comparison:" ] }, { "cell_type": "code", "execution_count": 5, "id": "2ba3f3be", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:16.690815Z", "iopub.status.busy": "2026-06-01T13:03:16.690684Z", "iopub.status.idle": "2026-06-01T13:03:16.699917Z", "shell.execute_reply": "2026-06-01T13:03:16.699640Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading pet maps.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading data parcellated with 'Schaefer200Parcels7Networks'\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "No collection: 49 maps, index: ['map']\n", "UniqueTracers: 29 maps, index: ['set', 'map']\n" ] } ], "source": [ "# without a collection: all maps, flat ['map'] index\n", "pet_all = fetch_reference(\"pet\", parcellation=\"Schaefer200\", print_references=False)\n", "print(f\"No collection: {pet_all.shape[0]} maps, index: {pet_all.index.names}\")\n", "print(f\"UniqueTracers: {pet_unique.shape[0]} maps, index: {pet_unique.index.names}\")" ] }, { "cell_type": "code", "execution_count": 6, "id": "df4adad5", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:16.701412Z", "iopub.status.busy": "2026-06-01T13:03:16.701289Z", "iopub.status.idle": "2026-06-01T13:03:17.197874Z", "shell.execute_reply": "2026-06-01T13:03:17.197472Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading mrna maps.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Loading integrated collection 'CellTypesSilettiSuperclusters' for dataset 'mrna'.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:16 | nispace.datasets: Filtering maps by collection.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.datasets: Loading data parcellated with 'DesikanKilliany'\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "mRNA (CellTypesSilettiSuperclusters): 613 genes in 29 cell-type sets x 68 parcels\n", "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.datasets: Loading enigmathick maps.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.datasets: Loading data parcellated with 'DesikanKilliany'\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "ENIGMA thickness: 22 disorders x 68 cortical parcels\n", "['dx-mdd_age-adult_pub-schmaal2017', 'dx-mdd_age-adolescent_pub-schmaal2017', 'dx-adhd_age-allages_pub-hoogman2019', 'dx-adhd_age-adult_pub-hoogman2019', 'dx-adhd_age-adolescent_pub-hoogman2019', 'dx-adhd_age-pediatric_pub-hoogman2019', 'dx-asd_pub-vanrooij2018', 'dx-bd_age-adult_pub-hibar2018', 'dx-bd_age-adolescent_pub-hibar2018', 'dx-scz_pub-vanerp2018', 'dx-ocd_age-adult_pub-boedhoe2018', 'dx-ocd_age-pediatric_pub-boedhoe2018', 'dx-epilepsy_pub-whelan2018', 'dx-epilepsy_subtype-gge_pub-whelan2018', 'dx-epilepsy_subtype-ltle_pub-whelan2018', 'dx-epilepsy_subtype-rtle_pub-whelan2018', 'dx-22q_pub-sun2020', 'dx-an_pub-walton2022', 'dx-an_subtype-acAN_pub-walton2022', 'dx-an_subtype-pwrAN_pub-walton2022', 'dx-antisocial_pub-gao2024', 'dx-pd_pub-laansma2021']\n" ] } ], "source": [ "# two other available reference datasets\n", "\n", "# mRNA gene expression (Allen Human Brain Atlas)\n", "mrna = fetch_reference(\n", " \"mrna\",\n", " parcellation=\"DesikanKilliany\",\n", " collection=\"CellTypesSilettiSuperclusters\",\n", " print_references=False\n", ")\n", "print(f\"mRNA (CellTypesSilettiSuperclusters): {mrna.shape[0]} genes in \"\n", " f\"{mrna.index.get_level_values('set').nunique()} cell-type sets x {mrna.shape[1]} parcels\")\n", "\n", "# ENIGMA cortical thickness effect sizes (Cohen's d, cases vs. controls)\n", "# Note: this is a reference dataset — not example data — but we use it as input in Notebook 11\n", "enigma_thick = fetch_reference(\n", " \"enigmathick\",\n", " parcellation=\"DesikanKilliany\",\n", " print_references=False\n", ")\n", "print(f\"ENIGMA thickness: {enigma_thick.shape[0]} disorders x {enigma_thick.shape[1]} cortical parcels\")\n", "print(list(enigma_thick.index))" ] }, { "cell_type": "markdown", "id": "e1478669", "metadata": {}, "source": [ "## Fetching example data\n", "\n", "For demos and testing, NiSpace includes the `\"anorexianervosa\"` example dataset: parcellated grey matter values for 50 anorexia nervosa patients and 50 healthy controls. We use this throughout the series wherever we need individual-subject data.\n", "\n", "> **Note:** This dataset is **simulated** and not intended for scientific use. The numbers are realistic but fabricated — please do not draw any clinical or scientific conclusions from it.\n", "\n", "Group labels are encoded in the subject IDs: `sub-XXXAN` = patient, `sub-XXXHC` = healthy control." ] }, { "cell_type": "code", "execution_count": 7, "id": "720210f5", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:17.199967Z", "iopub.status.busy": "2026-06-01T13:03:17.199777Z", "iopub.status.idle": "2026-06-01T13:03:17.211433Z", "shell.execute_reply": "2026-06-01T13:03:17.211126Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.datasets: Loading example dataset: 'anorexianervosa', parcellated with: Schaefer200Parcels7Networks.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Shape: (100, 200) (100 subjects x 200 parcels)\n", "First few IDs: ['sub-001AN', 'sub-002AN', 'sub-003AN', 'sub-004AN'] ...\n", "Last few IDs: ... ['sub-097HC', 'sub-098HC', 'sub-099HC', 'sub-100HC']\n", "\n", "Group counts: {'AN': 50, 'HC': 50}\n" ] } ], "source": [ "from nispace.datasets import fetch_example\n", "\n", "an_data = fetch_example(\"anorexianervosa\", parcellation=\"Schaefer200\")\n", "\n", "print(f\"Shape: {an_data.shape} ({an_data.shape[0]} subjects x {an_data.shape[1]} parcels)\")\n", "print(\"First few IDs:\", list(an_data.index[:4]), \"...\")\n", "print(\"Last few IDs: ...\", list(an_data.index[-4:]))\n", "\n", "# extract group labels from the index\n", "import pandas as pd\n", "groups = an_data.index.str.extract(r'(AN|HC)$')[0]\n", "print(\"\\nGroup counts:\", groups.value_counts().to_dict())" ] }, { "cell_type": "markdown", "id": "e481eeb1", "metadata": {}, "source": [ "## Fetching parcellations\n", "\n", "`fetch_parcellation()` downloads and returns any of NiSpace's built-in parcellations. The returned object is a multi-space `Parcellation` instance." ] }, { "cell_type": "code", "execution_count": 8, "id": "15cbe763", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:17.213179Z", "iopub.status.busy": "2026-06-01T13:03:17.212806Z", "iopub.status.idle": "2026-06-01T13:03:17.222685Z", "shell.execute_reply": "2026-06-01T13:03:17.222422Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Building multi-space Parcellation for 'Schaefer200Parcels7Networks' from library.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Available spaces: MNI152NLin2009cAsym, MNI152NLin6Asym, fsaverage, fsLR\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Parcellation 'Schaefer200Parcels7Networks': validation passed.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "from nispace.datasets import fetch_parcellation\n", "\n", "parc = fetch_parcellation(\"Schaefer200\")\n", "print(parc)" ] }, { "cell_type": "markdown", "id": "d345af67", "metadata": {}, "source": [ "## The Parcellation concept\n", "\n", "You might expect a parcellation to be a single NIfTI file. In NiSpace, it's a **multi-space object** — a `Parcellation` instance that holds representations in all supported coordinate spaces: MNI152 (volumetric, two versions: NLin2009cAsym and NLin6Asym), fsaverage (FreeSurfer surface), and fsLR (HCP surface).\n", "\n", "When you pass `parcellation=\"Schaefer200\"` to `NiSpace()`, this object is built internally. It automatically detects the space of your input data and uses the right version of the parcellation, resampling as needed. You don't have to think about this. \n", "If used outside of the NiSpace API, you may need to set the current \"active\" space explicitely, otherwise some methods will raise errors.\n", "\n", "The multi-space design also enables space-aware **distance matrices** and **spin matrices**:\n", "\n", "- **Distance matrix**: geodesic distances between parcel centroids along the cortical surface. Used by Moran and Burt null models.\n", "- **Spin matrix**: a precomputed set of spatial permutations on the sphere. Used by the Alexander-Bloch spin test.\n", "\n", "For standard parcellations, both are precomputed and downloaded automatically. You can also load them manually:" ] }, { "cell_type": "code", "execution_count": 9, "id": "2831d6af", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:17.224142Z", "iopub.status.busy": "2026-06-01T13:03:17.224028Z", "iopub.status.idle": "2026-06-01T13:03:17.646340Z", "shell.execute_reply": "2026-06-01T13:03:17.645957Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Lazy-loading parcellation image for space 'MNI152NLin2009cAsym'.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Parcellation 'Schaefer200Parcels7Networks': active space set to 'MNI152NLin2009cAsym'.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Lazy-loading dist mat for 'Schaefer200Parcels7Networks' in space 'MNI152NLin2009cAsym'.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Distance matrix: (200, 200) (parcels x parcels, in mm)\n" ] } ], "source": [ "# set active space\n", "parc.set_active_space(\"MNI152NLin2009cAsym\")\n", "\n", "# geodesic distance matrix between Schaefer200 parcel centroids\n", "dist_mat = parc.get_dist_mat()\n", "\n", "print(f\"Distance matrix: {dist_mat.shape} (parcels x parcels, in mm)\")" ] }, { "cell_type": "markdown", "id": "22227b72", "metadata": {}, "source": [ "## Combined cortex + subcortex parcellations\n", "\n", "NiSpace's built-in atlases are separated by cortex and subcortex to allow for the multi-space approach. As we have done above with cortical atlases, subcortical atlases and data parcellated using these can be directly fetched by the name of the atlas. You can combine a cortical and a subcortical atlas by concatenating their names:\n", "\n", "```python\n", "parcellation = \"Schaefer200TianS1\" # Schaefer200 cortex + Tian scale 1 subcortex\n", "```\n", "\n", "The same combined string works wherever a parcellation name is accepted: `NiSpace(parcellation=...)`, `fetch_reference(parcellation=...)`, and `fetch_parcellation(...)`. Writing it with a space (`\"Schaefer200 TianS1\"`) also works.\n", "\n", "**Available subcortical atlases (refer to the Parcellations page for an up-to-date list):**\n", "\n", "| Name | Regions | Notes |\n", "|------|---------|-------|\n", "| `TianS1` | 16 | Coarsest; good default for whole-brain analyses |\n", "| `TianS2` | 32 | Intermediate |\n", "| `TianS3` | 50 | Finest Tian scale |\n", "| `Aseg` | ~40 | FreeSurfer automatic subcortical segmentation |\n", "| `HarvardOxfordSubcortical` | 21 | Harvard-Oxford subcortical atlas |\n", "| ... |\n", "\n", "If possible, reference datasets (PET, mRNA, etc) are pre-parcellated with all atlases, including subcortical ones, so `fetch_reference()` with a combined parcellation just works.\n", "\n", "**Null models:** The spin test requires a spherical cortical surface projection and cannot be applied to combined parcellations. NiSpace automatically selects Moran spectral randomization for combined parcellations — see [Notebook 5](intro05_null_models.ipynb)." ] }, { "cell_type": "code", "execution_count": 10, "id": "9f21772f", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:17.648030Z", "iopub.status.busy": "2026-06-01T13:03:17.647904Z", "iopub.status.idle": "2026-06-01T13:03:17.918629Z", "shell.execute_reply": "2026-06-01T13:03:17.918352Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Building combined Parcellation 'Schaefer200Parcels7Networks+TianS1' from library.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Common MNI space(s) for combined: ['MNI152NLin2009cAsym', 'MNI152NLin6Asym'].\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Merging 'Schaefer200Parcels7Networks' and 'TianS1' for space 'MNI152NLin2009cAsym'.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Merging 'Schaefer200Parcels7Networks' and 'TianS1' for space 'MNI152NLin6Asym'.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Fetching cx surface data for 'Schaefer200Parcels7Networks' in 'fsaverage' (for spin tests).\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Fetching cx surface data for 'Schaefer200Parcels7Networks' in 'fsLR' (for spin tests).\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Combined parcellation 'Schaefer200Parcels7NetworksTianS1' ready. MNI space(s): ['MNI152NLin2009cAsym', 'MNI152NLin6Asym']. Cx surface space(s) for spins: ['fsaverage', 'fsLR'].\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.core.parcellation: Parcellation 'Schaefer200Parcels7NetworksTianS1': validation passed.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Combined: \n", "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.datasets: Loading pet maps.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.datasets: Loading integrated collection 'UniqueTracers' for dataset 'pet'.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.datasets: Filtering maps by collection.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:17 | nispace.datasets: Loading and inner-merging data parcellated with 'Schaefer200Parcels7Networks' and 'TianS1'\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "PET (Schaefer200+TianS1): 29 maps x 216 parcels\n", " → 200 cortex + 16 subcortex parcels\n" ] } ], "source": [ "# combined parcellation: Schaefer200 cortex + TianS1 subcortex\n", "parc_combined = fetch_parcellation(\"Schaefer200TianS1\")\n", "print(f\"Combined: {parc_combined}\")\n", "\n", "# reference data for the combined parcellation\n", "pet_combined = fetch_reference(\n", " \"pet\",\n", " parcellation=\"Schaefer200TianS1\",\n", " collection=\"UniqueTracers\",\n", " print_references=False\n", ")\n", "n_cx = 200 # Schaefer200 cortex parcels\n", "n_total = pet_combined.shape[1]\n", "print(f\"\\nPET (Schaefer200+TianS1): {pet_combined.shape[0]} maps x {n_total} parcels\")\n", "print(f\" → {n_cx} cortex + {n_total - n_cx} subcortex parcels\")" ] }, { "cell_type": "markdown", "id": "4b876c01", "metadata": {}, "source": [ "## Storage and reproducibility\n", "\n", "By default, all data lands in `~/nispace-data/`. Change this with:\n", "\n", "```python\n", "import os\n", "os.environ[\"NISPACE_DATA_DIR\"] = \"/your/path\"\n", "```\n", "\n", "Every downloaded file is verified against a SHA-256 hash manifest — corrupted or incomplete downloads are detected and re-fetched automatically. See the [Data Management](../data_management.rst) page for details on versioning and reproducibility.\n", "\n", "## Summary\n", "\n", "| Function / pattern | Purpose |\n", "|--------------------|---------|\n", "| `fetch_reference(dataset, parcellation, collection)` | Load curated reference maps (PET, mRNA, ENIGMA, …) |\n", "| `fetch_example(\"anorexianervosa\", parcellation)` | Load the group-comparison example dataset |\n", "| `fetch_parcellation(name)` | Fetch a built-in parcellation as a multi-space object |\n", "| `parcellation=\"Schaefer200TianS1\"` | Combined cortex + subcortex (TianS1/S2/S3, Aseg, …) |\n", "| `NiSpace(parcellation=nifti_img)` | Use any NIfTI as a custom parcellation |\n", "\n", "Next: [Notebook 4](intro04_imaging_phenotypes.ipynb) shows how to compute group-level effect sizes from individual subject data." ] }, { "cell_type": "code", "execution_count": 11, "id": "373256b9", "metadata": { "execution": { "iopub.execute_input": "2026-06-01T13:03:17.920165Z", "iopub.status.busy": "2026-06-01T13:03:17.920067Z", "iopub.status.idle": "2026-06-01T13:03:20.415566Z", "shell.execute_reply": "2026-06-01T13:03:20.415173Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[fetch_atlas_aal] Dataset found in /Users/llotter/nilearn_data/aal_SPM12\n", "AAL atlas: 117 labels\n", "\u001b[32mINFO | 01/06/26 15:03:18 | nispace.api: *** NiSpace.fit() - Data extraction and preparation. ***\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:18 | nispace.core.parcellation: Building Parcellation from path / image.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:18 | nispace.core.parcellation: Parcellation space: 'MNI152NLin6Asym'.\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/var/folders/6n/h4150p8d5gz5kbnqv5_406940000gp/T/ipykernel_28706/3862712985.py:6: DeprecationWarning: Starting in version 0.13, the default fetched mask will beAAL 3v2 instead.\n", " aal = nilearn_datasets.fetch_atlas_aal()\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:18 | nispace.core.parcellation: Parcellation 'None': validation passed.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:18 | nispace.api: Checking input data for 'x' (should be, e.g., PET data):\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:18 | nispace.io: Input type: list, assuming imaging data.\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:18 | nispace.io: Background (bg) handling: ignoring bg: True (bg value: ['auto', 0.0]); dropping bg parcels: False\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32mINFO | 01/06/26 15:03:18 | nispace.io: Parcellating imaging data.\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", "Parcellating (1 proc): 0%| | 0/5 [00:00