The nsdimagery experiment was conducted in a single 7T scanning session consisting of 12 runs. All 8 participants from the main NSD study underwent this additional session, following the same high-resolution fMRI acquisition protocols as NSD. Participants performed three types of tasks—vision, attention, and imagery—on three sets of target stimuli: simple, naturalistic, and conceptual, resulting in a total of 9 distinct run types. The three imagery runs were repeated once (bringing the total number of instances of each imagery run type to two) with different design matrices, bringing the total to 12 runs per subject. Each run included six conditions (condition defined asthe thing that is being seen, attended, or imagined), each repeated eight times, resulting in a total of 48 stimulus trials per run.
For the behavioral data associated with the nsdimagery experiment, please see Behavioral data.
Task description:
Vision: Subjects viewed an image within a square frame along with a single letter cue at the point of fixation for 3 seconds. For 50% of the trials, the cued target matched the seen target image; the rest of the time, the cue displayed actually belonged to one of the other 5 targets. Subjects indicated with a button press whether or not the cued target matched the seen target. This was followed by 1 second rest during which the square frame was empty and the single letter in the fixation dot returned to an “X”.
Each vision run had 12 blank trials: 3 at the beginning of the run, 5 distributed in the middle and 4 at the end. Each blank trial was 4 second long with empty border and "X" in the fixation dot.
Attention: Each trial in an attention run had 2 epochs:
a) cue epoch (3 seconds long) - here, subjects viewed an empty square frame with a single letter cue at the point of fixation indicating the target they were to attend in the following epoch.
b) detection epoch (3 seconds long) - here, subjects were shown a rapid series of images at 8Hz. For 50% of the trials, an image was present somewhere in this series that matched or was consistent with the cued target, and the rest of the images were distractor images. For the other 50% of trials, all 24 images in the series were distractors. Subjects indicated with a button press when they saw the target or, if by the end of the series they didn’t detect the target, they indicated with a different button that it was absent. The detection epoch was followed by 2 seconds of rest during which the square frame was empty and the single letter in the fixation dot returned to an “X”.
Each attention run had 12 blank trials: 2 at the beginning of the run, 7 distributed in the middle and 3 at the end. Each blank trial was 8 seconds long with empty border and "X" in the fixation dot.
Imagery: Subjects viewed an empty square frame with a single letter cue at the point of fixation indicating the target they were to imagine. They were instructed to project their mental image into the space outlined by the frame and to maintain it to the best of their ability for the 3 second duration in which that cue was present. Subjects indicated with a button press whether the image they formed was vivid or not-vivid. This was followed by a 1 second rest during which the square frame remained empty and the single letter in the fixation dot returned to an “X”.
Each imagery run had 12 blank trials: 3 at the beginning of the run, 5 distributed in the middle and 4 at the end. Each blank trial was 4 second long with empty border and "X" in the fixation dot.
Stimuli types:
There were 3 sets of target stimuli, each set containing 6 targets.
Six simple geometric shapes: four oriented bars (0°, 45°, 90°, 135°) and two crosses (”+” and "x"). All constructed from black bars on a gray background.
Six complex images of five natural scenes selected from the NSD shared1000 and one artwork (”The Two Sisters” by Kehinde Wiley). The natural scene images were chosen based on a recognizability score derived from participants’ performance in the original NSD sessions, ensuring a range of familiarity and visual content.
set C = conceptual:
A set of six words describing visual features or objects, representing abstract concepts that might be present in an NSD core image, rather than specific images. Target words were chosen so that there were 2 sequences of 3 words that described nested features, going from more low-level features to higher-level features. The 6 target words were:
stripes, zebra, mammal (non-human)
yellow, banana, fruit
The images selected to be consistent with the targets here are different between vision and attention runs and different from trial to trial.
Each stimulus was associated with a unique single-letter cue, and participants memorized all 18 cue-stimulus pairs prior to scanning. Pre-scan practice sessions were conducted to ensure familiarity with the cues and stimuli, involving both visual presentations and verbal recall to reinforce memory.
Run order:
Run
trial duration
no. of stimulus trials
run duration
no. of volumes (for 1mm/1s preparation)
visA
4s
48
4 min
240
attA
8s
48
8 min
480
imgA_1
4s
48
4 min
240
visB
4s
48
4 min
240
attB
8s
48
8 min
480
imgB_1
4s
48
4 min
240
visC
4s
48
4 min
240
attC
8s
48
8 min
480
imgC_1
4s
48
4 min
240
imgA_2
4s
48
4 min
240
imgB_2
4s
48
4 min
240
imgC_2
4s
48
4 min
240
Stimuli design:
A black frame, 4 pixels in width, was present and constant at all times during all runs, whether or not an image was present. It encapsulated a space of 714 x 714 pixels (which corresponds to 8.4° x 8.4°).
All images subtended 8.4° x 8.4° visual angle (consistent with NSD stimuli presentation) corresponding to 714 x 714 pixels. (Including the bordering frame, the size comes to ~8.5° and 722 x 722 pixels.) Note that images taken from NSD core were originally 425 x 425 pixels, and so were upsampled to 714 x 714 pixels.
A fixation dot at center was present and constant at all times during all runs: 27 x 27 pixels, 0.3° x 0.3°, consisting of black border (~1px) and white center.
One of 19 capital letters (i.e. the cue or dummy cue) was present at the center of the white circle at all times during all runs but changed between trials in order to indicate rest periods and blank trials. During the ‘on’ portion of vision and imagery stimulus trials and during the ‘cue epoch’ of attention stimulus trials, the letter was one of 18 cue letters (~.2° tall). During the ‘detection epoch’ of attention stimulus trials, during the ‘off’/rest portion of all stimulus trials, and during all blank trials, the letter was a capital “X”.
The background used for stimuli was uint8(127).
Each frame to be shown in the nsdimagery experiment was constructed ahead of time (border + every needed combination of cue + image) resulting in 1,149 distinct frame images. Each distinct frame used in all experimental (and practice) runs are in: nsddata_stimuli/stimuli/nsdimagery/allstim/*.png
All frame images were named to indicate the image-cue combination with the following convention:
For stim type A: WWW_XXX.Xdeg_YYYL_ZZW_cue*.png – WWW indicates whether it was a bar (‘bar’) or cross (‘crs’), XXX.X is the orientation in degrees, YYY is the length of the component bars in pixels, ZZ is width
For stim types B and C: sharedAAAA_nsdBBBBB_cue*.png - where AAAA is the index into the shared 1,000 images, BBBBB indicates the 73k-ID (1-indexed), and * indicates the letter showing in the fixation dot (blank00000_00000000_*.png for frames where no image was showing, only border + fixation dot + letter). Note that shared0000_nsd00000 is a special case that was used for the artwork image (which is not part of the actual shared1000 NSD images).
nsddata/experiments/nsdimagery/*pair_list* - indicates cue letters assigned to the various targets
nsddata/experiments/nsdimagery/*screencapture.mp4 - movie files showing the actual nsdimagery experiment for runs 4, 5, and 6 out of the 12 total runs.
Design matrices:
nsddata/experiments/nsdimagery/[run]_dm.mat
Each of the 12 runs have a design matrix file: XXX_dm.mat where XXX is the task type (vis, att, or img; stim type A, B, or C; and run number (only img runs have a _1 or _2)).
Contents:
<dm> is time (in 1 sec increments) X conditions, where the conditions are specified in order by <condit_list>.
<condit_list> N x 2 matrix where N is the number of conditions. The first column indicates the imagined, seen, or attended target using the associated single-letter and the second col (if present) indicates whether it was a match/present trial (1) or a non-match/not-present trial (0).
imagery: (6 x 1) conditions determined by cue
vision: (12 x 2) conditions determined by [seen image] x [match/non-match status] (note that the letter corresponds to the seen image, not the cue!)
attention: (18 x 2)
cue epoch: conditions determined by cue (second col is NaN, b/c only cue is showing)
detection epoch: conditions determined by [cued target] x [present/not-present status]
The condition of a trial is different from the cued target. For example, in visC runs, the cued target might be a zebra, but the image that the subject is seeing is actually of a banana. Therefore, broadly condition is alignedwith the thing that is being seen, attended, or imagined.
Images in the detection epoch of attention runs were shown at 8Hz, and so each entry in <dm> during this epoch of trials actually represents 8 unique images. A 1 entry under a target-present condition indicates that the cued target was one of the 8 images in that second. A 1 entry under a target-not-present condition indicates that the cued target was not one of the 8 images in that second. Therefore, for each attention trial there will be 6 associated 1’s: 3 in a row for the cue epoch, and then either 3 under the target-not-present condition (not-present trials), or 2 under the target-not-present condition and 1 under the target-present condition to indicate where the target image appeared in the 3s detection epoch (present trials).
Note that these design matrices are primarily for informational purposes; the actual design matrix used in the GLM analysis to prepare betas is described below.
This indicates the design matrix used with GLMsingle to derive single-trial betas.
fMRI data
The pre-processing of the fMRI data from the nsdimagery experiment is essentially the same as that used on the NSD core experiment. However, one difference is that the low-resolution pre-processing of the nsdimagery data was at 1.8 mm and 1 s (not 1.333 s). The total number of volumes, after pre-processing, is 240, 480, 240, 240, 480, 240, 240, 480, 240, 240, 240, and 240 for the 12 runs, respectively.
For the GLM analysis, we preserve the order in which the 12 runs were collected. Furthermore, we treat the cue epoch and detection epoch of the attention runs as if they reflect distinct conditions. Hence, vision and imagery trials produce a single beta weight, whereas the attention trials produce two beta weights. Note that all beta weights reflect an event that has a nominal 3-s duration, which is the same as in the NSD core experiment. Each scan session produces a total of 720 beta weights.
We provide a “fithrf” and “fithrf_GLMdenoise_RR” versions of single-trial betas. To support the cross-validation procedures that are a necessary part of "fithrf_GLMdenoise_RR", we make the following design choices: (1) For vision runs, the match/nonmatch distinction is ignored (hence, trials are coded according to the seen image). (2) We act as if visA, visB, and visC reflect the same conditions, and attA, attB, and attC reflect the same conditions. (We leave imgA, imgB, and imgC as distinct runs.) These design choices are approximations that are designed just to help the estimation of GLMsingle hyperparameters; the user ultimately receives a separate beta for each trial in the nsdimagery experiment.