Troubleshooting code

The general topic here is: how to diagnose a problem in an analysis result. Like, the code ran (didn't crash) but things don't "look right" ﻿ . What do you do? ﻿ 
﻿
Note that this page is the low-level cousin of this other page: ﻿ 🔬⁠⁠Troubleshooting experiments⁠ .
Common things that can go wrong/unexpected outcomes:Division by zero (NaN / Inf)
Division by small numbers can make "exploding" values
Getting negative values even though you didn't want/expect negative values
Indexing mistakes, such as 0- vs 1-based, or pulling out the wrong columns of a matrix, or loading the wrong data file.
Ordering mistakes. You thought you were getting files in a certain order, but the sorted order (e.g. alphabetically) is not what you expected!
Clobbering variables in your code when you didn't mean that (i.e. not using distinct variable names).
Experiment/stimulus specification/timing are just plain inaccurate.
The data for a voxel might be all zeros (or constant), which can lead to undefined correlation values.
Data precision issues (assuming you are in floats but maybe the data are integer format)
Software expectation violations (e.g. data types / data files / NIFTI headers)
"Outlier" values — meaning extreme values that are probably not "real"... these can percolate and spread like a virus if you don't stop and catch them.
Complex (imaginary) numbers when you don't actually want them. (e.g. sqrt of negative numbers).
Corner cases. If the data for a voxel consists of strange/missing/invalid data, what does your analysis do to them? Or, what if the dimensionality of your data is a bit strange — does your code correctly handle all of those cases?
Units. Do you know roughly the min/max/mean of the values? Are you sure they are what you expect?
Figure bugs/errors. Did you flip the hemispheres? Did you use an inappropriate colormap? Is your axis range reasonable? Are the axes labels correct? Are you sure you are plotting the right variables?
Convention errors. Are you sure you understand the order of hemispheres, x vs. y, brain spatial dimensions, Cartesian vs. image coordinates, degrees vs. radians, 0-based vs 1-based, mm vs. voxels, z-score units vs. raw units, etc.
How to prevent errors:Use "safe" programming (see ﻿ 🛠️⁠⁠Coding tips⁠ ).
For example, include checks of certain conditions that need to be true (and if they fail, you can deliberately error out).
For anything that seems quirky or that could go wrong (voxel selection or cross-validation schemes), take a second to write an assert that can check the sanity of it.
For example, if you know that your experiment involved exactly 2 repetitions in each run, you can check that that is true when you load in your stimulus specification.
Never assume certain properties of the data when you code. Alternatively, if you rely on certain assumptions, state them clearly.
Use copious data inspections.
Check all bits of data that affect your analysis. Garbage in will mean garbage out.
Check all major steps of analysis that you perform. E.g., visually check each registration you perform. E.g. visually check output of some pre-processing step. If you don't know if something you are visually staring at is good or bad, ask someone.
As you develop a new analysis, step through every single code line and inspect the contents of variables (e.g. histogram, imagesc, plot, etc.).
Consider making a consolidated script that shows all of the steps at a glance, so you can review it easily.
Use fprintf statements to report on various basic aspects of the data as the code runs. This allows you to easily review the progress of the analysis.
Diagnostics (what do you do when faced with something unexpected?):Be a detective. Make a list of potential hypotheses and start ruling them out if you can. Is it head motion? Is it just noise? Is it an indexing bug? Are the MRI data fundamentally corrupted in terms of image quality? Did the subject fall asleep? Is there a massive instability across runs that you didn't know about? Did you accidentally analyze the wrong voxels? Did the stimulus presentation go EXACTLY as you intended? Is it some coding bug?
Step 1: Try to isolate where in the analysis flow this occurred. I.e., check the state of the data at different points and try to identify where in the analysis weird things are happening.
Step 2: Assuming you have a good starting point, then one approach is rewind to the point that you believe the data are sane, and then go line by line, i.e. one step at a time. If your IDE has a debugger, that could be a useful way to do this. At each step, look at the variables/data. You have to be clever/flexible:
Sometimes, spot checks (e.g. one slice) are good enough and convenient/easy-to-do. Of course, that may not necessarily be enough to find the problem.
Summary measures (e.g. histograms, means/medians) are useful, but may not be sensitive enough.
Another strategy: Look at ONE voxel and really figure out what your analysis is doing.
Ask an expert(s) ﻿  to take a look at your code/data/results. This can save a lot of time.
If you are using a tool, ask the developers of tool your question!
Tests and simulations:Make synthetic data with a known desired outcome, and confirm that your code does the right thing?
Take your data and manipulate it (e.g. shuffle/destroy/add-noise) and see what happens?
Create a "test dataset" and repeatedly use it to benchmark/validate your code.
If you write a modular function, stress-test its behavior (try a diverse set of test examples; deliberately make corner cases) and validity before deploying it as part of a larger analysis.
Generate pure noise data and analyze it to see how "noise" manifests in your analysis outcomes.
Other ideas:Have someone look at your code (code review).
Have someone re-implement the analysis and compare results.
﻿