Image/video processing

Image/video processing

Scope:

  • The ability to understand, manipulate, and process images and video.
  • This includes many types of interaction, ranging from "black box" tools to programming analyses yourself.
  • Sometimes, image/video processing is "just" for aesthetics and visualization and precision/accuracy is not super important; other times, the processing is part of a formal analysis and precision/accuracy is critical.

Application domains:

  • Visual stimuli - Designing and creating stimuli to probe certain aspects of visual processing
  • Figures/visualizations - Often, the scientist creates visualizations to help understand data and analysis results
  • Paper figures (see  🎨Making figures ) - Sometimes, the scientist is trying to create effective and very carefully designed figures to convey ideas in a paper
  • Paper videos - This is not very common, but in theory, creating videos and submitting them as part of a paper (e.g. Supplementary materials) can be effective. For example, a movie is very handy for showing what a cognitive experiment is like.
  • Data are images/videos - One can think of MRI data as effectively just a set of images or videos. Hence, being adept at processing images and videos transfers to MRI analysis.

Basic concepts:

  • image
  • We can define this roughly as any data arranged in rows x columns
  • For RGB images, there are rows x columns x 3 channels (red, green, blue)
  • Sometimes you also have an alpha channel as a 4th channel.
  • For the most part, you want an aspect ratio of 1:1 (i.e. square pixels).
  • video
  • We can define this roughly as images over time. Hence, data arranged as rows x columns x frames.
  • Frame rate: e.g. 24 fps, 30 fps, 60 fps.
  • Also relevant is the refresh rate of your display device: monitors are typically 60 Hz (but modern monitors can be as high as, e.g., 120 Hz).
  • Once you have frames over time, you immediately encounter the concept of time series.
  • Video can be very large in size. Hence, there are a variety of codecs (used in the industry domain) for "compressing or storing" a video. Be careful that codecs typically cause compression (and therefore data loss or data modification).
  • formats
  • Typically, images are stored with 8 bits per channel. (0-255 values)
  • bitmap graphics: pictures are represented at the pixel level (i.e. images)
  • vector graphics: pictures are represented using idealized mathematical objects
  • image formats: PNG (lossless, widely used), TIFF (lossless, older format), JPG (small file size, compressed (and lossy)), BMP (old school, not compressed), GIF (indexed images, old, can support animation), SVG (supports bitmap and vector graphics), PDF (general format that subsumes many different types of things, including but not limited to pictures)
  • video formats: MPG, MPEG-2, QuickTime (.mov), H.264, MKV, MP4. Note that things are complicated: some of these refer to specific codecs, whereas some of these are container formats (that are compatible with a variety of underlying codecs).
  • BE CAREFUL ABOUT COMPRESSION
  • lossy - this refers to a way of encoding an image/video such that you lose information
  • lossless - this refers to a way of encoding such that you can recover exactly the original information. Note that it is possible to be compressed but lossless.
  • displays
  • The resolution of your monitor is important to consider
  • Common resolutions: 640 x 480, 800 x 600, 1024 x 768, 720p, 1080i, 1080p, 2k (1920 x 1080), 4k, 8k
  • gamma
  • The physical luminance you get from a display is, typically, not linear with respect to the encoded image pixel intensity values. This is intentional to mimic our perceptual sensitivity
  • print issues
  • dots per inch (dpi)
  • pixels per inch (ppi)
  • When you save a bitmap image, it consists of a specific number of pixels (e.g. rows x columns) and often also contains the intended ppi, which then determines the intended actual size in inches. (For example, do a "get info" on some .png file that you have.)

Tools for image and video manipulation:

  • See  ✂️Tools and information .
  • ImageMagick - collection of utilities for image manipulation; useful for automation
  • ffmpeg - standard free video encoder/decoder
  • Handbrake - handy GUI wrapper for ffmpeg (i think)
  • Photoshop / GIMP - bitmap manipulation
  • Illustrator / Inkscape - vector graphics manipulation
  • Google Image Search - find images for stimuli?
  • Pexels - royalty-free images?
  • Taking a screenshot of what's on your computer screen is a handy method for making images:
  • QuickTime Player can record your screen and/or audio and/or video
  • OBS - a modern standard method for screen recording
  • Zoom - can provide very nicely compressed screen recordings
  • Your cell phone - can make images and video!
  • Storage of videos: YouTube, Vimeo?
  • Captioning: YouTube automatically gives pretty good captions for a video you upload
  • Making a video/movie from images:
  • First, generate all of the individual frames, and namely them well (e.g. image0001.png)
  • Use some command-line utility (like ffmpeg) to convert a collection of frames into a movie file. For example:
ffmpeg -framerate 60 -pattern_type glob -i 'image*.png' -crf 18 -c:v libx264 -pix_fmt yuv420p test.mp4

Colormaps:

  • For non-RGB images (i.e. data), the choice of colormap is really important
  • Consider choices for the minimum value, the maximum value, and the colormap. These determine the RGB color corresponding to any arbitrary value (in between the min and max)
  • Lots of different colormaps (see  🏂One-offs )... The choice is important!!!!!
  • Indexed images - the image just stores an index (e.g. between 0 and 255) and an associated colormap; the combination determines the final RGB values shown.
  • ColorBrewer is useful for designing colormaps.
  •  Color cycling fun 

Image processing:

  • resize - Refers to changing the number of rows/columns in an image. Presumably, you want to do this in a way that preserves aspect ratio (and therefore preserves square pixels).
  • cropping - Refers to taking a small section from an image
  • luminance - Roughly, the overall mean intensity across all pixels. If you add a constant to all pixels, you are increasing the luminance.
  • contrast - Roughly, the dynamic range of the intensity values across pixels. If you multiply pixel intensities by a number greater than 1, you are increasing the contrast. Of course, you should consider whether you are causing values to go out of range (e.g. below 0)!
  • aliasing - Roughly, when you change the sampling rate of a signal (e.g. an image) in a way that distorts the frequency content. This is, in general, bad and to be avoided.
  • anti-aliasing - Roughly, using a method (such as smoothing a signal before downsampling) to minimize or reduce the effects of aliasing
  • smoothing + subsample - This is right way to perform image downsampling.
  • interpolation - Roughly, guessing/estimating the signal level at points in space (or time) where you do not have actual data. There are many types of interpolation, e.g., nearest, linear, cubic, spline, etc.
  • filtering - general term referring to systematic modification of, usually, the frequency content (spectrum) of a signal.
  • low-pass - preserving the low frequencies, discarding higher frequencies
  • band-pass - preserving an intermediate range of frequencies, discarding others
  • high-pass - preserving the high frequencies, discarding lower frequencies

ffmpeg and ImageMagick command line utilities

Here are  sample scripts that may be handy  for doing things like quickly combining images into a movie, splitting a movie into its image frames, combining images into a PDF, etc. (See  brief overview by Kendrick here .)