Cognition:
Programming Edition!

LondonR 2025

Hannah Frick

Reading and understanding code

Research indicates that almost 60% of programmers’ time is spent understanding rather than writing code.

Felienne Hermans - The Programmer’s Brain

Let’s read some code

my_fun <- function(a,  
                   b,  
                   c,  
                   d = 2,  
                   e = 3,  
                   g = 22,  
                   j = 324) {  
  i <- a + c  
  w <- (d + e) / ((a + c) + (d + e))  
  four <- e - a  
  f <- four - a  
  ff <- i + w  
  list(f, ff, four)  
}

Our short-term memory (STM) only holds two to six items.

This is not even R code!

SEXP Cdqrls(SEXP x, SEXP y, SEXP tol, SEXP chk)
{
    SEXP ans;
    SEXP qr, coefficients, residuals, effects, pivot, qraux;
    int n, ny = 0, p, rank, nprotect = 4, pivoted = 0;
    double rtol = asReal(tol), *work;
    Rboolean check = asLogical(chk);

    ans = getAttrib(x, R_DimSymbol);
    if(check && length(ans) != 2) error(_("'x' is not a matrix"));
    int *dims = INTEGER(ans);
    n = dims[0]; p = dims[1];
    if(n) ny = (int)(XLENGTH(y)/n); /* y :  n x ny, or an n - vector */
    if(check && n * ny != XLENGTH(y))
    error(_("dimensions of 'x' (%d,%d) and 'y' (%d) do not match"),
          n,p, XLENGTH(y));

    /* These lose attributes, so do after we have extracted dims */
    if (TYPEOF(x) != REALSXP) {
    PROTECT(x = coerceVector(x, REALSXP));
    nprotect++;
    }
    if (TYPEOF(y) != REALSXP) {
    PROTECT(y = coerceVector(y, REALSXP));
    nprotect++;
    }
/* < more code > */
}

This is not even R code!

SEXP Cdqrls(SEXP x, SEXP y, SEXP tol, SEXP chk)
{
    /* define some variables */

    /* check inputs x and y */

    /* protect x and y */

/* < more code > */
}

Our long-term memory (LTM) helps us aggregate items in our STM into chunks.

What does this do? And how?

drop_strata <- function(expr, in_plus = TRUE) {
  if (rlang::is_call(expr, "+", n = 2) && in_plus) {
    lhs <- drop_strata(expr[[2]], in_plus = in_plus)
    rhs <- drop_strata(expr[[3]], in_plus = in_plus)
    if (rlang::is_call(lhs, "strata")) {
      rhs
    } else if (rlang::is_call(rhs, "strata")) {
      lhs
    } else {
      rlang::call2("+", lhs, rhs)
    }
  } else if (rlang::is_call(expr)) {
    expr[-1] <- purrr::map(as.list(expr[-1]), drop_strata, in_plus = FALSE)
    expr
  } else {
    expr
  }
}

Tests show (important) use cases

test_that("`drop_strata()` removes strata term in a series of `+` calls", {
  expect_equal(
    drop_strata(rlang::expr(a + strata(x))),
    rlang::expr(a)
  )
  
  expect_equal(
    drop_strata(rlang::expr(a + strata(x) + b)),
    rlang::expr(a + b)
  )
})

test_that("`drop_strata()` does not remove strata in other cases", {
  expect_equal(
    drop_strata(rlang::expr(a * (b + strata(x)))),
    rlang::expr(a * (b + strata(x)))
  )
})

Abstract syntax tree

library(lobstr)

ast(strata(x))
#> █─strata 
#> └─x

ast(a + b)
#> █─`+` 
#> ├─a 
#> └─b

Abstract syntax tree

library(lobstr)

ast(a + strata(x))
#> █─`+` 
#> ├─a 
#> └─█─strata 
#>   └─x

ast(a + strata(x) + b)
#> █─`+` 
#> ├─█─`+` 
#> │ ├─a 
#> │ └─█─strata 
#> │   └─x 
#> └─b

Our working memory is our STM applied to a problem.

Our working memory only holds two to six items.

Challenges

  • Lack of information
  • Lack of knowledge
  • Lack of processing power

Challenges, for reasons

  • Lack of information
  • Lack of knowledge
  • Lack of processing power
  • Limited capacity of STM
  • Activation of LTM
  • Limited capacity of working memory

Help your brain out

  • Look for beacons: names, comments, paragraphs
  • Summarize code into chunks via comments or refactoring
  • Learn more: programming concepts, domain knowledge
  • Offload information

Writing Code

Writing is for re-reading

Well-scoped code supports chunking.

Names

Good names help activate knowledge from your LTM.

predict(survival_model, 
        type = "survival",  
        time = 2)

time: the time points at which the survival probability is estimated

Bad names can hinder you by activating the wrong knowledge.

predict(survival_model, 
        type = "survival",  
        eval_time = 2)

Make (re)thinking names a separate step

to avoid overloading your working memory.

Bad names are linguistic anti-patterns, code smells are structural anti-patterns.

Design patterns

Design patterns are reusable solutions to common problems.

Many arguments

my_fun <- function(x, 
                   y,
                   opt1 = 1,
                   opt2 = 2, 
                   opt3 = 3, 
                   opt4 = 4){
  ...  
}

Many arguments, revisited

my_fun <- function(x, y, options = my_fun_opts()) {
  ...
}

my_fun_opts <- function(opt1 = 1, opt2 = 2, opt3 = 3, opt4 = 4) {
  list(
    opt1 = opt1,
    opt2 = opt2,
    opt3 = opt3, 
    opt4 = opt4
  ) 
}

Design patterns can help lower the cognitive load.

Having a mental model of how your brain works helps you to work with it, not against it.

If you want your code to grow in complexity,
you need to keep (re-)chunking.

Keynote at LatinR 2023

Full talk at https://hfrick.github.io/2023-latinr/