There is a small example dataset included in the lwc2022
package called cog_data. The dataset simulates cognitive
scores following the methodology used in the the Health and Retirement
(HRS), specifically focusing on tasks like word recall, serial
subtraction, and backwards counting. These cognitive tasks are the core
of the Langa-Weir classification system used to assess cognitive
function.
The simulated dataset contains 10 observations and follows the
structure expected by the functions in the package
(extract(), score(), and
classify()). Below, we detail the steps taken to simulate
the dataset.
The cog_data dataset contains 35 variable. A summary of
its structure is presented below:
# Load the package
library(lwc2022)
# Load the example dataset
data(cog_data)
# Display the structure of cog_data
str(cog_data)
#> 'data.frame':    10 obs. of  35 variables:
#>  $ HHID    : int  288941 234057 224021 785284 326317 465208 748794 293626 669691 689448
#>  $ PN      : int  93 99 72 26 7 42 9 83 36 78
#>  $ SD182M1 : num  17 53 39 63 12 15 32 52 55 7
#>  $ SD182M2 : num  9 51 10 23 27 99 63 7 63 27
#>  $ SD182M3 : num  32 38 25 34 29 5 8 12 13 18
#>  $ SD182M4 : num  33 67 27 25 38 21 15 51 57 26
#>  $ SD182M5 : num  99 31 16 62 30 6 53 8 22 22
#>  $ SD182M6 : num  39 31 58 17 64 60 59 34 4 13
#>  $ SD182M7 : num  5 64 61 25 62 22 25 32 56 25
#>  $ SD182M8 : num  23 35 40 58 30 12 31 67 56 30
#>  $ SD182M9 : num  35 14 29 32 7 3 23 64 96 15
#>  $ SD182M10: num  21 37 8 61 10 60 52 54 34 10
#>  $ SD183M1 : num  22 12 20 56 17 56 64 35 40 56
#>  $ SD183M2 : num  61 30 15 24 59 23 53 7 29 15
#>  $ SD183M3 : num  23 26 38 56 32 7 27 52 5 6
#>  $ SD183M4 : num  16 24 32 21 65 11 36 54 56 99
#>  $ SD183M5 : num  19 25 39 64 26 9 7 34 58 13
#>  $ SD183M6 : num  19 66 62 57 39 4 1 40 30 30
#>  $ SD183M7 : num  62 25 16 24 64 11 58 20 40 3
#>  $ SD183M8 : num  29 36 62 54 22 59 52 98 20 11
#>  $ SD183M9 : num  67 65 8 56 21 55 2 53 13 56
#>  $ SD183M10: num  6 67 8 54 32 96 36 55 14 63
#>  $ SD142   : int  96 90 97 97 99 98 97 91 94 98
#>  $ SD143   : int  86 86 89 90 80 98 89 92 90 90
#>  $ SD144   : int  89 76 89 78 78 74 83 83 75 70
#>  $ SD145   : int  69 76 76 66 68 79 65 77 76 64
#>  $ SD146   : int  69 52 63 50 51 53 59 50 54 57
#>  $ SD124   : int  0 0 0 0 1 1 0 1 0 0
#>  $ SD129   : int  0 1 0 0 0 1 0 0 1 0
#>  $ SD237WA : num  -8 -8 -9 1 0 0 0 1 0 1
#>  $ SD237WC : int  13 17 3 18 2 5 12 13 10 6
#>  $ SD237WT : int  42 42 38 60 48 16 35 36 27 27
#>  $ SD238WA : num  -8 0 -8 -8 -8 -9 1 -8 -8 -8
#>  $ SD238WC : int  9 7 9 4 2 12 9 11 7 13
#>  $ SD238WT : int  37 43 33 19 12 34 21 17 12 30The dataset contains variables for individual identifiers, cognition-related tasks (immediate/delayed word recall, serial subtraction, and backwards counting), and other variables necessary for scoring and classification.
HHID: A unique household identifier.PN: A unique personal identifier.SD182M01-SD182M10: Responses for the Immediate Word
Recall task.SD183M01-SD183M10: Responses for the Delayed Word
Recall task.SD142-SD146: Responses for the Serial Subtraction task,
where participants are asked to subtract 7 from 100 iteratively five
times.SD124 and SD129: Responses for the
Backwards Counting task, where participants count backwards from 20.
SD124 represents the first attempt, and SD129
represents the second attempt.SD237WA-SD237WT and SD238WA-SD238WT:
Responses to a mouse clicking test measuring accuracy, click counts, and
click time.The generate_example_data() function generates a dataset
of size \(n = 10\), producing a set of
cognitive test variables along with unique identifiers. The output
dataset is structured similarly to the cognitive assessment data
collected in the HRS.
# Simulated dataset
generate_example_data <- function(n = 10) {
  data.frame(
    # Identifiers
    HHID = sample(100000:999999, n, replace = TRUE),   # Random household ID
    PN = sample(1:99, n, replace = TRUE),              # Random person number
    # THESE ARE THE VARIABLES USED IN THE LW CLASSIFICATIONS
    # Immediate word recall (10 items)
    SD182M1 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M2 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M3 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M4 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M5 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M6 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M7 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M8 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M9 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD182M10 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    # Delayed word recall (10 items)
    SD183M1 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M2 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M3 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M4 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M5 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M6 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M7 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M8 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M9 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    SD183M10 = sample(c(1:40, 51:67, 96, 98, 99), n, replace = TRUE),
    # Serial subtraction (Subtracting 7 from 100 five times)
    SD142 = sample(90:100, n, replace = TRUE),  # First subtraction value
    SD143 = sample(80:99, n, replace = TRUE),   # Second subtraction
    SD144 = sample(70:89, n, replace = TRUE),   # Third subtraction
    SD145 = sample(60:79, n, replace = TRUE),   # Fourth subtraction
    SD146 = sample(50:69, n, replace = TRUE),   # Fifth subtraction
    # Backwards counting
    SD124 = sample(0:1, n, replace = TRUE),  # Success on first try (1 = success, 0 = fail)
    SD129 = sample(0:1, n, replace = TRUE),  # Success on second try (1 = success, 0 = fail)
    # RANDOM VARIABLES NOT USED IN LW CLASSIFICATIONS
    # Speed Test (Mouse clicking)
    SD237WA = sample(c(0, 1, -8, -9), n, replace = TRUE),
    SD237WC = sample(c(0, 1, -8, -9), n, replace = TRUE),
    SD237WT = sample(c(0, 1, -8, -9), n, replace = TRUE),
    SD238WA = sample(c(0, 1, -8, -9), n, replace = TRUE),
    SD238WC = sample(c(0, 1, -8, -9), n, replace = TRUE),
    SD238WT = sample(c(0, 1, -8, -9), n, replace = TRUE)
  )
}The function returns a dataframe with \(n\) rows and the following columns:
set.seed(123)
cog_data <- generate_example_data()
knitr::kable(head(cog_data), caption = "Example of generated cognition data")| HHID | PN | SD182M1 | SD182M2 | SD182M3 | SD182M4 | SD182M5 | SD182M6 | SD182M7 | SD182M8 | SD182M9 | SD182M10 | SD183M1 | SD183M2 | SD183M3 | SD183M4 | SD183M5 | SD183M6 | SD183M7 | SD183M8 | SD183M9 | SD183M10 | SD142 | SD143 | SD144 | SD145 | SD146 | SD124 | SD129 | SD237WA | SD237WC | SD237WT | SD238WA | SD238WC | SD238WT | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 288941 | 93 | 17 | 9 | 32 | 33 | 99 | 39 | 5 | 23 | 35 | 21 | 22 | 61 | 23 | 16 | 19 | 19 | 62 | 29 | 67 | 6 | 96 | 86 | 89 | 69 | 69 | 0 | 0 | -8 | 0 | 0 | -9 | -8 | -8 | 
| 234057 | 99 | 53 | 51 | 38 | 67 | 31 | 31 | 64 | 35 | 14 | 37 | 12 | 30 | 26 | 24 | 25 | 66 | 25 | 36 | 65 | 67 | 90 | 86 | 76 | 76 | 52 | 0 | 1 | -8 | -9 | 0 | -9 | -9 | 1 | 
| 224021 | 72 | 39 | 10 | 25 | 27 | 16 | 58 | 61 | 40 | 29 | 8 | 20 | 15 | 38 | 32 | 39 | 62 | 16 | 62 | 8 | 8 | 97 | 89 | 89 | 76 | 63 | 0 | 0 | -9 | 0 | 1 | -8 | 1 | 0 | 
| 785284 | 26 | 63 | 23 | 34 | 25 | 62 | 17 | 25 | 58 | 32 | 61 | 56 | 24 | 56 | 21 | 64 | 57 | 24 | 54 | 56 | 54 | 97 | 90 | 78 | 66 | 50 | 0 | 0 | 1 | -8 | 0 | -9 | -8 | -9 | 
| 326317 | 7 | 12 | 27 | 29 | 38 | 30 | 64 | 62 | 30 | 7 | 10 | 17 | 59 | 32 | 65 | 26 | 39 | 64 | 22 | 21 | 32 | 99 | 80 | 78 | 68 | 51 | 1 | 0 | 0 | -8 | 1 | -8 | -8 | 1 | 
| 465208 | 42 | 15 | 99 | 5 | 21 | 6 | 60 | 22 | 12 | 3 | 60 | 56 | 23 | 7 | 11 | 9 | 4 | 11 | 59 | 55 | 96 | 98 | 98 | 74 | 79 | 53 | 1 | 1 | 0 | 1 | 1 | -8 | -8 | 1 |