New family of association measure functions for contingency
tables: assoc_measures(), contingency_coef(),
gamma_gk(), goodman_kruskal_tau(),
kendall_tau_b(), kendall_tau_c(),
lambda_gk(), phi(), somers_d(),
uncertainty_coef(), and yule_q(). Each returns
a numeric scalar by default; pass detail = TRUE for a named
vector with estimate, confidence interval, and p-value.
cross_tab() gains assoc_measure and
assoc_ci arguments. When both variables are ordered
factors, it automatically selects Kendall’s Tau-b instead of Cramer’s V.
The note format changes from Chi-2: 18.0 (df = 4) to
Chi-2(4) = 18.0. Numeric attributes (chi2,
df, p_value, assoc_measure,
assoc_value, assoc_result) are now attached to
the output data frame.
table_apa() now dynamically labels the association
measure column based on the measure used, instead of always showing
“Cramer’s V”. New assoc_measure and assoc_ci
arguments are passed through to cross_tab().
table_apa() gains output = "gt" to
produce a gt_tbl object with APA-style formatting, column
spanners, and alignment.
table_apa() now correctly centers spanner labels
over their column pairs in tinytable and
flextable output.
All association measure functions and
assoc_measures() gain a digits argument
(default 3) that controls the number of decimal places when printed. The
p-value always uses 3 decimal places or
< 0.001.
detail = TRUE results now print with formatted
output (aligned columns, fixed decimal places) via a new
print.spicy_assoc_detail() method.
assoc_measures() output uses a new
print.spicy_assoc_table() method with the same
formatting.
New bundled dataset sochealth: a simulated
social-health survey (n = 1200, 24 variables) with variable labels,
ordered factors, survey weights, and missing values. Includes four
Likert-scaled life satisfaction items (life_sat_health,
life_sat_work, life_sat_relationships,
life_sat_standard) for demonstrating mean_n(),
sum_n(), and count_n().
count_n() now correctly counts NA
values when count = NA and strict = TRUE are
both used. List columns are now reported in verbose mode instead of
causing silent errors.
cross_tab() rescale logic now operates on complete
cases only, so the weighted total N matches the unweighted N when
missing values are present (consistent with Stata behavior).
freq() now uses true NA consistently
(instead of the "<NA>" string) in both weighted and
unweighted paths. cum_valid_prop is now correctly
NA for missing rows. Invalid digits and
sort values are rejected with clear error
messages.
mean_n() and sum_n() now validate
min_valid and digits arguments, rejecting
non-numeric, negative, or multi-element values.
mean_n(), sum_n(), and
count_n() no longer trigger a tidyselect deprecation
warning when select receives a character vector. Character
vectors are now automatically wrapped with
all_of().
table_apa() now preserves the original factor level
order in row variables instead of sorting alphabetically. When
drop_na = FALSE, the (Missing) category is
placed at the bottom of each variable’s levels.
percent_digits, p_digits, and
v_digits are now validated.
table_apa() p-values no longer wrap across lines in
tinytable HTML output.
cramer_v() now accepts a detail argument.
By default it returns a numeric scalar (as before). Pass
detail = TRUE to get a 4-element named vector
(estimate, ci_lower, ci_upper,
p_value), or detail = TRUE, conf_level = NULL
for a 2-element vector (estimate, p_value)
without CI.table_apa() helper to build APA-ready cross-tab
reports with multiple output formats (wide,
long, tinytable, flextable,
excel, clipboard, word).table_apa() exposes key cross_tab()
controls for weighting and inference (weights,
rescale, correct, simulate_p,
simulate_B) and now handles missing values explicitly when
drop_na = FALSE.count_n() no longer crashes when
special = "NaN" is used with non-numeric columns. Passing
count = NA now errors with a message directing to
special = "NA".cross_tab() fixes a spurious rescale warning for
explicit all-ones weights and aligns the Cramer’s V formula with
cramer_v().table_apa() no longer leaks global options on error.
The simulate_p default is aligned to
FALSE.varlist() title generation no longer crashes on
unrecognizable expressions.copy_clipboard() parameter message renamed
to show_message.freq() now dispatches printing correctly via S3.collapse and stringi from
Imports.cross_tab() hardening: improved vector-mode detection
(including labelled vectors), stricter weight validation, safer
rescaling, and clearer early errors (e.g., explicit
y = NULL).cross_tab() statistics are now computed on non-empty
margins in grouped tables, avoiding spurious NA results;
internal core path refactored to remove
dplyr/tibble from computation while preserving
user-facing behavior.freq() now errors clearly when x is
missing for data.frame input and validates rescaling when weight sums
are zero/non-finite.count_n(), mean_n(), and
sum_n() regex mode is hardened (regex = TRUE
now validates/defaults select safely).mean_n() and sum_n() now return
NA (with warning) when no numeric columns are
selected.label_from_names() now validates input type
(data.frame/tibble required).cramer_v() now returns NA with warning for
degenerate tables.DT and clipr
moved to Suggests; optional runtime checks added in
code_book() and copy_clipboard().Print methods have been fully redesigned to produce clean, aligned ASCII tables inspired by Stata’s layout. The new implementation improves formatting, adds optional color support, and provides more consistent handling of totals and column spacing.
Output from freq() and cross_tab() now
benefits from the enhanced print.spicy() formatting,
offering clearer, more readable summary tables.
Documentation and internal tests were updated for clarity and consistency.
cross_tab() gains an explicit correct
argument to control the use of Yates’ continuity correction for
Chi-squared tests in 2x2 tables. The default behavior remains
unchanged.
The documentation of cross_tab() was refined and
harmonized, with a clearer high-level description, improved parameter
wording, and expanded examples.
Minor cosmetic improvements were made to varlist()
output: the title prefix now uses vl: instead of
VARLIST, and the column name Ndist_val was
renamed to N_distinct for improved readability and
consistency.
Minor cosmetic improvement: ASCII table output no longer includes a closing bottom rule by default.
code_book(), which generates a
comprehensive variable codebook that can be viewed interactively and
exported to multiple formats (copy, print, CSV, Excel, PDF).label_from_names() now correctly handles edge cases
when the separator appears in the label or is missing.label_from_names() to derive and assign
variable labels from headers of the form
"name<sep>label" (e.g. "name. label").
Especially useful for LimeSurvey CSV exports (Export results
-> CSV -> Headings: Question code & question
text), where the default separator is ". ".varlist()).freq()), cross-tabulations
(cross_tab()), and Cramer’s V for categorical associations
(cramer_v()).mean_n()), sums (sum_n()), and counts
(count_n()) with automatic handling of missing data.copy_clipboard()) directly to the clipboard
for quick export.