Package 'xactonomial' reference manual

Title:	Exact Inference for Real-Valued Functionals of k-Sample Multinomial Parameters
Description:	We consider the k sample multinomial problem where we observe k vectors (possibly of different lengths), each representing an independent multinomial random vector. For a given function psi which takes in the concatenated vector of multinomial probabilities and outputs a real number, this is a Monte Carlo estimation procedure of an exact p-value and confidence interval. The resulting inference is valid even in small samples, and when the parameter is on the boundary, and when the function is nondifferentiable at the parameter value, all situations where asymptotic methods and the bootstrap would fail.
Authors:	Michael C Sachs [aut, cre], Michael P Fay [aut]
Maintainer:	Michael C Sachs <[email protected]>
License:	MIT + file LICENSE
Version:	0.4.1
Built:	2025-02-16 06:01:58 UTC
Source:	https://github.com/sachsmc/xactonomial

xactonomial: Exact Inference for Real-Valued Functionals of k-Sample Multinomial Parameters

Description

We consider the k sample multinomial problem where we observe k vectors (possibly of different lengths), each representing an independent multinomial random vector. For a given function psi which takes in the concatenated vector of multinomial probabilities and outputs a real number, this is a Monte Carlo estimation procedure of an exact p-value and confidence interval. The resulting inference is valid even in small samples, and when the parameter is on the boundary, and when the function is nondifferentiable at the parameter value, all situations where asymptotic methods and the bootstrap would fail.

Author(s)

Maintainer: Michael C Sachs [email protected]

Authors:

Michael P Fay

Calculate probability for given parameters

Description

Given a set of candidate parameter vectors, check if the null $\psi \leq \psi_0$ is satisfied, and if so, compute the probability for each element of the sample space

Usage

calc_prob_null(theta_cands, psi, psi0, minus1, SSpacearr, logC, II)
calc_prob_null(theta_cands, psi, psi0, minus1, SSpacearr, logC, II)

Arguments

`theta_cands`	A matrix with samples in the rows and the parameters in the columns
`psi`	The function of interest mapping parameters to the real line
`psi0`	The null boundary for testing psi <= psi0
`minus1`	Either plus or minus 1
`SSpacearr`	A matrix with the sample space for the given size of the problem
`logC`	log multinomial coefficient for each element of the sample space
`II`	logical vector of sample space psi being more extreme than the observed psi

Value

A numeric vector of probabilities

Calculate probability for given parameters

Description

Given a set of candidate parameter vectors, check if the null $\psi \leq \psi_0$ is satisfied, and if so, compute the probability for each element of the sample space

Usage

calc_prob_null2(
  theta_cands,
  psi,
  psi0,
  minus1,
  SSpacearr,
  logC,
  II,
  psi_v = FALSE
)
calc_prob_null2(
  theta_cands,
  psi,
  psi0,
  minus1,
  SSpacearr,
  logC,
  II,
  psi_v = FALSE
)

Arguments

`theta_cands`	A matrix with samples in the rows and the parameters in the columns
`psi`	The function of interest mapping parameters to the real line
`psi0`	The null boundary for testing psi <= psi0
`minus1`	Either plus or minus 1
`SSpacearr`	A matrix with the sample space for the given size of the problem
`logC`	log multinomial coefficient for each element of the sample space
`II`	logical vector of sample space psi being more extreme than the observed psi
`psi_v`	Is psi vectorized by row?

Value

A numeric vector of probabilities

calculate multinomial probabilities

Description

calculate multinomial probabilities

Usage

calc_probs_rust(sar, logt, logc, d, n, nt)
calc_probs_rust(sar, logt, logc, d, n, nt)

Calculate combinations of multinomials

Description

Calculate combinations of multinomials

Usage

combinate(X, Y)
combinate(X, Y)

Arguments

`X`	Matrix 1
`Y`	Matrix 2

Value

A list of arrays

Like combinate but add to existing

Description

Like combinate but add to existing

Usage

combinate2(X, Y)
combinate2(X, Y)

Arguments

`X`	A list as returned by combinate
`Y`	Matrix 2

Value

A list of arrays

Get a matrix of indices for all possible combinations of vectors of lengths

Description

This is basically the same as expand.grid, but faster for integers

Usage

expand_index(lengths)
expand_index(lengths)

Arguments

lengths

A vector with the lengths of each index to expand

Value

A matrix with length(lengths) columns and prod(lengths) rows

Examples

expand_index(c(2, 3, 4))
## the same as
expand.grid(1:2, 1:3, 1:4)

expand_index(c(2, 3, 4))
## the same as
expand.grid(1:2, 1:3, 1:4)

Sample from the unit simplex in d dimensions

Description

Sample from the unit simplex in d dimensions

Usage

get_theta_random(d = 4, nsamp = 75)
get_theta_random(d = 4, nsamp = 75)

Arguments

`d`	the dimension
`nsamp`	the number of samples to take uniformly in the d space

Value

The grid over Theta, the parameter space. A matrix with d columns and nsamp rows

Examples

get_theta_random(3, 10)

get_theta_random(3, 10)

Find the root of the function f

Description

This finds the value $x \in [a, b]$ such that $f(x) = 0$ using the one-dimensional root finding ITP method (Interpolate Truncate Project). Also see itp.

Usage

itp_root(
  f,
  a,
  b,
  k1 = 0.1,
  k2 = 2,
  n0 = 1,
  eps = 0.005,
  maxiter = 100,
  fa = NULL,
  fb = NULL,
  verbose = FALSE,
  ...
)
itp_root(
  f,
  a,
  b,
  k1 = 0.1,
  k2 = 2,
  n0 = 1,
  eps = 0.005,
  maxiter = 100,
  fa = NULL,
  fb = NULL,
  verbose = FALSE,
  ...
)

Arguments

`f`	The function to find the root of in terms of its first (one-dimensional) argument
`a`	The lower limit
`b`	The upper limit
`k1`	A tuning parameter
`k2`	Another tuning parameter
`n0`	Another tuning parameter
`eps`	Convergence tolerance
`maxiter`	Maximum number of iterations
`fa`	The value of f(a), if NULL then will be calculated
`fb`	The value of f(b), if NULL then will be calculated
`verbose`	Prints out information during iteration
`...`	Other arguments passed on to f

Value

A numeric vector of length 1, the root at the last iteration

References

I. F. D. Oliveira and R. H. C. Takahashi. 2020. An Enhancement of the Bisection Method Average Performance Preserving Minmax Optimality. ACM Trans. Math. Softw. 47, 1, Article 5 (March 2021), 24 pages. https://doi.org/10.1145/3423597

Examples

fpoly <- function(x) x^3 - x - 2 ## example from the ITP_method wikipedia entry
itp_root(fpoly, 1, 2, eps = .0001, verbose = TRUE)

fpoly <- function(x) x^3 - x - 2 ## example from the ITP_method wikipedia entry
itp_root(fpoly, 1, 2, eps = .0001, verbose = TRUE)

Calculate log of multinomial coefficient

Description

Calculate log of multinomial coefficient

Usage

log_multinom_coef(x, sumx)
log_multinom_coef(x, sumx)

Arguments

`x`	Vector of observed counts in each cell
`size`	Total count

Value

The log multinomial coefficient

Examples

#' @examples
S0 <- sspace_multinom(4, 6)
S1 <- sspace_multinom(4, 7)
logC0<- apply(S0,1,log_multinom_coef,sumx=6)
logC1<- apply(S1,1,log_multinom_coef,sumx=7)
logC<- outer(logC0,logC1,'+')

#' @examples
S0 <- sspace_multinom(4, 6)
S1 <- sspace_multinom(4, 7)
logC0<- apply(S0,1,log_multinom_coef,sumx=6)
logC1<- apply(S1,1,log_multinom_coef,sumx=7)
logC<- outer(logC0,logC1,'+')

Compute a p value for the test of psi <= psi0 (lower = TRUE) or psi >= psi0 (lower = FALSE)

Description

Compute a p value for the test of psi <= psi0 (lower = TRUE) or psi >= psi0 (lower = FALSE)

Usage

pvalue_psi0(
  psi0,
  psi,
  psi_hat,
  psi_obs,
  maxit,
  chunksize,
  lower = TRUE,
  target,
  SSpacearr,
  logC,
  d_k,
  psi_v = FALSE
)
pvalue_psi0(
  psi0,
  psi,
  psi_hat,
  psi_obs,
  maxit,
  chunksize,
  lower = TRUE,
  target,
  SSpacearr,
  logC,
  d_k,
  psi_v = FALSE
)

Arguments

`psi0`	The null value
`psi`	The function of interest
`psi_hat`	The vector of psi values at each element of the sample space
`psi_obs`	The observed estimate
`maxit`	Maximum iterations
`chunksize`	Chunk size
`lower`	Do a one sided test of the null that it is less than psi0, otherwise greater.
`target`	Stop the algorithm if p >= target (for speed)
`SSpacearr`	The sample space array
`logC`	The log multinomial coefficient
`d_k`	The vector of dimensions
`psi_v`	Is psi vectorized by row?

Value

A p-value

Return a random sample from the d unit simplex

Description

Return a random sample from the d unit simplex

Usage

sample_unit_simplex(d)
sample_unit_simplex(d)

Return a random sample from the d unit simplex

Description

Return a random sample from the d unit simplex

Usage

sample_unit_simplex2(d)
sample_unit_simplex2(d)

Return n random samples from the d unit simplex

Description

Return n random samples from the d unit simplex

Usage

sample_unit_simplexn(d, n)
sample_unit_simplexn(d, n)

Enumerate the sample space of a multinomial

Description

We have $d$ mutually exclusive outcomes and $n$ independent trials. This function enumerates all possible vectors of length $d$ of counts of each outcome for $n$ trials, i.e., the sample space. The result is output as a matrix with $d$ columns where each row represents a possible observation.

Usage

sspace_multinom(d, n)
sspace_multinom(d, n)

Arguments

`d`	Dimension
`n`	Size

Value

A matrix with d columns

Examples

d4s <- sspace_multinom(4, 8)
stopifnot(abs(sum(apply(d4s, 1, dmultinom, prob = rep(.25, 4))) - 1) < 1e-12)

d4s <- sspace_multinom(4, 8)
stopifnot(abs(sum(apply(d4s, 1, dmultinom, prob = rep(.25, 4))) - 1) < 1e-12)

Exact inference for a real-valued function of multinomial parameters

Description

We consider the k sample multinomial problem where we observe k vectors (possibly of different lengths), each representing an independent sample from a multinomial. For a given function psi which takes in the concatenated vector of multinomial probabilities and outputs a real number, we are interested in constructing a confidence interval for psi.

Usage

xactonomial(
  psi,
  data,
  psi0 = NULL,
  alternative = c("two.sided", "less", "greater"),
  alpha = 0.05,
  psi_limits,
  maxit = 50,
  chunksize = 500,
  conf.int = TRUE,
  psi_is_vectorized = FALSE
)
xactonomial(
  psi,
  data,
  psi0 = NULL,
  alternative = c("two.sided", "less", "greater"),
  alpha = 0.05,
  psi_limits,
  maxit = 50,
  chunksize = 500,
  conf.int = TRUE,
  psi_is_vectorized = FALSE
)

Arguments

`psi`	Function that takes in a vector of parameters and outputs a real valued number
`data`	A list with k elements representing the vectors of counts of a k-sample multinomial
`psi0`	The null hypothesis value for the parameter being tested. A p value for a test of psi <= psi0 is computed. If NULL only a confidence interval is computed.
`alternative`	a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less"
`alpha`	A 1 - alpha percent confidence interval will be computed
`psi_limits`	A vector of length 2 giving the lower and upper limits of the range of $\psi(\theta)$
`maxit`	Maximum number of iterations of the stochastic procedure
`chunksize`	The number of samples taken from the parameter space at each iteration
`conf.int`	Logical. If FALSE, no confidence interval is calculated, only the p-value.
`psi_is_vectorized`	Logical. If TRUE, expect that psi can take a matrix as input, and return a vector of length the number of rows, computing the statistic for each row of the matrix. If possible, this will substantially speed up the computation. See examples.

Details

Let $T_j$ be distributed $\mbox{Multinomial}_{d_j}(\boldsymbol{\theta}_j, n_j)$ for $j = 1, \ldots, k$ and denote $\boldsymbol{T} = (T_1, \ldots, T_k)$ and $\boldsymbol{\theta} = (\theta_1, \ldots, \theta_k)$ . The subscript $d_j$ denotes the dimension of the multinomial. Suppose one is interested in the parameter $\psi(\boldsymbol{\theta}) \in \Theta \subseteq \mathbb{R}$ . Given a sample of size $n$ from $\boldsymbol{T}$ , one can estimate $\boldsymbol{\theta}$ with the sample proportions as $\hat{\boldsymbol{\theta}}$ and hence $\pi(\hat{\boldsymbol{\theta}})$ . This function constructs a $1 - \alpha$ percent confidence interval for $\psi(\boldsymbol{\theta})$ and provides a function to calculate a p value for a test of the null hypothesis $H_0: \psi(\boldsymbol{\theta}) \neq \psi_0$ for the two sided case, $H_0: \psi(\boldsymbol{\theta}) \leq \psi_0$ for the case alternative = "greater", and $H_0: \psi(\boldsymbol{\theta}) \geq \psi_0$ for the case alternative = "less". We make no assumptions and do not rely on large sample approximations. The computation is somewhat involved so it is best for small sample sizes.

Value

A list with 3 elements: the estimate, the 1 - alpha percent confidence interval, and p-value

Examples

psi_ba <- function(theta) {
  theta1 <- theta[1:4]
  theta2 <- theta[5:8]
  sum(sqrt(theta1 * theta2))
  }
data <- list(T1 = c(2,1,2,1), T2 = c(0,1,3,3))
xactonomial(psi_ba, data, psi_limits = c(0, 1), maxit = 5, chunksize = 20)

psi_ba_v <- function(theta) {
theta1 <- theta[,1:4, drop = FALSE]
theta2 <- theta[,5:8, drop = FALSE]
rowSums(sqrt(theta1 * theta2))
}
data <- list(T1 = c(2,1,2,1), T2 = c(0,1,3,3))
xactonomial(psi_ba_v, data, psi_limits = c(0, 1), maxit = 5, chunksize = 20, psi_is_vectorized = TRUE)

psi_ba <- function(theta) {
  theta1 <- theta[1:4]
  theta2 <- theta[5:8]
  sum(sqrt(theta1 * theta2))
  }
data <- list(T1 = c(2,1,2,1), T2 = c(0,1,3,3))
xactonomial(psi_ba, data, psi_limits = c(0, 1), maxit = 5, chunksize = 20)

psi_ba_v <- function(theta) {
theta1 <- theta[,1:4, drop = FALSE]
theta2 <- theta[,5:8, drop = FALSE]
rowSums(sqrt(theta1 * theta2))
}
data <- list(T1 = c(2,1,2,1), T2 = c(0,1,3,3))
xactonomial(psi_ba_v, data, psi_limits = c(0, 1), maxit = 5, chunksize = 20, psi_is_vectorized = TRUE)

Package 'xactonomial'

Help Index

xactonomial: Exact Inference for Real-Valued Functionals of k-Sample Multinomial Parameters

Description

Author(s)

See Also

Calculate probability for given parameters

Description

Usage

Arguments

Value

Calculate probability for given parameters

Description

Usage

Arguments

Value

calculate multinomial probabilities

Description

Usage

Calculate combinations of multinomials

Description

Usage

Arguments

Value

Like combinate but add to existing

Description

Usage

Arguments

Value

Get a matrix of indices for all possible combinations of vectors of lengths

Description

Usage

Arguments

Value

Examples

Sample from the unit simplex in d dimensions

Description

Usage

Arguments

Value

Examples

Find the root of the function f

Description

Usage

Arguments

Value

References

Examples

Calculate log of multinomial coefficient

Description

Usage

Arguments

Value

Examples

Compute a p value for the test of psi <= psi0 (lower = TRUE) or psi >= psi0 (lower = FALSE)

Description

Usage

Arguments

Value

Return a random sample from the d unit simplex

Description

Usage

Return a random sample from the d unit simplex

Description

Usage

Return n random samples from the d unit simplex

Description

Usage

Enumerate the sample space of a multinomial

Description

Usage

Arguments

Value

Examples

Exact inference for a real-valued function of multinomial parameters

Description

Usage

Arguments

Details

Value