diffxpy.api.test.lrt¶

diffxpy.api.test.lrt(data: Union[anndata._core.anndata.AnnData, anndata._core.raw.Raw, numpy.ndarray, scipy.sparse.csr.csr_matrix, batchglm.models.base.input.InputDataBase], full_formula_loc: str, reduced_formula_loc: str, full_formula_scale: str = '~1', reduced_formula_scale: str = '~1', as_numeric: Union[List[str], Tuple[str], str] = (), init_a: Union[numpy.ndarray, str] = 'AUTO', init_b: Union[numpy.ndarray, str] = 'AUTO', gene_names: Optional[Union[numpy.ndarray, list]] = None, sample_description: Optional[pandas.core.frame.DataFrame] = None, noise_model='nb', size_factors: Optional[Union[numpy.ndarray, pandas.core.series.Series]] = None, batch_size: Union[None, int, Tuple[int, int]] = None, backend: str = 'numpy', train_args: dict = {}, training_strategy: Union[str, List[Dict[str, object]], Callable] = 'AUTO', quick_scale: bool = False, dtype='float64', **kwargs)¶

Perform log-likelihood ratio test for differential expression for each gene.

Note that lrt() does not support constraints in its current form. Please use wald() for constraints.

Parameters

data – Input data matrix (observations x features) or (cells x genes).
full_formula_loc – formula Full model formula for location parameter model.
reduced_formula_loc – formula Reduced model formula for location and scale parameter models.
full_formula_scale – formula Full model formula for scale parameter model.
reduced_formula_scale – formula Reduced model formula for scale parameter model.
as_numeric – Which columns of sample_description to treat as numeric and not as categorical. This yields columns in the design matrix which do not correpond to one-hot encoded discrete factors. This makes sense for number of genes, time, pseudotime or space for example.
init_a –
(Optional) Low-level initial values for a. Can be:
- str:
  - ”auto”: automatically choose best initialization
  - ”standard”: initialize intercept with observed mean
  - ”init_model”: initialize with another model (see ìnit_model parameter)
  - ”closed_form”: try to initialize with closed form
- np.ndarray: direct initialization of ‘a’
init_b –
(Optional) Low-level initial values for b Can be:
- str:
  - ”auto”: automatically choose best initialization
  - ”standard”: initialize with zeros
  - ”init_model”: initialize with another model (see ìnit_model parameter)
  - ”closed_form”: try to initialize with closed form
- np.ndarray: direct initialization of ‘b’
gene_names – optional list/array of gene names which will be used if data does not implicitly store these
sample_description – optional pandas.DataFrame containing sample annotations
noise_model –
str, noise model to use in model-based unit_test. Possible options:
- ’nb’: default
size_factors – 1D array of transformed library size factors for each cell in the same order as in data or string-type column identifier of size-factor containing column in sample description.
batch_size –
Argument controlling the memory load of the fitting procedure. For backends that allow chunking of operations, this parameter controls the size of the batch / chunk.
- If backend is “tf1” or “tf2”: number of observations per batch
- If backend is “numpy”: Tuple of (number of observations per chunk, number of genes per chunk)
backend –
Which linear algebra library to chose. This impact the available noise models and optimizers / training strategies. Available are:
- ”numpy” numpy
- ”tf1” tensorflow1.* >= 1.13
- ”tf2” tensorflow2.*
training_strategy –
{str, function, list} training strategy to use. Can be:
- str: will use Estimator.TrainingStrategy[training_strategy] to train
- function: Can be used to implement custom training function will be called as training_strategy(estimator).
- list of keyword dicts containing method arguments: Will call Estimator.train() once with each dict of method arguments.
  
  Example:
```
[
  {"learning_rate": 0.5, },
  {"learning_rate": 0.05, },
]
```
  This will run training first with learning rate = 0.5 and then with learning rate = 0.05.
quick_scale –
Depending on the optimizer, scale will be fitted faster and maybe less accurate.

Useful in scenarios where fitting the exact scale is not absolutely necessary.
dtype –
Allows specifying the precision which should be used to fit data.

Should be “float32” for single precision or “float64” for double precision.
kwargs – [Debugging] Additional arguments will be passed to the _fit method.