% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/keep_identify_abundant.R
\docType{methods}
\name{keep_abundant}
\alias{keep_abundant}
\title{Filter to keep only abundant transcripts/genes}
\usage{
keep_abundant(
  .data,
  abundance = assayNames(.data)[1],
  design = NULL,
  formula_design = NULL,
  minimum_counts = 10,
  minimum_proportion = 0.7,
  minimum_count_per_million = NULL,
  factor_of_interest = NULL,
  ...,
  .abundance = NULL
)
}
\arguments{
\item{.data}{A `tbl` or `SummarizedExperiment` object containing transcript/gene abundance data}

\item{abundance}{The name of the transcript/gene abundance column (character, preferred)}

\item{design}{A design matrix for more complex experimental designs. If provided, this is passed to filterByExpr instead of factor_of_interest.}

\item{formula_design}{A formula for creating the design matrix}

\item{minimum_counts}{The minimum count threshold for a feature to be considered abundant}

\item{minimum_proportion}{The minimum proportion of samples in which a feature must be abundant}

\item{minimum_count_per_million}{The minimum count per million threshold}

\item{factor_of_interest}{The name of the column containing groups/conditions for filtering. DEPRECATED: Use 'design' or 'formula_design' instead.}

\item{...}{Further arguments.}

\item{.abundance}{DEPRECATED. The name of the transcript/gene abundance column (symbolic, for backward compatibility)}
}
\value{
Returns a filtered version of the input object containing only the features that passed
the abundance threshold criteria.

Returns a filtered version of the input object containing only the features that passed
the abundance threshold criteria.
}
\description{
Filters the data to keep only transcripts/genes that are consistently expressed above 
a threshold across samples. This is a filtering version of identify_abundant() that 
removes low-abundance features instead of just marking them.

This function is similar to identify_abundant() but instead of adding an .abundant column,
it filters out the low-abundance features directly.
}
\details{
Filter to keep only abundant transcripts/genes

\lifecycle{questioning}


This function uses edgeR's filterByExpr() function to identify and keep consistently expressed features.
A feature is kept if it has CPM > minimum_counts in at least minimum_proportion of samples
in at least one experimental group (defined by factor_of_interest or design).

This function is similar to identify_abundant() but instead of adding an .abundant column,
it filters out the low-abundance features directly.
}
\examples{
## Load airway dataset for examples

  data('airway', package = 'airway')
  # Ensure a 'condition' column exists for examples expecting it

    SummarizedExperiment::colData(airway)$condition <- SummarizedExperiment::colData(airway)$dex


# Basic usage
airway |> keep_abundant()

# With custom thresholds
airway |> keep_abundant(
  minimum_counts = 5,
  minimum_proportion = 0.5
)

# Using a factor of interest
airway |> keep_abundant(factor_of_interest = condition)

}
\references{
McCarthy, D. J., Chen, Y., & Smyth, G. K. (2012). Differential expression analysis of 
multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research, 
40(10), 4288-4297. DOI: 10.1093/bioinformatics/btp616
}
