| Title: | Faster Generation of Quantile Quantile Plots with Large Samples |
|---|---|
| Description: | New and faster implementations for quantile quantile plots. The package also includes a function to prune data for quantile quantile plots. This can drastically reduce the running time for large samples, for 100 million samples, you can expect a factor 80X speedup. |
| Authors: | Gudmundur Einarsson [aut, cre], Hafsteinn Einarsson [aut] |
| Maintainer: | Gudmundur Einarsson <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.5 |
| Built: | 2026-05-16 08:31:41 UTC |
| Source: | https://github.com/gumeo/fastqq |
This function is not exposed, since we want to hard-code the parameters for simplicity of usage.
drop_dense(x, y, N_hard = 10000)drop_dense(x, y, N_hard = 10000)
x |
A numeric vector of sample/theoretical points. |
y |
A numeric vector of theoretical/sample points. |
N_hard |
Desired upper bound on the number of points to plot. |
data.frame with o and e pruned as columns.
Creates a quantile-quantile plot from p-values from an association study,
e.g. a genome wide association study (GWAS). We compare
the data quantile with a theoretical quantile from a uniform distribution.
This code is mostly adapted from the qqman package, but improved
for speed. A graph with a hundred million points should only take a few
seconds to generate.
qq(pvector, zero_action = NULL, ...)qq(pvector, zero_action = NULL, ...)
pvector |
A numeric vector of p-values. |
zero_action |
A numeric value to substitute for p-values of exactly
zero before plotting. If |
... |
Other arguments passed to |
No return value, called for plotting side effects.
qq(stats::runif(1e6)) # Handle p-values of zero by substituting a small finite value pvec <- c(stats::runif(1e4), 0, 0) qq(pvec, zero_action = 1e-300)qq(stats::runif(1e6)) # Handle p-values of zero by substituting a small finite value pvec <- c(stats::runif(1e4), 0, 0) qq(pvec, zero_action = 1e-300)
Accepts test statistics as input and converts them to
values assuming 1 degree of freedom. The conversion
uses log-space computation via pchisq(..., log.p = TRUE), which
avoids floating-point underflow for very large test statistics and is
numerically precise well beyond .Machine$double.xmin. Produces the
same style of plot as qq and qqlog.
qqchisq1(chisq_vector, ...)qqchisq1(chisq_vector, ...)
chisq_vector |
A numeric vector of |
... |
Other arguments passed to |
No return value, called for plotting side effects.
chisq_vals <- stats::rchisq(1e5, df = 1) qqchisq1(chisq_vals)chisq_vals <- stats::rchisq(1e5, df = 1) qqchisq1(chisq_vals)
Accepts values directly as input. This is useful when
the caller has already transformed their p-values, or when higher numerical
precision is required before passing values to the plotting layer. Produces
the same style of plot and uses the same fast pruning algorithm as
qq.
qqlog(log10_pvector, ...)qqlog(log10_pvector, ...)
log10_pvector |
A numeric vector of |
... |
Other arguments passed to |
No return value, called for plotting side effects.
pvec <- stats::runif(1e5) qqlog(-log10(pvec))pvec <- stats::runif(1e5) qqlog(-log10(pvec))
Faster alternative to stats::qqnorm(). For more than 1e5 points
we remove excess points, that would not be visible in the plot, since the
points are so close. Otherwise this should work exactly the same, and the
code is mostly adapted from stats::qqnorm(). This code produces
more lightweight plots for excessive amounts of data.
qqnorm( y, ylim, main = "Normal Q-Q Plot", xlab = "Theoretical Quantiles", ylab = "Sample Quantiles", plot.it = TRUE, datax = FALSE, ... )qqnorm( y, ylim, main = "Normal Q-Q Plot", xlab = "Theoretical Quantiles", ylab = "Sample Quantiles", plot.it = TRUE, datax = FALSE, ... )
y |
sample, to compare to normal quantiles. |
ylim |
graphical limits. |
main |
Plot title. |
xlab |
X label. |
ylab |
Y label. |
plot.it |
Should the plot be created. |
datax |
logical. Should data values be on x-axis? |
... |
Other arguments passed to |
data.frame with sorted sample and normal quantiles, NA
values are excluded.
qqnorm(stats::rnorm(1e6))qqnorm(stats::rnorm(1e6))
Faster alternative to stats::qqplot(). For more than 1e5 points
we remove excess points, that would not be visible in the plot, since the
points are so close.
qqplot( x, y, plot.it = TRUE, xlab = deparse1(substitute(x)), ylab = deparse1(substitute(y)), ... )qqplot( x, y, plot.it = TRUE, xlab = deparse1(substitute(x)), ylab = deparse1(substitute(y)), ... )
x |
First sample for |
y |
Second sample for |
plot.it |
Should the plot be created. |
xlab |
x label for plot. |
ylab |
y label for plot. |
... |
Other arguments passed to |
list with sorted samples, interpolated to be same size.
qqplot(stats::runif(1e6),stats::runif(1e6))qqplot(stats::runif(1e6),stats::runif(1e6))