lingo.lol is one of the many independent Mastodon servers you can use to participate in the fediverse.
A place for linguists, philologists, and other lovers of languages.

Server stats:

54
active users

#StatisticalGenetics

0 posts0 participants0 posts today

STATGEN 2024 talk
Statistical Methods for Single-Cell RNA-Seq Analysis and Spatial Transcriptomics
Rafael Irizarry

tSNE and UMAP plots:
"They really aren't informative, but they are really pretty."

Negative control scRNAseq data set: the percent of zeros is very high, and contributes strongly to the first PCA. tSNE plot 'discovers' new cells.

Transformed to log2(1 + CPM): looks zero-inflated.

Raw counts: Poisson

1/

I have worked on a lot of papers, this is my first preprint.

Analyzing large biobanks of genomic data requires a fresh look at our quality control metrics, especially as we move to NGS and more diverse populations.

You may be throwing out biologically important variants.

I would be happy to hear feedback.

#UKBiobank #HardyWeinbergEquilibrum
#statistics #statisticalgenetics #GWAS #genetics #preprint

medrxiv.org/content/10.1101/20

medRxiv · A reassessment of Hardy-Weinberg equilibrium filtering in large sample Genomic studiesHardy Weinberg Equilibrium (HWE) is a fundamental principle of population genetics. Adherence to HWE, using a p-value filter, is used as a quality control measure to remove potential genotyping errors prior to certain analyses. Larger sample sizes increase power to differentiate smaller effect sizes, but will also affect methods of quality control. Here, we test the effects of current methods of HWE QC filtering on varying sample sizes up to 486,178 subjects for imputed and Whole Exome Sequencing (WES) genotypes using data from the UK Biobank and propose potential alternative filtering methods. METHODS Simulations were performed on imputed genotype data using chromosome 1. WES GWAS (Genome Wide Association Study) was performed using PLINK2. RESULTS Our simulations on the imputed data from Chromosome 1 show a progressive increase in the number of SNPs eliminated from analysis as sample sizes increase. As the HWE p-value filter remains constant at p<1e-15, the number of SNPs removed increases from 1.66% at n=10,000 to 18.86% at n=486,178 in a multi-ancestry cohort and from 0.002% at n=10,000 to 0.334% at n=300,000 in a European ancestry cohort. Greater reductions are shown in WES analysis with a 11.91% reduction in analyzed SNPs in a European ancestry cohort n=362,192, and a 32.70% reduction in SNPs in a multi-ancestry dataset n=463,605. Using a sample size specific HWE p-value cutoff removes ∼2.25% of SNPs in the all ancestry cohort across all sample sizes, but does not currently scale beyond 300,000 samples. A hard cutoff of +/- 20% deviation from HWE produces the most consistent results and scales across all sample sizes but requires additional user steps. CONCLUSION Testing for deviance from HWE may still be an important quality control step in GWAS studies, however we demonstrate here that using an HWE p-value threshold that is acceptable for smaller sample sizes will be inappropriate for large sample studies due to an unnecessarily high number of variants removed prior to analysis. Rather than exclude variants that fail HWE prior to analysis it may be better to include all variants in the analysis and examine their deviation from HWE afterward. We believe that adjusting the cutoffs will be even more important for large whole genome sequencing results and more diverse population studies. KEY TAKEAWAYS ### Competing Interest Statement BB and AS are full time employees of DNAnexus, Inc. PG and DW are full time employees of Ariel Precision Medicine Inc ### Funding Statement Authors were compensated for time contributed to the study by their respective institutions. Their respective institutions also paid for any compute needed to complete experiments on the UKBRAP. ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: This study I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes All data produced are available in supplementary material. * HWE : Hardy Weinberg Equilibrium GWAS : Genome Wide Association Study WES : Whole Exome Sequencing WGS : Whole Genome Sequencing MAF : Minor Allele Frequency SNP QC : quality control EO : European Only GSD : Gallstone Disease UKB : UK Biobank UKB RAP : UK Biobank Research Analysis Platform MHC : Major Histocompatibility Complex

📣 #postdoc in #personality #psychology at #UIUC

Work with faculty including @geneforanarchy, DA Briley, Chris Fraley, & @brentwroberts; teach 1 Intro Personality course/semester. Interests including personality #assessment; personality #development; #social, emotional, & behavioral skills; #behaviorgenetics, #statisticalgenetics; #gender identity; #sexuality; & #attachment. Start date flexible, as early as 8/16/23

psychology.illinois.edu/open-p

@academicchatter

psychology.illinois.eduOpen Positions | Psychology at Illinois

#postdocPosition alert! We have funding from project STEVE (advancing genotype-to-phenotype Studies by
considering Transposable Elements Variability and Epivariability) for you to work with myself at CBIO and Vincent Colot & Pierre Baduel at IBENS on brand new sequencing data generated in their lab. Up to 36 months, in Paris. Details: cbio.mines-paristech.fr/pdf-fi
#bioinformatics #machineLearning #statistics #statisticalGenetics #populationGenetics #postdoc