Estimating Cell-Type Profiles and Cell-Type Proportions in Heterogeneous Gene Expression Data

Pritchard, David A.

Metadata Field	Value	Language
dc.contributor.advisor	Zeng, Peng
dc.contributor.advisor	Carpenter, Mark
dc.contributor.advisor	Billor, Nedret
dc.contributor.author	Pritchard, David A.
dc.date.accessioned	2012-05-16T19:01:51Z
dc.date.available	2012-05-16T19:01:51Z
dc.date.issued	2012-05-16
dc.identifier.uri	http://hdl.handle.net/10415/3156
dc.description.abstract	Understanding the mechanisms underlying natural variation in gene expression is an important question in medical and evolutionary genetics. Many studies intend to compare either (i) cell-type expression profiles across individuals for the same cell-type, or (ii) cell-type expression profiles within an individual for different cell-types (NIH 2012). Naturally, accurate estimates of these expression profiles is of great importance. However, the presence of heterogeneity of cell-types in gene expression data can result in inaccurate estimates of such cell-type expression profiles (Leek and Storey 2007). The standard statistical method for assaying gene expression data is to use a simple linear regression model, with the assumption that the presence of minor alleles in the genotype has an additive effect on gene expression levels (Veyrieras 2008). This method assumes that the observed gene expression data has a homogeneous composition of a single cell-type. However there are many scenarios where it may be more appropriate to assume that observed gene expression data is composed of two cell-types; for example a brain tissue sample would presumably have a heterogeneous mixture of neuron and glial cell-types (GeneNetwork 2012). Previous studies have developed methodologies for estimating cell-type expression profiles given prior information regarding individual cell-type proportions; or conversely for estimating cell-type proportions with prior knowledge of cell-type expression profiles. This thesis derives a computational method for estimation of both the cell-type expression profiles and individual cell-type proportions for a two cell-type model, without any prior information. The parameter estimation techniques are based on an alternating-regression least-squares process. This methodology is applied to both simulated data and a real dataset, and the results are examined.	en_US
dc.rights	EMBARGO_NOT_AUBURN	en_US
dc.subject	Mathematics and Statistics	en_US
dc.title	Estimating Cell-Type Profiles and Cell-Type Proportions in Heterogeneous Gene Expression Data	en_US
dc.type	thesis	en_US
dc.embargo.length	NO_RESTRICTION	en_US
dc.embargo.status	NOT_EMBARGOED	en_US

Files in this item

Name:: DPritchard_Thesis.pdf.txt
Size:: 150.1Kb

Name:: DPritchard_Thesis.pdf
Size:: 3.041Mb

Show simple item record