A Spatial Econometric Analysis of the Effects of Subsidized Housing and Urban Sprawl on Property Values by Cephas Banlenan Naanwaab A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Auburn, Alabama May 9, 2011 Keywords: Subsidized housing, spatial dependence, difference-in-difference, propensity score matching, urban sprawl, generalized spatial model Copyright 2011 by Cephas Banlenan Naanwaab Approved by Diane Hite, Chair, Professor of Agricultural Economics and Rural Sociology James Novak, Professor of Agricultural Economics and Rural Sociology Norbert Wilson, Associate Professor of Agricultural Economics and Rural Sociology Asheber Abebe, Associate Professor of Mathematics and Statistics ii Abstract Property owners often resist the idea of siting public or subsidized housing in their neighborhood. The notion that subsidized housing exerts a negative externality effect on adjacent properties has been investigated elsewhere in the US, particularly in the Northeast, with inconclusive findings. In the first two chapters of this dissertation, I analyze the spatial effects of two Federal Housing Programs, namely Section 8 Vouchers and Low-Income-Housing Tax Credits, on property values. The Census 2000 report estimates the Southeast to be among the fastest growing populations in the US. From 1990 to 2000 the population growth for the Southeastern States were Georgia (26.37%), Florida (23.57%), North Carolina (21.43%), South Carolina (15.07%), and Alabama (10.06%). The Southeast also has a higher concentration of minorities; this, coupled with high incidence of poverty, implies a higher demand for public housing in order to serve low income households. Policy makers need to be able to determine not only the housing needs of low income populations, but also how the provision of such affordable housing impacts surrounding property values. Chapter one offers an aggregated analysis of the impact of Section 8 and low-income housing tax credit programs on property values in Alabama. In chapter two, I perform a disaggregated analysis of the impact of Section 8 projects on proximate single family home sales in Fulton County, Georgia. Chapter three explores the nexus between property values, commuting, and urban sprawl in Birmingham metropolitan area, Alabama. There is a common methodological linkage between all three chapters. I employ spatial econometric estimation iii techniques, first, to account for spill-over effects in house prices, and secondly, to overcome spatial autocorrelation and spatial heterogeneity that may result in biases if the traditional OLS estimation is used. In chapter one, I specifically deal with the problem of causality between subsidized housing and property values. Two novel quasi-experimental methods?difference-in-difference and propensity score matching? are employed to test if there is actually a causal relationship between subsidized housing and property values. The last chapter also offers an innovative combination of principal component analysis and generalized spatial two stage least squares estimation techniques to investigate the relationship between urban sprawl, property values, and commute times. iv Acknowledgments I would like to offer special thanks to my academic advisor, Dr Diane Hite, for her guidance and insightful comments that have brought this work to a successful completion. I also like to thank my committee members, Dr James Novak, Dr Norbert Wilson, and Dr Asheber Abebe for their invaluable comments and criticisms. Special thanks to Dr Curtis Jolly, Chair of Agricultural Economics and Rural Sociology, and to Dr Henry Thompson, former GPO of Applied Economics program. Finally, I owe my family a big debt of gratitude and appreciation for their patience and support during these four years. This piece of work is especially dedicated to little Irene L. Naanwaab. v Table of Contents Abstract ......................................................................................................................................... ii Acknowledgments........................................................................................................................ iv List of Tables ............................................................................................................................... ix List of Figures .............................................................................................................................. xi List of Abbreviations .................................................................................................................. xii Chapter 1 A Quasi-Experiment to Estimate the Spatial Effects of Subsidized Housing in Alabama??????????????????????????????? .1 1. Introduction .................................................................................................................. 1 2. Literature Review.......................................................................................................... 3 2.1. Low Income Housing Tax Credit Program ............................................................ 7 2.2. Section 8 Voucher Program ................................................................................... 9 3. Data and Methods ....................................................................................................... 10 3.1. Data ...................................................................................................................... 10 3.2. Measurement of School Quality and Air Pollution .............................................. 11 3.3. Subsidized Properties ........................................................................................... 11 3.4. Neighborhood Demographics .............................................................................. 12 3.5. Comparing Neighborhood characteristics by Federally Subsidized Programs .... 13 3.6. Analytical Methods .............................................................................................. 16 3.6.1. Contiguity Weight matrices ........................................................................... 17 3.6.2. Model Specification and Estimation .............................................................. 18 vi 3.6.3. Moran?s I test of Spatial Autocorrelation ...................................................... 19 4. Results and Discussion ............................................................................................... 20 4.1. The Effects of LIHTC and Section 8 projects on Property Values....................... 22 4.2. Property Values and Housing Characteristics....................................................... 23 4.3. The Effect of Ethnicity, Poverty, and other Demographic Factors....................... 24 4.4. Environmental Quality, School Quality, and Property Values ............................. 26 5. Difference-in-Difference Model ................................................................................. 27 6. Propensity Score Analysis .......................................................................................... 31 6.1. PSM Adjustment for Sample Selection ................................................................ 34 7. Conclusion .................................................................................................................. 35 Chapter 2 The Impact of Section 8 Public Housing on Property Values in Fulton County- Georgia??????????????????????????????? .39 1. Introduction ................................................................................................................ 39 2. Literature Review........................................................................................................ 40 2.1. Section 8 Certificates and Voucher Program ........................................................ 42 3. Data and Methods ....................................................................................................... 43 3.1. Data ....................................................................................................................... 43 3.1.1. Study Area ...................................................................................................... 45 3.2. Analytical methods ............................................................................................... 46 3.2.1. Hedonic Price Model ...................................................................................... 47 3.2.2. Spatial Dependence in House Prices............................................................... 49 3.2.3. Moran?s I Test for Spatial Autocorrelation ..................................................... 51 3.2.4. The Estimated Model and Specification ......................................................... 52 3.2.5. Spatial Weight matrices .................................................................................. 53 vii 3.2.6. Distance Weight Matrices ............................................................................... 54 4.1. Results of Spatial Dependence Tests ....................................................................... 54 4.2. Results and Discussion ............................................................................................ 56 4.2.1. The Effect of House Characteristics on House Price ........................................ 60 4.2.2. The Effect of Neighborhood Factors on House Price ....................................... 61 5. Conclusion .................................................................................................................. 61 Chapter 3 Urban Sprawl, Property Values, and Commute Times in Birmingham Metropolitan Area ??? .................................................................................................................... 66 3.1. Introduction .............................................................................................................. 66 3.2. Causes and Consequences of Sprawl ....................................................................... 71 3.3. Sprawl in the Birmingham-Hoover Metropolitan Area ........................................... 73 3.4. Data and Methods .................................................................................................... 76 3.4.1. Data ................................................................................................................... 76 3.4.2. The Monocentric Model of Urban Development.............................................. 76 3.4.3. Estimated Monocentric Density Function ........................................................ 78 3.4.4. Polycentric Model of Urban Form .................................................................... 78 3.4.5. Identification of Employment Sub-centers ....................................................... 80 3.4.6. Polycentric Density Function Results ............................................................... 81 3.4.7. Constructing an Index of Sprawl ...................................................................... 82 3.4.8. Theoretical Generalized Spatial Model ............................................................ 84 3.4.9. Empirical Generalized Spatial Model Specification ......................................... 86 3.5. Results ...................................................................................................................... 87 3.6. Conclusion ............................................................................................................... 91 viii References ............................................................................................................................... 100 Appendix 1 Scree Plots ............................................................................................................. 109 Appendix 2 Principal Component Matrix Plot ......................................................................... 109 Appendix 3 Principal Component Pattern Profiles ................................................................... 110 ix List of Tables Table 1.1 Neighborhood characteristics and presence or absence of subsidized projects ......... 13 Table 1.2 Variable Descriptions ................................................................................................. 14 Table 1.3 Summary Statistics of the Data ................................................................................... 15 Table 1.4 Comparative estimates of HPM and Spatial models .................................................. 21 Table 1.5 D-in-D model estimates of treatment effects: Outcome= property values ($) ........... 30 Table 1.6a Propensity Score Matching, Treatment=LIHTC ....................................................... 37 Table 1.6b Propensity Score Matching, Treatment=Section 8 ................................................... 38 Table 2.1 Summary Statistics of Variables ................................................................................. 46 Table 2.2 Main Differences between the models........................................................................ 56 Table 2.3 Model Diagnostics ...................................................................................................... 57 Table 2.4 Comparative Results of HPM and Spatial models...................................................... 62 Table 2.5 Comparative Results of HPM and Spatial models...................................................... 63 Table 2.6 Comparative Results of HPM and Spatial models...................................................... 64 Table 2.7 Comparative Results of HPM and Spatial models...................................................... 65 Table 3.1 Employment Sub-centers in Birmingham Metro Area?????????? ?? 80 Table 3.2 Density function estimates for 1990. Dependent variable: log (density) ................... 92 Table 3.3 Density function estimates for 2000. Dependent variable: log (density) ................... 93 Table 3.4 Polycentric Density Function: Five Employment Centers ......................................... 94 Table 3.5 Polycentric Density Function: Two Employment Centers ?????????? 9 4 x Table 3.6 GS2SLS-SEM estimation Results Equation 1 (log Median House Values) ............... 95 Table 3.7 GS2SLS-SEM estimation Results Equation 2 (Sprawl index) ................................... 96 Table 3.8 GS2SLS-SEM estimation Results Equation 3 (log Commute times) ......................... 97 xi List of Figures Figure 1 Cross-walk of 1990 to 2000 census tracts .................................................................... 16 Figure 2.1Map of Central Fulton County .................................................................................. 47 Figure 2.2 Univariate Moran?s I Test on House Prices .............................................................. 55 Figure 2.3 Clustering in House Prices ........................................................................................ 55 Figure 3.1 Mapping sprawl in Birmingham Metro area: Median year structure built by census block group??????????????????????????????? ?.. 74 Figure 3.2 Mapping sprawl in Birmingham Metro area: Population change from 1990-2000 by census block group. ................................................................................................................... 75 Figure 3.3 Commute Times in the Birmingham Metro-Area .................................................... 98 Figure 3.4 Sprawl Index for Birmingham Metro-Area .............................................................. 99 xii List of Abbreviations CBD Central Business District CBG Census Block Group GS2SLS Generalized Spatial Two Stage Least Squares HUD Housing and Urban Development LIHTC Low Income Housing Tax Credit MSA Metropolitan Statistical Area OLS Ordinary Least Squares PCA Principal Component Analysis PSA Propensity Score Analysis PSM Propensity Score Matching SAR Spatial Autoregressive SEM Simultaneous Equations Model/Spatial Error Model 1 CHAPTER 1 A Quasi-Experiment to Estimate the Spatial Effects of Subsidized Housing in Alabama 1. Introduction This study seeks to understand the interrelationships among subsidized housing, ethnicity, poverty levels, and property values. The effect of subsidized housing? Section 8 and Low- Income Tax credits? on property values is assessed. It explores the relationship between subsidized housing and property values, controlling for ethnicity on the one hand, and the relationship between poverty levels of neighborhoods and property values on the other hand. The motivation for the current study stems from the fact that little previous research has been conducted on the relationship between ethnicity, poverty, and property values.1 Besides, previous studies on the effect of subsidized housing are inconclusive, and there have been no such studies in the Southeastern US in general and Alabama in particular. The Census 2000 report estimates the Southeast to be among the fastest growing populations in the US? from 1990 to 2000 the population growth for the Southeastern States were Georgia (26.37%), Florida (23.57%), North Carolina (21.43%), South Carolina (15.07%), and Alabama (10.06%). The Southeast also has a higher concentration of minorities; this, coupled with high incidence of poverty, implies a higher demand for public housing in order to serve low income households. Policy makers need to be able to determine not only the housing needs of low income 1 Exceptions are Massey and Kanaiaupuni (1993), Bauman (1987), Bowly (1987), Carter et al (1998), and Galster et al (2006) 2 populations, but also how the provision of such affordable housing impacts surrounding property values. It has long been the contention of many that concentrated poverty negatively impacts property values. The objective of the current research, then, is to produce empirical evidence to support or refute this position. This paper thus provides an inquiry into how concentrated poverty affects surrounding homeowners. It is also important to control for the effect of race in this study, since race and poverty are intertwined. The study also looks at the pertinent issue of environmental quality?not commonly examined in the public housing studies literature. The broader issue of environmental justice as it relates to disadvantaged populations has been extensively studied (Hite, 2000; Bullard, 1996). The EPA defines Environmental Justice as ?the fair treatment for all people of all races, cultures, and incomes, regarding the development of environmental laws, regulations, and policies.? In order to delineate the causal link between ethnicity, poverty levels and property values, we have to be able to control for as many factors as possible that could affect property values. And one such important factor is the environmental quality of neighborhoods. This study is also different from previous studies of the subject in terms of the methodology used. To analyze the causal effects of subsidized housing on property values, I adopt two novel methodological approaches, spatial econometric methods, and quasi-experiment methods. Most often, previous researchers use OLS regression analysis to study the effect of subsidized housing, which tends to confuse association with causation. In this study, I analyze the causal impact of subsidized housing using quasi-experiment techniques of difference-in- difference, and propensity score matching. Incorporating spatial methods overcomes biases due 3 to spatial autocorrelation and spatial heterogeneity, while difference-in-difference and propensity score methods lead to uncovering the true causal impacts of subsidized housing. To understand these relationships, data on neighborhood demographics, racial or ethnic compositions, poverty levels, quality of schools, and property values are used. Other demographic data requirements for this project are population, households, incomes, educational levels, and employment. The main source of these data is the Census Bureau?s database? a rich source for decennial census data. The American ?Fact Finder?and Data Ferret tools embedded in the Census Bureau database allow queries to be made for decennial data down to the neighborhood level. The environmental quality data (air pollution) is obtained from the US- EPA?s Toxic Release Inventory (TRI), and matched to the census data. 2. Literature Review The impact of subsidized housing projects on surrounding properties has generally been a subject of interest in the regional science literature. A study of this nature is important, first and foremost, to property owners, who must live with potential consequences of having a public project located in their neighborhood. Secondly, it is important to policy makers who must make the decisions concerning where to locate subsidized properties. But why would public housing projects exert an impact on surrounding properties? Schwartz et al (2003) offers two avenues by which subsidized housing investments could positively or negatively affect surrounding properties in the neighborhood. The first avenue is that subsidized housing creates an external effect because of what it replaces. If there is an unsightly or dilapidated structure in a neighborhood, the value of surrounding properties is depressed?thus, a nice-looking subsidized housing structure that 4 replaces the previously not-well-maintained structure tends to increase property values in the neighborhood. This effect has been described as the ?flattening-out? of the house price gradient. Secondly, subsidized housing investment may exert a beneficial external effect because of what it creates. It may not be the case that a project replaces blight, but just because it has a nice architecture that is in sync with the other surrounding properties, it can increase property values. In this case, the design of subsidized housing is of importance, whether high rise, multi-family, or single family. In their study of Section 42 Low-Income Housing Tax Credit developments in Wisconsin, Green et al. (2002) did not find evidence that Section 42 developments cause property values to decrease. They did find evidence in Milwaukee County of a small and significant appreciation in properties that were distant from section 42 developments. In Madison County, their result showed a rapid appreciation for properties near section 42 developments. By way of policy, they conclude that ?..it is only when low-income housing developments are located in areas that already have concentrated poverty that they have a negative impact on property values.? Other studies that find positive effects of subsidized housing are those of Nourse (1963), DeSalvo (1974), Rabiega, Lin and Robinson (1984), and Warren, Aduddell and Tatlovich (1983). Lee, Culhane and Wachter (1999), looked at the differential impact of Federally Assisted Housing on nearby properties by program type? high rise public housing, large scale public housing, and homeownership public housing. Their methodological approach consisted of estimating the effect of federally assisted housing using dummies for individual house sales within 1/8 or ? mile radii of a development and then carrying out hedonic analysis. Their results indicate a slight negative effect of federally assisted housing on surrounding property values. 5 The breakdown of their result by program showed that Section 8 certificates and vouchers have a slight negative effect. Low-Income Housing Tax Credit sites also have a negative impact, while Federal Housing Administration-assisted units, public housing homeownership program units, and Section 8 New Construction and Rehabilitation units are found to have a slight positive impact. Another study involving Low-Income Housing Tax Credit program was undertaken by the Maxfield Research Group (2000), with support from the Family Housing Fund of Minneapolis, Minnesota. The research sought to determine whether there was evidence in support of claims that tax-credit rental developments for families depressed property values. This study focused on 12 neighborhoods in the Twin Cities where tax-credit rental housing was located. They analyzed three indices of market performance for houses that transacted within those neighborhoods. These were sales price per square foot, percentage of sales to asking (list) price, and time on the market The methodology used was to compare outcomes under pre/post analysis to that of a control-analysis. In the pre/post analysis they compared these indices in the pre and post establishment periods of tax-credit housing, and found that homes that sold after the siting of tax- credit housing had ?similar or stronger market performance? than homes that sold before the establishment of such projects. Similarly, their results showed stronger performance of houses in the experimental neighborhoods than in the control neighborhoods. Finally, their findings indicated a general upward price trend, declining time on market, and stable or improving sales- to-list price ratios of houses over six year period of the existence of tax-credit housing in those neighborhoods they studied. 6 Concerning the impact of poverty on property values, Galster et al. (2006) provide a thorough theoretical and empirical investigation into the potential consequences of concentrated poverty. They address the impact of spatial concentration of poverty on ?proximate residents (socially harmful behaviors like crime) and property owners (reduced maintenance, and, in the extreme, abandonment).? They did not find any ?substantial relationship between neighborhood poverty changes and property values or rents when poverty rates stay below ten (10) percent.? However, they did find that when poverty is in the range of 10-20 percent, marginal increases in poverty did have a dramatic negative effect on property values and rents. Other researchers have studied the interrelationship between the location of public housing and poverty concentration (Carter et al, 1998; Massey and Kanaiaupuni, 1993; Bauman, 1987; and Bowly, 1978). In particular, the concentration of racial minorities in public housing projects, mainly in central cities, has tended to increase neighborhood poverty in the inner cities (Carter et al., 1998). Massey and Kanaiaupuni (1993) found that public housing projects had been targeted to poor, black neighborhoods, thus substantially increasing the concentration of poverty in ensuing years. A direct consequence of concentration of racial minorities in public housing is that these neighborhoods descend into a vicious cycle of poverty. It is an indisputable fact that in the US where one lives is a proxy for economic opportunity, such as access to good quality education, health care and employment. In the case of public housing, a family that is already poor gets placed in a neighborhood where they have no economic opportunities, thus perpetuating their poverty. In an attempt to deal with the problem of poverty concentration created by centralized public housing, the federal government initiated programs such as Moving to Opportunity (MTO) demonstration program in 1992. Under this program, Section 8 subsidy recipients were 7 given mobility counseling to enable them re-locate from inner-city public housing to suburbs where they could have better economic prospects. But as Galster et al. (1999) have pointed out, these programs have not always been quite as successful as initially thought; because in some areas landlords do not want to accept Section 8 tenants into their property for fear that allowing them in could depress their property values. 2.1. Low Income Housing Tax Credit Program The Low-Income Housing Tax Credit (LIHTC) is a Federal program created by the Tax Reform Act of 1986 (TRA 86). It is based on Section 42 of the Internal Revenue Code enacted by Congress in 1986. HUD estimates that the program has placed in service nearly 31, 251 projects and over 1, 843, 000 housing units between 1987 and 2007. The program gives state and local LIHTC-allocating agencies nearly $5 billion in annual budget to issue tax credits. It provides tax credits for private sector production of low to moderate income housing. The LIHTC provides private developers ?up to 70% of the cost of new construction or 30% of the cost of acquisition of existing low income housing in return for limits on rent charged? (Green et al., 2002). Developers are required, under the law, to rent 20% of available units to households whose incomes are at most 50% of the county median income (adjusted for family size) or rent 40% of available units to households making at most 60% of the county median income (Green et al, 2002). McClure (2006) investigates the spatial distribution of LIHTC projects over its 20 year period in existence, and finds that these projects are increasingly being placed in the suburbs. By comparing the distribution of LIHTC in inner-city versus suburbs, his findings point to a greater acceptability of LIHTC in the suburbs than previously was the case. Using data from 8 HUD, he estimates that 43% of LIHTC are now placed in the suburbs compared to 44% in the central cities. Dating back to the1950s there has always been criticism that public housing tended to be targeted to central cities, thus contributing to concentrated poverty and racial segregation of African-American families. The Gautreaux Assisted Housing Program in Chicago (GAHP), born out of lawsuits filed against the Chicago Housing Authority (CHA) and the Department of Housing and Urban Development (HUD), sought to put an end to racial concentration of minority families in inner cities. As a consequence of GAHP?s law suit settlement against HUD, many African-American families were given an opportunity to move to desegregated areas. This lawsuit probably spurred on HUD to develop programs aimed at desegregation of minority poor in central cities. The LIHTC program sought to address initial problems faced by early public housing programs by offering low-income people an opportunity to move out to desegregated places where there is less poverty. Rohe and Freedman (2001) study the role that race and ethnicity play in the placement of assisted housing developments. They consider five federal housing programs, and find that race and ethnicity matter in the placement of these assisted housing programs. In particular, they find that the percentage of African-American families in a neighborhood is a strong predictor of the placement of LIHTC housing in that neighborhood. Moreover, the higher the percentage of African-American and Hispanic families within a census tract, the more likely it is that public family and other HUD family developments will be sited there. In effect, it can be argued that LIHTC programs are not de-concentrating racial minorities better than earlier programs. 9 Newman and Schnare (1997) investigate the quality of the neighborhoods in which assisted housing programs are located. For the purpose of their study, they define neighborhood quality in terms of several factors including socio-economic status (median family incomes and poverty rates), quality of housing stock, concentration of assisted housing, and the racial or ethnic mix. They evaluate the relative performance of six assisted housing programs including LIHTC in improving neighborhood conditions of recipients. They conclude that project-based assistance programs, such as LIHTC, do little to improve the quality of recipients? neighborhoods compared to those of welfare households, and public housing even performs worse. 2.2. Section 8 Voucher Program Section 8?or Housing Choice Voucher? refers to a federally subsidized program enacted under the U.S. Housing and Community Development Act of 1974 to assist low-income families and individuals find affordable housing. The program provides subsidized housing for extremely low-income families (income no more than 30% of county median income) or very low-income (income no more than 50% of county median income). Under this program families pay 30% of their rent, based on Fair Market Rate (FMR), while the remaining is paid off with federal money (voucher). Section 8 initially consisted of three programs? New Construction, Substantial Rehabilitation, and Existing Housing Certificates. Later, some programs were phased out and new ones added. Currently, Section 8 projects consist, essentially, of the voucher program? ?project- based? or ?tenant-based.? With a project-based voucher, a person is limited to specified apartment complexes which may be administered by a Public Housing Authority 10 (PHA), while a tenant-based voucher provides the freedom to move to any state and to any apartment where Section 8 vouchers are accepted. 3. Data and Methods 3.1. Data Decennial housing and population survey data for census years 1990 and 2000 are used. A problem encountered with using data from different census years is that census geography changes in different census years. This means that 1990 census tract demarcations are not exactly the same as 2000 census tracts demarcation. This poses a problem of trying to compare property value changes across census years. To overcome this, I had to ?cross-walk? or match 1990 census tracts to 2000 census tracts. The process of ?cross-walking? was conducted in GIS using the overlay and spatial join tools to identify the spatial relationship of 1990 census tracts to 2000 census tracts. This process generated 1,081, census tracts the same number as for the year 2000 census. The cross-walk result is presented in Figure 1. In 1990, the State of Alabama had a total of 1,066 census tracts and some of these were split in 2000 census--thus bringing the number of census tracts in 2000 to 1,081. Data for all of Alabama?s census tracts for 1990 and 2000 census years are obtained from the census database available at http://factfinder.census.gov/. The data consist of census tract observations on median property values for owner-occupied housing, median rent for renter- occupied housing, and characteristics of housing units, such as year built, and number of rooms and so on. These are matched to socio-economic and demographic conditions of the census tracts?in particular, percent of people below the poverty line, unemployment, age and racial 11 composition, and educational attainment. Other data included in the study are the quality of the school district in the census tract, and air pollution levels. 3.2. Measurement of School quality and air pollution The measurement of school quality used follows the approach adopted by Brasington and Hite (2005) in which school-district quality is approximated by the proportion of ninth grade students passing a proficiency test. School-district quality used in this study is the percentage of students passing the Stanford Achievement Test during the ninth grade in year 2000, averaged over reading, mathematics, language and science. The Alabama Board of Education publishes this data from all school districts at each grade. Air pollution levels data for year 2000 are obtained from the national Toxic Release Inventory (TRI) of the Environmental Protection Agency (EPA). Total releases (air pollution) from factories in the state are merged with the other census tract data. To identify the location of factories in relation to census tracts, their X-Y geographic coordinates were used to locate which census tracts the factories fall under. The total releases variable was then coded into a dummy variable in such a way that if the total releases within the tract exceeded one standard deviation of the mean level of pollution, it is coded as one and zero otherwise. Approximately 18% of census tracts are identified to have pollution levels more than one standard deviation above the mean level of pollution in the State. 3.3. Subsidized properties Data on subsidized properties (Section 8 and LIHTC) in the State of Alabama are obtained from the Department of Housing and Urban Development (HUD). There are 516 Section 8 properties 12 and 636 LIHTC in Alabama as at year 2000. In order to capture the effect of subsidized housing on property values, GIS software is used to determine the location of Section 8 and LIHTC properties by census tracts. A numerical count of these properties in each census tract is used as the variable of interest. Alternatively, a dummy variable approach is used where the variable takes on value 1 indicating the presence of Section 8 or LIHTC and zero otherwise. The summary statistics indicate that 35% of census tracts have at least one LIHTC property and 36% of census tracts have at least one Section 8 property. 20% of census tracts have both LIHTC and Section 8 properties, while 48% have neither. 3.4. Neighborhood Demographics The unit of observation is the census tract which is assumed to constitute a neighborhood. There is no one acceptable definition of what is a neighborhood, but many published studies have used the census tract as a good approximation of a neighborhood. Others define a neighborhood as a borough or a number of city blocks or simply a community of people where there is a face-to- face interaction of the people living within that spatial unit. Neighborhood characteristics included in the analysis are summarized in Tables 1.1 and 1.2. A typical census tract in Alabama is about 67% white, 30% black, 1.6% Hispanic and a little over 1% other ethnic origins. The data also show that some neighborhoods are homogenous in racial composition such as 99% white or 99% black. On average a neighborhood has 13.5% of its population aged 65 and above and about 31% are 21 years or less. The average household income in the dataset is $33,990 (in year 2000 dollars) while average census tract unemployment is 2.9%. In predominantly black neighborhoods the unemployment rate is almost twice that in predominantly white neighborhoods. The average 13 percent of the population below the poverty line of all census tracts is 17.5%. If disaggregated by race, the poverty rate in mainly black census tracts is 27% while that figure is 12% in predominantly white census tracts. Table 1.1: Neighborhood characteristics and presence or absence of subsidized projects Sec8 LIHTC Characteristic Present Absent Present Absent Percent white pop. 62.1 69.5 61.8 69.5 Percent black pop. 35.2 27.6 35.5 27.5 Percent hispanic pop. 1.7 1.6 1.7 1.6 Percent below poverty 18.7 16.8 19 16.7 Household income 32,695 34,674 31,705 35,199 Percent unemployed 3.14 2.85 3.20 2.83 Percent with college degrees 7.44 7.5 6.8 7.8 3.5. Comparing Neighborhood Characteristics by Federally Subsidized programs Table 1.1 above presents the average census tract characteristics based on presence of Federal Program type. This cross-tabulation describes the characteristics of neighborhoods where Section 8 and LIHTC properties are present versus absent. It can be realized that subsidized properties are more likely to be found in predominantly black neighborhoods than in predominantly white neighborhoods. These projects are also more likely to be located in areas of high poverty, high unemployment and lower incomes than the average. 14 Table 1.2: Variable description. Variable Description Value Median property value (owner-occupied) Rent Median rent (renter-occupied) Age_oc Median Age (owner-occupied, in years) Age_ro Median Age (renter-occupied, in years) Rooms Median number of rooms White Percent white pop in census tract Black Percent black pop in census tract Age_under21 Percent pop 21 years and under Age65_up Percent pop aged 65 or more HH_size Average Household size HH_income Median household income Mean_TT Mean travel time to work (mins) PctColDeg Percent with college degrees PctUnemp Percent unemployed PctBPov Percent below poverty line Sch_Quality Percent of ninth graders passing SAT LIHTC_D Dummy=1 if LIHTC properties present LIHTC_# Number of LIHTC properties Sec8_D Dummy=1 if Section 8 properties present Sec8_# Number of Section 8 properties AirPoll Dummy=1 if pollution >1 std dev above mean 15 Table 1.3: Summary statistics of the data. Variable Min Mean Max Value $14,400 $78,423 $593,800 Rent $99 $331 $1,964 Age_oc 3 27.58 61 Age_ro 1 28.5 61 Rooms 2.6 4.6 9.1 White 0 66.8 99.2 Black 0 30.3 99.3 Age_under21 11.1 31.2 85.1 Age65_up 0 13.5 35.7 HH_size 1.46 2.5 3.5 HH_income $5,412 $33,990 $143,968 Mean_TT 10.56 24.7 54.3 PctColDeg 0 7.5 34.8 PctUnemp 0 2.9 31.7 PctBPov 0 17.5 77 Sch_Quality 30.75 51.6 84.5 LIHTC_D 0 .35 1 LIHTC_# 0 .68 9 Sec8_D 0 .36 1 Sec8_# 0 .62 7 AirPoll 0 .18 1 16 Figure 1. Crosswalk of 1990 to 2000 census tracts. 3.6. Analytical Methods Spatial hedonic methods are used to analyze the impact of Section 8 and section 42 Low-Income Tax Credit Programs on property values in Alabama. This study is performed at the census tract level. The current study seeks to avoid some of the pitfalls of previous studies by employing a quasi-natural- experiment technique that incorporates a difference-in-difference model analyses. 17 Another crucial aspect of the relationship between the public housing and property value is the issue of spatial dependency. The current study builds on the innovations of Galster et al (1999) in their study of the effects of Section 8 certificates on property values in which a spatial fixed effects model was used to capture the proximity and trend effects. The econometric and the innovative technique of spatial analyses incorporated into the current study helps to delineate the causality between public housing and property values, while controlling for other effects on property values. 3.6.1. Contiguity Weight Matrices For this study, the weights matrix is based on polygon features? census tracts? and this requires the use of a contiguity weight matrix to perform spatial econometric estimation. There are three possible types of contiguity relationships? rook, queen and bishop. Contiguity may also be of the first order? elements that share common borders are neighbors, or of the second order? both immediate and adjoining elements are neighbors. The following illustrations by Anselin (1988) serve to distinguish three types of contiguity relationships possible. A: Rook?s Contiguity B: Bishop?s contiguity C: Queen?s contiguity 18 Each element of the weights matrix is computed as follows (Anselin, 1988). (1) where 1ijC? if i and j are contiguous and 0ijC? if i and j are not contiguous. Equation (1) implies that the weight matrix is row-standardized such that the sum of elements in each row is one. The weight matrix used in the analysis is based on Rook?s contiguity relationship, so that census tracts that share borders (edges) are treated as neighbors and are assigned a value of one, and zero otherwise. Rook?s contiguity relationship is chosen because it provides a better description of neighbors without introducing unnecessary complexity. 3.6.2. Model specification and estimation The theoretical OLS and spatial econometric models are described by the following equations. (2) (3) (4) (5) Equation (2) represents the traditional OLS or hedonic price model (HPM) that expresses the median house value as a function of characteristics of the house, location and neighborhood characteristics. Model (3) is the spatial autoregressive or so-called spatial lag model. It relates median property value to its spatial lag (wy) as well as other factors typically included in a hedonic price model. Equations (4) and (5) constitute the spatial error model. The empirical model estimated takes the form given below. The model is estimated using the median value of owner-occupied housing or median rent for renter-occupied housing as 19 alternate forms of measuring the dependent variable. I estimate and compare the results from three different model specifications namely OLS, spatial lag, and spatial error models. (6) (7) 3.6.3. Moran?s I test of spatial autocorrelation The Moran?s I provides an estimate of the extent of spatial autocorrelation in the data. This gives an indication of whether using OLS might bias the estimates or not. The global Moran?s I statistic is given by; , (8) where N is the total number of objects under consideration, wij is the weight for neighbors i and j, yi and yj refer to the observed values of i and j, and y bar is the overall mean of values. The estimate of Moran?s I is 0.208 and this is very significant (p-value=0.0000). This is as we would expect in housing studies. This estimate shows that there is positive spatial autocorrelation in house prices. Moreover, the estimates of rho and lambda in the spatial lag and spatial error models are respectively statistically significant and positive?a further indication of spatial dependence and spatial autocorrelation in house prices (LeSage, 1997; Anselin and Bera, 1998). This spatial dependence must be taken into consideration lest the estimates be biased. In the presence of spatial autocorrelation therefore, the spatial models are preferred to OLS. 20 4. Results and Discussion Table 1.4 presents the results of the OLS and spatial econometric model estimations2. The significance of the estimated coefficients is good and almost all have the expected signs. There is an improvement in the explanatory power as one moves from OLS estimation to spatial econometric estimation. The adjusted R-square from the OLS is 0.65, which indicates that the explanatory variables taken together, account for 65% of the variability in rents. The adjusted R- square from the spatial lag model is 0.66 and that of the spatial error is 0.71, indicating the contribution of spatial information to the variation in house rents. In other words, by accounting for spatial information, we are able to explain better the variation in house rents or property values. The second reason for using the spatial models is that we are able to correct for bias due to spatial dependence and heterogeneity. Comparing the parameter estimates of the OLS and spatial models, it can be seen that the OLS estimates generally have an upward bias in absolute terms. For instance, the OLS estimate of the effect of LIHTC is -0.054 while that from the spatial lag model is -0.037. Thus, using the OLS estimates for policy-making will overstate the impact of each included explanatory variable on property values. For policy-making purposes, we want to obtain precise estimates of the impacts, devoid of biases, so the spatial model estimates are preferable. 2 Results presented in Table 1.4 are based on the model with log (median rent) as the dependent variable. Results using log (median value) is not shown but quite comparable to those in Table 1.4. 21 Table 1.4: Comparative estimates of HPM and Spatial models. Dependent Var.=log(rent) Variable OLS SPATIAL LAG SEM Estimate t-ratio Estimate t-ratio Estimate t-ratio LIHTC_dum -0.054*** -3.410 -0.037** -2.532 -0.034** -2.326 Sec8_dum 0.039** 2.500 0.030** 2.086 0.018 1.268 Mean_TT -0.011*** -7.340 -0.009*** -6.989 -0.009*** -5.874 Pct_colldeg 0.019*** 8.670 0.013*** 9.279 0.017*** 7.602 Pct_bpoverty -0.006*** -3.370 -0.006*** -5.971 -0.005*** -3.838 HH_inc(log) 0.377*** 6.210 0.259*** 33.811 0.262*** 5.106 Med_Yrblt -0.002*** -2.810 -0.003*** -3.229 -0.004*** -4.985 Med_rooms 0.047*** 2.940 0.066*** 4.524 0.084*** 5.812 Sch_quality 0.005*** 4.330 0.0008 0.386 0.0008 0.749 Pct_white -0.004** -2.030 -0.003 -1.430 - 0.004** -2.096 Pct_black -0.003 -1.27 -0.002 -0.920 -0.004** -2.072 Pct_under21 0.006*** 3.120 0.003* 1.777 0.004** 2.092 Pct_65up -0.007*** -3.080 -0.006** -2.911 -0.006*** -2.836 Ave_hh_sz -0.120** -2.340 -0.096** -2.461 -0.111** -2.245 Air_poll -0.038** -2.060 -0.026 -1.503 -0.002 -0.119 Constant 2.259*** 3.610 1.376*** 5.971 3.518*** 7.114 Rho --- --- 0.367*** 13.236 --- Lambda --- --- ---- 0.522*** 66.306 R-sq .65 0.66 0.71 AIC -58.77 -946.904 -939.464 N 1082 1082 1082 Variable Model 1 Model 2 Model 3 Model 4 ***, **, * denote 1%, 5% and 10% respectively. Moran I=.208, statistic=12.13, p value=.00000 22 4.1. The effects of LIHTC and Section 8 projects on property values This section and the next three discuss the baseline results of the analysis. Sections 5 and 6 discuss how the study avoids the potential problem of confusing association of effects with cause by addressing the issue of causality between subsidized housing and property values. There are two ways of measuring the effect of public projects on neighborhood property values in a regression analytic framework. First, we can measure this effect using the median rent as the dependent variable in the case of rental housing. Secondly, we can use assessed property values of owner-occupied housing as the dependent variable. I have estimated the models presented using both ways of measuring the property values. The dataset includes observations on owner- occupied as well as renter-occupied housing, allowing estimation based on either owner- occupied or renter-occupied housing. The dependent variable from the estimated models is the log of house rent, as a measure of property values. Results presented in Table 1.4 are based on data using renter-occupied housing. The results are fairly similar to those in Table 1.4 where the analysis uses owner-occupied housing data and assessed property values as the dependent variable. In most empirical specifications, having a dummy explanatory variable in a semi-log model requires the following transformation. (9) I apply this correction to interpret the coefficients of the model where the predictor variable is a dummy. The most important variables of interest are LIHTC_Dum and Sec8_Dum. These dummy variables indicate the presence or otherwise of either type of subsidized project in a census tract. For LIHTC_Dum, the estimate is -0.037 and when transformed by the above formula, yields the marginal effect of the presence of LIHTC project on median rent to be 23 -0.036. Thus, the estimated effect of LIHTC is to decrease median rent by 3.6%. Put differently, census tracts that have LIHTC projects have a median rent that is 3.6% lower than those census tracts where there is no LIHTC project. This finding is consistent with the results of the study by Lee, Culhane and Wachter (1999). Contrary to the effect of LIHTC, the transformed estimate of Sec8_Dum is positive (i.e. 0.030) which is to say that the presence of Section 8 projects has a positive effect on rent values of about 3% higher than in census tracts with no Section 8 projects. Thus, I find an opposite result of the impacts of LIHTC and Section 8 projects on property values in Alabama. The causality argument concerning the effect of LIHTC and Section 8 on property values is addressed in sections 5 and 6 dealing with difference-in-difference modeling and propensity score analysis. 4.2. Property values and housing characteristics Most hedonic studies include the characteristics of the property because they tend to be capitalized into the overall price of the property. A hedonic price model expresses the price of a property as a function of its characteristics, as well as of environmental and other neighborhood qualities (Bartik, 1987; Hite, 2001). This is based on the assumption that households maximize their utility by choosing a bundle of characteristics of a house and other location-specific characteristics. In line with the spirit of hedonic modeling, this study includes two characteristics of properties?median number of rooms by census tract, and median age of property. Both of these variables have the expected sign effects as seen in Table 1.4. The two coefficient estimates are very highly significant, statistically and economically. From the HPM estimates the semi-elasticity of rent with respect to median number of rooms is 0.047 and that of 24 the spatial lag model is 0.066. The estimates also show that older houses are less valued than newer ones?as we would expect. The semi-elasticity of house rent with respect to median age is -0.003. 4.3. The effect of ethnicity, poverty, and other demographic factors Why might ethnicity be related to property values? It has been a subject matter of investigation as to whether the entry of minorities (especially blacks) into previously all-white neighborhoods caused property values to decline. Downs (1960), commenting on the relationship between race and property values, concluded that non-whites (blacks) entering an all-white neighborhood did not lead to declining property values, and may very well cause property values to rise in comparison to neighborhoods that remained all-white. In a study conducted in seven cities including San Francisco, Oakland and Philadelphia, property values in twenty test neighborhoods were compared to nineteen control neighborhoods using sales of single-family residences from 1949 to 1955. Ratios of price changes between ?test? and ?control? neighborhoods pre/post entry of non-whites were compared. These ratios generally increased in the test neighborhoods indicating a positive impact of entry of non-whites (mainly black). People had previously held the belief that entry of non-whites into all-white neighborhoods depressed property values for two reasons. Firstly, entry supposedly converts neighborhoods from low-density, high-maintenance to high-density, low-maintenance neighborhoods, leading to deterioration of properties and declining values. Secondly, the entry of non-whites created panic among whites leading to ?white flight.? As whites were leaving, many properties were put on the market, thus flooding the market and causing property values to decline. 25 This paper controls for the effect of ethnicity by including in the models the percentage of census tract population that is white or black. The estimated coefficients on both race variables?pct_white and pct_black?are negative and the magnitude is roughly the same. A higher percentage of either white or black population in a census tract is negatively associated with property values. This may be an indication that ethnic diversity is perhaps good for property values. Thus, property values are higher for ethnically diverse communities than ethnically homogenous ones. The linkage between poverty and property values seems pretty evident?poor neighborhoods lack essential amenities like transportation, good schools, parks, and health facilities. Consequently, the property market is depressed in this type of neighborhood as people do not want to live there. It seems then that we should expect a negative relationship between the poverty rate and property values. This study finds that for a one percentage point increase in the poverty rate in a census tract, rental values decline by 0.6%, other factors remaining the same. In separate regression analysis, the interaction between the poverty rate and the presence of a public project (LIHTC) has a negative effect on property values. This reaffirms the findings of Carter et al. (1998) that the concentration of racial minorities in public housing projects? mainly in central cities, has tended to depress property values in these inner cities by increasing neighborhood poverty. In a related study Massey and Kanaiaupuni (1993) found that public housing projects had been targeted to poor, black neighborhoods, thus substantially increasing the concentration of poverty, the effect of which may lead to declining property values. Other demographic factors included in this study are percent of population with college degrees (pct_colldeg), percent of population age 65 and older (pct_65up), percent of population 26 under 21 (pct_under21), household income (HH_inc), average household size (Ave_hh_sz), and mean travel time (mean_TT, a proxy for distance to CBG). The income, college degrees and population under 21 are positively related to property values in the census tracts, and these are all significant. The estimated elasticity of housing rents with respect to household incomes is 0.26 and the semi-elasticity with respect to pct_colldeg is 0.013. Mean travel time to work is negatively associated with rental values, and so are the average household size and percent of population aged 65 and above. 4.4. Environmental quality, school quality, and property values Hedonic models have been generally used to estimate the marginal implicit prices that households are willing to pay for environmental and other neighborhood characteristics such as air quality, water quality, school quality, and distance from a disamenity. Examples of such environmental studies looking at the impacts of landfills, air quality, and other disamenities on property values include Nelson et al. (1992), Kohlhase (1991), Nelson (1978), Hite et al. (2001), Brasington and Hite (2005). Studies relating school quality and property values are equally large (Brasington, 1999; Brasington and Hite, 2001; Kane et al. 2006). Most school quality studies have found a positive impact on property values. What constitutes quality of a school district varies in different studies, but most use student performance on a standardized test. In line with these earlier studies, I estimate the effect of school quality by using the Stanford Achievement Test (SAT)?a standardized index of school district quality in the State of Alabama. I find that the quality of school district positively affects house rents with a semi-elasticity of 0.005 in the OLS model. This estimate compares favorably 27 with the results found by Brasington and Hite (2005) in Ohio, in which school quality had effects on property values of 0.002 in Akron, 0.008 in Cleveland, and 0.009 in Columbus. If households value environmental quality, as earlier studies have found, then it is expected that they will be willing to pay a premium price to have a house located in an area of higher quality. We will expect then that as the level of air pollution in a census tract increases, property values should decline. This paper confirms that poor air quality negatively affects housing values in Alabama?for census tracts that have a level of air pollution more than one standard deviation above the mean levels of air pollution in the state, housing rents are lower by about 3.8%. 5. Difference-in-Difference Model In order to capture if there is a true causal effect of public housing on property values, a quasi difference-in-difference model whereby we account for trends in property values over the decade 1990-2000 is estimated. The problem is to estimate the true effect of public housing (treatment) on property values (outcome) between 1990 and 2000. The difference-in-difference model can be expressed as; (10) Ti takes values (0, 1) where 0 indicates the control group (census tracts where there is no public housing) and 1 indicates the treatment group (census tracts where there is public housing). We have observations on two time periods 1990 and 2000, so that ti= 0, 1 where 0 indicates pre- treatment (1990) and 1 indicates post-treatment (2000). The pre-treatment period of 1990 is chosen because most of the public properties in the dataset were placed in service post 1990, whereas the post-treatment date of year 2000 was chosen for convenience of the researcher depending on availability of data. Every unit (census tract) i for i= 1, 2, ?,N is assumed to have 28 two observations each. We let 0TY and 1TY denote the sample averages of the outcome (property values) for the treatment group (census tracts with public housing) before and after treatment. Correspondingly, let 0CY and 1CY represent the averages of the outcome for the control group (census tracts without public housing) before and after treatment. The estimators derived from equation (3) can be interpreted as follows; ?=constant term ?=treatment group specific effect (accounts for permanent differences between treatment and control) ?= time trend common to control and treatment groups ?= true effect of treatment The model assumes that there is no specification error, the error has mean zero and is uncorrelated with the other variables. Under these assumptions, we can derive the expected values from the outcome equation as follows; (11) Estimators for simple pre versus post treatment situations comparing the averages of the outcome Yi for the treatment group are derived as follows. Treatment group: The expected value of this estimator: (12) 29 Similar derivations give the estimators for the control group alone, as well as for the treatment versus control. The difference-in-difference estimator is a double-difference estimator, defined as the difference in average outcome in the treatment group before and after treatment minus the difference in average outcome in the control group before and after treatment (Imbens and Wooldridge, 2007). (13) Taking the expected value yields the estimator; (14) Applying OLS estimation to equation (9) yields the estimates presented in Table 1.5. The model is fitted three times?firstly, using only LIHTC as the treatment, secondly, using Section 8 only as the treatment, and thirdly, using either LIHTC or Section 8 as treatment. To understand the treatment effects of either LIHTC or Section 8, we must interpret both the sign and magnitude of the parameters- ?, ?, and ? from the Table 1.5. Using LIHTC only as treatment, both ? and ? have negative signs?indicating that LIHTC causes property values to decline. The interpretation of the coefficient of ? as the difference due to treatment effect of LIHTC means that on average census tracts that have LIHTC have property values lower than other tracts with no LIHTC by $7, 493. This estimate is very significant both statistically and economically. The estimate of ? is negative, but not statistically significant. The estimate of the trend term ? is positive and statistically significant?implying that property values were rising from 1990 to 2000. This allows us to isolate the trends in property value changes from the causal 30 impacts of either LIHTC or Section 8. This is an innovation that overcomes the weaknesses of earlier studies of this subject whereby trends in property values were not accounted for. Table 1.5: D-in-D model estimates of treatment effects. Outcome= property values ($) Parameter Estimate T-Statistic t-probability LIHTC only ? 70,721 48.29 .000 ? -7,493 -2.85 .004 ? 15,109 7.42 .000 ? -1,480 -.39 .695 SEC8 ONLY ? 67,174 46.02 .000 ? 4,078 1.53 .126 ? 17,084 8.43 .000 ? -7,537 -1.96 .050 EITHER ? 68,936 66.62 .000 LIHTC/SEC8 ? -1,757 -.94 .348 ? 16,099 11.20 .000 ? -4,511 -1.68 .094 The estimate of ?, using Section 8 as the treatment is not statistically significant. The estimate of ? is negative and statistically significant. To investigate the effect of either LIHTC or Section 8, I estimated equation (9) allowing the possibility of either of these two types of public properties in a census tract. The estimate of ? is negative, albeit, insignificant statistically. Moreover, the estimate of ? is negative with 31 borderline statistical significance. These findings provide evidence of a negative causal effect of LIHTC public housing on neighborhood property values, after controlling for time trends. 6. Propensity Score Analysis The estimation of treatment effects from non-experimental or observational data is quite problematic because of self-selection. Unlike in randomized experiments, individual units may choose to self-select into a program and this has the tendency to bias the estimates when computing the true effect of participating in the program. A similar issue may arise in trying to ascertain the causal impact of subsidized housing on property values. Since the data is observational, those census tracts that have subsidized housing might have been chosen out of necessity to receive these Federal programs?which is by definition a self-selection mechanism. One way of dealing with this is using the difference-in-difference estimator, already described above. An alternative way of capturing the treatment (causal) effects of Section 8 and LIHTC is by doing propensity score analysis. Propensity score analysis is an effective strategy used to correct selection biases in the estimation of treatment effects. To carry out propensity score analysis (PSA), we will assume a dichotomous treatment variable Ti, an outcome variable Yi, and a set of covariates Xi. In this paper, we let the treatment Ti take on the value of one if a subsidized or public project is present in a census tract and zero otherwise. The outcome Yi refers to median property value or rent in the census tract. The set of covariates are characteristics of the tract including socio-demographic factors, quality of school district and environmental quality. The selection bias is described below following the Rubin Causal Model (Rubin, 1974). For observation unit i, the observed outcome Yi can be expressed in the following way. 32 (15) Where (Y1i Y0i) is the causal effect of the treatment Ti. In practice however, direct computation of this treatment effect is not possible because either Y1i or Y0i is not observable. That is to say, an individual either receives the treatment?in which case we observe Y1i, or does not?in which case we observe Y0i. Thus, we can only estimate the treatment effect by comparing the outcomes from a treated sample to an untreated or ?control? sample. In this spirit, the average sample treatment effect can be expressed as; (16) Where E[.] denotes the expectation operator. The first term on the right is the average causal effect of the treatment we are interested in?computed as the difference between the average outcome of those receiving the treatment and those not receiving the treatment. The second term on the right can be interpreted as a selection bias due to self-selection. If the assignment of the treatment were random, the second term would be zero. In using propensity score matching, we hope to eliminate this bias by computing the probability of receiving the treatment conditional on characteristics of individual observational units. The propensity score then is the conditional probability of an individual receiving the treatment given the covariates, expressed as; (17) In estimating treatment effects, two assumptions are made (Imbens and Wooldridge, 2009). These are (1) unconfoundedness (conditional independence) and (2) Covariate overlap. 33 The unconfoundedness assumption implies that the probabilities of assignment to treatment do not depend on the outcomes, conditional on observed covariates. That is; (18) Where A B|C? denotes the conditional independence of A and B given C. By way of interpretation, we can say that the probability that we will find a subsidized property in a census tract does not depend on property values in that particular census tract, given the characteristics of that tract. Imbens and Wooldridge (2009) define covariate overlap as; (19) Where ?p? denotes probability. This assumption implies that the covariate distribution in the treated group overlap that in the untreated group. Put simply, the characteristics of the treated and untreated should be sufficiently similar. Whether the overlap condition is met is easily ascertained by computing the propensity scores in a logistic regression. The predicted probabilities (propensity scores), given the covariates, must be bounded between zero and one. If these two assumptions hold, propensity score matching can approximate a ?randomized? assignment of treatments. The process of conducting a PSM entails two stages. In stage one, we model the probability of receiving the treatment, conditional on the individual unit characteristics (covariates) as stated in equation (17) above. The estimated (predicted) probability from stage one constitutes the propensity score for the ith entity, conditional on its covariates. Typically, stage one is estimated as a logistic regression, but of course other methods like classification trees and discriminant analyses have been effectively used. Stage two of PSM involves sorting the individual observational units into a relatively small number of strata, in such a way that each stratum is relatively homogenous in terms of their propensity scores. Since stratification involves grouping identical members, the members in 34 each stratum are nearly similar in their propensity scores, and then it follows that they must have similar covariate distributions. If two entities did not have similar covariates, they could not possibly have similar propensity scores, or be in the same stratum for that matter. If the covariate distribution is similar within a stratum, the treatment assignment in that stratum can be viewed as ?random? insofar as whether an individual is treated or not is independent of their characteristics (covariates). In each stratum, the treatment group is compared with the non-treated on the outcome measure of interest. That is, we conduct a comparison of the mean outcomes between the treated and untreated in each stratum. The average treatment effect across strata is called the Direct Adjustment Estimator (DAE). The DAE is computed as the average of the mean differences across strata. The unadjusted treatment mean effect is computed from the sample i.e. we compute the mean outcome for the treated and untreated in the sample. The unadjusted treatment effect is the difference between the raw estimated means for the treated and the untreated in the sample. The difference between the DAE and the unadjusted treatment effect is the selection bias. 6.1. PSM adjustment for sample selection Following Cochran (1968), Rosenbaum and Rubin (1983), I group the propensity scores into five strata of roughly equal size, presented in Tables 1.6a and 1.6b. The sample has 1,081 observations, and thus the average number of elements in a stratum is about 216. Table 1.6a uses the presence of LIHTC as the treatment, while Table 1.6b uses Section 8 as the treatment. Within each stratum, I compute the counts of treated and untreated (i.e. presence or absence of either LIHTC or Section 8) and the corresponding mean outcome (median rent in census tract). 35 The computed unadjusted (raw) mean rent for census tracts that have a LIHTC is $309, and for those with no LIHTC, $344, implying a difference of -$35. Thus, the raw treatment effect of LIHTC is to lower average rent by $35. From Table 1.6a, the mean difference, accounting for selection bias, or DAE estimate is -$32.83, which is statistically significant (t value= -2.70, prob=0.027). This implies that census tracts with LIHTC properties have mean rent $32.83 lower than corresponding tracts with no LIHTC properties, but similar characteristics, after adjusting for sample selection bias. Similarly, census tracts that have Section 8 have unadjusted (raw) mean rent of $331.29 and those census tracts with no Section 8 have mean rent $331.95. The difference is very small (-$.66), indicating that the unadjusted estimate of treatment effect of Section 8 is negligible. From Table 1.6b, when we adjust for selection bias, the DAE estimate is $5.72 (although statistically insignificant), implying that the presence of Section 8 properties in a census tract does not necessarily have a causal effect on average rents. 7.0. Conclusion Many authors have studied the effects of public housing on neighboring properties, but mainly in the Northeast. Most of these studies tended to focus on a single city or county. Generalization of the findings from such single-city or single-county studies to other parts of the country is problematic. First, cities and counties vary within a given state, not to talk about the Country as a whole. Secondly, property values vary markedly across counties within a State. Not least, is the fact that, even in the same city or county, property values do have trends over time. To overcome these weaknesses, this study presents results from a state-wide dataset for the State of Alabama. The study also incorporates property value trends over a decade from 1990 to 2000. 36 The paper also demonstrates that it is possible to uncover the true causal impacts of public housing on property values, by using difference-in-difference methods, and propensity score matching. Further, I employ spatial methods in addition to OLS regressions so as to complement the quantitative techniques for evaluating the true causal effects of two Federal low income housing schemes on property values. The results show that section 42 LIHTC projects cause property values to decline. This effect is proven to be the case in three different methodological approaches i.e. regression analysis, difference-in-difference, and propensity score matching methods. On the other hand, I find in the regression analysis that Section 8 projects have a positive effect on property values, but this is not the case in difference-in-difference and propensity score analyses. The plausible explanations for the opposite effects of these two programs on property values could be the management type, style of the structure, and state of maintenance. While Section 8 projects are typically managed by a housing authority, LIHTC tend to be privately managed. Finally, the study explores the relationships between ethnicity, poverty, and property values. I find that high poverty rates within a neighborhood depress property values. Contrary to popular beliefs, this study shows that ethnically diverse neighborhoods tend to have higher property values than homogenous ones. 37 Table 1.6a Propensity Score Matching, Treatment=LIHTC Strata Counts Means Propensity LIHTC Non-LIHTC LIHTC Non-LIHTC Score Range Rent Rent Difference (.080, .265) 56 159 297.321 326.504 -29.18 (.265, .331) 71 145 350.704 359.696 -8.99 (.331, .392) 69 147 299.173 376.361 -77.19 (.392, .455) 85 131 321.188 334.251 -13.07 (.455, .620) 103 116 283.514 319.25 -35.74 Unadjusted means Adjusted mean (DAE) =-$32.83 Mean rent for tracts with LIHTC =$309.10 one-sided t-test on this difference Mean rent for tracts with no LIHTC =$344.15 t = -2.70 p-value= 0.027 Unadjusted treatment effect =-$35.05 38 Table 1.6b Propensity Score Matching, Treatment=Section 8 Strata Counts Means Propensity SEC8 Non-SEC8 SEC8 Non-SEC8 Score Range Rent Rent Difference (.080, .265) 41 174 323.731 317.764 5.97 (.265, .331) 62 154 412.193 334.415 77.78 (.331, .392) 81 135 336.864 360.608 -23.74 (.392, .455) 94 122 320.531 335.721 -15.19 (.455, .620) 113 106 294.601 310.801 -16.20 Unadjusted means Adjusted mean (DAE) =$5.72 Mean rent for tracts with sec 8 =$331.29 one-sided t-test on this difference Mean rent for tracts with no sec 8 =$331.95 t = 0.343 p-value= 0.374 Unadjusted treatment effect =-$.66 39 CHAPTER 2 The Impact of Section 8 Public Housing On Property Values In Fulton County-Georgia 1. Introduction Having looked at an aggregate analysis of the effects of subsidized housing in chapter one, I now turn attention to a disaggregated analysis, this time focusing on the impact of Section 8 projects on single family home sales in Fulton County, Georgia. Many property owners resist the idea of a public housing in their neighborhood, no matter what type of program it is, and there are different federal programs giving rise to different public or affordable housing.3 Public, subsidized and affordable housing has often been viewed as a bad that should be kept at bay. This ?Not In My Backyard? or NIMBY mentality stems from the belief that public housing has a negative externality effect on neighboring properties. For most people, their home is their only investment and anything that is likely to decrease the value of their property can invoke heated debate. Public housing may depress property values through a range of factors, such as poor quality of public housing, concentrated poverty and perceived increases in crime. In all fairness, property owners object to anything that 3 The distinction between affordable, subsidized, and public housing depends on the ownership of the property. The common denominator in all these programs is that they provide affordable housing to low-income people. Public housing is typically owned by the government and may be managed by a housing authority or a private company on behalf of the housing authority, while subsidized housing may be privately managed. The Department of Housing and Urban Development (HUD) identifies about ten public housing programs managed by the PHA. The Section 8 housing (now called Housing choice vouchers) is just one example of public housing programs. 40 is likely to impact negatively on the value of their property, including landfills, power lines, community care facilities, and even Churches (Green et al., 2002). But is this NIMBY opposition founded in reality or it is based purely on fears and stereotype? The research so far can, at best, be described as contradictory or inconclusive. The present study focuses on the impact of Section 8 multi-family dwelling units on neighboring property values in Fulton County- Georgia. The primary source of data is the County Tax Assessor records of Fulton County, Georgia. In addition, I use micro-data from decennial population and housing census on socioeconomic conditions of the neighborhood in which these properties are located. Much controversy surrounds the impact of affordable or subsidized housing on property values. While many researchers have devoted time and resources to unraveling this relationship, the results so far have been contradictory and inconclusive. Most previous studies have been conducted in single cities or counties located in the Northeastern US, while a few have been conducted in the West and Midwest; to the best of my knowledge, such studies have not been conducted in the Southeast. 2. Literature Review Reviewing the literature thus far, one can categorize the previous studies into two: those that find a positive effect of public housing on neighboring properties versus those that find a negative relationship. Yet still, there are those studies that have found neither a positive nor a negative effect. Most popular studies in the category of positive effect are those of Nourse (1963), De Salvo (1974), Rabiega, Lin and Robinson (1984) and Warren, Aduddell and Tatlovich (1983). 41 Among the studies reporting negative association are Lee, Culhane and Wachter (1999), Lyons and Loveridge (1993), Goetz, Lam, and Heitlinger (1996) and Cummings and Landis (1993). Lyons and Loveridge (1993) investigate whether the presence of subsidized housing leads to negative externalities on neighboring properties in Minnesota. Their models are constructed to capture the effects at five distance radii, namely 300 feet, a quarter, a half, one and two mile radii. They found that there is a small and statistically significant negative effect of subsidized housing on neighboring properties and that the effect diminishes further away from the subsidized housing. In particular, they stressed that ??adding one subsidized unit within a quarter mile radius of a house has the same dollar impact on that house?s value as removing half a square foot of its living space.? Galster, Tatian and Smith (1999) studied the impact of neighbors who use Section 8 Certificates on property values and found mixed results. ?If only a few Section 8 sites were located within 500 feet [of a property], we found a strong positive impact on property values in higher-valued, real appreciation, predominantly white census tracts? (Galster et al., 1999). On the other hand, in low-valued, higher-density census tracts with declining property values, they found that Section 8 developments had an adverse effect on property values within 2000 feet, with this effect diminishing after 500 feet This result indicates that Section 8 developments do not necessarily depress property values? if anything, these developments seem to accentuate the trend in property values pre-establishment of developments. So, then, if Section 8 properties are sited in blighted neighborhoods, they seem to have a depressing effect, but if sited in wealthy suburban neighborhoods they seem to have a positive effect. If we control for trends in property values pre/post establishment (as in Galster et al. ), it should be the case then, that these developments have a neutral effect in aggregate. 42 One of the criticisms leveled against previous studies is that they have relied on cross- sectional analysis wherein the property value impacts are modeled as if it were a one-time effect. In reviewing the literature on this subject thus far, Nguyen (2005) observes two ?waves? of studies. In the first, researchers used a ?test versus control? methodology in which the property values of a neighborhood containing public housing is compared to another neighborhood with similar characteristics but no public housing. The shortcoming of the cross-sectional studies is that they fail to capture any trends in property values that existed before the siting of public property. The second ?wave? of studies utilizes multiple regression techniques (hedonic models) that also incorporated advances in GIS-enabled spatial analysis.4 2.1. Section 8 Certificates and Voucher Program Section 8?or Housing Choice Voucher? refers to a federally subsidized program enacted under the U.S. Housing and Community Development Act of 1974 to assist low-income families and individuals find affordable housing. Under the Section 8 program, tenants pay 30% of their rent while the remaining is paid off with federal money (voucher). Section 8 initially consisted of three programs? New Construction, Substantial Rehabilitation, and Existing Housing Certificates. Later, some programs were phased out and new ones added. Currently, Section 8 projects consist, essentially, of the voucher program? ?project- based? or ?tenant-based.? With a project-based voucher, a person is limited to specified apartment complexes which may be 4 GIS proximity analyses tools are used to construct distance measures to assess the impact of public housing on nearby properties. In some of the studies reviewed, GIS software has been used to construct rings of differing radii around transacted houses to estimate the effect of public housing within those distance rings. Galster,Tatian,and Smith (1999), Santiago,Galster,and Tatian (2001), Ellen et al. (2001,2002),and Schwartz et al. (2003) utilized this technique to study the effect of subsidized housing projects. 43 administered by a Public Housing Authority (PHA), while a tenant-based voucher provides the freedom to move to any state and to any apartment where Section 8 vouchers are accepted. 3. Data and Methods 3.1. Data The data on property transactions were purchased from Fulton County tax assessor. They cover properties that transacted within fiscal year 2007. Only single family units are considered, and properties belonging to other non-residential uses are excluded from the analyses. After further cleaning the data, there are 7,211 usable observations. The problem with these data was that they do not have locational information, i.e. only street names were provided and neither the zip codes nor geographic coordinates were provided. This called for using the street names to locate the zip codes of the transacted properties in Google Earth. Then using the zip codes together with street names, these houses were geo-coded using GIS software. With geographic coordinates of transacted properties and Section 8 properties determined, I created a layer for each and used these in GIS for proximity analysis. The data on Section 8 multi-family housing are obtained from the US Department of Housing and Urban Development (HUD). In Fulton County, there are 635 Section 8 properties located across the county. The Section 8 data consist of the year that the property was placed in service, number of units, and the XY-coordinates of the geographical location. Using the geographic coordinates, GIS software is used to locate these properties in order to find their spatial relationship with the transacted properties. Data on neighborhood characteristics i.e. socio-demographic information at the census tract level was gathered from the census bureau?s database of decennial population and housing census. 44 In order to carry out the analysis, all of the data on transacted properties have to be linked to neighborhood characteristics. Thus, a spatial join was performed to match all of these data using ArcGIS overlay tools. Moreover, proximity analysis to find the relationship between transacted properties and Section 8 properties is done using ArcGIS tools. Two measures of proximity are used: distance bands and dummies indicating whether there is a Section 8 property, and a numerical count of Section 8 properties within a certain distance. The buffer tool in ArcGIS is used to construct rings of differing radii around the transacted properties so that the number of Section 8 properties falling within each ring can be calculated. These rings were constructed at two different radii, a half mile and one mile. Summary statistics of the variables used in the analysis are presented in Table 2.1. The average sale price of a house in the data is $272,207. The standard deviation of sale price is very high, at $483,881, indicating a very wide distribution of house prices in the data. The minimum sale price is a $1000 and the maximum is $10 million. Some of the properties at the lower end of the price distribution might have been dilapidated structures that lacked maintenance or had been foreclosed upon and resold at auctioned prices. The descriptive statistics indicate that 42% of all the transacted properties had at least one Section 8 public housing unit located within a half-mile away. Similar rings constructed at the one-mile radius show that 80% of the transacted properties had at least one Section 8 public housing within its proximity. It is customary in hedonic studies to include measures of the property size, because, as we would expect, larger properties fetch higher prices than smaller properties, ceteris paribus. In this spirit, the measures of house size included are lot size, number of rooms, number of baths, and number of stories. The average number of rooms in the 45 data is 6.37 with standard deviation of 1.72, while the average lot size as measured in square feet is 12,089. In hedonic analysis, it is equally important to include variables that capture location or neighborhood factors that may affect property values. Research has shown that neighborhood factors like incomes and poverty status tend to be capitalized into house values. To account for these effects, I include socio-demographic factors measured at the census tract level to isolate the impact of location from property-specific factors. This also helps to identify the causal effect of the variable of interest?Section 8 properties within a half-mile and one-mile radii. I also include racial composition of the neighborhood, age composition, average household size as well as a measure of population density in the analysis. The summary statistics of these variables show that the average census tract population is 68% black, 20% white, and 3% Hispanic. The average household size is 2.59 and the percentage of population aged 65 and above is 10%. On average 29% of the population in each census tract is 21 years old or below. 3.1.1. Study Area The study area is Fulton County, Georgia. Figure 2.1 depicts the map of Fulton County displaying transacted properties and Section 8 public housing. The demographic data obtained from the Census Bureau show that in 2009, the population of Fulton County was 1,033,756. Median property value (owner-occupied) in Fulton County is $180,700 and the average commute time to Central Business District in Atlanta is 29 minutes. The county is relatively heterogeneous in terms of race?42.9% white, 43.1% black and 14% other races. The median household income in 2008 was $62,682 and the poverty rate in the County is 14.9%. 46 Table 2.1: Summary statistics of variables Variable Mean Standard Deviation Sale price $272207 $483881 Rooms 6.37 1.72 Square footage of lot 12089 11360 Baths 1.64 0.88 Stories 1.18 0.38 Age of structure (years) 53.54 48.88 Attic 1.26 0.78 Section 8 dummy 0.42 0.49 Section 8 count 1.185 0.85 Percent white 0.20 0.29 Percent black 0.68 0.33 Percent Hispanic 0.03 0.04 Percent aged 65+ 0.10 0.05 Percent aged 21< 0.29 0.10 Average household size 2.59 0.46 Pop Density 4156 1927 3.2. Analytical Methods Housing price determination, much like for other goods, involves analyzing what consumers are willing to pay for a bundle of house characteristics. This process, known as hedonic price modeling, originated with Rosen (1974) and has gained widespread acceptance in the economic 47 profession as a tool for predicting housing price, given structural and neighborhood characteristics of the house. Lately, the need to exploit information contained in spatial relationships has been gaining prominence in the field of spatial econometrics5. Figure 2.1: Map of Central Fulton County 3.2.1. Hedonic Price Model The basic hedonic price model (HPM) expresses the price of a commodity as a function of structural characteristics of the commodity. , (1) 5 See Anselin (1988), LeSage(1999) and LeSage and Pace(2009) for a detailed treatment of spatial econometrics. 48 Where yi is house price, xi is a set of house characteristics and ? is the error term. Utility theory posits that individuals value a good based on a set of or a bundle of characteristics each of which has an implicit price. The utility maximizing household, thus, chooses a bundle of housing characteristics that maximize their utility subject to a binding budget constraint. Consequently, the utility maximization problem facing the individual household can be expressed in the form; (2) Where X is the vector of housing and neighborhood characteristics, and G is a vector of other goods, Px is a price vector of housing and neighborhood characteristics, Pg is price vector of other goods, and Y denotes household income. The parameters of the HPM can be conceived of as implicit prices. Thus, the hedonic model for housing expresses the price of a house as a weighted sum of the implicit prices of its structural features. (3) Where P is price, and ? refers to the set of implicit prices to be estimated and X is the vector of observable house characteristics like number of bathrooms, bedrooms, square footage, lot size and could also include locational factors like neighborhood crime rate, air pollution, and quality of the school district. 49 3.2.2. Spatial Dependence in House prices Following Tobler?s (1979) first law of geography??everything is related to everything else, but near things are more related than distant things?? it stands to reason that adjacent houses do exert influences on one another. Information contained in space is critical in spatial econometrics, much like temporal information is in time series econometrics. Two issues arise in estimating spatial data?spatial dependence and spatial heterogeneity. Spatial dependence is said to exist when there are spillover effects. Anselin (1988) defines spatial dependence as ?the lack of independence in cross-sectional [spatial] observations.? He identified two factors that give rise to spatial dependence: measurement errors and spatial organization. Measurement errors in spatial units tend to spill-over into other units. For this reason, the error of one observation, spatial unit i, will likely affect the errors in neighboring unit j. This is referred to as spatial autocorrelation. Not properly accounting for this spatial autocorrelation, as the traditional econometric approaches do, violates the Gauss-Markov assumptions and could render econometric results biased and inconsistent. This calls for better estimation techniques than the traditional ordinary least squares. A second type of spatial effect, less prominent than spatial dependence, is spatial heterogeneity. This is said to exist if the relationship being modeled varies across space, or the sample parameters of the phenomenon, such as sample variance, is not constant across space (Anselin, 1988). For example, states, counties, and census tracts tend to exhibit unequal populations, crime rates, incomes, and unemployment levels. Spatial heterogeneity also violates Gauss-Markov assumptions, and thus, its presence requires better estimation techniques than standard OLS. 50 Ignoring spatial dependence and spatial heterogeneity in our econometric estimation of spatial data produce residuals that are spatially correlated. This phenomenon is referred to as spatial autocorrelation. LeSage (1997) illustrates how to incorporate spatial information in regression relationships that exhibit spatial autocorrelation. It can be shown that using OLS in the presence of spatial dependence and/or heterogeneity generally result in biased and inconsistent estimates. To overcome this, different spatial models have been proposed. Anselin (1988) details different specification of spatial models that overcome spatial autocorrelation. The simplest of these is the Cliff-Ord-type model also known as a spatial error model (SEM). (4) Where y is an Nx1 vector of observations across space such as house prices, crime rates, or unemployment rates. The vector X is an Nxk set of explanatory variables, ? is the Gaussian random error term, ? is the spatial error terms for neighbors, w is known as the spatial weight matrix, determined by contiguity relationship or distances between neighbors. The parameters of interest are? and ? . The parameter ? is referred to as the spatial autocorrelation paratmer? analogous to serial autocorrelation coefficient in time series. The ? can be estimated consistently by ordinary least squares, while both ? and ? can be derived by maximum likelihood techniques (LeSage, 1997). A second spatial model is known as the spatial autoregressive model or spatial lag model (Anselin, 1988) and is also quite commonly used in the literature to model spatial dependence. (5) 51 The SAR model is analogous to the lagged dependent variable model in time series regressions. In this case however, the parameter ? is a measure of spatial dependence, and wy is the spatial lag of the dependent variable y. The third spatial model is a combination of the spatial error and spatial autoregressive models previously defined. This model has two spatial weights matrices incorporated in it which increases its complexity in comparison to either of its derivative forms. (6) A fourth and final spatial model has been dubbed the Spatial Durbin Model. This model includes both the spatial lag of the dependent variable as well as lags of the explanatory variables (Anselin, 1988). (7) The parameters ? and ? are as previously defined, wX is the matrix of lag terms for explanatory variables and ? is the vector of coefficients on the lagged X. 3.2.3. Moran?s I Test for Spatial Autocorrelation To measure spatial autocorrelation in property values? any spatial clustering of house prices? Moran?s I has gained popularity in the literature. There are global and local Moran?s I tests for testing autocorrelation over an entire region or over localized areas. The global Moran?s I statistic is given by the formula below. (8) 52 Where N is the total number of objects under consideration, wij is the spatial weight for neighbors i and j, yi and yj refer to the observed values of i and j, and y bar is the overall mean of values. The statistic takes values between +1 and -1, thus, similar to serial correlation coefficient in time series. When the spatial autocorrelation coefficient, as measured by Moran?s I, is positive, it means that there is a positive spill-over effect among neighbors. Thus, for house prices, this implies a higher valued property in the neighborhood would positively influence prices of neighboring properties, and a lower valued one would have negative effect. Negative spatial autocorrelation, on the other hand, exists if higher valued and lower valued properties are neighbors, a phenomenon not so common in the housing market 3.2.4. The estimated model and specification There are no rules of thumb on the exact specification of a hedonic model. However, the housing literature does offer some insights into variables that ought to be present in the model. Structural characteristics of the house, location or neighborhood effects, proximity to central business districts (CBD), and environmental quality are a few of the most important covariates that may be included. Hedonic theory provides little help with regards to ?correct? functional form specification (Butler, 1982) and one must rely on intuition and empirical evidence, such as goodness of fit measures. I propose and estimate the following model for single-family home sales. 2 0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 8 ( _ P r ) 6 5 2 1 S e c R o o m s R o o m s L o tsize A g e S truc S torie s L n S a le ic e F ix b a th A tt ic P c twhit e P c tblac k P c thisp P c t u p P c t u n d e r R a il R o a d P o p d e n s H H size ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ??? ? ? ??? ? ? ? ? ? ? ??? ? ? ?? ? ? ? ? ??? ? ? (9) 53 I compare the spatial model (spatial lag and spatial error) estimation to the traditional OLS or Hedonic price Model (HPM). I conduct sensitivity analysis to ascertain robustness of the model estimates under different specifications of the Section 8 policy variable. Changing the measurement of the Section 8 variable gives rise to four models. In the first model, I define Section 8 variable as a dummy indicating presence of a Section 8 public housing within a half- mile radius of transacted properties. Secondly, I define Section 8 as a count of the number of Section 8 public projects within a half-mile radius. The third and fourth models include a dummy and number of Section 8 public projects, respectively, within a one-mile radius of each transacted property. I then apply OLS, spatial lag and spatial error estimation techniques to all four specifications. 3.2.5. Spatial Weight matrices One of the important concepts in spatial econometric estimation is the sparse weight matrix. A sparse matrix makes the usual weight matrix compact, thus reducing its size and making it easy to estimate the models with large data sets. This matrix summarizes information about the relationship between spatial neighbors. There are routines available in MATLAB to sparse weight matrices. Spatial weight matrices may be based on contiguity or distances. In this current study, the weight matrix is constructed based on euclidean distances, since the data I use involve observable points in space, i.e. the individual transacted houses. 54 3.2.6. Distance Weights matrices Distance-based weights matrices are either based on Euclidean or inverse distances between points in space. XY-coordinates or centroids of polygons are used to calculate the distances. For point data, off-diagonal elements of the weight matrix are distances between any two points. A weight matrix is typically row-standardized, that is, for each observation or row in the data, elements across the row must sum to unity, and diagonal elements must be zero. 4.1. Results of Spatial Dependence Tests Moran?s I test of spatial autocorrelation indicates positive spill-over effect in house prices. The univariate Moran?s I (see Figure 2.2) was performed on sale prices of all the transacted properties in the sample and the statistic is 0.581 (p value = 0.000)?an indication of strong spatial dependence. The multivariate Moran?s I statistic is 0.109 and this is also highly significant (p-value<0.0000). Other tests of spatial dependence were performed?namely Lagrange multiplier test and Robust LM?all of which show strong evidence of spatial dependence. In addition to these parametric tests, cluster analysis was conducted, the result of which is presented in Figure 2.3. This figure indicates that there is significant clustering of house prices in the sample under consideration. The red dots show clustering of high prices together and blue dots show clustering of low prices together. 55 Figure 2.2. Univariate Moran?s I Test on House Prices Figure 2.3. Clustering in House Prices. Red dots indicate clustering of high-high values and blue dots indicate clustering of low-low values. 56 Table 2.2 Main Differences between the models Half-Mile Ring One-Mile Ring Model Sec8 dummy Sec8 count Sec8 dummy Sec8 count OLS I x OLS II x OLS III x OLS IV x Spatial lag I x Spatial lag II x Spatial lag III x Spatial lag IV x Spatial error I x Spatial error II x Spatial error III x Spatial error IV x 4.2. Results and Discussion I estimate and compare results of the traditional OLS (HPM) to spatial lag and spatial error models. In all, four models in each category are presented in Tables 2.4 through Tables 2.7. Table 2.2 presents the main distinguishing features of all the models, each one differs because of the way the policy variable of interest is measured. The Section 8 public housing is measured by constructing a radius of half-mile or one mile around a transacted property. Within 57 the half-mile radius I have two ways of measuring the effect of the Section 8 variable?a dummy indicating the presence or otherwise of Section 8 and a numerical count of Section 8 properties. The same is repeated at the one-mile radius. Thus, Table 2.2 shows that each model differs by the way the effect of Section 8 is measured. I compare the adequacy of each of these models by computing log-likelihood ratios and Akaike Information Criteria (AIC). The model diagnostics are presented in Table 2.3. Among the OLS models, model I has the smallest AIC value of 15,586.6. Generally, the spatial models fit better than the OLS models because they have the smallest AIC values. Table 2.3 Model Diagnostics Model Log-Likelihood AIC OLS I -6776.44 13 586.9 OLS II -6777.07 13 588.1 OLS III -6778.2 13 590.4 OLS IV -6781.88 13 597.8 Spatial lag I -6735.72 13 507.4 Spatial lag II -6736.16 13 508.3 Spatial lag III -6738.10 13 512.2 Spatial lag IV -6741.01 13 518.0 Spatial Error I -6574.05 13 182.1 Spatial Error II -6574.83 13 183.7 Spatial Error III -6573.95 13 181.9 Spatial Error IV -6575.20 13 184.4 58 Overall the models fit very well judging from the likelihood ratios and AIC criteria presented in Table 2.3. Regarding choice of model, the AIC criterion says that the model with the smallest AIC value is best. Thus, from Table 2.3 there is strong preference for OLS I, spatial lag I and the spatial error III according to the AIC criterion. The parameters of the spatial models show that, indeed, there is spatial dependence?further complementary evidence to the Moran?s I tests presented before. The estimated coefficient of spatial dependence (rho) is approximately 0.089 and that of spatial autocorrelation (lambda) is 0.474?both of these passed the test of statistical significance at the 1% level. In terms of explanatory power, the models are quite consistent with R2=0.60. Results presented in Tables 2.4 and 2.5 are for the models in which Section 8 effect is measured at the half-mile ring. Comparable results with the Section 8 variable measured at a one-mile radius are presented in Tables 2.6 and 2.7. In each of Tables 2.4, 2.5, 2.6, and 2.7, I present the results from the traditional OLS and two spatial models? spatial lag and spatial error models. Generally, the estimates from the OLS appear to overstate the effects of the variables? possibly due to bias resulting from spatial autocorrelation. In particular, the OLS estimate from Table 2.4 shows that the presence of Section 8 public housing within a half-mile radius of a single-family home is associated with an increase in the value of the latter by about 3.9% over similar single-family houses having the same characteristics. This effect is slightly lower in the spatial lag and spatial error models at 3.6% and 3.1% respectively. This is a non-trivial effect both in economic magnitude and statistical significance. This finding particularly supports that of Galster et al. (1999) where they found a positive effect of Section 8 projects within 500 feet but an adverse effect within 2000 feet. Broadly speaking, the results found here are consistent with much earlier research like Rabiega, Lin and Robinson (1984), and Warren, Aduddell and 59 Tatlovich (1983) showing a positive effect of public housing on neighboring properties in general. The positive externality effect of Section 8 housing on neighboring property values found here is possibly explained by two avenues. First, these public housing projects may have been constructed to replace previously unsightly structures?thus the removal of blight may have caused surrounding property values to appreciate. Secondly, it could be due to the architectural design of the Section 8 housing?a nice-looking design that is in sync with neighboring properties may positively impact adjacent property values. A policy question of interest arises concerning the impact of having more units of Section 8 projects located within the same half-mile radius compared to the case where there is just one or two. On the other hand, policy makers might contemplate whether having more units per project?say in a high-rise project has a different effect. To investigate the former policy question, I include the count of Section 8 properties within a half-mile radius of a single-family home. Data constraints did not allow me to look at the impacts of high-rise public projects versus low-rise projects. Results from using count of Section 8 properties as the policy variable is shown in Table 2.5 and the effect is positive and smaller in economic significance than the previous case of using the dummy. Increasing the number of Section 8 projects within a half-mile radius of single-family home by one more unit is associated with a 1.9% increase in the price of the single- family home. The policy implication of this finding is that, placing ten more units of Section 8 projects within a half-mile radius can be expected to increase the average property values by about 19%. 60 What is clear from the two measures of Section 8 explained above is that, at the half-mile radius, public projects do exert a positive effect on single-family sales. This positive effect however diminishes and turns to negative if we include projects located within one-mile radius of single-family homes. Specifically, at the one-mile radius, the presence of one Section 8 project reduces the sale price of single-family homes by 1.7%. These findings are consistent with those of other researchers like Santiago et al. (2001) in their study of public housing in Denver where they found positive and negative effects depending on location and proximity relationship of public housing to single-family homes. 4.2.1. The effect of house characteristics on house price Similarly to other hedonic studies I have included characteristics of the house in all of the models estimated. This helps to isolate own-price effects of intrinsic factors relating to the house from extrinsic factors like Section 8 projects and other neighborhood effects. Most important factors typically included in a hedonic housing models are number of rooms, baths, lot size, and age of the structure. In all the models, the ?number of rooms? variable enters the equation with a quadratic term, because in most previous studies it has been found that the number of rooms increases house price up to some threshold, beyond which the effect turns negative. Both rooms and roomssq are statistically significant with their expected signs of positive and negative, respectively. Sales price increases with number of rooms up to about the 12th (0.096/2*0.004=12) room and then start to decrease with additional rooms beyond that point. Similarly, the lotsize, fixbath, attic, and stories also have statistically significant positive effects on house prices. The elasticity of price with respect to lotsize is 0.11 and the semi-elasticity with respect to fixbath, attic and stories is 0.16, 0.04 and 0.36, respectively. 61 4.2.2 The effect of neighborhood factors on house prices Neighborhood factors that are included in the analysis are racial and age compositions, average household size and population density. Of the racial variables, the percent of white in the census tract is positively associated with property values while the percent of Hispanics is negatively associated with property values. There is no statistically significant relationship between the percent of black people, people aged 65 and people aged 21 and under and property values. The average household size and population density are negatively associated with property values. Increasing average household size tends to increase the population density and both effects tend to depress housing values. The proximity of the transacted properties to the railroad line is also found to positively affect property values, possibly due to the ease of transportation or commute. 5.0. Conclusion This paper analyzes the impact of Section 8 public housing on single-family home sales in Fulton County-Georgia. The data used was obtained from the County Tax Assessor of Fulton County, and involves single-family houses that transacted in fiscal year 2007. Data on Section 8 properties in the county was obtained from The Department of Housing and Urban Development (HUD). The study incorporates spatial econometric methodology which seeks to address the twin issues of spatial autocorrelation and spatial heterogeneity in housing data. Typically, house prices exhibit spill-over effects on others due to proximity. The use of a spatial econometric estimation procedure in this study is novel in that many previous studies of the subject do not use spatial econometric techniques and thus fail to tackle this all-important problem of spatial dependence. The study finds a positive effect of Section 8 projects on nearby houses at the half- 62 mile radius, and that as one goes farther away from the transacted property, this positive effect diminishes and turns to negative beyond the one-mile distance. Table 2.4 Comparative Results of HPM and Spatial models: Dep Var: log(sale Price) Policy variable: Section 8 public housing within a half-mile radius dummy Variable OLS I SPATIAL LAG I SEM I Estimate t-statistic Estimate z-statistic Estimate z-statistic Sec8_dum 0.039** 2.436 0.036** 2.260 0.031 1.373 Lotsize(log) 0.111*** 7.591 0.107*** 7.387 0.101*** 6.272 Fixbath 0.166*** 12.163 0.159*** 11.774 0.149*** 10.986 Attic 0.044*** 4.448 0.044*** 4.457 0.033*** 3.393 Age of stru 0.0003** 2.031 0.003* 1.731 -1.03e-005 -0.063 Stories 0.360*** 14.422 0.355*** 14.317 0.318*** 13.004 Rooms 0.096*** 5.174 0.094*** 5.101 0.089*** 4.963 Roomssq -0.004*** -3.239 -0.004*** -3.201 -0.004*** -3.301 Pct_white 0.013*** 4.133 0.011*** 3.414 0.01*** 3.138 Pct_black -0.003 -1.067 -0.004 -1.253 -0.002 -0.485 Pct_Hisp -0.012*** -4.048 -0.011*** -3.924 -0.101** -2.359 Pct65_up 0.0006 0.369 -3.25e-005 -0.019 0.002 0.835 Pct21_under -0.0005 -0.801 -0.0005 -0.377 0.004 1.641 Ave_HH_Sz -0.102*** -3.454 -0.967*** -3.308 -0.113** -2.546 Popdens -0.013** -2.139 -0.009 -1.451 -0.010 -0.988 RR_Halfm 0.072*** 4.402 0.066*** 4.096 0.042* 1.783 Constant 9.959*** 28.552 9.032*** 24.893 10.004*** 20.068 Rho ---- ---- 0.089*** 8.707 ----- ----- Lambda ---- ---- ----- ------ 0.474*** 22.008 R-sq 0.59 ---- 0.59 ----- 0.62 ------ N 7,211 ---- 7,211 ----- 7,211 ------- Variable Model 1 Model 2 Model 3 Model 4 ***, **, * denote 1%, 5% and 10% respectively, SE in parentheses 63 Table 2.5 Comparative Results of HPM and Spatial models: Dep Var: log(sale Price) Policy variable: Count of Section 8 public housing within a half-mile radius Variable OLS II SPATIAL LAG II SEM II Estimate t-statistic Estimate z-statistic Estimate z-statistic Sec8_count 0.019** 2.162 0.018** 2.053 0.006 0.569 Lotsize(log) 0.110*** 7.557 0.106*** 7.360 0.100*** 6.224 Fixbath 0.167*** 12.235 0.160*** 11.839 0.149*** 11.024 Attic 0.044*** 4.443 0.044*** 4.451 0.033*** 3.390 Age of struc 0.0003** 2.044 0.0003* 1.741 -7.7e-006 -0.047 Stories 0.360*** 14.414 0.355*** 14.307 0.318*** 13.012 Rooms 0.096*** 5.182 0.094*** 5.107 0.090*** 4.969 Roomssq -0.004*** -3.251 -0.004*** -3.212 -0.004*** -3.307 Pct_white 0.013*** 4.188 0.011*** 3.461 0.015*** 3.185 Pct_black -0.003 -1.013 -0.004 -1.205 -0.002 -0.429 Pct_Hisp -0.011*** -3.982 -0.011*** -3.864 -0.010** -2.311 Pct65_up 0.0008 0.479 0.0001 0.079 0.002 0.893 Pct21_under -0.0004 -0.237 -0.0005 -0.312 0.004 1.756 Ave_HH_Sz -0.101*** -3.418 -0.096*** -3.269 -0.115** -2.582 Popdens -0.014** -2.166 -0.009 -1.466 -0.011 -1.110 RR_Halfm 0.072*** 4.407 0.066*** 4.102 0.042* 1.779 Constant 9.944*** 28.501 9.015*** 24.845 9.994*** 20.052 Rho ---- ---- 0.090*** 8.727 ----- ----- Lambda ---- ---- ----- ------ 0.475*** 22.051 R-sq 0.59 ---- 0.59 ----- 0.62 ------ N 7,211 ---- 7,211 ----- 7,211 ------- Variable Model 1 Model 2 Model 3 Model 4 ***, **, * denote 1%, 5% and 10% respectively, 64 Table 2.6 Comparative Results of HPM and Spatial models: Dep Var: log(sale Price) Policy variable: Section 8 public housing within a one-mile radius dummy Variable OLS III SPATIAL LAG III SEM III Estimate t-statistic Estimate z-statistic Estimate z-statistic Sec8_dum -0.069*** -3.229 -0.060*** -2.795 -0.064** -2.011 Lotsize(log) 0.100*** 6.925 0.097*** 6.767 0.097*** 6.039 Fixbath 0.169*** 12.438 0.163*** 12.036 0.150*** 11.053 Attic 0.045*** 4.489 0.044*** 4.483 0.033*** 3.412 Age of struc 0.0004** 2.233 0.0003* 1.915 -4.85e-007 -0.003 Stories 0.363*** 14.514 0.357*** 14.397 0.319*** 13.037 Rooms 0.100*** 5.367 0.098*** 5.279 0.091*** 5.032 Roomssq -0.004*** -3.417 -0.004*** -3.366 -0.004*** -3.366 Pct_white 0.014*** 4.432 0.012*** 3.670 0.015*** 3.248 Pct_black -0.002 -0.613 -0.003 -0.858 -0.001 -0.291 Pct_Hisp -0.010*** -3.579 -0.010*** -3.508 -0.010** -2.193 Pct65_up 0.002 0.972 0.0008 0.525 0.003 1.029 Pct21_under 0.0001 0.098 5.75e-006 0.004 0.004 1.766 Ave_HH_Sz -0.111*** -3.775 -0.106*** -3.625 -0.117** -2.639 popdens -0.021** -3.130 -0.015** -2.366 -0.017* -1.645 RR_mile 0.063*** 3.253 0.056*** 2.936 0.033 1.103 Constant 9.976*** 28.523 9.062*** 24.928 10.025*** 19.988 Rho ---- ---- 0.089*** 8.635 ----- ----- Lambda ---- ---- ----- ------ 0.475*** 22.085 R-sq 0.59 ---- 0.59 ----- 0.62 ------ N 7,211 ---- 7,211 ----- 7,211 ------- Variable Model 1 Model 2 Model 3 Model 4 ***, **, * denote 1%, 5% and 10% respectively, 65 Table 2.7 Comparative Results of HPM and Spatial models: Dep Var: log(sale Price) Policy variable: Count of Section 8 public housing within a one-mile radius Variable OLS IV SPATIAL LAG IV SEM IV Estimate t-statistic Estimate z-statistic Estimate z-statistic Sec8_count -0.017* -1.755 -0.014 -1.415 -0.015 -1.235 Lotsize(log) 0.102*** 7.050 0.099*** 6.876 0.099*** 6.111 Fixbath 0.169*** 12.435 0.163*** 12.036 0.150*** 11.053 Attic 0.044*** 4.383 0.044*** 4.483 0.033*** 3.412 Age of struc 0.0004** 2.171 0.0003* 1.856 -4.85e-007 -0.003 Stories 0.362*** 14.475 0.356*** 14.358 0.319*** 13.037 Rooms 0.099*** 5.378 0.097*** 5.244 0.091*** 5.032 Roomssq -0.004*** -3.373 -0.004*** -3.328 -0.004*** -3.366 Pct_white 0.013*** 4.220 0.011*** 3.476 0.015*** 3.248 Pct_black -0.003 -0.887 -0.003 -1.106 -0.001 -0.291 Pct_Hisp -0.011*** -3.853 -0.010*** -3.508 -0.010** -2.193 Pct65_up 0.001 0.795 0.0008 0.525 0.003 1.029 Pct21_under 0.0002 0.123 5.75e-006 0.004 0.004 1.766 Ave_HH_Sz -0.115*** -3.923 -0.106*** -3.625 -0.121** -2.719 Popdens -0.018*** -2.728 -0.013** -1.984 -0.014 -1.421 RR_mile 0.063*** 3.248 0.056*** 2.928 0.032 1.063 Constant 10.018*** 28.644 9.088*** 24.996 10.045*** 19.975 Rho ---- ---- 0.090*** 8.716 ----- ----- Lambda ---- ---- ----- ------ 0.477*** 22.234 R-sq 0.59 ---- 0.59 ----- 0.62 ------ N 7,211 ---- 7,211 ----- 7,211 ------- Variable Model 1 Model 2 Model 3 Model 4 ***, **, * denote 1%, 5% and 10% respectively. 66 CHAPTER 3 Urban sprawl, Property values, and Commute Times in Birmingham Metro- Area 3.1. Introduction The rate of suburbanization in U.S. cities in the past century has been a cause of concern for city planning agencies and local governments. In 1970, 69 percent of the population lived in Metropolitan Statistical Areas (MSAs), but this percentage increased to 75 percent in 1980, and 77 percent in 1990 (Mieszkowski and Mills, 1993). What is even more intriguing about these statistics is that the percentage of people living in city centers has actually been dropping since the 1950s, with the percentage of MSA population in inner cities dropping from 57 percent in the 1950s to 43 percent in 1970s to 37 percent in 1990s. Projections are that the percentage of inner city residents will soon decline to less than a third of the MSA population. Some have described this trend of increasing urbanized population and decreasing inner city population as a post-war phenomenon, while others assert that this has always been the case throughout modern history. International observers of suburbanization also point out that it is not an issue exclusive to the U.S. and that most industrialized countries are facing similar rates of urban growth, although the factors driving these trends do seem to vary from country to country. It has been argued by some that increasing suburbanization is a natural process of growth?the result of growing incomes, and preference for suburban lifestyle, symbolized by large single-family houses with large lot sizes. But the decline of inner cities also raises questions about whether suburbanization is a wise process of development. It is a chicken-and- 67 egg situation about cause and effect of suburbanization and the decline of inner cities. Is it the case that the decline of inner cities caused the middle class to flee to the suburbs or the flight to the suburbs by the middle class caused inner cities to decline? A number of factors have been adduced in support of increasing suburbanization since WWII. Home mortgage insurance in the 1950s by the federal government was known to have increased the desire of the middle class to move to the suburbs. So during this era, the American Dream came to be synonymous with owning a house in the suburbs. The automobile, the interstate highway system, and racial tensions were known causes of flight to the suburbs in the 1960s. In recent times, the most often quoted reasons for moving to the suburbs is high crime rate in the inner cities, poor quality of schools, and lack of employment opportunities. Mieszkowski and Mills (1993) explore at length two theories of the causes of increasing suburbanization in the U.S. The first, referred to as the ?natural evolution theory?, explains the process of suburbanization as resulting from development taking place at the city center and then spreading outward. According to this paradigm, when the city center has been all but built-up, new developments take place at the outer fringes of the city. Higher income groups who can afford the newer and perhaps larger houses at the periphery move to the suburbs, leaving behind lower income people to occupy the older city-center houses. This process leads to income stratification or ?Tiebout Sorting? where the rich live in the suburbs and the poor live in the inner cities. The second theory posits that the growth of suburbanization is the consequence of fiscal and/or social problems in the inner cities. High rates of poverty in the inner cities, high crime rates, racial tensions, poor quality of schools, and general lack of other services drive the affluent to the suburbs. This ?flight from blight? reinforces the decline of the inner city in the sense that 68 the rich middle class leaves the inner cities, depleting the tax base, further eroding quality of schools, and general neighborhood conditions, thus fuelling more flight to the suburbs. Jargowsky (2002) explores the linkages between urban sprawl, decline of inner cities, and the concentration of poverty at the city center. Sprawl is related to the decline of inner cities because it depletes the tax base of inner cities as the middle class leaves to the suburbs. As city population shifts to the suburbs so does employment opportunities, and this further erodes the income base of inner-city dwellers. Sprawl is also related to the concentration of poverty by virtue of the fact that it separates the income classes, with the rich living in the suburbs and the poor in the city core. That said, it has also been observed that gentrification of some inner-cities is driving the poor out of the city center. Examples of cities and metropolitan areas that have witnessed gentrification of inner-cities include Atlanta, Cleveland, Columbus, Washington D.C. and San Francisco Bay Area (Kennedy and Leonard, 2001), where the wealthy have started returning and rehabilitating the inner city, thus, raising property values beyond the reach of the poor. Suburban sprawl, quite commonly urban sprawl, is viewed as an environmental catastrophe that poses risks to everything from agricultural land encroachment, increased traffic congestion, traffic fatalities, longer commutes, air and water pollution, flooding, concentration of inner city poverty, to wildlife destruction.6 This has attracted a significant number of researchers to the subject. Local and state governments have been urged to address issues related to urban sprawl. Urban sprawl has become such a hot button issue that a federally-funded7 study was 6 The literature on the causes and consequences of urban sprawl is large. See Ewing (1994,1997), Sirra Club (1998), Wasserman (2000), Siegel (1999), Burchell et al..( 2000), Gordon & Richardson (1997,2000), Galster et al.. (2001),Glaeser & Kahn (2003),Wassmer (2008), Nechyba and Walsh (2004). 7 The Cost of Sprawl Revisited (Sponsored by Federal Transit Administration, 1998). 69 commissioned to examine it and a number of books have been written about it8, articles published, and journalists and researchers spilled much ink about it. Despite all these efforts, the literature is awash with different definitions of what constitutes ?urban sprawl.? According to Downs (1999), ?sprawl is not any form of suburban growth, but a particular form? characterized by low-density, leapfrog development, dominance of the automobile for transportation, lack of centralized planning and land uses, and great disparities among localities. The general lack of agreement as to what sprawl looks like is evident in the multiplicity of definitions given to it in the literature.9 Ewing et al. (2002) identify urban sprawl as ??the process in which the spread of development across the landscape far outpaces population growth.? Because of the difficulty of measuring sprawl, Ewing et al. (2002) construct four indices of sprawl measures composed from a total of 22 sprawl-related factors. These four sprawl indices are (1) residential density; (2) neighborhood mix of homes, jobs, and services; (3) strength of activity centers and downtowns; and (4) accessibility of the street network. An overall index of sprawl computed from these four factors was then used as a final measure to rank 83 of the nation?s most sprawling cities. The issue of sprawl has gained interest in the popular press. For example, the Sierra Club (1998) issued a report titled the ?The Dark Side of the American Dream: The Costs and Consequences of Suburban Sprawl.? In this report they rank American cities on the degree to which they sprawl?defining sprawl as ?low-density development beyond the edge of service and employment, which separates where people live from where they shop, work, recreate and educate?thus requiring cars to move between zones.? USA Today (2001) also did a study on 8 For example Squires (2002), and Downs (1994) 9 Burchell (1998), Ewing et al. (2002), and Galster el tal (2001) offer a comprehensive survey of this literature. 70 sprawling cities in America, by using two density-related measures to construct an index on which they ranked 271 sprawling cities in the nation. These two measures of sprawl were: (1) the percentage of a metro area?s population living in urbanized areas, and (2) change in the percentage of metropolitan population living in urbanized areas between 1990 and 1999. Besides Ewing et al. (2002), Galster et al. (2001) offer the most comprehensive characterization of sprawl based on eight distinct dimensions of land use patterns: density, continuity, concentration, clustering, centrality, nuclearity, mixed uses, and proximity. They define sprawl as a ?condition of land use that is represented by low values on one or more of these dimensions.? Using data on a sample of 13 urbanized areas (UAs), they computed scores to rank each. The UAs with the greatest sprawl?i.e. lowest score on the composite index were Atlanta, Miami, Detroit, and Denver. Few studies have investigated urban sprawl using data on a single MSA. Most of the studies sited thus far are conducted in multiple metropolitan areas. The exceptions are Song (1996) on Reno-Sparks metropolitan area in Nevada, and Weber and Sultana (2005) on the impact of sprawl on commuting in Alabama. But Song (1996) deals with only one measure of sprawl?accessibility to economic opportunity, measured by distance to Central Business District (CBD). Weber and Sultan (2005) are concerned only with sprawl?s impact on commuting, and used a univariate measure of sprawl?the percentage change in population of UAs between 1990 and 2000 that is above the average for each MSA. None of the literature investigates the linkages of sprawl and property values. 71 3.2. Causes and Consequences of Sprawl Downs (1999) criticizes many authors for confusing what sprawl is with its causes and consequences. Squires (2002) offers, arguably, the best illustration of the distinction between sprawl, its causes, and consequences. He explores the causes and consequences of urban sprawl as it relates to the structural, spatial, and social dimensions of metropolitan communities. In order not to fall in the same trap of confusing the definition of sprawl with causes and consequence, Squires offers the following; ?Sprawl can be defined as a pattern of urban and metropolitan growth that reflects low-density, automobile- dependent, exclusionary new development on the fringe of settled areas often surrounding a deteriorating city.? He further identifies the traits of metropolitan growth frequently associated with sprawl as unlimited outward extension of development; low-density housing and commercial development; leapfrog development; edge cities; reliance on private automobiles; fragmentation of land use planning among municipalities; and race and class-based exclusionary housing and employment (Squires, 2002). The causes of sprawl can be traced back to the immediate post-World War II period, beginning with the provisions of the GI Bill and federal subsidies for homeownership to the middle-class. This served as an opportunity for people to aspire to own their own homes in the suburbs and also to escape from interacting with the ?poor? in the inner-city. The influx of poor immigrants from other countries, mostly, into central cities also did help to push most middle- class families to the suburbs. Not least in spurring the wave of migration to the suburbs was the coming of the automobile, which made commuting from the suburbs to employment centers more affordable. Contemporary causes of sprawl have been linked to rising incomes and consequent desire to enjoy low-density, large single-family housing, among which are high crime rates in inner-cities, deteriorating quality of schools and other social infrastructure, and the 72 re-location of businesses and economic opportunities to the suburbs. The Sierra Club (1998) reports indicate that in some regions the main cause of sprawl in US metropolitan areas is flight from central cities, while in other cases it is due to population growth. Many authors discuss the consequences of sprawl by focusing only on the costs, but as Squires (2002), Siegel (1999), and Burchell et al. (2000) point out, sprawl has its benefits, at least to those who think it is their right to live in low-density areas. The most obvious consequences of sprawl are the environmental problems created or exacerbated by sprawl. These include air and water pollution, flooding, and loss of bio-diversity. By separating where people live from where they work, shop, and recreate, sprawl increases dependence on autos. This increases air pollution, contributes to trans-boundary problems like global warming and climate change, and ozone depletion. The dependence on autos increases dependence on fossil fuels leading to more greenhouse gas emissions and climate change. A case in point to illustrate the negative consequences of over-reliance on autos and fossil foils is the recent BP oil spill in the Gulf of Mexico in the summer of 2010, as well as the 1989 Exxon Valdex accident in the Prince William Sound. Critics of sprawl have pointed to this accidental spillage of massive amounts of oil into the oceans as a few of the environmental disasters that result from fossil fuel dependence. Traffic congestion and accidents have been known to worsen with the extent of suburbanization. In a lot of metropolitan areas, commute times have been increasing over the years, and cities like Los Angeles, Atlanta, and Chicago are gaining notoriety for having the longest traffic lines and commute times. Air quality in these cities also happens to be worse than in less sprawling cities in America. Urban sprawl is causing farm land to be converted to other non-agricultural uses at an astounding rate. Every year, an estimated 400,000 acres of farm land 73 is lost to urbanized uses, and about 70% of prime farmland is now in the path of rapid development. 3.3. Sprawl in the Birmingham-Hoover Metropolitan Area The Birmingham-Hoover metropolitan area?i.e. Greater Birmingham Area?is the second most sprawling metropolitan area in Alabama, after Mobile. The city of Mobile was ranked the 36th most sprawling city in the nation, while Birmingham ranks 99th (Weber and Sultana, 2005). The Greater Birmingham area comprises seven contiguous counties: Jefferson, Shelby, Bibb, Blount, Chilton, St. Clair, and Walker, and it has a population that is about one- fourth the population of the State of Alabama. In the year 2000, this metro area ranked 48th nationally in terms of population, and from 1990 to 2000, the area saw a 10 percent increase in population. The 2000 population and housing census reports show that the population of the Birmingham area is 1,052,238, estimated to increase by 15% between 2000 and 2009 to 1,212,848. Much of the population growth now occurs in two of these counties, Jefferson and Shelby? which together account for 78% of the population of the Birmingham Metro Area. Shelby County is known to be the fastest growing county in Alabama, and also happens to be the most sprawling area in the Birmingham Metropolitan area. Between 1990 and 2000, the population of Shelby County increased 44%. The maps in Figures 3.1 and 3.2 illustrate some indicators of sprawl in the Birmingham area. In Figure 3.1, the temporal pattern of sprawl is depicted by the median year structure was built, an indicator of housing development from the inner-city to the suburbs. The data are from 1990 census data and thus as can be seen from the Figure 3.1, the median year built of houses in all CBGs is prior to 1990. 74 Figure 3.1. Mapping sprawl in Birmingham Metro area: Median year structure built by census block group. The shading indicates the median year in which houses were built in each census block group. The lighter shading indicates CBGs in which the median year house built is before 1950 and the very dark shaded areas indicate CBGs in which the median year built is post 1970. Clearly, neighborhoods in the central city of Birmingham were built prior to 1950. At the periphery of the metro area, the median year built is generally post 1970, which indicates the pattern of development from inner-city outwards. Of course, there are some dark shaded areas at the center and light shaded areas at the fringes. Dark shaded areas at the city center indicate that older houses were probably renovated or razed and re-built. The lighter shading at the fringes 75 may indicate that these were once separate towns that later merged and formed part of the suburban Birmingham area, as the city expanded outward. Figure 3.2 indicates the changes in the metropolitan population from 1990 to 2000. Generally, the inner-city CBGs lost population between 1990 and 2000, while there was a net gain in population in CBGs at the periphery of the metropolitan area. This is particularly the case in the southern part of the metro area, which coincides with the CBGs in Shelby County. This further lends credence to the assertion that much of the sprawl in the Birmingham metro area is taking place in Shelby County. Figure 3.2. Mapping sprawl in Birmingham Metro area: Population change from 1990- 2000 by census block group. 76 3.4. Data and Methods 3.4.1. Data The study utilizes census block group (CBG) data from 1990 and 2000 population and housing census years. Because of changes in census block demarcations, a cross-walk is performed to match 1990 CBGs to those in 2000. In 1990, there were 641 CBGs in Jefferson County and 56 in Shelby County. In the 2000 census, there were 452 CBGs in Jefferson, and 57 in Shelby. In all, there are 697 observations for the 1990 dataset and 509 for the 2000 dataset For each census year and for each CBG, I obtained the population counts, population in urbanized areas, population outside urbanized areas, commute times, transportation type, housing units and housing characteristics, and other socio-demographic data including incomes, employment, education, race, and poverty levels. Since the study involves analyzing density gradients, distances to the CBD, as well as population densities are required. Thus, I calculated population density per CBG as the number of persons per square mile of land area. Also, I computed the Euclidean distance of each CBG to the CBD by using GIS software. Measuring distances involving polygon feature classes is not as straightforward as in the case of using point feature classes. With polygons, the distances are computed by first determining the centroids of each CBG, and then measuring the distance to the CBD. 3.4.2. The Monocentric Model of Urban Development The monocentric model has been used to study the spatial structure of urban areas and patterns of suburban population gradients. The model, owed to Alonso (1964), Mills (1972), Muth (1969), and Wheaton (1974), is cast in the spirit of the natural evolution theory of suburban growth. It assumes that all employment opportunities are located at the central 77 business district (CBD), and all workers have to commute to the CBD. Housing is assumed to be expensive near the CBD, and households that choose to live farther from the CBD have to incur higher commuting cost but enjoy the benefit of larger houses and lower housing cost at the periphery. The model expresses the population density as a negative exponential function of the distance to the CBD (Mieszkowski and Mills, 1993), given by; (1) Where D (?) is the population density at distance ? from the CBD, D0 is the population density at the edge of the CBD, and ? is the gradient or constant percentage change in the population density per unit change in distance from the CBD. By log-transforming the model we obtain an estimable form which can be estimated by OLS as; (2) There have been many applications of the density gradient function in empirical studies of suburban growth and expansion. Examples of these empirical applications are Malpezzi and Guo (2001), Clark (1951), Mills (1972), Mills and Tan (1980), and Edmonston (1975). Typically, most of these studies used population density data at the census tract level, with the distances of the census tracts to the CBD determined. The estimate of interest is the coefficient on distance, ?--the density gradient. Mieszkowski and Mills (1993) states that the estimate of the density gradient?slope of the log density function (?)? has been used by many researchers as a measure of the degree of decentralization of MSAs. ?The more uniform the population density as a function of distance from the central business district, the smaller the gradient, so decreases in the value of the gradient overtime have been taken as increases in decentralization or suburbanization of urban areas? (Mieszkowski and Mills, 1993). 78 Malpezzi and Gou (2001), commenting on the functional form of the density gradient, stated that in some cases the model did fit the data perfectly in the linear form as in (2). In other cases, the non-linear versions?with polynomials in ? fit the data better. The non-linear form of the log-transformed density gradient function is stated as; (3) Brueckner and Fansler (1983) applied the more flexible non-linear Box-Cox transformation, whereby data on 40 urbanized areas was used. It was found that a square-root function fitted the data more appropriately. 3.4.3. Estimated Monocentric Density Function In analyzing sprawl in the Birmingham area, I first undertake a study of the extent of the population density gradient from the central city to the outer fringes of the Metro area. I explored different functional form specification of the density function, including different polynomial terms in the distance variable as well as Box-Cox transformation, and found that the best model is one that is linear in distance; (4) Where d(.) is the population density as a function of distance (u ) to the CBD. Results of the density function estimations are presented in Tables 3.2 and 3.3. 3.4.4. Polycentric Model of Urban Form In the urban economics literature, the monocentric model developed by Alonso (1964), Muth (1969), and Mills (1972) have been used to study urban form. The methodology typically involves estimating a negative exponential density function or log density function. This model 79 assumes that urban form is ?monocentric? i.e. one can model urban structure of a city as having one central business district. But in recent times, urban structure has been changing, and with the growth of suburbia, employment centers have been shifting from the central business district to the suburbs. This makes the assumption of monocentricity less appealing. The more recent approach is to adopt a polycentric model that assumes decentralized employment centers. The monocentric model has been found to perform poorly in predicting the spatial patterns of employment and population distribution of modern cities. The ?ring-cities? or ?polycentric? model seems to provide a better characterization of modern cities. Hamilton (1982) and Small and Song (1994) found that the monocentric model did not adequately represent the distribution of employment and population in cities like Los Angeles, Boston, Pittsburgh, and Phoenix. Therefore, they proposed the polycentric model which expresses density as a function of distances to all employment centers. The idea of the polycentric model is that households value access to all employment centers, not just the CBD?as the monocentric model predicts. The polycentric model is a generalization of the negative exponential density function to include multiple centers. This is expressed as; (5) Where Dm is the density at location m; N is the number of employment centers; is the density gradient to center n; is the distance from locatiom m to center n; Dn and are parameters to be estimated for each center n; is an additive error term associated with location m. Thus, the density at any spot depends on its relative location to all of the employment centers. This polycentric model can be estimated by nonlinear least squares. 80 Table 3.1 Employment Sub-centers in Birmingham Metro Area Center Location Total Businesses Total Employment Emp. Density (Emp/Acre) Downtown 2318 49331 24.54279 Hoover 739 6604 3.285572 Bessemer 663 4408 2.193035 Gardendale 391 3249 1.616418 Pelham 294 3039 1.51194 Trussville 329 2901 1.443284 Center point 329 1974 0.98209 Montevallo 155 1308 0.650746 Pinson 127 1224 0.608955 Alabaster 226 1217 0.605473 Calera 75 845 0.420398 Adamsville 74 500 0.248756 Warrior 78 489 0.243284 Data Source: BAO- ESRI 3.4.5. Identification of employment Sub-centers To identify ?ring-cities? or employment sub-centers within the Birmingham metropolitan area, I use Business Analyst Online (BAO) software provided by the Environmental Systems Research Institute (ESRI). BAO provides data on business summary, including location of businesses, total employees, as well as demographic information of the surrounding neighborhood. I define 81 an employment center as one having a large concentration of businesses and thus high employment density. Employment density is defined as the number of employees per unit area (employees/acre). Table 3.1 shows the top 13 largest concentration of businesses and employment in the Birmingham metro area. Whether each of these sub-centers qualifies to be an employment center on its own is a matter of empirics. Different criteria have been used to define employment centers. McDonald (1987) suggests two approaches to identify employment centers, namely, local peaks in employment density and employment-population ratio. Small and Song (1994) define a center as a set of contiguous zones, each with density above some cutoff and total employment above some cutoff . Using criteria of =20 employees per acre, and =20,000 they identified 7 employment centers for 1970 and 10 for 1980 in Los Angeles. Applying these criteria to Birmingham area, I find that only the CBD/Downtown qualifies as a center (see Table 3.1). Relaxing these criteria to half the values used by Small and Song (1994), i.e. =10 employees per acre, and =10,000, I still get only Downtown Birmingham as the sole center of employment. 3.4.6. Polycentric Density Function Results I estimate equation 5 by nonlinear least squares using Birmingham metro area population density for 1990 and 2000 by census block group. Only five centers, whose densities exceed 1.5, are included in the estimation to avoid high multi-collinearity in the independent variables. Distances to most of the centers are highly collinear. In the five-center model none of the parameters are statistically significant. Using only two centers in the polycentric model, the gradient of the first center is statistically significant but that of the second center is not. This result could be interpreted to mean that the polycentric model does not adequately describe the 82 spatial pattern of the Birmingham metro area. As Table 3.1 further reveals, most of the businesses and employment are concentrated in the downtown area (CBD). The employment density is 24.5 in the CBD while most of the other possible employment sub-centers have employment densities below 2. This is possibly an indication that Downtown Birmingham (CBD) is the prominent center of employment in the metro area. 3.4.7. Constructing an Index of Sprawl As already discussed at length, the literature on urban sprawl does not give a single acceptable measure of sprawl. The acceptable norm, it seems, is to define sprawl in terms of several dimensions. My definition of urban sprawl involves five dimensions of population and housing density a la Malpezzi and Guo (2005), and Ewing et al. (2002). The five measures of sprawl I construct are (1) Population Density (persons/sq mile), (2) Housing Density (Houses/sq mile), (3) percentage change in population from 1990-2000, (4) Distance to CBD, and (5) Percentage of people living in urbanized areas. To obtain a univariate measure of urban sprawl, the technique of principal component analysis (hereafter, PCA) is applied to construct a sprawl index from the above named variables.10 PCA as a data reduction method is used to express multi-dimensional data in terms of a few linear combinations of the original variables without losing much information.11 Assuming a random vector of variables X'=[X1, X2,?, X p] having covariance matrix with eigenvalue-eigenvector 10 Factor Analysis?another dimension reduction technique? could also be used; especially when interpretation of the resulting factors is desired. 11 For a comprehensive treatment of PCA and other dimension reduction techniques, see Applied Multivariate Statistical Analysis, 6ed, by Johnson and Wichern (2007). 83 pairs ( 1? ,e1), ( 2? ,e2),?,( p? ,ep) where 1? ? 2? ??..? p? ?0, PCA can be used to obtain the linear combination; , i=1, 2, , p (6) Where, Yi is the ith principal component. Thus, each principal component is a linear combination of the original variables, weighted by the eigenvectors. The eigenvectors (loadings) convey information from the original Xs into the principal components. Thus, the more important a variable is, the more weight is given to it by way of bigger values of its eigenvector. Empirically, the researcher must determine how many principal components (PCs) to retain, depending on how much information one wants to retain in those PCs. In order to construct a univariate measure of sprawl?i.e. sprawl index?the PCA procedure is programmed to retain only one PC, meaning that we are interested in only the first PC corresponding to the pair ( 1? , e1). The resulting PC of interest then is given by; (7) Where X1, X2, ?, X5 are the various measures of sprawl listed in the foregoing page, and Y1 is, by interpretation, the desired sprawl index. We can determine how good an index we have constructed, i.e. how much information from the original sprawl measures is contained in the index, by calculating the percentage of variance of that is explained by the first PC (Y1). The total variance (information) from Xp variables is expressed by the summation of the eigenvalues of . Johnson and Wichern (2007) derive the total variance of as follows; First define =P?P' where ? is the diagonal matrix of eigenvalues, P=[e1,e2,?,e p], and PP'=P'P=I. Then, tr( )=tr(P?P')=tr(?P'P)=tr(?)= 1? + 2? +?.. + p? Thus, the total variance can be compactly expressed as; 84 (8) And the proportion of the variance explained by the ith PC is; If most (80% or more) of the total variance, can be explained by the first one or two PCs, then these PCs can be used to ?replace? the original set of variables ?without much loss of information? (Johnson and Wichern, 2007). 3.4.8. Theoretical Generalized Spatial Model In what follows I present a structural model that is used to study the relationship between urban sprawl, property values, and commute times. It is postulated that these three are jointly determined within the model, and thus exhibit right hand side endogeneity. Estimating each equation separately would result in simultaneity bias. Consequently I propose and estimate a generalized spatial two stage least squares model (GS2SLS). Kelejian and Prucha (2004) offer the theoretical foundations of the GS2SLS model in the context of spatially interrelated cross- sectional equations. The following is a description of the simultaneous system of spatially interrelated n cross sectional equations formulated by Kelejian and Prucha (2004). (9) With, , , , 85 Where, is the vector of cross sectional observations on the dependent variable in the jth equation, is the vector of cross sectional observations on the lth exogenous explanatory variable, is the disturbance vector in the jth equation, is an weight matrix of the spatial relationship among neighbors, and , and are k corresponding matrices of parameters, respectively (Kelejian and Prucha, 2004). As is typical in the literature, is referred to as the spatial lag of the dependent variable in the jth equation. The weight matrix has zero elements on the diagonal, since an observational unit cannot have a spatial relationship with itself, and non-zero off-diagonal elements if any two observational units share a meaningful spatial relationship as measured in terms of geographical proximity or if they share common boundaries. In the latter case, such observational units are called spatial neighbors. I construct based on distances of the observational units (CBGs) from each other. Using the centroids of the CBGs, distances are measured and used for the purpose of creating the weight matrix The model also allows for spatial autocorrelation in the disturbance terms, which can be expressed in the following autoregressive processes (Kelejian and Prucha, 2004). (10) With; , , Where is the vector of error terms and is the spatial autoregressive parameter in the jth equation. Similar to the spatial lag of the dependent variable stated above, the vector is known as the spatial lag of the disturbance terms in the jth equation and is the corresponding weight matrix. 86 3.4.9. Empirical Generalized Spatial Model Specification Three-equation SEM model to be estimated by GS2SLS is given by; (11) Where y1, y2, and y3 are median house values, sprawl index, and commute times, respectively. W is the weight matrix, is the parameter on the jth right hand side endogenous variable in the ith equation, is the vector of coefficients on exogenous variables, and is the vector of exogenous variables in the ith equation. The model allows the disturbances to be correlated spatially and across equations, where refers to the spatial error term in the ith equation. These spatial errors follow an autoregressive process as follows. (12) In order to obtain unique and consistent estimates of the parameters in this SEM, each equation must be identified. The order condition for identification states that the number of exogenous variables excluded from equation j must be at least as large as the number of endogenous variables included in equation j (Greene, 2003). But this condition is only necessary and by no means a sufficient condition to obtain unique estimates of the parameters. The rank or sufficiency condition is concerned with whether there is sufficient number of exogenous 87 variables in the SEM that can serve as instruments for each right hand endogenous variable. This ensures that there is exactly one solution for the structural parameters, given the reduced- form parameters (Greene, 2003). Each of the specified equations in the SEM above meets both the order and rank conditions. Thus, I am able to obtain unique and consistent estimates of parameters in each equation by applying generalized spatial two-stage least squares procedure in MATLAB. 3.5. Results One of the objectives of the study is to construct a sprawl index by applying PCA. The resulting sprawl index is used in further analysis under the generalized spatial two stage least squares section. Appendix 1 shows the eigenvalue plot of the PCA, and it can be seen that one principal component is enough to express much of the information from five factors of sprawl. Appendix 2 shows the principal component scores matrix plot, showing the correlation between all principal components. The principal component pattern profile is shown in Appendix 3. About 90% of total variance of is explained by the first principal component. This principal component is the desired sprawl index, which I then use as a univariate measure of urban sprawl in the Birmingham Metro area. Figure 3.4 maps the sprawl index scores by census block group. As can be seen from the map, the index ranges from a minimum of -4.21 to a maximum of 5.00. The smallest values of this index occur at the outer fringes, consistent with low-densities on all five measures of sprawl in the sprawling areas of the metro area. Put differently, smaller scores of the index indicate more sprawling areas (see Figure 3.4). Ewing et al. (2002) construct sprawl index scores for different cities in the US, and they find that the most sprawling cities tend to have smaller scores 88 on their index. Similarly, in their study of thirty five large metropolitan areas, Malpezi and Guo (2001) found that the more sprawling metro areas score low on the index constructed. Thus, these findings are consistent with the notion that sprawl is characterized by low density developments. The results of the density function estimation from OLS and spatial estimators are presented in Tables 3.2 and 3.3. In Table 3.2, I present results based on fitting these models to 1990 data, while Table 3.3 presents the results based on 2000 data, along with model diagnostics. The AIC criterion shows that the spatial error model fits the data better that the spatial lag model which in turn fits better than the OLS model. For 1990 data, all estimates are statistically significant at the 1% level, and the signs are typically what would be expected. The parameter of interest from the density models is the density gradient ( ), the coefficient on distance. The models predict a negative density gradient, i.e. density declines with distance from CBD. This is consistent with the theory of the monocentric-city, which states that density declines monotonically with distance from the CBD. The elasticity of density with respect to distance from CBD is -0.72, thus, for a unit increase in distance away from the CBD, density declines by 0.72. This is also consistent with most findings on sprawling cities which indicate lower residential densities as one moves away from the inner-city to the periphery. The GS2SLS results of the SEM model are presented in Tables 3.6, 3.7, and 3.8. In Table 3.6, I present the results for equation 1 with log of property values as the dependent variable. The coefficient of the spatial lag of dependent variable is positive and very significant statistically, an indication of spatial dependence in house values. This is consistent with my findings in chapters 1 and 2 that house values in general have spill-over effects on others. The sprawl index has an insignificant association with property values. As shown above, more 89 sprawling areas tend to have lower scores on the index, and thus a negative association of the index with property values would mean that more sprawling areas have higher property values, whereas a positive association would mean more sprawling areas have lower property values. There is no a priori expectation for the sign effect. Since sprawling areas are newly built houses with larger lot sizes and with modern architecture it is likely that property values could be higher. The median year built for houses in the dataset is 1980s for the sprawling areas, compared to 1940s for inner-city of Birmingham (see Figure3.1). On the other hand, people may be attracted to the sprawling areas by lower property values. The prediction of the monocentric city model is that households trade-off cheaper housing at the periphery for longer commutes to the CBD. The effect of commute times on property values in this equation is positive, which is opposite to the expected sign from the monocentric city model where house prices are negatively related to distance to the CBD to compensate for travel cost. Other things equal, sprawling areas have longer commute times. Figure 3.3 shows that the average commute time for the sprawling areas of metropolitan Birmingham is 30-50 minutes compared to less than 20 minutes in less sprawling areas. The other important and significant explanatory variables in this equation are characteristics of properties including median year built and median number of rooms. The median year built is negatively associated with property values. That means that the housing market values older houses less than newer ones. Median number of rooms is found to positively impact property values, consistent with expectation. Table 3.7 shows the result of equation 2 of the GS2SLS-SEM model with the sprawl index as dependent variable. The higher the percentage of people living in urbanized areas (UAs), the higher is the index, and this makes intuitive sense. If more people lived in densely 90 concentrated areas, we would expect less sprawl. Since higher values of the sprawl index indicate less sprawling areas, it follows that more people in UA reduces sprawl. For census year 2000, the Census Bureau defines urbanized areas as consisting of census blocks or block groups that have a population density of at least 1000 people per square mile. The results also show that in areas of high population density, the index is higher, and vice-versa. This effect corroborates other findings that sprawling areas (areas with low density developments), tend to occur in the suburbs. The coefficient of median house values is positive, contrary to my expectations of the sign effect of this variable. Median household income is negatively associated with the sprawl index. This supports the predictions of the ?monocentric-model? that as a city expands and grows outwards, upper middle class households tend to locate away from the inner-city in the surburbs?the sprawling areas. Distance to central business district is significantly and negatively associated with sprawl. The farther one goes away from the CBD, the smaller the sprawl index, and by implication the more the sprawl. Equation 3 of the SEM fits commute times as a function of distance to the CBD and transportation type. The effect of distance on commute times is not statistically significant. The effect of the sprawl index is also not significant. Carpooling and driving alone are positively associated with commute times, while truck or van (approximating for mass transit) seem to decrease commute times. The public transportation variable was dropped from the analysis because it was not significant either statistically or economically, due to a lot of zeros in it. The data indicate that many people either do not have access to or choose not to use public transit. This also underscores the reliance on the automobile as the main means of commute serving people in sprawling areas. 91 3.6. Conclusion In this paper I have explored the multidimensionality of sprawl in Birmingham Metro area using census block group level data. Applying the technique of principal component analysis on five measures of sprawl, I construct an index of sprawl for each census block group in the metro area. These five measures of sprawl include; (1) Population Density (persons/sq mile), (2) Housing Density (Houses/sq mile), (3) percentage change in population from 1990-2000, (4) Distance to CBD, and (5) Percentage of people living in urbanized areas. Peripheral census block groups score low on the index, indicating more sprawl occurring at the fringes. I also estimate log density functions, and find that the density gradient is -0.72, implying that density declines with distance from the central business district. This result is consistent with the tenets of the ?monocentric-city? model of Alonso (1964), Muth (1969), and Mills (1972). Finally, in a generalized spatial two stage least squares model of urban sprawl, property values, and commute times, I find that, other things equal, sprawling areas have longer commute times and higher property values than non-sprawling areas. 92 Table 3.2. Monocentric Density function estimates for 1990. Dependent variable: log (density) Variable OLS SAR SEM Distance (log) -0.495***(-9.789) -0.189***(-4.27) -0.720***(-6.18) HH Income 0.353***(9.376) 0.316***(7.00) 0.356***(12.10) Pct Pop in UA 0.024***(19.470) 0.012***(8.83) 0.015***(9.054) Poverty rate 1.162***(9.559) 1.007***(9.39) 1.015***(10.46) Owner occupied -0.002***(-9.031) -0.001***(-6.925) -0.001***(-6.32) Constant -7.564***(-7.027) -9.589***(-8.48) -5.216***(-5.62) R-sq 0.62 0.57 0.76 AIC 1753.62 1102.58 1081.11 Rho 0.626***(34.31) Lambda 0.735***(33.94) N 694 694 694 Variable Model 1 Model 2 Model 3 Model 4 ***, **, * denote 1%, 5%, and 10% statistical significance respectively, t-ratios in parentheses 93 Table 3.3. Density function estimates for 2000. Dependent variable: log (density) Variable OLS SAR SEM Distance -0.088(-0.82) -0.025(-0.28) -0.19(-1.05) HH Income -0.363(-1.47) -0.19(-0.86) -0.12*(-1.88) Pct Pop in UA 0.012***(5.01) 0.006***(2.98) 0.008**(2.56) Below Poverty 0.34(1.55) 0.38**(2.48) 0.421***(4.39) Owner occupied 0.0009***(4.45) 0.0009***(5.43) 0.0009***(5.09) Constant 17.03***(5.45) 6.528**(2.156) 14.82***(10.83) R-sq 0.13 0.13 0.36 AIC 1830.69 1356.96 1358.20 Rho 0.579***(13.129) Lambda 0.59***(31.76) N 509 509 509 Variable Model 1 Model 2 Model 3 Model 4 ***, **, * denote 1%, 5%, and 10% statistical significance respectively, t-ratios in parentheses 94 Table 3.4. Polycentric Density Function: Five Employment Centers 1990 2000 Location Intercept Gradient Intercept gradient Downtown 3918 3.090 668 -13.35 (4373) (3.917) (288909) (778) Hoover -5840 0.010 -536 11.929 (16959314) (2.9560) (8052847) (436376) Bessemer 1081 8.6145 -2589 0.00678 (404.1) (7.232) (1.012E1) (25072) Gardendale 56050 -0.0177 2680 -0.1460 (16960418) (5.3499) (1.011E10) (537134) Trussville 529 9.493 83.50 -11.3609 (438) (20.131) (159636) (2973) Note: Standard errors in parentheses. Dependent variable is population density. Model estimated by nonlinear least squares. Convergence criterion not met. Table 3.5. Polycentric Density Function: Two Employment Centers 1990 2000 Location Intercept Gradient Intercept gradient Downtown 2312? 9.187? 1363137? 0.019? (136) (0.947) (233942) (0.0093) Hoover 995? 12.69? 36116 -0.0669 (175) (4.616) (86299) (0.048) Note: Standard errors in parentheses. Dependent variable is population density. Model estimated by nonlinear least squares. Convergence criterion met ? denotes 5% statistical significance. 95 Table 3.6. GS2SLS-SEM estimation Results Equation 1 (log Median House Values) Variable Estimate SE t-ratio Wy1 0.6572 0.0534 12.299 Sprawl Index -0.005 0.0474 -0.1084 Commute 0.3881 0.0587 6.6067 Pctbpov -0.0026 0.0045 -0.5888 Pctbachelo 0.0106 0.0065 1.6275 Popdens -1.46E-05 3.88E-05 -0.3755 Median year -0.0208 0.00462 -4.5096 Median rms 0.1829 0.0718 2.5469 Plumbing 0.00021 0.00079 0.2719 Median HH Inc 2.15E-08 5.28E-06 0.0040 Housing Units 0.00022 0.00077 0.2957 Occupied -0.0002 0.00029 -0.8314 Unemployed -0.0016 0.00083 -2.0307 Constant 40.066 8.97546 4.4639 Lambda -0.298 N 509 Variable Model 1 Model 2 Model 3 Model 4 ***, **, * denote 1%, 5%, and 10% statistical significance respectively, t-ratios in parentheses 96 Table 3.7. GS2SLS-SEM estimation Results Equation 2 (Sprawl index) Variable Estimate SE t-ratio Wy2 0.0022 0.0396 0.0577 Median Value 0.0736 0.0127 5.7569 Pct pop in UA 0.0171 0.0010 16.985 Median HH Inc -7.29E-06 1.32E-06 -5.5352 Popdens 0.00045 1.17E-05 38.7097 Dis2cbd -0.0668 0.0049 -13.5346 Owner_occ 0.00024 8.10E-05 3.02566 Pctbpov 0.0014 0.00165 0.8809 Constant -1.745 0.1978 -8.8215 Lambda 0.3121 N 509 97 Table 3.8. GS2SLS-SEM estimation Results Equation 3 (log Commute Times) Variable Estimate SE t-ratio Wy3 0.7942 0.0396 21.369 Sprawl Index 0.0548 0.0513 1.0679 Truck_Van -1.88E-05 0.00019 -0.095 Carpooled 0.00568 0.0005 11.1103 Drovealone 0.00035 0.0002 1.7157 Popdens 7.33E-06 3.23E-05 0.2269 Dis2cbd 0.0058 0.0062 0.9297 Constant 1.2667 0.3150 4.0206 Lambda -0.5575 N 509 98 Figure 3.3 Commute times in the Birmingham Metro Area 99 Figure 3.4. Sprawl Index for Birmingham metro Area 100 REFERENCES Alonso, W. (1964). Location and land use: Towards a general theory of land rent. Cambridge, MA: Harvard University Press. Anselin, L. (1988). ?Spatial Econometrics: Methods and Models.? Studies in Operational Regional Science. Kluwer Academic Publishers. Anselin, L.,and A. K. Bera, ?Spatial Dependence in Linear Regression Models with an Introduction to Spatial Econometrics.? In A. Ullah and D. Giles, eds., Handbook of Applied Economic Statistics, New York, N.Y.: Marcel Dekker, 1998, pp. 237?289. Bartik, T.J.(1987). ?The Estimation of Demand Parameters in Hedonic Price Models.? The Journal of Political Economy, Vol. 95, No. 1. pp. 81-88. Bauman, J.F. (1987). ?Public Housing, Race, and Renewal: Urban Planning in Philadelphia, 1920-1974.? Philadelphia. Temple University Press. Bowly, Devereux, Jr. (1978). ?The Poorhouse: Subsidized Housing in Chicago 1895-1976.? Carbondale: Southern Illinois University Press. Brasington, D.M. and Hite, D. (2005). ?Demand for Environmental Quality: A Spatial Hedonic Analysis.? Regional Science and Urban Economics, Vol. 35, pp.57-82. Brasington, D. M. (1999). ?Which Measures of School Quality Does the Housing Market Value? Spatial Evidence vs. Non-Spatial Evidence.? Journal of Real Estate Research 18, 395-413. Brueckner, J. K. (2000). Urban sprawl: Diagnosis and remedies. International Regional Science Review, 23, 160?171. Brueckner, J. K., & Fansler, D. A. (1983). The economics of urban sprawl: Theory and evidence on the spatial size of cities. The Review of Economics and Statistics, 65, 479?482. 101 Bullard, R.D. (1996). ?Environmental Justice: It?s more than waste facility siting.? Social Science Quarterly 77: 493-499. Burchell, R. W., Lowenstein, G., Dolphin, W. R., Galley, C. C., Downs, A., Seskin, S., & Moore, T(2000). The benefits of sprawl. In The costs of sprawl?Revisited (pp. 351?391). Washington, DC: Transportation Research Board and National Research Council. Burchell,R.W., Shad, N.,A., Listokin, D., Phillips, H., Downs,A., Siskin, S., Davis, J.S., Moore, T., Helton, D., Gall, M., and ECONorthwest (1998). Costs of Sprawl?Revisited. Washington, DC: National Academy Press. Butler, R.V. (1982). ?The Specification of hedonic Indexes for Urban Housing.? Land Economics. Vol. 58 No. 1. Carter, W.H., Schill, M.H. and Wachter, S. M. (1998). ?Polarization, Public Housing and Racial Minorities in US Cities.? Urban Studies, Vol. 35, No. 10, 1889-1911. Clark, C. (1951). Urban Population Densities. Journal of the Royal Statistical Society, Series A, December 1951, 490-96. Cochran, W. G. (1968). ?The effectiveness of adjustment by sub-classification in removing bias in observational studies.? Biometrics, 24, 205-213. Cummings, P. and Landis, J. (1993). ?Relationships between affordable housing development and neighboring property values.? Working Paper 599, Berkeley, CA: Institute of Urban and Regional Development. DeSalvo, J. S. (1974). ?Neighborhood Upgrading Effects of Middle Income Housing Projects in New York City.? Journal of Urban Economics, 1(3), pp. 269-77. Downs, A. (1960). ?An Economic Analysis of Property Values and Race.? Reports and Comments. Land Economics. 35:36, pp.180-188. Downs, A. (1994). New Visions for Metropolitan America. Published by The Brookings Institution and Lincoln Institute of Land Policy. 102 Downs, A. (1999). Some realities about sprawl and urban decline. Housing Policy Debate, 10, 955?974. Edmonston, B. (1975). Population Distribution in American Cities. Lexington: Heath and Company. Ewing, R. H. (1994). Characteristics, causes, and effects of sprawl: A literature review. Environmental and Urban Issues, Winter, 1?15. Ewing, R. H. (1997). Is Los Angeles-style sprawl desirable? American Planning Association Journal, 63, 107?126. Ewing, R., Pendall, R., and Chen, D. (2002 ). Measuring Sprawl and Its Impacts. Smart Growth America. Retrieved at http:/smartgrowthamerica.org. Family Housing Fund. (2000, September). ?A Study of the Relationship Between Affordable Family rental Housing and Home Values in the Twin Cities: Final Report.? Minneapolis, MN: http://www.fhfund.org/whatsnew.htm. Fisher, Manfred M. and Getis Arthur (2010). Handbook of Applied Spatial Analysis: Software Tools, Methods and Apllications. Springer. Fotheringham A. Stewart and Peter, A. Rogerson (2009). The SAGE Handbook of Spatial Analysis. SAGE publications Ltd. Galster, G., Hanson, R., Ratcliffe, M. R., Wolman, H., coleman, S., & Freihage, J. (2001). Wrestling sprawl to the ground: Defining and measuring an elusive concept. Housing Policy Debate, 12, 681?717. Galster, G.C., Cutsinger, J.M. and Malega, R. (2006). ?The Social Costs of Concentrated Poverty: Externalities to Neighboring Households and Property Owners and the Dynamics of Decline.? National Poverty Center Working Paper Series, #06-42. 103 Galster,G.C., Tatian, P.and Smith, R.(1999). ?The Impact of neighbors who use Section 8 Certificates on Property values.? Housing Policy Debate, Vol.10, Issue 4. Glaeser, E. L., & Kahn, M. E. (2003). Sprawl and urban growth. Harvard Institute of Economic Research Discussion Paper, Number 2004. Goetz, E.G., Lam, H. K. and Heitlinger, A. (1996). ?There goes the neighborhood? The impact of subsidized multi-family housing on urban neighborhoods.? Minneapolis- St. Paul: University of Minnesota, Center for Urban and Regional Affairs. Gordon, P., & Richardson, H. W. (1997). Are compact cities a desirable planning goal? Journal of the American Planning Association, 63, 89?106. Gordon, P., & Richardson, H. W. (2000). Critiquing sprawl?s critics. Policy Analysis, 365, 1?18. Green, R.K., Malpezzi, S. and Seah, K. Y. (2002). ?Low Income Housing Tax Credit Housing Developments and Property values.? The University of Wisconsin. The Center for Urban Land Economics. www.bus.wisc.edu/realestate Greene, W.H. (2003). Econometric Analysis. Fifth Edition. Prentice Hall. Hamilton, B.W., & Roell,A. (1982). Wasteful Commuting. The Journal of Political Economy, Vol. 90, No.5, pp 1035-1053. Hite, D (2000). ?A Random Utility model of environmental equity.? Growth and Change 31:40- 58. Hite, D.,Chern, W., Hitzhusen, F.,and Randall, A. (2001). ?Property-Value Impacts of an Environmental Disamenity: The case of Landfills.? Journal of Real Estate Finance and Economics, 22:2/3, 185-202. Imbens, G. and Wooldridge J., (2007). ?What?s New in Econometrics? Lecture 10: Difference- in-Differences Estimation.? NBER Summer Institute. Imbens, G. and Wooldridge J., (2009). ?Recent Developments in the Econometrics of Program Evaluation.? Journal of Economic Literature 47:5-86. 104 Jargowsky, P. (2002). Sprawl, Concentration of Poverty, and Urban Inequality. In ?Squires (2002), Urban Sprawl: Causes, Consequences, and Policy Responses. The Urban Institute Press. Washington, D.C.? Johnson, R.A., and Wichern, D.W. (2007). Applied Multivariate Statistical Analysis, 6th edition. Pearson Education. Kane, T.J., Riegg, S.K., & Staiger, D. O.(2006). "School Quality, Neighborhoods, and Housing Prices," American Law and Economics Review, Oxford University Press, vol. 8(2), pages 183- 212. Kennedy, Maureen and Leonard, Paul (2001). ?Dealing with Neighborhood Change: A Primer on Gentrification and Policy Choices.? A Discussion Paper Prepared for the Brookings Institution Center on Urban and Metropolitan Policy. www.brookings.edu/urban. Kohlhase, J. (1991). ?The Impact of ToxicWaste Sites on Housing Values.? Journal of Urban Economics 30, 1-26. Lee, C-M., Culhane, D. P., and Wachter, S.M. (1999). "The Differential Impacts of Federally Assisted Housing Programs on Nearby Property Values: A Philadelphia Case Study" Departmental Papers (SPP). Available at: http://works.bepress.com/dennis_culhane/11. LeSage, J. P. and R. K. Pace (2009). ?Introduction to Spatial Econometrics.? Volume 196 of Statistics, Textbooks and Monographs, CRC Press, 2009. LeSage, James, P. (1997). ?Regression Analysis of Spatial Data.? The Journal of Regional Analysis and Policy, JRAP 27, 2:83-94. LeSage, James, P. (1999). ?The Theory and Practice of Spatial Econometrics.? Department of Economics, University of Toledo. Lyons, R.F. and Loveridge, S. (1993). ?An hedonic estimation of the effect of federally subsidized housing on nearby residential property values.? Staff Paper P93-6, Minneapolis- St. Paul: University of Minnesota, Department of Agricultural and Applied Economics. 105 Malpezzi, S. & Guo,W-K.(2001). Measuring ?Sprawl?: Alternative Measures of Urban Form in US Metropolitan areas. The Center for urban Land Economics Research, The University of Wisconsin. http://wiscinfo.doit.wisc.edu/realestate. Malpezzi, S. (1999). Estimates of the Measurement and Determinants of Urban Sprawl in US Metropolitan areas. The Center for urban Land Economics Research, The University of Wisconsin. http://wiscinfo.doit.wisc.edu/realestate. Massey, D.S., and Kanaiaupuni, S. M. (1993). ?Public Housing and the Concentration of Poverty.? Social Science Quarterly, Vol. 74, No. 1. McClure,K. (2006). The Low-Income Tax Housing Credit Program goes Mainstream and Moves to the Suburbs. Housing Policy Debate, Vol 17, Issue 3. Fannie Mae Foundation. McDonald, J.F. (1987). The Identification of Urban Employment Subcenters. Journal of Urban Econmics 21:242-258. Mieszkowski, P., & Mills, E. S. (1993). The causes of metropolitan suburbanization. Journal of Economic Perspectives, 7, 135?147. Mills, E. S. (1967). An aggregative model of resource allocation in a metropolitan area. American Economic Review Papers and Proceedings, 57, 197?210. Mills, E. S. (1972). Studies in the Structure of the Urban Economy. Baltimore: John Hopkins Press. Mills, E. S.,and Tan, J.P.(1980). A Comparison of Urban Population Density Functions in Developed and Developing Countries. Urban Studies, 17:3,313-21. Muth, R. F. (1969). Cities and housing. Chicago: University of Chicago Press. Nechyba,T.J., & Walsh, R.P. (2004). Urban Sprawl. Journal of Economic Perspectives,Vol 18 No 4, Pages 177-200. Nelson, A. C., J. Genereux, and M. Genereux (1992). ?Price Effects of Landfills on House Values.? Land Economics 68, 359-365. 106 Nelson, J. P. (1978). ?Residential Choice, Hedonic Prices and the Demand for Urban Air Quality.? Journal of Urban Economics 5, 357-369. Newman,S.J.and Schnare, A.B. (1997). ??And a Suitable Living Environment?: The Failure of Housing Programs to deliver on neighborhood Quality. Housing Policy Debate, Vol 8, Issue 4. Fannie Mae Foundation. Nguyen, M.T.(2005). ?Does affordable housing detrimentally affect property values? A review of the literature.? Journal of Planning Literature 20(1): 15-28. Nourse, H. O. (1963). ?The Effect of Public Housing on Property values in St Louis.? Land Economics, 39, pp. 434-41. Rabiega, W. A., Ta-Win, L. and Robinson, L. (1984). ?The Property Value Impacts of Public Housing Projects in Low and Moderate Density Residential Neighborhoods.? Land Economics, 60, pp. 174-9. Rohe, W. M. and Freeman, L. (2001). ?Assisted Housing and Residential Segregation: The Role of Race and Ethnicity in the Siting of Assisted Housing Developments?, Journal of the American Planning Association,67: 3, 279 ? 292. Rosen, Sherwin. (1974). ?Hedonic Prices and Implicit markets: Product Differentiation in Pure Competition.? Journal of Political Economy. Vol. 82, pp.34-55. Rosenbaum, P. R. and Rubin, D. B. (1983). ?The central role of the propensity score in observational studies for causal effects.? Biometrika, 70, 41-55. Rubin, D. (1974). ?Estimating Causal Effects of Treatments in Randomized and Non- Randomized Studies.? Journal of Educational Psychology, 66: 688-701. Santiago, A.M., Galster, G.C., & Tatian, P. (2001). ?Assessing the Property Value Impacts of the Dispersed Housing Subsidy Program in Denver.? Journal of Policy Analysis and Management, Vol.20, No.1, 65-88. 107 Schwartz, A.E., Ellen, I.G., Voicu, I., and Schill, M.H. (2003). ?Estimating the External Effects of Subsidized Housing Investment on Property Values.? Lincoln Institute of Land Policy: Working Paper: WP03AS1. Siegel, F. (1999). The sunny side of sprawl. The New Democrat, March/April, 21?22. Sierra Club (1998). The dark side of the American dream: The costs and consequences of suburban sprawl. Washington, DC. Small, K.A., & Song, S. (1994). Population and Employment Densities: Structure and Change. Journal of Urban Economics, 36, 292-313. Song,S. (1996). Some Tests of Alternative Accessibility Measures: A population Density Approach. Land Economics, 72(4): 474-82. Squires, G.D. (2002). ?Urban Sprawl: Causes, Consequences, and Policy Responses. The Urban Institute Press.? Washington, D.C. Tobler, W. (1979). ?Cellular Geography.? In Spatial Econometrics: Methods and Models, by Anselin, 1988, Kluwer Academic Publishers. Pp. 8. USA Today (2001). A Comprehensive look at Sprawl in America. Available at http://www.usatoday.com/news/sprawl/main.htm. Warren, Elizabeth, Aduddell, R.M., and Tatlovich, Raymond (1983). The impact of subsidized housing on property values: A two-pronged analysis of Chicago and Cook County suburbs. Center for Urban Policy, Loyola University of Chicago. Wasserman, M. (2000). Urban sprawl: American cities just keep growing, growing, and growing. Regional Review, 10, Federal Reserve Bank of Boston. Retrieved December 21, 2007, from http://www.bos.frb.org/economic/nerr/rr2000/q1/wass00_1.htm. Wassmer, R. W. (2008). Causes of Urban Sprawl in the United States: Auto Reliance as Compared to Natural Evolution, Flight from Blight, and Local Revenue Reliance. Journal of Policy Analysis and Management, Vol. 27, No. 3, 536?555. 108 Weber, J., and Sultana,S. (2005). The Impact of Sprawl on Commuting in Alabama. University Transportation Center for Alabama. UTCA Report 04108. Unpublished. Wheaton, W.C. (1974). A Comparative Static Analysis of Urban Spatial Structure. Journal of Economics Theory, 9, 223-237. 109 Appendix 1. Scree Plots Appendix 2. Principal Component Score Matrix Plot 110 Appendix 3. Principal Component Pattern Profiles