United States


http://www.census.gov/geo/www/tiger/tgrshp2009/tgrshp2009.html
(or directly from http://www2.census.gov/geo/tiger/)

New! 2009 version just came out on 10/1/09! Previous versions are also linked from this site.

The TIGER/Line Shapefiles are extracts containing selected geographic and cartographic information from the Census Bureau’s MAF/TIGER® database. Unlike the previous Cartographic Boundary Files site, this site includes block boundary files as well as hydrography, transportation/streets and landmark data files. Users can also download multiple TIGER/Line Shapefiles at a time via this FTP site.

Again, these files contain no demographic information but are designed to be used with decennial census population and housing data as well as other related federal datasets.

http://www.icpsr.umich.edu/cocoon/ICPSR/SERIES/00059.xml

[Available to ICPSR members only - most U.S. higher education institutions are member institutions!]

“The United States Historical Election Returns series was developed by ICPSR and was supported by grants from the National Science Foundation and the National Endowment for the Humanities. ICPSR’s holdings of historical election data cover the years 1788-1990 and consist of several discrete datasets that contain county- and state-level returns for all elections to the offices of president, governor, United States senator, and United States representative.”

I haven’t used the files yet. I will post the resulted maps later..

http://www.americaview.org/

“AmericaView (AV) is a nationwide program that focuses on satellite remote sensing data and technologies in support of applied research, K-16 education, workforce development, and technology transfer.” Most state’s AmericaView program provides access to its satellite imagery and remotely sensed data collection. Visit the member state websites below to find more about the state programs.

Current full member states are : Alabama, Alaska, Arkansas, California, Georgia, Hawaii, Idaho, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maryland, Michigan, Minnesota, Mississippi, Montana, Nebraska, New Hampshire, New Mexico, North Dakota, Ohio, Pennsylvania, South Dakota, Texas, Virginia, West Virginia, Wisconsin, Wyoming

Current affiliated member states are : Colorado, Nevada, New York, North Carolina, Utah, Vermont, Washington

http://www.huduser.org/Datasets/nsp_foreclosure_data.html

HUD’s Neighborhood Stabilization Program (NSP) local level foreclosure data is a statistically generated dataset to estimate a foreclosure rate as small as at a tract level. The data was created, under a new law, Economic Recovery Act of 2008, to assist local efforts to sustain high risk neighborhoods by acquiring and redeveloping foreclosed properties. Since foreclosure data is not readily available (no public & nationwide data sources available), some users may find the data useful. Remember: the data is not actual data but estimated numbers using public datasets – read the methdoroloogy document!

[Image left - HUD NSP Foreclosure Rate in Chicago metropolitan area (sorry, I forgot to add a legend!) - click on the image to enlarge]

http://www4.uwm.edu/eti/PurchasingPower/purchasing.htm

[Image left: Estimated annual expenditures for food away home per square mile, Metro Chicago by zipcode - click on the image to enlarge.]

What marketing people are very interested in is learning consumer’s spending patterns. The main data source for this is Bureau of Labor Statistics (BLS)’s , Consumer Expenditure (CE) Survey, which is available only at a national level not at any smaller geographic levels.

But, many people were also interested in learning “where” people might spend money – i.e. geographic patterns of consumer spending, so marketers/business data companies developed a way to create such data by adjusting the BLS Consumer Expenditure Survey data to local demographics using decennial censuses. This data is now widely available for fee from various companies, such as ESRI (data methodology),  Applied Geographic Solutions (data methodology – page 14), and Geolytics (the Northwestern University Library has this CD, 2003/Estimates and 2008/Projections, which include consumer expenditure at small geographic levels down to BG.)

GIS users who do not wish to spend money for this kind of data can use this Free Purchasing Power Profiles and Workforce Density Data for All Census Tracts and Residential ZIP Codes in U.S. instead. This dataset, based on 2002 CE survey plus 2000 census data, is provided by the University of Wisconsin-Milwaukee, Employment and Training Institute. It’s totally free, but the interface is not so flexible – I still don’t know what is the best way to download this data. Oh, well! It’s free – what can we say. The data methodology page is available from here.

http://www.nass.usda.gov/research/Cropland/SARS1a.htm
(Note 1: CDL is available for selected states only. Note 2: CDL data prior to 2007 is available at the Geospatial Data Gateway)

Cropland Data Layer (CDL) is a raster data set showing types of agricultural products grown on the land. The land use/cover analysis was conducted using satellite imagery and remote sensing analysis software (so I understand – more information about methodology & classification process is available from the methodology & FAQ pages.)

This dataset can be used as an alternative source for landuse/landcover data – which I learned from our geography instructor, Professor Greene. Though the focus of the NLCD (National Land Cover Dataset) and the CDL is different (i.e. NLCD=focuses on more general land use/cover types with more classes for urbanized area, CDL=more classes for crop types – instead of just one class, like “82: cultivated crops”), it’s possible to create a “crosswalk” to compare land use/cover changes over time since both datasets use similar classifications – like 1992/2001 NLCD crosswalk and Land Cover Institute’s Classification System Crosswalk table (scroll down a bit!)  As you see, main land use/cover classes are basically the same: water, developed/urban land, barren land, forests, shrub land, grassland, cropland/agricultural land, wetland, others/clouds – all we need is a crosswalk to create common classes. (Note: of course, we still need to consider data accuracy, data methodology differences.)

http://www.irs.gov/efile/article/0,,id=118376,00.html#DR

[Image left: showing % change in average adjusted gross income (AGI) between 2004 and 2007 (tax year) - click on the image to enlarge.]

Another IRS data – source for community (zipcode) economic/income level. Unfortunately, income ranges data isn’t available from this file – which is available for fee. See: SOI Tax Stats – Individual Tax Statistics – Zip Code Data

Pros: available annually since tax year 2004 at a relatively small geographic level (i.e. zip code.)

Cons: incomplete data – the data doesn’t include people who didn’t send tax forms.

Data fields included are:

  • Age of primary taxpayers by range (<30, 30-44, 45-60, >60)
  • Average adjusted gross income (AGI)
  • Average refund
  • County
  • Town
  • Zipcode
  • and many tax form related fields

http://www.thearda.com/

“The Association of Religion Data Archives (ARDA) strives to democratize access to the best data on religion. Founded as the American Religion Data Archive in 1997 and going online in 1998, the initial archive was targeted at researchers interested in American religion. The targeted audience and the data collection have both greatly expanded since 1998, now including American and international collections and developing features for educators, journalists, religious congregations, and researchers.”

ARDA is a great source to look for religion information and datasets. The data includes U.S. county level demonimational profile data, which is now also accessible via Social Explorer.

http://www.irs.gov/taxstats/charitablestats/article/0,,id=97186,00.html

If you are looking for list of nonprofit, charitable or religious organizations, this is a good starting point. This dataset includes name and address of organizations, classification/foundation codes and organizational codes (note: this is useful but incomplete!), etc.  See the current data document – http://www.irs.gov/pub/irs-soi/eobk09.doc

Here is one example: on May 17, 2009, Chicago Tribune published a story about a distressed Chicago neighborhood, North Lawndale (where homicide is the 4th leading cause of death in 2003 – above accidents, according to the Chicago Dept of Public Health report.) The center of the story was storefront churches. “Aldermen and business leaders in the North Lawndale area of Chicago are questioning whether the area’s numerous churches are causing economic harm.” Storefront churches: Causing more economic harm than spiritual good? (05/17/2009) A map in the paper showing location of churches along Roosevelt and Pulaski roads (which is not available online) intrigued me, so I mapped locations of churches in Chicago found in the IRS Exempt Organizations master file – see the map on the left (click on the map to enlarge.)  By comparing the map in Chicago Tribune and my map, it is obvious that my church data (extracted from IRS file where foundation code = 10) missed the storefront churches found by Chicago Tribune. Further investigation on the data is perhaps necessary.

According to my professor, Dr. Michael Barndt of Nonprofit Center of Milwaukee, the dataset is fairly comprehensive, but some information may not be updated. So, be aware of such shortcomings, if you use this data..

Other sources for nonprofit organizations are:

  • National Center for Charitable Statistics (NCCS)/Urban Institute – http://nccs.urban.org – is said to have more comprehensive and updated data but this dataset is not free (thank you for the information, Professor Barndt..)
  • Phone/yellow books – yes, old fashioned but still vital source of organization listings.

http://www.historicaerials.com/

“HistoricAerials.com provides free online access to historic and current aerial photography. You can view aerial photography from the 1930s through today. Use our multi-year comparison tools to detect changes in property. Come and explore your favorite points of interest at HistoricAerials.com.”

Keywords: historical, historic, airphotos, aerial photographs, aerial photos, orthophotos, orthoimagery, ortho photographs

http://www-fars.nhtsa.dot.gov/

FARS (Fatality Analysis Reporting System) contains data on all fatal traffic crashes in U.S. and Puerto Rico. The data system was designed to “assist the traffic safety community in identifying traffic safety problems, developing and implementing vehicle and driver countermeasures, and evaluating motor vehicle safety standards and highway safety initiatives.”

FARS is a relational database containing four normalized tables (crashes, persons, vehicles, drivers) that are related to the others. You can either download the entire annual databases from a FTP site, or select cases by data fields and geography. To select cases, click on the Query tab and follow instructions (i.e. Step 1: select a year – submit – step 2: select fields – submit, etc.)

Make sure to read the data manuals to understand this dataset – they are available from this FTP site, ftp://ftp.nhtsa.dot.gov/FARS/FARS-DOC/.

[Image above: pedalcycle fatal collisions 2001-07, over Chicago bike paths.]

http://lehdmap3.did.census.gov/themap3/

LED (Local Employment Dynamics) OnTheMap is “a web-based mapping and reporting application that shows where workers are employed and where they live. It also provides companion reports on age, earnings, industry distributions, and local workforce indicators.” Unlike other federal programs,  the main source of this application comes from states – payroll tax payment records (Unemployment Insurance, also known as Employment Security, ES-202 program) maintained by each state, and the information is available for the participating states only (LED version 3 has 46 states.) It’s a fascinating program! Learn more about this tool and related programs, starting from this site : http://lehd.did.census.gov/led/datatools/onthemap3.html

[Image left: Laborshed analysis map, showing a residential pattern of the City of Chicago employees – employees who works in the city hall.)

You can make maps and a variety of data reports on the OnTheMap site (use of the web tool is highly recommended) but *experienced* data users may choose to use the public use data instead, which is available from http://vrdc.ciser.cornell.edu/onthemap/doc/index.html (user registration is required!) The LED OnTheMap data is available at a census block level.  Note: it’s not easy to use this dataset – use the OnTheMap web tool instead and stay away unless you are comfortable with large dataset handling and statistics!

http://www.transtats.bts.gov/Tables.asp?DB_ID=630&DB_Name=Census%20Transportation%20Planning%20Package%20%28CTPP%29%202000&DB_Short_Name=CTPP%202000

(Check also : 1990 CTPP data from BTS transtats. Or, get 1990 & 2000 data CDs from BTS Bookstore..)

CTPP, Census Transportation Planning Package, formerly known as “journey to work” data, is a collection of special summary tables which are tabulated from decennial censuses for transportation planning.  The smallest geographic levels are either census tracts, block groups, or TAZ (Traffic Analysis Zone) – depending on the MPC (Metropolitan Planning Council?)  In Chicago, the smallest geography is census tract in 2000, and TAZ in 1990.

History: In 1970 and 1980, this dataset was called “Urban Transportation Planning Package (UTPP)” which is not publicly available – check local MPOs (Metropolitan Planning Organizations).

Use Part 2 data, place of work, to obtain job/employment counts. Part 1, place of residence, data are basically the same as decennial population census data with more detailed “transportation” related variables. Part 2 is uniquely tabulated – counting people not by place of residence (part 1) but by place of work. Hence, you can find how many people work in a specific census tract, or a specific place.

Though CTPP isn’t meant to be used to count jobs, part 2 data can offer good alternative estimates.  To learn more about the use of CTPP part 2 data as a source of employment estimates, I highly recommend this article by Nanda Srinivasan – CTPP Workers-at-Work Compared to Other Employment Estimates (on page 2).

[Image left: %  female and male employees - by place of employment. The maps show different employment/job location patterns by gender in space. Source: CTPP 2000, Part 2, Table 3.]

Part 3, journey to work, data is equally unique and useful if you are interested in commuting patterns. The Part 3 tells you how many people travel/commute from one place (residence) to another (work place.)

[Image right: Employee residential patterns - Northwestern University (Evanston Campus, tract 8087.02) & Univ of Chicago (North & The Quadrangle campuses, tract 4113). Source: CTPP 2000, Part 3, Table 3. ]

Another CTPP like data source for employment analsyis is LED On The Map data. See the next posting..

http://www.cdc.gov/brfss/maps/gis_data.htm

Download the historical (2002-recent) BRFSS (Behavioral Risk Factor Surveillance System) GIS data from the Centers for Disease Control and Prevention website. This BRFSS data is available at a state level only. (Some states may provide county level BRFSS data – check state public health departments.) The zip files contain BRFSSdata that is mapped for both the states and metropolitan/micropolitan statistical areas (MMSAs).

[Image left - click here to enlarge : a map showing % of binge drinkers (adults having five or more drinks on one occasion).]

http://datawarehouse.hrsa.gov/

U.S. Department of Health & Human Services, Health Resources and Services Administration (HRSA)’s health service related geospatial data download and map viewing website. Datasets include: Primary Care Service Area (PCSA) boundaries, registered nurses database (down to county), Health Professional Shortage Area (HPSA) data (I haven’t checked this one yet.)

Another related popular dataset is HRSA Area Resource File (http://www.arfsys.com/) – county level health service related data (health care facilities, professionals and demographic data)  but this isn’t available online. Find a library that provides access to this dataset. Our library, the Northwestern University Library has this 2001-2005 & 2007 dataset, but the access is limited to the Northwestern affiliated members only.

[Image above: from the HRSA map tool help site, http://datawarehouse.hrsa.gov/DWOnlineMap/Help/]

http://www.bts.gov/publications/national_transportation_atlas_database/_database/

NTAD is a set of nationwide geographic databases of transportation facilities, transportation networks, including public transit networks and associated infrastructure. The latest dataset is available for downloading from this site (now 2008.)  If you want previous editions, check a local library. (For example, our library, the Northwestern University Library, has 1995 – 2007 CDs.)

http://www.mrlc.gov/

The Multi-Resolution Land Characteristics (MRLC) Consortium is specifically designed to meet the current needs of Federal agencies for nationally consistent satellite remote sensing and land-cover data and also provides imagery and land cover data as public domain information, all of which can be accessed through this website.

Currently available land cover datasets are

http://water.usgs.gov/maps.html

USGS Water-Resources offices provide water information that benefits the Nation’s citizens: Publications, data, maps, and applications software. Lists of water GIS datasets are available from this site.

Included are:

  • Water Data (Real-Time Data, Annual Water Data Reports, Streamflow Map of the United States)
  • WaterWatch (Floods and High Flow, Drought,  Monthly Streamflow, Ground Water, Water Quality)
  • Etc.

http://nhd.usgs.gov/

“The NHD is a comprehensive set of digital spatial data representing the surface water of the United States using common features such as lakes, ponds, streams, rivers, canals, and oceans. These data are designed to be used in general mapping and in the analysis of surface-water systems using geographic information systems (GIS).”

You can also download NHD data from USDA Geospatial Data Gateway (my preference for downloading NHD) Note: This is actually a quite complicated dataset – be aware! You may want to use other simplified hydrography data instead, such as US Census TIGER derived data or data from the National Atlas Raw Data download website.

http://www.fema.gov/hazard/map/index.shtm

It’s been a while since I retrieved Q3 floodplain GIS data from FEMA’s website last time, and I am a little confused by this new site – there are three similar GIS datasets with slightly different data descriptions and pricing.  Here is what I’ve learned about the floodplain data so far.

DFIRM: Digital Flood Insurance Rate Map ($10.00 per community) http://msc.fema.gov/webapp/wcs/stores/servlet/CategoryDisplay?catalogId=10001&storeId=10001&categoryId=12001&langId=-1&userType=G&type=1&parent_category_rn=12009&dfirmCatId=12009
DFIRM database is a collection of the digital data that are used in GIS systems for creating new Flood Insurance Rate Maps (FIRMs). These datasets cover a county or a community. The aerial, orthorectified photography used to create the base maps, which were used in the creation of FIRMs, are not part of a DFIRM database. However, they may be purchased with the DFIRM database, if available. Those interested in purchasing these items should select on the “CD with basemaps” option on the product selection screen. In some cases, partial DFIRM databases are available. The GIS data contained in those databases does not cover the entire county or community. Partial does not mean community-based; both community-based and countywide could be partial.

DFIRM NFHL: National Flood Hazard Layer ($0.00, free – why? – http://msc.fema.gov/webapp/wcs/stores/servlet/CategoryDisplay?catalogId=10001&storeId=10001&categoryId=12011&langId=-1&type=12 )
National Flood Hazard Layer (NFHL) dataset is a compilation of effective Digital Flood Insurance Rate Map (DFIRM) databases (a collection of the digital data that are used in GIS systems for creating new Flood Insurance Rate Maps) and Letters of Map Change (Letters of Map Amendment and Letters of Map Revision only) that create a seamless GIS data layer for a State or Territory. The dataset is updated on a quarterly basis and it is made available on DVD in Shapefile format. We can order the National Flood Hazard Layer (NFHL) GIS datasets by state on DVDs.

FEMA Digital Q3 Data ($50.00 per disc) http://www.fema.gov/hazard/map/q3.shtm
The Q3 Flood Data product is a digital representation of certain features of FEMA’s FIRM product, intended for use with desktop mapping and GIS technology. Digital Q3 Flood Data has been developed by scanning the existing FIRM hardcopy, vectorizing a thematic overlay of flood risks. The vector Q3 Flood Data files contain only certain features from the existing FIRM hardcopy. Digital Q3 Flood Data has been developed by scanning the existing FIRM hardcopy, vectorizing a thematic overlay of flood risks. The vector Q3 Flood Data files contain only certain features from the existing FIRM hardcopy.

Luckily, various local government sites and data portals redistribute Q3 floodplain data since they aren’t copyrighted. Perhaps, we should get Q3 data from such sites. For example, Chicago area Q3 data are available from Natural Connections Data Archive site (http://www.greenmapping.org/archive – select “Floodplain” category. )

[This isn't really GIS data - these are electronic maps] FEMA Issued Flood Maps – see the image above (Evanston, IL): http://msc.fema.gov/webapp/wcs/stores/servlet/CategoryDisplay?catalogId=10001&storeId=10001&categoryId=12001&langId=-1&userType=R&type=1
We can obtain electronic FEMA issued flood maps over aerial photography from this site. They are kind of like PDF type electronic print maps – and I think you can download as PDF or an image file.

Next Page »