UAP 5114, Spring 2018
Computer Application in Planning |
Planning Analytics and Visualization

Virginia Tech, College of Arhictecture and Urban Studies | School of Urban Affairs & Planning

11:00 AM - 12:15 PM, Architecture Annex Room 1, Mon & Wed
Instructor: Wenwen Zhang

Overview

We are in the era of big data, with 2.5 exabytes - that's 2.5 billion gigabytes (GB) - of data generated every day. Planners are expected to be equipped with quantitative analysis skill sets and data visualization techniques to take advantage of urban big data to address/monitor/measure complex urban problems. This class offers an introduction to urban analytics to planners, with a focus on unstructured urban data acquisition and processing. Additionally, the class also provides fundamentals regarding machine learning and text mining algorithms that can be used to develop supervised and unsupervised predictive models to address real-world inquiries. The course also emphasizes on data visualization and communication skillsets to enhance planners public engagement skills in the age of big data. By the end of the semester, students should be familiar with adapting python scripts to obtain and process unstructured urban data, train and test machine learning models, and effectively visualize model outputs and communicate to local stakeholders.

Office Hours

Wenwen Zhang @ Architecture Annex, Room 215
Wed 12:15 - 1:00 PM or by Appointment

Class Objective

The objective of this course is to familiarize students with modern unstructured data acquisition, processing and machine learning techniques. The course also demonstrates how these tools can be applied in real-world context. The course is practice oriented, i.e., concepts and models are motivated and illustrated by application to urban problems and datasets. By the end of the semester, the students will be able to:
  1. Learn basics of Python 2.7
  2. Download Data using API
  3. Scrape Data Online
  4. Cleaning messy data
  5. Perform basic data analysis using machine learning algorithms
  6. Natural language processing: text mining
  7. Evaluate model outputs
  8. Visualize model outputs

Course Procedure and Organization

In the course, students will work on a REAL-WORLD PROJECT: Evaluating the social and economic imapcts of Arlington Restaurant Initiative (ARI). Studnets are expected to get familiar with the project and develope social and economic evaluation framework based on the literature review for the first 2-3 weeks in the class.
The course is organized in a series of lectures, hands-on exercises, and readings. The lectures and readings are designed to allow students to have basic knowledge of unstructure urban data, statistics analysis, machine learning, and implementations. The exercise session will provide students with hands-on experience adjusting Python Scripts for data acquisition, processing, model development, and model results analysis. During the implementation process, students will learn the skills to downloand data via APIs, scrape webiste, conduct data query, basic joining, perform statistics tests, as well as model development and evaluation in Python. Students will use Google Classroom to submit assignments and discuss questions on class materials and homework. The class materials and schedules are published on the Google Classroom website.

The challenge in learning planning analytics lies in absorbing a conceptual understanding of different data acquisition and processing techniques with Python training in implementation. Therefore, every attempt has been made wherever possible, to provide a “hands-on” exercise application of the concepts demonstrated in the previous lecture sessions. Additionally, case-studies (i.e., ARI project for this semester) in homework assignment will provide examples of the data and models learned in class may be applied in real-world practices, in data preparation, and in problem-solving, to bridge the gap between academics and technicalities.

Readings and Materials

Data science, machine learning, data mining
Visualization
Readings will be posted to Google Classroom. Students are expected to finish readings before coming to classes.

Class Schedule (Tentative)

Date Topic Mon Wed Reminders
Jan. 17 * Welcome and Overview
Course Introduction: Data-Analytics Thinking
Arlington Restaurant Initiative (ARI) Project Background
 
22, 24 * Intro. 2 Python
Python Basics I, Tutorial
Social Imapct Analysis Framework
Python Basics II
Social Imapact Analysis Group Presentation
HW 1 OUT
29, 31 * Data Acquisitions: APIs Download Data using APIs
Economic Imapct Analysis Framework
Download Data using APIs
Economic Imapct Analysis Group Presentation
Feb. 05, 07 * Data Acquistions: Webscraping
Download Data using Webscraping Scripts ownload Data using Webscraping Scripts HW 1 DUE FEB 07TH, 11:59 PM
12, 14 * Data Processing Geocoding using Google API Data Cleaning + Fusion HW 2 OUT
19, 21 * Data Analysis + Viz Basics Numpy + Pandas: Data Descriptive Analysis Introduction to Data Visualization
26, 28 * Data Viz No Class Tableau + GIS: Data Visualization
Mar. 05, 07 * Spring Break No Class No Class
12, 14 * Midterm Review HW 2 GROUP Presentations Machine Learning vs. Statistics HW 2 DUE MAR 14TH, 11:59 PM
19, 21 * Machine Learning Supervised Machine Learning: Linear, Logistic and SVM Supervised Machine Learning: Trees and Boosting HW 3 OUT
26, 28 * Unsupervised + Reinforced Learning Unsupervised Machine Learning: Clustering Reinforcement Learning
Apr. 02, 04 * Natural Language Processing Introduction to Text Mining Texting Mining Model Developments
09, 11 * Model Evaluation What Contributes to Good Predictive Model? How to Avoid Over Fitting?
16, 18 * Mdoel Results Visualization Predicative Model Visualization HW 3 Group Presentation HW 3 DUE APR 18TH, 11:59 PM
23, 25 * Project Wrapping Up Data Dashboard Development Mock Final Presentation
May. 30, 02 * Final Project Preparation Final Project Final Presentation

Course Requirements and Grading

Homework Assignments [60%]

The BEST way to get help with homework assignments is to post your questions to Piazza. (Please kindly review the posted questions for potential answers) If you prefer that keep your questions to only the TA and the instructor, you may use the private post feature (i.e., check the "Individual Students(s) / Instructors(s)" radio box).
Teams of 3-4 should be formed at the beginning of the semester. Most assignments are team projects. VT students MUST observe the Graduate honor code manual.
We plan to have three assignments (Check Schedule for Due time):
  • HW1 [25%]: Social and Economical Impact Analysis Framework Development * Team Work
  • HW2 [25%]: Data Collection and Control Group Selection [Logistic Regression] * Team Work
  • HW3 [10%]: Real-time twitter data scraping and Tableau * Individual Work
The detailed grading criteria will be provided in assignment descriptions. Students will have approxiatmely 1.5 - 2 weeks to finish the assignment. The in-class exercise sessions will demonstrate how to use/adjust Python Scripts to finish the assignment!

Class Project [30%]

The objective of the final project is to develope a data dashboard for the Arlington County Police Department (ACPD) to check the long-term economcial and social impact of the Arlington Restaurant Initiative (ARI). The final project is an extensition of the HW 1 AND 2. The teams are expected to use the second half of the semester to improve and extend the impact evaluation framework and data from HW 1 and 2 and develop a comprehensive and professional data dashboard to be presented to ACPD.

Participation [10%]

Hint Piazza does track students' contributions
Attendance, participation in class discussions, and out of class discussions (on Piazza), and motivation may affect your grade and could potentially influence any borderline grade. Roll will be called in class.

Final Evaluation

The breakdown of final grades:
  • A : [94,100]
  • A- : [90, 94)
  • B+ : [87, 90)
  • B : [83, 87)
  • B- : [80, 83)
  • C+ : [77, 80)
  • C : [73, 77)
  • C- : [70, 73)
  • D : [60, 70)
  • F : [0 , 60)

Late Submissions Policy

Class Accessibility

Any student with a disability requiring accommodations in this course is encouraged to contact me after class or during office hours. In addition, students should contact the Services for Students with Disabilities for Academic Success.

Finally, a couple of house keeping things

  1. Check the class website frequently to see any changes, updates, tips, and announcements.
  2. When you submit your final work, please include all your works into one PDF file, following file naming convention like this: ‘ch1_Lastname.pdf’, ‘major1_Lastname.pdf”, ‘Exam_Lastname.pdf’. Word and Excel NOT allowed for submission
  3. When you have questions, please e-mail me with a heading of “[UAP5114]” in the subject.

Welcome to the world of Planning Analytics and Visualization and good luck!