交通数据挖掘技术(Data Mining for Transportation)
交通数据挖掘技术(Data Mining for Transportation)
1000+ 人选课
更新日期:2025/04/02
开课时间2025/02/20 - 2025/04/30
课程周期10 周
开课状态开课中
每周学时-
课程简介

The motivation for this course started with the development of information techniques. The amount of traffic data collected is growing at an increasing rate. At the same time, the users of these data are expecting more sophisticated analysis of these large data sets. The area of data mining has been developed over the last decade to address this problem.

Data Mining is often defined as discovering useful but hidden patterns or relationships in a database, which is one of the hottest fields in computer science. It is a good field to study not only for computer science students, but also for transportation students, as well as lots of or engineer students because the same techniques can be used to solve many problems related to data mining that may arise during their career in the future.

This course intends to cover the basic concepts of data mining as well as specific applications to transportation systems, including data preprocessing, instance-based learning, decision tree, support vector machine, neural network, outlier detection and ensemble learning. The instructors will introduce what the techniques are, what they can do, how they are used, and how they work.

Welcome to join us.

课程大纲

Week 1. Introduction to data mining

1.1 What is data mining?

1.2 Data mining functionality

1.3 Data Mining Techniques

1.4 Summary

Slides

Topic for Discussion: Week 1

Python Foundations

Sklearn

Test 1

Term Project

Term Project

Week 2. Data pre-processing

2.1 Why preprocess the data?

2.2 Data cleaning

2.3 Data integration

2.4 Data reduction

2.5 Data transformation

2.6 Summary

Slides

Topic for Discussion: Week 2

Test 2

Week 3. Instance based learning

3.1 Overview of IBL

3.2 Components of KNN

3.3 Variants of kNN

3.4 Summary

Slides

Topic for Discussion: Week 3

Test 3

Week 4. Decision Trees

4.1 Decision Tree Representation

4.2 Construct Decision Tree

4.3 Overfitting and Tree Pruning

4.4 Pros and Cons of DTs

Slides

Topic for Discussion: Week 4

Test 4

Week 5. Support Vector Machine

5.1 Linear SVMs

5.2 Non-linear SVMs

5.3 Multiclass

5.4 Support Vector Regression

5.5 Summary

Slides

Topic for Discussion: Week 5

Test 5

Week 6. Outlier Mining

6.1 Background of Outlier Detection

6.2 Statistic-based Method

6.3 Distance-based Method

6.4 Density-based Method

6.5 Conclusions

Slides

Topic for Discussion: Week 6

Test 6

Week 7. Ensemble Leaning

7.1 General Idea on Ensemble Methods

7.2 Popular methods for ensemble

7.3 Class-Imbalanced Data

7.4 Summary

Slides

Topic for Discussion: Week 7

Test 7

Week 8 Clustering

8.1 Introduction to Clustering

8.2 K-means and K-medoids

8.3 DBSCAN

8.4 Model Based Clustering

Test 8

International professors\' teaching resources

Satish

Fengxiang Qiao

Bilal

Course Projects

Detection of abnormal driving behavior

Analysis of shared parking choice behavior

Identify factors influencing drowsy driving

Fuel consumption estimation of vehicles

Emission estimation of vehicles

Emissions analysis for LNG bus

Code with Python and Scikit

Code

Academic Write and Present

Skills to write

Skills to present