大數據分析持續學習資源

Big Data Analysis

- The Data Mining Specialization teaches data mining techniques for both structured data which conform to a clearly defined schema, and unstructured data which exist in the form of natural language text. Specific course topics include pattern discovery, clustering, text retrieval, text mining and analytics, and data visualization. The Capstone project task is to solve real-world data mining challenges using a restaurant review data set from Yelp.

Master Recommender Systems (U Minnesota)

- This Specialization covers all the fundamental techniques in recommender systems, from non-personalized and project-association recommenders through content-based and collaborative techniques. Designed to serve both the data mining expert and the data literate marketing professional, the courses offer interactive, spreadsheet-based exercises to master different algorithms along with an honors track where learners can go into greater depth using the LensKit open source toolkit. A Capstone Project brings together the course material with a realistic recommender design and analysis project.

Machine Learning (Stanford)

- 本課程將廣泛介紹機器學習、數據挖掘和統計模式識別。相關主題包括：(i) 監督式學習（參數和非參數算法、支持向量機、核函數和神經網絡）。 (ii) 無監督學習（集群、降維、推薦系統和深度學習）。 (iii) 機器學習實例（偏見/方差理論；機器學習和AI領域的創新）。課程將引用很多案例和應用，您還需要學習如何在不同領域應用學習算法，例如智能機器人（感知和控制）、文本理解（網絡搜索和垃圾郵件過濾）、計算機視覺、醫學信息學、音頻、數據庫挖掘等領域。

Probabilistic Graphical Models (Stanford)

- Probabilistic graphical models (PGMs) are a rich framework for encoding probability distributions over complex domains: joint (multivariate) distributions over large numbers of random variables that interact with each other. These representations sit at the intersection of statistics and computer science, relying on concepts from probability theory, graph algorithms, machine learning, and more. They are the basis for the state-of-the-art methods in a wide variety of applications, such as medical diagnosis, image understanding, speech recognition, natural language processing, and many, many more. They are also a foundational tool in formulating many machine learning problems.

Big Data Platform

Big Data Analysis with Apache Spark (UC. Berkeley)

- Learn how to apply data science techniques using parallel programming in Apache Spark to explore big data.

Distributed Machine Learning with Apache Spark (UC. Berkeley)

- Learn the underlying principles required to develop scalable machine learning pipelines and gain hands-on experience using Apache Spark.

Unlock Value in Massive Datasets (UC. San Diego)

- Drive better business decisions with an overview of how big data is organized, analyzed, and interpreted. Apply your insights to real-world problems and questions.