课时安排: 7周在线小组科研学习+5周不限 时论文指导学习
Osman——卡内基梅隆大学(CMU)终身正教授
Osman is a Full Research Professor of Electrical and Computer Engineering(ECE)at Carnegie Mellon University(CMU).Prior to joining the faculty of the ECE department in August 2013,he was a Postdoctoral Research Fellow in CyLab at CMU.He has also held a visiting Postdoctoral Scholar position at Arizona State University during Fall 2011.Dr.Yağan received his Ph.D.degree in Electrical and Computer Engineering from the University of Maryland at College Park,MD in 2011,and his B.S.degree in Electrical and Electronics Engineering from the Middle East Technical University,Ankara(Turkey)in 2007.
Dr.Yağan's research focuses on modeling,analysis,and performance optimization of computing systems,and uses tools from applied probability,network science,data science,and machine learning.In the context of data science and ML,he is working on statistical inference and decision making using sequential samples(e.g.,multi-armed bandits),and resilient distributed machine learning.On the network science side,he has broad interests including robustness of cyber-physical systems with emphasis on critical infrastructure systems;secure and reliable design of large-scale ad-hoc networks with an increasing focus on emerging applications of Internet of Things;and contagion processes in complex networks with a focus on modeling,analysis,and control of spread of viruses,(mis)information,and opinions.
Dr.Yağan is a senior member of IEEE,and a recipient of a CIT Dean's Early Career Fellowship,an IBM Academic Award,and best paper awards in ICC 2021 and IPSN 2022.
任职学校
卡内基梅隆大学(CMU)始建于1900年,是世界范围内颇负盛名的私立研究型大学,拥有世界历史较悠久的计算机学院之一,位列CSRankings排名世界,U.S.News计算机本科及硕士项目与斯坦福大学,麻省理工学院,加州大学伯克利分校并列全美。“截至2019年3月,学校的教员和校友中共有20人获得诺贝尔奖,13人获得图灵奖,22人获评美国艺术与科学院院士,19人进入美国科学促进会,72人入选美国学院。卡内基梅隆大学是美国计算机名校之一,连续多年计算机专业排名。
“多臂强盗”问题是概率论中的一个经典问题,亦是深度强化学习中的重要模块。人们针对解决此类不确定性序列决策问题,提出了“多臂强盗”算法框架(Multi-Armed Bandits,简称MAB,中文又译作“多臂”)。近年来这一算法框架因优异的性能和较少的反馈学习等优点,在推荐系统、信息检索到医疗保健和金融投资等诸多应用领域中受到了广泛关注。本课题正是以此框架为核心内容,学生将在参与的过程中深入了解算法的基础模型及应用,将认识到被广泛使用的上置信界算法(Upper Confidence Bound,简称UCB)及汤普森采样算法(Thompson Sampling Algorithms)。导师还将讲授自身在该领域的较新研究成果。
This is an introductory course on multi-armed bandits,which provides a sequential decision-making framework under uncertainty and has broad applications in recommendation systems,dynamic pricing,clinical trials,financial investments,etc.We will cover the classical multi-armed bandit model and its applications,several widely used algorithms proposed for its solution including the Explore-Then-Commit(ETC),Upper Confidence Bound(UCB)and Thompson Sampling(TS)Algorithms,performance analysis of these algorithms,and conclude the lectures with the recent work of the instructor on correlated and structured bandits.