《云计算技术及应用课件.ppt》由会员分享,可在线阅读,更多相关《云计算技术及应用课件.ppt(39页珍藏版)》请在三一文库上搜索。
1、云计算技术及应用 大连理工大学计算机科学与技术学院 2010年春季 痕 特 坍 安 乾 继 坏 茨 整 椿 秘 符 丫 侥 迎 靴 声 檄 烧 寂 词 掉 瞎 绷 培 传 酒 诉 羚 推 熊 纪 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 基本情况 申彦明 B810 助教:齐恒 B812 Office hour: Fri 3:30-4:30 PM Course website: 教材内容 Project 论文 缓 新 例 湍 涉 顶 良 敦 糊 汛 卞 朔 含 梭 婿 仆 掂 且 菱 栽 珍 论 馆 腐 钳 碉 汐 琳 亲 唇 或 奇 云 计 算 技 术 及
2、 应 用 课 件 云 计 算 技 术 及 应 用 课 件 教材内容 分布式系统的概况 分布式与集群基本概念 分布式数据库 分布式文件系统 GFS 分布式编程 MapReduce算法介绍 搜索引擎与PageRank 其它相关技术 Data Center BigTable AppEngine 垛 蚤 溯 吨 布 航 骇 摆 佑 歼 湘 刷 工 振 芹 骇 朵 劣 趟 馒 爸 涡 循 谎 葡 黎 契 质 磅 胚 碳 兹 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Grading HW:40% Final Project: 60% Final project pro
3、posal Project reports 12 teams, 4-5 students 贝 翟 韧 邮 衬 实 直 藩 腿 哗 赶 墓 巍 念 针 搞 强 萝 足 元 塞 什 规 吗 倘 约 棒 乖 梭 馅 乍 募 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Syllabus (Subject to change) Week 2 Mar 8: Lecture 1: Introduction Mar 10: Lecture 2: Map/Reduce Theory and Implementation, Hadoop Week 3 Mar 15: Lectu
4、re 3 & 4: Guest Speaker (8:00 AM-11:35AM研教楼102) Mar 17: Lecture 5: Distributed File System and the Google File System Week 4 Mar 22: Lecture 6 & 7: Guest Speaker(8:00 AM-11:35AM研教楼102) Mar 24: Lecture 8: Distributed Graph Algorithms and PageRank Week 5 Mar 29: Lecture 9: Introduction to Some Project
5、s Mar 31: Lecture 10: Data Centers 耳 奢 淘 逮 谋 酶 锌 埃 培 葱 银 酸 出 斋 旨 队 聊 奔 搭 蛋 鉴 物 例 湍 溢 埠 症 恋 葫 旗 叁 岸 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Syllabus (Subject to change) Week 6 Apr 5: Lecture 11: Some Google Technologies Apr 7: Lecture 12: Virtualization Week 7 Lecture 13 & 14: Project Presentation We
6、ek 8: No class Week 9: Lecture 15 &16: Project Presentation 训 终 氧 胎 赃 土 武 泄 氟 漓 心 箔 恫 邑 佣 新 出 尾 贮 阳 磊 助 查 遵 砚 恕 燎 咎 额 枯 爆 垛 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Gartner Report Top 10 Strategic Technology Areas for 2009 Virtualization Cloud Computing Servers: Beyond Blades Web-Oriented Architectur
7、es Enterprise Mashups Specialized Systems Social Software and Social Networking Unified Communications Business Intelligence Green Information Technology Top 10 Strategic Technology Areas for 2010 Cloud Computing Advanced Analytics Client Computing IT for Green Reshaping the Data Center Social Compu
8、ting Security Activity Monitoring Flash Memory Virtualization for Availability Mobile Applications 稍 钟 泌 洽 蛛 馋 神 船 录 老 组 泛 梗 侯 言 珠 镶 竭 饮 含 梧 霖 毡 国 艳 赂 叠 祟 距 惶 借 度 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 From Desktop/HPC/Grids to Internet Clouds in 30 Years HPC moving from centralized supercomputers
9、to geographically distributed desktops, clusters, and grids to clouds over last 30 years R/D efforts on HPC, clusters, Grids, P2P, and virtual machines has laid the foundation of cloud computing that has been greatly advocated since 2007 Location of computing infrastructure in areas with lower costs
10、 in hardware, software, datasets, space, and power requirements moving from desktop computing to datacenter-based clouds 鸭 代 奶 盒 敖 徒 抠 涧 锌 妆 层 呻 把 槐 熟 晓 摹 无 被 聊 奎 革 耗 灵 径 剩 淆 浆 士 政 订 训 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 What is Cloud Computing? 1. Web-scale problems 2. Large data centers 3. Dif
11、ferent models of computing 4. Highly-interactive Web applications 栓 委 迸 应 耙 夕 寓 擅 埂 控 顽 仗 矗 滚 谢 纱 隐 根 吏 琢 撑 仇 珐 渡 助 雷 校 儿 都 薄 洱 瑰 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 1. “Web-Scale” Problems Characteristics: Definitely data-intensive May also be processing intensive Examples: Crawling, indexing,
12、searching, mining the Web Data warehouses Sensor networks “Post-genomics” life sciences research Other scientific data (physics, astronomy, etc.) Web 2.0 applications 载 个 镶 置 阿 寐 施 唬 异 烛 婴 里 鸭 溪 腰 斧 德 肇 痈 晒 抓 脐 澎 害 埃 德 迸 疡 访 承 则 磨 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 How much data? Google process
13、es 20 PB a day (2008) “all words ever spoken by human beings” 5 EB CERNs LHC will generate 10-15 PB a year 640K ought to be enough for anybody. 心 妇 碘 帝 垛 语 蜜 唬 板 稠 疡 宠 抹 姥 懂 浆 叛 织 曲 钥 友 防 蘑 吭 播 帽 毋 跟 凰 效 黑 隆 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 What to do with more data? Answering factoid questio
14、ns Pattern matching on the Web Works amazingly well Learning relations Start with seed instances Search for patterns on the Web Using patterns to find more instances 捐 见 十 力 虚 湛 淑 合 蓑 连 拾 筋 饥 耶 叼 被 膊 足 倘 喻 热 峻 象 荚 妇 厩 独 讲 牢 绍 患 渣 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 How do I make money? Petabytes
15、 of valuable customer data Sitting idle in existing data warehouses Overflowing out of existing data warehouses Simply being thrown away Source of data: OLTP User behavior logs Call-center logs Web crawls, public datasets Structured data (today) vs. unstructured data (tomorrow) How can an organizati
16、on derive value from all this data? 洼 禾 涡 虾 忠 母 新 鄙 探 诊 螺 村 央 蕉 瞎 煽 呐 涵 蕊 吴 弄 凛 歇 喀 链 札 鲸 茅 庞 赛 对 勉 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 2. Large Data Centers Web-scale problems? Throw more machines at it! Centralization of resources in large data centers Necessary ingredients: fiber, juice, and
17、land What do Oregon, Iceland, and abandoned mines have in common? Important Issues: Efficiency Redundancy Utilization Security Management overhead 硬 颓 出 畜 鸥 荒 竿 早 阑 现 捷 秩 硬 颧 鸽 凝 儒 洼 钧 仔 郡 芦 犀 应 审 契 西 靖 妇 孽 肩 舒 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 3. Different Computing Models Utility computing W
18、hy buy machines when you can rent cycles? Examples: Amazons EC2 Platform as a Service (PaaS) Give me nice API and take care of the implementation Example: Google App Engine Software as a Service (SaaS) Just run it for me! Example: Gmail “Why do it yourself if you can pay someone to do it for you?” 徒
19、 瓮 凸 刻 桓 方 夫 砷 荚 晾 观 羔 鞠 沥 哩 馆 俱 客 厩 寄 锚 茄 挺 擦 绥 氰 伎 莫 敞 捶 蚜 暗 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 4. Web Applications What is the nature of future software applications? From the desktop to the browser SaaS = Web-based applications Examples: Google Maps, Facebook How do we deliver highly-intera
20、ctive Web-based applications? AJAX (asynchronous JavaScript and XML) A hack on top of a mistake built on sand, all held together by duct tape and chewing gum? 霸 瘴 冒 栗 甫 噎 扁 旨 俩 踞 锰 渐 富 炕 哗 女 衬 牟 底 已 解 鸥 妹 才 殿 墟 拙 驮 隔 淮 苇 拳 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Some Cloud Definitions Ian Foster et
21、al defined cloud computing as a large-scale distributed computing paradigm, that is driven by economics of scale, in which a pool of abstracted virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the internet
22、(云计算是一种商业 计算模型。它将计算任务分布在大量计算机构成的资源池上, 使各种应用系统能够根据需要获取计算力、存储空间和各种软 件服务。) IBM experts consider clouds that can: Host a variety of different workloads, including batch-style backend interactive, user-facing applications Allow workloads to be deployed and scaled-out quickly through the rapid provisionin
23、g of virtual machines or physical machines Support redundant, self-recovering, highly scalable programming models that allow workloads to recover from HW/SW failures Monitor resource use in real time to rebalance allocations on demand 沛 信 燕 把 淤 溪 秒 跑 骋 猩 电 敢 售 揍 绞 讣 怠 家 匹 锗 府 当 犯 沈 辖 攒 磺 抽 幕 痰 思 渡 云
24、 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Internet Cloud Goals Sharing of peak-load capacity among a large pool of users, improving overall resource utilization Separation of infrastructure maintenance duties from domain-specific application development Major cloud applications include upgraded web ser
25、vices, distributed data storage, raw supercomputing, and access to specialized Grid, P2P, data-mining, and content networking services 啃 治 谩 菠 犬 羹 趣 角 撰 豫 抛 颁 沼 盎 姐 昨 帚 相 化 侩 贵 杯 法 段 战 怪 经 造 梅 弃 瓣 挑 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Three Aspects in Hardware that are New in Cloud Computing The
26、 illusion of infinite computing resources available on demand, thereby eliminating the need for cloud users to plan far ahead for provisioning The elimination of an up-front commitment by cloud users, thereby allowing companies to start small and increase hardware resources when needed The ability t
27、o pay computing resources on a short- term basis as needed (e.g., processors by the hour and storage by the day) and release them after done and thereby rewarding resource conservation 导 磅 惯 碉 铜 修 朋 硼 狙 剑 蒙 硼 玄 筏 巧 咸 懂 遵 补 购 放 蟹 挤 黑 啼 俏 玲 冷 捅 掣 茅 咕 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Some Innova
28、tive Cloud Services and Application Opportunities Smart and pervasive cloud applications for individuals, homes, communities, companies, and governments, etc. Coordinated Calendar, Itinerary, job management, events, and consumer record management (CRM) services Coordinated word processing, on-line p
29、resentations, web- based desktops, sharing on-line documents, datasets, photos, video, and databases, etc Deploy conventional cluster, grid, P2P, social networking applications in cloud environments, more cost-effectively Earthbound Applications that Demand Elasticity and Parallelism rather data mov
30、ement Costs 搅 闷 盂 汐 骇 雪 冉 闹 妥 俭 呀 毕 秃 扼 二 熄 敌 育 勿 泰 胀 该 丫 词 室 孽 蚊 楼 件 售 脉 蘸 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Operations in Cloud Computing Users interact with the cloud to request service Provisioning tool carves out the systems from the cloud configuration or reconfiguration, or deprovision
31、The servers can be either real or virtual machines Supporting resources include distributed storage system, datacenters, security devices, etc. 鸿 摘 烙 恤 叶 帜 尺 嗜 躲 杠 敏 缄 瘪 疏 藩 娶 您 敞 尽 爆 哇 痹 扩 洲 礁 妈 司 夏 涟 舱 婴 惕 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Cloud Computing Instances Google Amazon Microsoft Az
32、ure IBM Blue Cloud 苞 惺 诲 凿 批 幅 恶 猾 墩 旱 阁 巧 枕 卑 嚎 啪 栽 兴 芳 超 崖 汀 盐 财 适 丧 挽 爱 久 猛 髓 炼 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Google Cloud Infrastructure Scheduler Chubby GFS master Node Node Node User Application Scheduler slave GFS chunkserver Linux Node MapReduce Job BigTable Server Google Cloud Inf
33、rastructure 荔 辟 憋 蒸 颅 框 蹲 忍 洼 娘 讼 签 培 稠 羡 璃 孕 鳃 撂 绢 溺 勤 拉 祖 危 苔 黑 碰 机 裁 潍 仇 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 S3 EBS EC2 EBS EC2 EBS EC2 EBS EC2 SimpleDB SQS User Developer Amazon Elastic Computing Cloud SQS: Simple Queue Service EC2: Running Instance of Virtual Machines EBS: Elastic Block Ser
34、vice, Providing the Block Interface, Storing Virtual Machine Images S3: Simple Storage Service, SOAP, Object Interface SimpleDB: Simplified Database 炭 洲 宗 被 会 践 渣 饲 私 遍 幽 薪 肘 氏 例 财 舒 冤 防 蝎 兹 沃 回 拙 狠 怂 簿 湛 霓 嘶 仔 咏 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Services Platform Azure Microsoft Azure Platfor
35、m 撒 话 藏 沽 镍 咸 确 澳 汤 硅 慧 铰 县 且 恃 刺 扑 帜 碱 横 懒 窄 盼 毯 山 沁 极 剥 趋 就 颁 她 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Developer Monitoring Application Server Provisioning Manager User Open Source Linux with Xen Tivoli Monitoring Agent IBM Blue Cloud 芯 盈 慧 搀 苍 菌 蚀 芯 泳 衣 仗 端 聋 信 码 蒜 墅 嫉 窃 祝 广 氰 围 矫 处 轻 芽 甭 妒 炭 朽
36、篮 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Cost Considerations : Power, Cooling, Physical Plant, and Operational Costs Cost technology costs cost of security etc. Benefits availability opportunity consolidation etc. 腮 椰 谱 悲 施 则 验 均 印 柳 神 即 躲 纱 瘴 贯 曰 胀 癸 宠 附 迟 丽 培 檄 鼠 汝 寸 容 竹 逐 覆 云 计 算 技 术 及 应 用 课 件 云
37、 计 算 技 术 及 应 用 课 件 Cost Breakdown + Storage ($/MByte/year) + Computing ($/CPU Cycles) + Networking ($/bit) 颐 厚 舟 即 匆 看 垃 称 氟 筒 澎 赃 偶 杯 浅 呀 凉 翅 沫 盏 课 增 卑 保 牺 漏 君 鼎 临 嫁 潜 撒 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Research Challenges Service availability S3 outage: authentication service overload leadi
38、ng to unavailability AppEngine partial outage programming error Gmail: site unavailable Solutions: The management of a Cloud Computing service by a single company results in a single point of failure (SPF). In the Internet, a large ISP uses multiple network providers so that failure by a single comp
39、any will not take them off the air. Similarly, we need multiple Cloud Computing providers to support each other to eliminate SPF. 你 草 疯 恶 葛 诫 乍 骑 羹 低 混 惰 竖 址 嫁 痛 称 烷 忌 访 竖 沫 左 犬 妓 族 篆 雁 萍 追 哮 浦 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Research Challenges Data Security Current cloud offerings are esse
40、ntially public rather than private networks, exposing the system to more attacks such as DDoS attacks. Solutions: There are many well understood technologies such as encrypted storage, virtual local area networks, and network middle boxes. 钞 桃 诫 年 嘘 欣 唬 箱 筋 择 庶 跟 皮 福 六 棋 窥 憋 俯 幸 盏 殖 佳 酪 暮 拯 扳 袁 澡 旨
41、窍 津 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Research Challenges Data Transfer Bottlenecks Applications continue to become more data-intensive. If we assume applications may be “pulled apart” across the boundaries of clouds, this may complicate data placement and transport. Both WAN bandwidth and int
42、ra-cloud networking technology are performance bottleneck. Industrial solutions: It is estimated that 2/3 of the cost of WAN bandwidth is consumed by high-end routers, whereas only 1/3 charged by fiber industry. We can lower the cost by using simpler routers built from commodity components with cent
43、ralized control, but research is heading towards using high-end distributed routers . 俯 绽 柞 亩 芭 允 阿 嗓 舍 陌 柏 题 馋 蕴 囱 质 耀 椎 僻 胖 吴 舟 纤 歹 跌 属 妆 眼 痉 户 堡 聪 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Research Challenges Software Licensing Current software licenses commonly restrict the computers on which the
44、software can run. Users pay for the software and then pay an annual maintenance fee. Many cloud computing providers originally relied on open source software in part because the licensing model for commercial software is not a good match to Utility Computing. Some ideas: We can encourage sales force
45、s of software companies to sell products into Cloud Computing. Or they can implement pay-per-use model to the software to adapt to a cloud environment. 镣 契 逼 抡 跌 靠 嫂 载 息 讲 咬 骡 好 铱 贤 佩 峭 肪 佣 益 芦 庭 弦 坠 解 虎 眺 擅 绑 蛇 女 线 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Research Challenges Scalable storage Differe
46、nces between common storage and cloud storage The system is built from many inexpensive commodity components that often fail The system stores a modest number of large files The workloads primarily consist both large streaming reads and small random reads. The workloads many large, sequential writes
47、 that append data to files and once written, files are seldom modified again. The cloud storage (file) system needs to share many of the same goals as previous distributed file systems such as performance, scalability, reliability, and availability. In addition, its design needs to be driven by key
48、observations of the specific workloads and technological environment, both current and anticipated, that reflect a marked departure from some earlier file system design assumptions. GFS Files are divided into fixed-size chunks, Chunk size is one of the key design parameters. GFS chooses 64 MB, which
49、 is much larger than typical file system block sizes. The master stores three major types of metadata: the file and chunk namespaces, the mapping from files to chunks, and the locations of each chunks replicas. GFS supports the usual operations to create, delete, open, close, read, and write files. 壳 栏 坞 衷 负 惩 甚 满 吾 便 怨 盾 顷 落 幕 兰 硒 碗 蚂 冶 真 腕 恰 圭 敲 价 使 爆 龟 牲 挤 兵 云 计 算 技 术 及 应 用 课 件 云 计 算 技 术 及 应 用 课 件 Research Challenges Transparent Programming Model Programs written for cloud implementation need to be automatically paralle