Hadoop的英特尔之道.docx

上传人:苏美尔 文档编号:8939271 上传时间:2021-01-26 格式:DOCX 页数:26 大小:4.01MB
返回 下载 相关 举报
Hadoop的英特尔之道.docx_第1页
第1页 / 共26页
Hadoop的英特尔之道.docx_第2页
第2页 / 共26页
Hadoop的英特尔之道.docx_第3页
第3页 / 共26页
Hadoop的英特尔之道.docx_第4页
第4页 / 共26页
Hadoop的英特尔之道.docx_第5页
第5页 / 共26页
点击查看更多>>
资源描述

《Hadoop的英特尔之道.docx》由会员分享,可在线阅读,更多相关《Hadoop的英特尔之道.docx(26页珍藏版)》请在三一文库上搜索。

1、Hadoop: the Intel Way (Hadoop的英特尔之道)Bring New Analytics Capabilities to Hadoop Stack何京翔英特尔亚太研发有限公司总经理Software and Services GroupCloud and IOT: More Users, More Device, MoreDataWorkloadConsolidationSecurity& TrustImmersiveExperiencesCloudConnectivityOpen CloudArchitectureDataAnalyticsSoftware and Ser

2、vices Group#Intels VisionThis decade we will create and extend computing technologyto connect and enrich the lives of every person on earthSoftware and Services Group#Our Big Data Goal: Make Hadoop the Foundation of Next-Gen Data Analytics PlatformExisting IT & DataSystemsData Mining and AnalyticsBu

3、sinessStatisticMachineIntelligencModelingLearningeRDBMSEDWDataMartsBIAll of Your Big Data (Structured & Unstructured)TablSensorLogDocumentReadinImageegSoftware and Services Group#4Hadoop in TelecomHiveMapReduceETL3GHBaseBaseStationHDFSsCarrier Network OptimizationsUserSegmentationInstantaneous query

4、 of3G records bysubscribersSoftware and Services Group#5Hadoop in Smart CityData mining (e.g., vehicletracking)HiveMapReduceHBaseInstantaneousquery (e.g., roadimage)Legacy applicationsStream processing(e.g., real-timeroad conditions)Software and Services Group#6Hadoop的英特尔之道企业级解决方案前沿技术开发Enterprise-Gr

5、adeAdvanced DevelopmentSolution即时分析(Instantaneous Analysis)英特尔Hadoop发行版“Project Panthera”更易用稳定的企业级软件产Advanced development品and path-finding(Reduced Complexity)针对垂直行业的功能Open source and更高效增强community driven(Improved Efficiency)Bring New Analytics Capabilities to Hadoop StackSoftware and Services Group#

6、7英特尔Hadoop发行版优化的大数据处理软件产品稳定的企业级Hadoop发行版利用硬件新技术进行优化为Hadoop提供即时数据处理能力针对行业的功能增强,应对不同行业的大数据挑战数据处理数据分析、统计和挖掘工具集MahoutR 数据统计HivePig机器学习from Revolution Analytics 交互式数据仓库 数据流处理语言Sqoop关系数据ETL工具MapReduce英特尔Hadoop Manage r稳定高效的分布式计算框架Flume安装、部署、日志收集工具分布式、高维数据库HBase配置、监控、HBase 0.94的改进和创新,提供即时数据处理告警和访问控Zookeepe

7、r制HDFS分布式协作服务可靠的分布式文件系统Software and Services Group#8“Project Panthera”Open source initiatives to enableadvanced analytics capabilities on Hadoophttps:/ store onSQL engine forHBaseHive/MapReduce Document Efficient utilization Better integrationsemantics &of new HWwith existingsignificantlyplatforminf

8、rastructurespeedup querytechnologiesusing SQLprocessing onHBaseSoftware and Services Group#9即时分析 (Instantaneous Analysis)Instantaneous analysis with greatly enhanced HBase Stream new data into HBase for analysis in real time Support high update rate workloads (to keep the system always up to date) A

9、llow very low latency, online data serving Etc.Software and Services Group#10Interactive Query on HBase (英特尔Hadoop发行版)10X faster than MapReduceFor certain queries on HBase (e.g., group-by aggregation)HBase Query Engine Layer Fast, distributed aggregations directly inside HBase Parallel scanning over

10、 multiple regions Advanced, distributed filtering (CRC32 comparator, fuzzy row filter, etc.)HBase Query Engine as New Hive Backend Most “SELECT” automatically optimized to use HBase Query Engine “WHERE” using advanced scanner/filter “GROUP-BY” using distributed aggregations “JOIN” stills go to MapRe

11、duceSoftware and Services Group#11A Document Store on HBase (“Project Panthera”)Up-to 3x storage reduction and 3x query speedupFor Hive/MapReduce query processing on HBase(See https:/ and HBASE-6800)DOT (Document Oriented Table) on HBaseEach row contains a collection of documentsEach document contai

12、ns a collection of fieldsA document is mapped to a HBase column and serialized using AvroComplete transparent to existing HBase applicationsSoftware and Services Group#12更易用 (Reduced Complexity) Better data mining and statistics capabilities Full-text indexing and search Statistic modeling with R la

13、nguage Better integration with existing infrastructures Geo-distributed datacenters Full SQL support for OLAPSoftware and Services Group#13Full-Text Indexing and Search (英特尔Hadoop发行版)Full-text indexing and near real-time search for advanced data mining(E.g., log and click stream analysis, healthcare

14、 record analysis, etc.)Incremental full-text indexing on HBase Full-text indexing for semi-structured data (text, strings, numbers, etc.) Index incrementally built when records inserted or updated Support very high data insertion / update rateNear real-time search Distributed, keyword or logical exp

15、ression based search Zero delay of searching latest data that are just insertedSoftware and Services Group#14Bring R Statistics into Hadoop (英特尔Hadoop发行版)Distributed Statistic Modeling on Hadoop using R languageSoftware and Services Group#15Cross-Datacenter BigTable/HBase (英特尔Hadoop发行版)A virtual Big

16、 Table overlaid over existing geo-distributed data centers Global table view Data stored in geo-distributed data centers Better locality & higher availability Data transfer eliminated through distributed aggregationData CenterCData CenterAVirtualBig TableData CenterBAsync ReplicationSoftware and Ser

17、vices Group#16An analytical SQL engine for Hive/MapReduce(“Project Panthera”)Goal: Provide Full SQL support for OLAP in HadoopRequired by business users, enterprise applications, 3rd party tools (e.g., BI applications),etc.(See https:/ and HIVE-3472)SQLQuery DriverHiveQL(OpenSQL-SQL-AST Analyzer &Hi

18、ve-Hive SemanticTranslatorAnalyzerSource)ASTASTHadoopSubqueryMulti-TableINTERSECTMINUSSQLMRUnnestingSELECTSupportSupportParser*HiveHive-ASTParser*https:/ and Services Group#17更高效 (Improved Efficiency) Performance benchmarks & tools Efficient utilizing of new HW platform technologies (e.g., SSD, infi

19、niband)Software and Services Group#18英特尔Hadoop发行版高效支撑海量移动上网记录分析联通全国移动用户上网记录查询分析系统 国内首个基于Hadoop/HBase的商用电信服务系统 系统部署 英特尔Hadoop发行版v 满足高性能的数据导入和快速查询。v 稳定、易于部署和管理的企业级方案。 180+节点Hadoop/HBase集群 系统性能指标 上网记录入库时间:一般小于30分钟,实际约10分钟 具备存储全国移动用户不小于6个月的原始上网记录能力 统计分析的中间报表数据保存不小于5年 上网记录查询速度:不高于1秒 支持并发查询数目:1000请求/秒Soft

20、ware and Services Group#19HiBench & HiTune Performance Tools (“Project Panthera”)HiBench: Hadoop Benchmark Suite(See https:/ Hadoop PerformanceAnalyzer(See https:/ and Services Group#20Trying is Believing英特尔Hadoop发行版免费版 v2.2, 为最终用户和应用提供商提供了一个功能强大、方便易用的大数据入门平台。 免费版和企业版共用相同的核心代码 免费版包含所有核心增强功能 免费版在节点数和

21、系统存储容量上有所限制英特尔Hadoop发行版主页: http:/ and Services Group#21SummaryImmersive Computing = Big Data = Big OpportunitiesIntel is committed to deliver better and faster Hadoop solutions for big data analyticsIntel Hadoop Distribution (IHD) Free Edition is here, try it out!Software and Services Group#Software and Services Group#23

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 科普知识


经营许可证编号:宁ICP备18001539号-1