SQL Server 2005 的商业智能和数据存储毕业论文外文翻译.doc

资源描述

《SQL Server 2005 的商业智能和数据存储毕业论文外文翻译.doc》由会员分享，可在线阅读，更多相关《SQL Server 2005 的商业智能和数据存储毕业论文外文翻译.doc（10页珍藏版）》请在三一文库上搜索。

1、 Business Intelligence and Data Warehousing in SQL Server 20051.Introduction Microsoft SQL Server 2005 is a complete business intelligence (BI) platform that provides the features, tools, and functionality to build both classic and innovative kinds of analytical applications. This paper provides an

2、introduction to the tools that you will use to build an analytical application, and highlights new functionality that makes it easier than ever to build and manage complex BI systems.The following table presents an overview of the components of a business intelligence system, and the corresponding M

3、icrosoft SQL Server 2000 and SQL Server 2005 components. Two components are new for SQL Server 2005: SQL Server Management Studio and SQL Server Business Intelligence Development Studio. The other primary BI components Integration Services, Analysis Services OLAP, Analysis Services Data Mining, and

4、Reporting Services are substantially different and improved in SQL Server 2005. The SQL Server 2005 relational database contains several significant new features. Although the Microsoft Office query and portal tools are not part of SQL Server, the current releases will continue to work with SQL Serv

5、er 2005. The BI functionality in Microsoft Office will evolve with the Office product release cycle.2.Where do I start with SQL Server 2005? The first thing youll notice upon installing SQL Server 2005 is that the installation experience is integrated. No longer do you need to run the installation p

6、rogram separately for some features, such as Analysis Services. If a feature such as Reporting Services is not available for installation, your computer may not meet the installation requirements for that feature. You can review the Readme file for a complete discussion of feature prerequisites. The

7、 installation process will install: SQL Server relational database engine Integration Services Analysis Services Reporting Services SQL Server Management Studio (database management toolset) Business Intelligence Development Studio (BI application development toolset) Reporting Services requires tha

8、t IIS be installed and correctly configured. We strongly recommend that you take the time to configure and install IIS, as Reporting Services is an integral part of the SQL Server 2005 Business Intelligence feature set. Customers who are familiar with Analysis Services may be surprised by the lack o

9、f an Analysis Services metadata repository. In SQL Server 2000 the Analysis Services repository was shipped as a Microsoft Access database, but Analysis Services 2005 does not contain a metadata repository. Instead, the Analysis Services database metadata information are stored as XML files and are

10、managed by Analysis Services. These XML files can be placed under source control, if desired. We recommend that you use the Business Intelligence Development Studio to develop and the SQL Server Management Studio to operate and maintain BI database objects. You can design Integration Services packag

11、es and Analysis Services cubes and mining models in the SQL Server Management Studio, but the Business Intelligence Development Studio offers a better experience for designing and debugging BI applications. You will probably learn more by beginning with a new application rather than upgrading existi

12、ng DTS packages and Analysis Services databases. You may find it useful to recreate an existing package or database, if you have one available. After you have become more familiar with the new tools, features, and concepts, it is easier to upgrade existing objects.Many customers will use the SQL Ser

13、ver tools to develop a system with the now-familiar business intelligence structure of one or more source systems using Integration Services to feed a dimensional relational data warehouse, which in turn is used to populate the Analysis Services database. However, SQL Server 2005 provides many optio

14、ns to deviate from this generic design, by eliminating or virtualizing different components.3.Relational data warehousing The SQL Server 2005 relational database engine contains several features of great interest for the design and maintenance of data warehouse style applications. These features inc

15、lude: Table partitions enable fast data load and simplified maintenance for very large tables.Easy creation of a reporting server Transact-SQL improvements including new data types and new analytic functionsOnline index operations Fine grained backup/restore operations Fast file initialization4.Repo

16、rting Server A common technique for offloading relational operational reporting from the transaction database is to maintain a reporting server. The reporting server maintains an image of the transaction database with some latency, most often of the previous day. The reporting server is used for mos

17、t reporting and data warehouse extracts. Microsoft SQL Server 2005 adds two new features, database mirroring and database snapshots, that make it very easy to create and maintain a reporting server. Now a SQL Server reporting server can have much lower latency than once a day. Also, the reporting se

18、rver is designed to act as a standby system for the transaction system.To create a reporting server, first create a database mirror, a new SQL Server 2005 feature that provides an instant standby system for high availability. You can read the SQL Server Books Online topic Database Mirroring Concepts

19、 for more information. The database mirror cannot be queried directly, which is where the second new feature, database snapshots, becomes important. Creating a database snapshot on the mirror provides an additional copy of the data for reporting purposes. A database snapshot is a read-only copy of t

20、he database at a particular point in time, and is not automatically updated with new information when the source database changes. Database snapshots are a multifaceted topic with numerous usage scenarios; you can read the SQL Server Books Online topic Understanding Database Views for more informati

21、on. For now, however, it is enough to note that the way snapshots store their information is extremely space efficient. Multiple database snapshots can exist to provide a full view of the data for reporting purposes, although maintaining a database snapshot does have some impact on the transaction d

22、atabase upon which the database snapshots is based. By creating a database snapshot on a database mirror, you effectively create a standby server for high system availability. The database snapshot can then serve double duty as a reporting server, as well as be used in high availability solutions.5.

23、Table partitions Partitioned tables and indexes have their data divided into horizontal units, so that groups of rows are mapped into individual partitions. Operations performed on the data, such as queries, are executed as if the entire table or index is a single entity.Partitioning can: Improve ta

24、ble and index manageability. Improve query performance on multiple-CPU machines. In a relational data warehouse, fact tables are the obvious candidate for table partitioning and partitioning by date range is the most common partitioning strategy. There are three steps to defining a partitioned table

25、, as described in the SQL Server Books Online topic “Creating Partitioned Tables and Indexes”. The three steps are as follows: Create a partition function specifying how a table that uses the function is partitioned.Create a partition scheme specifying how the partitions of the partition function ar

26、e placed on the filegroup. Create a table or index using the partition scheme. Multiple tables can use the same partition scheme. This paper discusses Range partitioning of fact tables and is not intended to be a complete discussion or tutorial for table partitioning. For more information about tabl

27、e partitioning see SQL Server Books Online. The most common partitioning scheme partitions the fact table by date range, such as, year, quarter, month, or even day. In most scenarios, date partitioning of the large fact table, or tables, provides the greatest manageability benefits. In order to get

28、improved query performance, the Time dimension table should be partitioned using the same partitioning scheme. A partitioned table behaves like an unpartitioned table.Queries to the table are resolved correctly. Direct inserts, updates, and deletes on the table are automatically resolved to the corr

29、ect partition or partitions.6.Using table partitions for fast data loads Most data warehouse applications struggle to load increasingly large volumes of data in a small and shrinking load window. The typical process begins with extracts from several source systems, followed by steps to clean, transf

30、orm, synthesize, and rationalize the data across these systems. The data management application is constrained to complete the entire extract, transformation, and loading process within the load window. Usually, the systems business users have a strong requirement that the system minimize the amount

31、 of time the data warehouse is unavailable for their queries. The write step of the data management application, in which the new data is inserted into the existing data warehouse, must be designed to occur quickly and with minimum user impact. In order to load data very fast, the database recovery

32、model must be either Bulk Logged or Simple, and the table must either be empty or contain data but no indexes. If these conditions are met, a non-logged load is possible. In SQL Server 2000, before partitioned tables existed, these conditions are typically met only in the initial historical data war

33、ehouse load. Some customers with large data warehouses have built a quasi-partitioned structure by constructing a UNION ALL view over separate physical tables; these tables were populated each load cycle using a non-logged technique. This approach was not entirely satisfactory, and SQL Server 2005 p

34、artitioned tables provide superior functionality. In SQL Server 2005, you cannot perform a non-logged load directly into a partition. However, you can load into a separate table that we will call the pseudo-partition. Under certain conditions, you can switch the pseudo-partition into the partitioned

35、 table as a metadata operation that occurs extremely quickly. This technique meets our two requirements of minimizing overall load time: the pseudo-partition load is performed without logging, and minimizing end user impact and ensuring data warehouse integrity: the pseudo-partitions can be loaded w

36、hile users are querying the data warehouse. The data management application can wait until all fact tables are loaded and ready before performing the partition switch. The partition switch operation occurs very quickly, on the order of sub-second response. In addition, the pseudo partition can be ba

37、cked up as a separate table, improving system manageability.7.Using table partitions for fast data deletes Many data warehouses keep a sliding window of detailed data active in the data warehouse. For example, the fact table may contain data for three, five, or ten years. Periodically, the oldest da

38、ta is removed from the table. The primary reasons for keeping data pruned are to improve query performance and minimize storage costs.SQL Server 2005 partitions make it very easy to prune old data from a large partitioned fact table. Simply create an empty pseudo-partition as described above, and th

39、en switch it into the partitioned table. The partitioned table has an empty partition where it once had a populated partition; the pseudo-partition has data where once it was empty. The pseudo-partition can be backed up, truncated, or dropped, as appropriate.Optionally, you may choose to redefine th

40、e partition function to merge all of the left-hand (empty) partitions together into one. SQL Server 2005 的商业智能和数据存储1、简介Microsoft SQL Server2005是一个完整的商业智能（BI）平台，它提供工具和功能来建立兼具经典和创新的各种分析应用。本文将介绍，你将用它来构建分析应用程序的工具，并会强调新的功能，使得它比以往更容易建立和管理复杂的BI系统。下表介绍了商业智能系统的组件的概述，以及相应的Microsoft SQL Server 2000和SQL Server

41、2005组件。两个组件是新的SQL Server 2005：SQL Server管理Studio和SQL Server商业智能开发工作室。其他主要的BI组件集成服务，分析服务OLAP，Analysis Services数据挖掘，和Reporting Services明显不同，在SQL Server 2005中改进的SQL Server 2005关系数据库包含几个显著的新功能。虽然微软Office的查询和门户工具不是SQL Server的一部分，目前的版本将继续在SQL Server 2005中的工作。在Microsoft Office的BI功能将与Office产品发布周期演变。2、我是从哪里启

42、动SQL Server 2005？你会发现在安装SQL Server 2005的第一件事就是安装体验集成。你不再需要单独的一些功能，如分析服务运行安装程序。如果某个功能，如报表服务不适用于安装，您的计算机可能无法满足该功能的安装要求。您可以查看自述文件功能的先决条件的完整讨论。安装过程中会安装：SQL Server关系数据库引擎集成服务分析服务报表服务SQL Server管理工具（数据库管理工具集）商业智能开发套件（BI应用程序开发工具集）报表服务需要IIS中安装并正确配置。我们强烈建议您花时间来配置和安装IIS，如报表服务是SQL Server 2005商业智能功能集的一个组成部分。客户熟

43、悉Analysis Services的可能由于缺少分析服务的元数据存储库而感到惊讶。在SQL Server 2000 Analysis Services存储库被运为Microsoft Access数据库，但Analysis Services 2005中不包含元数据信息库。相反Analysis Services数据库元数据信息都存储为XML文件，并通过分析服务的管理。这些XML文件可以放置在源代码控制，如果需要的话。我们建议您使用Business Intelligence Development Studio中开发和SQL Server Management Studio来操作和维护BI数据库对

44、象。您可以设计Integration Services包和Analysis Services多维数据集和挖掘模型在SQL Server Management Studio中，但商业智能开发套件提供了设计和调试BI应用带来更好的体验。你可能会学到更多与新的应用程序开始，而不是升级现有DTS包和Analysis Services数据库。如果你有一个可用的，您可能会发现它很有用可重新创建现有的包或数据库。当你更熟悉新的工具，功能和概念，很容易升级现有的对象。许多客户将使用SQL Server工具来开发一个系统，使用Integration Services养活一个维度关系型数据仓库，而这又是用来填充

45、Analysis Services数据库的一个或多个源系统的现在熟悉的商业智能架构。但是，SQL Server 2005提供了许多选项，从这个通用设计的偏离，通过消除或虚拟化不同的组件。3、Relational数据仓库在SQL Server 2005关系数据库引擎包含几个功能的极大兴趣为数据仓库风格的应用程序的设计和维护。这些功能包括：表分区实现快速数据加载和简化维护非常大的表。轻松创建一个报告服务器的Transact-SQL的改进，包括新的数据类型和新的分析功能：联机索引操作细粒度的备份/恢复操作快速初始化文件4、Reporting服务器一种常用方法卸载从交易数据库中的关系运营报

46、告是维持一个报告服务器。报告服务器维护交易数据库的图像存在一定的延迟，通常前一天。报告服务器用于大多数报告和数据仓库提取。微软SQL Server 2005中新增了两项功能，数据库镜像和数据库快照。这使得它非常容易创建和维护一个报告服务器。现在SQL Server报表服务器每一天可以有更低的延迟比。此外，报告服务器的目的是充当交易系统的备用系统。要创建一个报表服务器，首先创建一个数据库镜像，一个新的SQL Server 2005的功能，它提供了高可用性的即时备份系统。你可以阅读SQL Server联机丛书主题“数据库镜像概念”获取更多信息。数据库镜像不能直接查询，这就是第二次的新功能，数据

47、库快照，就变得很重要。在创建镜像数据库快照提供了额外的数据副本进行报告。数据库快照是数据库在特定时间点的只读副本，并且不会自动与新的信息源数据库的更改时更新。数据库快照是与众多使用场景多方面的主题;你可以阅读SQL Server联机丛书主题“了解数据库视图”了解更多信息。但就目前而言，它是足够注意，快照存储其信息的方式是非常节省空间。多个数据库快照可以存在，为报告目的的数据的完整视图，虽然维护一个数据库快照不会对交易数据库中的一些影响赖以数据库快照的基础。通过在数据库镜像创建一个数据库快照，您可以有效地创建一个备份服务器系统的高可用性。那么数据库快照可以成为双重责任作为报表服务器，以及在高

48、可用性解决方案中使用。5、Table分区分区表和索引都他们的数据分为水平单位，让行组映射到单个分区上的数据，如查询执行的操作，执行好像整个表或索引是一个单独的实体。分区可以：提高表和索引管理。提高多CPU机器的查询性能。在关系型数据仓库中，事实表是显而易见的候选表分区和分区按日期范围是最常见的分区策略。有三个步骤来定义一个分区表，在SQL Server联机丛书主题“创建分区表和索引”中所述。三个步骤如下：创建分区函数指定如何使用该函数的表进行分区。创建分区方案指定分区函数的分区如何被放置在文件组中。创建使用分区方案的表或索引。多个表可以使用相同的分区方案。本文讨论了事实表的范围分区，不打算成为一个完整的讨论或教程表分区。有关表分区的更多信息，请参阅SQL Server联机丛书。最常见的分区方案的事实表按

展开阅读全文

SQL Server 2005 的商业智能和数据存储 毕业论文外文翻译.doc

SQL Server 2005 的商业智能和数据存储毕业论文外文翻译.doc