1 Introduction to the Oracle Database


1

This chapter provides an overview of the Oracle database server. The topics include:

 

本章Oracle数据库服务器进行概括性的介绍。本章的主题有:

2

Oracle Database Architecture

1.1 Oracle数据库体系结构

3

An Oracle database is a collection of data treated as a unit. The purpose of a database is to store and retrieve related information. A database server is the key to solving the problems of information management. In general, a server reliably manages a large amount of data in a multiuser environment so that many users can concurrently access the same data. All this is accomplished while delivering high performance. A database server also prevents unauthorized access and provides efficient solutions for failure recovery.

 

An Oracle database is a collection of data treated as a unit. The purpose of a database is to store and retrieve related information. 数据库服务器是解决信息管理问题的核心组件。大体上说,数据库服务器的作用是可靠地管理多用户环境下的大规模数据,使多用户可以并发地访问相同的数据,同时实现系统的高性能。数据库服务器还要阻止未授权的操作,并提供高效的故障恢复解决方案。

4

Oracle Database is the first database designed for enterprise grid computing, the most flexible and cost effective way to manage information and applications. Enterprise grid computing creates large pools of industry-standard, modular storage and servers. With this architecture, each new system can be rapidly provisioned from the pool of components. There is no need for peak workloads, because capacity can be easily added or reallocated from the resource pools as needed.

 

Oracle数据库是第一个为企业网格计算而设计的数据库,她为管理信息和应用提供了灵活、低成本、高效益的方式。企业网格计算把存储及服务能力转化为符合业界标准、模块化的资源池 (resource pool)。在这种体系结构之下,新系统可以从组件化的资源池中组合资源而迅速构成。企业也无需建设尖锋负荷系统 (peak workloads),因为计算能力可以在需要时轻松地从资源池中获得或重新分配。

5

The database has logical structures and physical structures. Because the physical and logical structures are separate, the physical storage of data can be managed without affecting the access to logical storage structures.

 

数据库具备逻辑结构和物理结构。因为物理和逻辑结构是分离的,所以数据物理存储的变动不会影响基于逻辑存储结构的应用程序。

6

The section contains the following topics:

 

本节包含以下主题:

7

Overview of Oracle Grid Architecture

1.1.1 Oracle网格体系结构概述

8

Grid computing is a new IT architecture that produces more resilient and lower cost enterprise information systems. With grid computing, groups of independent, modular hardware and software components can be connected and rejoined on demand to meet the changing needs of businesses.

 

网格是新出现的IT体系结构,它可以提供更有弹性、成本更低的企业信息系统。在网格中,众多独立的、模块化的软硬件组件可以随时地被联接和重组,以满足业务 及业务变化的需要。

9

The grid style of computing aims to solve some common problems with enterprise IT: the problem of application silos that lead to under utilized, dedicated hardware resources, the problem of monolithic, unwieldy systems that are expensive to maintain and difficult to change, and the problem of fragmented and disintegrated information that cannot be fully exploited by the enterprise as a whole.

 

网格形式的计算系统是为了解决企业IT中的常见问题:由不同的应用系统独占硬件资源而导致的资源利用率低下;系统过于庞大而导致的难以 改进、维护昂贵;信息过于分散而导致的企业信息难以作为整体充分利用。

10

Benefits of Grid Computing Compared to other models of computing, IT systems designed and implemented in the grid style deliver higher quality of service, lower cost, and greater flexibility. Higher quality of service results from having no single point of failure, a robust security infrastructure, and centralized, policy-driven management. Lower costs derive from increasing the utilization of resources and dramatically reducing management and maintenance costs. Rather than dedicating a stack of software and hardware to a specific task, all resources are pooled and allocated on demand, thus eliminating under utilized capacity and redundant capabilities. Grid computing also enables the use of smaller individual hardware components, thus reducing the cost of each individual component and providing more flexibility to devote resources in accordance with changing needs.

 

网格的优势  和其他体系结构相比,基于网格设计、实施的IT系统能够提供更高质量的服务,更低的成本,更大的灵活性。更高质量的服务来源于网格不存在单点脆弱性(single point of failure),健壮的安全基础结构,和基于策略的集中化管理方式。更低成本来源于软硬件资源利用水平的提高和管理、维护成本的显著降低。在以往的体系结构中,一个 完成特定任务的系统要独占一系列软硬件资源,而网格体系中所有资源被统一储备随需分配,这就消除了资源利用不足和资源冗余的现象。网格可以使用更小型的硬件组件, 这降低了每个组件的成本并使用户可根据需求的变化更灵活地分配资源。

11

Grid Computing Defined

 

1.1.1.1 网格的定义

 

12

The grid style of computing treats collections of similar IT resources holistically as a single pool, while exploiting the distinct nature of individual resources within the pool. To address simultaneously the problems of monolithic systems and fragmented resources, grid computing achieves a balance between the benefits of holistic resource management and flexible independent resource control. IT resources managed in a grid include:

  • Infrastructure: the hardware and software that create a data storage and program execution environment

  • Applications: the program logic and flow that define specific business processes

  • Information: the meanings inherent in all different types of data used to conduct business

 

网格将相似的IT资源整体地看做一个池,同时充分发挥池中每一个个体的独特功能。网格实现了整体资源管理和独立资源控制的平衡,解决了庞大系统和分散资源的矛盾。在网格中管理的IT资源包括:

  • 基础设施:组成数据存储、软件执行环境的硬件和软件。

  • 应用:定义业务过程(business process)的程序逻辑(program logic)和流程(flow)。

  • 信息:蕴含于各种数据中用于指导业务的数据的内在含义
     

13

Core Tenets of Grid Computing Two core tenets uniquely distinguish grid computing from other styles of computing, such as mainframe, client-server, or multi-tier: virtualization and provisioning.

 

·          With virtualization, individual resources (e.g. computers, disks, application components and information sources) are pooled together by type then made available to consumers (e.g. people or software programs) through an abstraction. Virtualization means breaking hard-coded connections between providers and consumers of resources, and preparing a resource to serve a particular need without the consumer caring how that is accomplished.

·          With provisioning, when consumers request resources through a virtualization layer, behind the scenes a specific resource is identified to fulfill the request and then it is allocated to the consumer. Provisioning as part of grid computing means that the system determines how to meet the specific need of the consumer, while optimizing operation of the system as a whole.

 

网格的核心理念 与大型机、C/S结构、多层结构等以往的体系结构不同,网格有两个独特的核心理念:虚拟化和资源供给。

 

·          所谓虚拟化,就是将各类独立的资源(计算机,磁盘,应用组件,信息源等)视为一个池,经过抽象后提供给资源消费者(用户或软件程序)。虚拟化意味着打破了资源提供者和消费者之间的硬性联系(固化在程序代码 中的(hard-coded)),在满足消费者对资源需求的同时无需消费者关心资源供给是如何实现的。

·          所谓资源供给,就是当消费者通过虚拟层请求资源时,网格在幕后找出满足需求的资源并分配给消费者。在网格中供给意味着系统负责决定如何满足消费者的特定需求,同时还要从整体上对系统 运转进行优化。

14

The specific ways in which information, application or infrastructure resources are virtualized and provisioned are specific to the type of resource, but the concepts apply universally. Similarly, the specific benefits derived from grid computing are particular to each type of resource, but all share the characteristics of better quality, lower costs and increased flexibility.

 

信息、应用、基础设施资源的虚拟化与资源供给的具体方法各不相同,但思路是相通的。类似的,通过网格供给各种资源给用户带来的的益处也各不相同,但都具备了高 质量、低造价及灵活的特点。

15

Infrastructure Grid Infrastructure grid resources include hardware resources such as storage, processors, memory, and networks as well as software designed to manage this hardware, such as databases, storage management, system management, application servers, and operating systems.

 

基础设施网格 基础设施网格资源包括存储、处理器、内存、网络等硬件资源,及管理这些硬件的软件资源,如数据库、存储管理、应用服务器和操作系统。

16

Virtualization and provisioning of infrastructure resources mean pooling resources together and allocating to the appropriate consumers based on policies. For example, one policy might be to dedicate enough processing power to a web server that it can always provide sub-second response time. That rule could be fulfilled in different ways by the provisioning software in order to balance the requests of all consumers.

 

基础设施网格的虚拟化与资源供给意味着将所有资源视为池,并根据预定策略分配给适当的消费者。例如,针对web服务器的策略要提供足够的处理能力来保证响应时间。资源供给管理软件根据实际情况选择适当的方式实现预定策略,以满足所有消费者对资源的请求。

17

Treating infrastructure resources as a single pool and allocating those resources on demand saves money by eliminating under utilized capacity and redundant capabilities. Managing hardware and software resources holistically reduces the cost of labor and the opportunity for human error.

 

将基础设施资源视为一个池并随需分配,提高了资源的利用水平,减少了冗余资源,节约了软硬件购买资金。对软硬件资源整体的管理降低 了人力成本及人为错误发生的机会。

18

Spreading computing capacity among many different computers and spreading storage capacity across multiple disks and disk groups removes single points of failure so that if any individual component fails, the system as a whole remains available. Furthermore, grid computing affords the option to use smaller individual hardware components, such as blade servers and low cost storage, which enables incremental scaling and reduces the cost of each individual component, thereby giving companies more flexibility and lower cost.

 

将计算能力分散于不同的计算机,将存储能力分散于多个磁盘和磁盘组,消除了系统的单点脆弱性,即保证系统中的个体组件发生故障时系统整体还能保持可用。此外,网格体系可以基于众多小型的硬件组件,例如刀片服务器和低成本存储器,这增加了系统的伸缩性、降低硬件组件的成本,使企业 获取更低的成本及更大的灵活性。

19

Infrastructure is the dimension of grid computing that is most familiar and easy to understand, but the same concepts apply to applications and information.

 

基础设施是网格体系中最为人熟知也最易理解的范畴,其实类似的概念也适用于应用和信息。

20

Applications Grid Application resources in the grid are the encodings of business logic and process flow within application software. These may be packaged applications or custom applications, written in any programming language, reflecting any level of complexity. For example, the software that takes an order from a customer and sends an acknowledgement, the process that prints payroll checks, and the logic that routes a particular customer call to a particular agent are all application resources.

 

应用网格 网格中的应用资源是蕴含于应用软件中反映业务逻辑(business logic)和处理流程(process flow)的程序代码。 这些应用软可以是套装的,也可以是定制的,可以以任何编程语言实现,表现任何层次的业务复杂度。举例来说,一个接受客户订单并发送反馈的程序,一个打印薪水册的工作流程,或一个 将特定客户的服务请求传递给特定员工的逻辑,都属于应用资源。

21

Historically, application logic has been intertwined with user interface code, data management code, and process or page flow and has lacked well-defined interfaces, which has resulted in monolithic applications that are difficult to change and difficult to integrate.

 

历史上,应用逻辑往往和用户界面代码、数据管理代码、流程控制代码、页面流转代码交织在一起,并且缺乏完善的接口,这导致的庞大的孤岛系统,难于修改,难于集成。

22

Service oriented architecture has emerged as a superior model for building applications, and service oriented architecture concepts align exactly with the core tenets of grid computing. Virtualization and provisioning of application resources involves publishing application components as services for use by multiple consumers, which may be people or processes, then orchestrating those services into more powerful business flows.

 

基于服务的体系结构(Service Oriented Architecture,SOA)是一种更高级的应用构建模型,同时SOA的概念与网格的核心理念不谋而合。应用资源的虚拟化和 资源供给意味着将应用组件发布为服务提供给众多消费者,这些消费者包括人或流程,进而用这些服务构件更为强大的业务流程。

23

In the same way that grid computing enables better reuse and more flexibility of IT infrastructure resources, grid computing also treats bits of application logic as a resource, and enables greater reuse of application functionality and more flexibility in changing and building new composite applications.

 

就像网格可以使基础设施资源更加灵活可重用性更好一样,网格将应用逻辑也视为资源,因此使应用程序功能的可重用性更强,使应用变更更加灵活,使通过组合已有应用构建新的 复合应用更简单。

24

Furthermore, applications that are orchestrated from published services are able to view activities in a business as a single whole, so that processes are standardized across geography and business units and processes are automated end-to-end. This generates more reliable business processes and lowers cost through increased automation and reduced variability.

 

由发布的服务构建而成的应用将整体地覆盖企业内的所有活动,由此业务处理在不同地理区域和不同业务部门间都是标准化的, 且从头到尾都是自动化的。由于自动化程度的提高和差异性的降低,将带来更可靠的业务流程和更低的管理成本。

25

Information Grid The third dimension to grid computing, after infrastructure and applications, is information. Today, information tends to be fragmented across a company, making it difficult to see the business as a whole or answer basic questions about customers. Without information about who the customer is, and what they want to buy, information assets go underexploited.


信息网格 继基础设施和应用层之后,网格体系结构的第三个层次是信息网格。当前的常见的信息系统现状是,信息零散的分布于企业内,使得从整体了结业务情况或回答一些基本的客户问题都很困难。当业务人员无法了解用户是谁,他们想购买什么产品时,信息的价值就 被大大削弱了。

26

In contrast, grid computing treats information holistically as a resource, similar to infrastructure and applications resources, and thus extracts more of its latent value. Information grid resources include all data in the enterprise and all metadata required to make that data meaningful. This data may be structured, semi-structured, or unstructured, stored in any location, such as databases, local file systems, or e-mail servers, and created by any application.
 

与过去的情况相反,信息网格将信息视为一个完整统一的资源,以便获取其中的潜在价值。信息网格资源包括企业拥有的所有数据以及用于解释数据含义的所有元数据。这些数据可能是结构化的、半结构化的、或非结构化的,由各种应用生成,存储于如数据库、文件系统、email服务器等各种位置。

27

The core tenets of grid computing apply similarly to information as they do to infrastructure and applications. The infrastructure grid exploits the power of the network to allow multiple servers or storage devices to be combined toward a single task, then easily reconfigured as needs change. A service oriented architecture, or an applications grid, enables independently developed services, or application resources, to be combined into larger business processes, then adapted as needs change without breaking other parts of the composite application. Similarly, the information grid provides a way for information resources to be joined with related information resources to greater exploit the value of the inherent relationships among information, then for new connections to be made as situations change.


与基础设施层和应用层一样,网格的核心理念也适用于信息层。基础设施网格可以通过网络集合多个服务器或存储设备来完成同一个任务,并可在需要时轻易的重新配置。基于服务的体系结构(或称为应用网格),可以将多个独立开发的服务(或称为应用资源)集合为更大的业务过程,当需求变动时每个服务可以独立调整,不至于影响应用的其他部分。同样地,信息网格提供 了将相关信息资源整合的方法,以便发掘具有潜在联系的信息的价值,同时当形势变化时也易于在信息间建立新的联系。

28

The relational database, for example, was an early information virtualization technology. Unlike its predecessors, the network database and hierarchical database models, in which all relationships between data had to be predetermined, relational database enabled flexible access to a general-purpose information resource. Today, XML furthers information virtualization by providing a standard way to represent information along with metadata, which breaks the hard link between information and a specific application used to create and view that information.
 

例如,关系型数据库就是一种信息虚拟化的技术。她的前辈网络数据模型和层次数据模型要求数据间的关系必须预先确定,而关系型数据库是一个可以灵活访问的通用信息源。现今,通过XML技术可以标准化的展现信息和元数据,进一步发展了信息虚拟化,打破了信息和用于创建、展现信息的应用之间的硬性联系。

29

Information provisioning technologies include , data propagation, replication, extract-transform-load, as well as mapping and cleansing tools to ensure data quality. Data hubs, in which a central operational data store continually syncs with multiple live data sources, are emerging as a preferred model for establishing a single source of truth while maintaining the flexibility of distributed control.
 

信息供给技术包括消息队列(message queuing)、数据播送(data propagation)、数据复制 (replication)、抽取转换加载(ETL),及用于保证数据质量的映射、清洗工具。数据中心--一个随时与众多活动数据源同步数据的中央操作数据库--的出现,为保证分析所需数据来源的唯一性,同时保证数据分布式使用的灵活性,提供了一个首选的模型。

30

Grid Resources Work Well Independently and Best Together By managing any single IT resource – infrastructure, applications, or information - using grid computing, regardless of how the other resources are treated, enterprises can realize higher quality, more flexibility, and lower costs. For example, there is no need to rewrite applications to benefit from an infrastructure grid. It is also possible to deploy an applications grid, or a service oriented architecture, without changing the way information is managed or the way hardware is configured.

 

网格三层次资源的独立性与协调性 企业的基础设施、应用、信息这三个层次的IT资源中的任何一个改为通过网格管理,都能实现更好的服务质量,更大的灵活性,更低的成本。例如,建设了基础设施网格后无需改写程序就能感受到改变。同样,部署一个应用网格(或者说基于服务的体系结构)既无需改变信息层的管理方式,也无需改变基础设施层的硬件配置。

31

It is possible, however, to derive even greater benefit by using grid computing for all resources. For example, the applications grid becomes even more valuable when you can set policies regarding resource requirements at the level of individual services and have execution of different services in the same composite application handled differently by the infrastructure - something that can only be done by an application grid in combination with an infrastructure grid. In addition, building an information grid by integrating more information into a single source of truth becomes tenable only when the infrastructure is configured as a grid, so it can scale beyond the boundary of a single computer.
 

但是如果将所有层次的资源均纳入网格体系将会获得更大的收益。例如,如果能为每个单独的服务(应用)设定相应的硬件资源(基础设施)分配策略,使组成同一个应用的不同服务 所需的硬件资源由基础设施网格根据情况单独分配,应用网格的优势就能更好地发挥,而这必须要求应用网格和基础设施网格协同工作。再比如,将众多信息源集成为唯一准确的数据源以建立信息网格时,必须依靠基础设施网格,才能超越单个计算机 硬件的界限。

32

Grid Computing in Oracle Database 10g

 

Oracle 10g中的网格

33

On the path toward this grand vision of grid computing, companies need real solutions to support their incremental moves toward a more flexible and more productive IT architecture. The Oracle Database 10g family of software products implements much of the core grid technology to get companies started. And Oracle delivers this grid computing functionality in the context of holistic enterprise architecture, providing a robust security infrastructure, centralized management, intuitive, powerful development tools, and universal access. Oracle Database 10g includes:

  • Oracle Database 10g

  • Oracle Application Server 10g

  • Oracle Enterprise Manager 10g

  • Oracle Collaboration Suite 10g

 

虽然网格具有美好前景,企业还是需要实实在在的解决方案才能确保实现这种更灵活更高效的IT体系结构。Oracle数据库10g家族的产品实现了大量网格核心技术。Oracle的网格解决方案包含于一个完整的企业信息平台之中,提供了健壮的安全机制、集中化的管理方式、强大的开发工具、全面的访问方式。Oracle数据库10g家族的产品包括:

  • 数据库服务器 Oracle Database 10g

  • 应用服务器 Oracle Application Server 10g

  • 企业管理器 Oracle Enterprise Manager 10g

  • 协作套件 Oracle Collaboration Suite 10g

34

Although the grid features of Oracle 10g span all of the products listed above, this discussion will focus on the grid computing capabilities of Oracle Database 10g.

 

Oracle 10g包含的网格特性覆盖了上面列出的所有产品,但是下面主要讨论数据库服务器(Oracle Database 10g)的网格特性。

35

Infrastructure Grid
  • Server Virtualization. Oracle Real Application Clusters 10g (RAC) enable a single database to run across multiple clustered nodes in a grid, pooling the processing resources of several standard machines. Oracle is uniquely flexible in its ability to provision workload across machines because it is the only database technology that does not require data to be partitioned and distributed along with the work. Oracle 10g Release 2 software includes enhancements for balancing connections across RAC instances, based on policies.
  • Storage Virtualization. The Oracle Automatic Storage Management (ASM) feature of Oracle Database 10g provides a virtualization layer between the database and storage so that multiple disks can be treated as a single disk group and disks can be dynamically added or removed while keeping databases online. Existing data will automatically be spread across available disks for performance and utilization optimization. In Oracle 10g Release 2, ASM supports multiple databases, which could be at different software version levels, accessing the same storage pool.
  • Grid Management. Because grid computing pools together multiple servers and disks and allocates them to multiple purposes, it becomes more important that individual resources are largely self-managing and that other management functions are centralized.
基础设施网格
  • 服务能力虚拟化。Oracle实时应用集群(RAC,Oracle Real Application Clusters)可以使一个数据库运行在网格中多个集群节点上,即把多个计算机的处理能力作为池。Oracle是目前唯一不需要将数据分区再分布处理就能利用多个计算机提供 的处理能力的数据库。Oracle 10g 版本2(Oracle 10g Release 2,Oracle 10g R2)还增加了基于策略来平衡RAC实例之间连接的功能。
  • 存储能力虚拟化。Oracle 数据库 10g的自动存储管理功能(ASM,Automatic Storage Management)在数据库与存储硬件之间建立了一个虚拟层,多个磁盘可以被视为一个磁盘组,而且磁盘可以在保持数据库联机的状态下动态地添加或移除。现有的数据自动的在可用磁盘间分布,以便获得性能和利用效率的优化。在Oracle 10g R2中的ASM支持不同版本的数据库使用同一个存储池。
  • 网格管理。由于网格将多个服务器和磁盘视为池,并分配给不同 业务需求,因此要求每个独立的资源有很强的自管理能力,同时还要提供集中化管理的功能。
 

36

The Grid Control feature of Oracle Enterprise Manager 10g provides a single console to manage multiple systems together as a logical group. Grid Control manages provisioning of nodes in the grid with the appropriate full stack of software and enables configurations and security settings to be maintained centrally for groups of systems.

 

Oracle 企业管理器 10g(Oracle Enterprise Manager,OEM)的网格控制功能通过一个控制台将多个系统作为一个逻辑组管理。 网格控制功能可以管理网格内各节点的资源供给,还能实现多组系统配置和安全设置的集中维护。

37

Another aspect to grid management is managing user identities in a way that is both highly secure and easy to maintain. Oracle Identity Management 10g includes an LDAP-compliant directory with delegated administration and now, in Release 2, federated identity management so that single sign-on capabilities can be securely shared across security domains. Oracle Identity Management 10g closely adheres to grid principles by utilizing a central point for applications to authenticate users - the single sign-on server - while, behind the scenes, distributing control of identities via delegation and federation to optimize maintainability and overall operation of the system.

 

网格管理的另一特性是以高度安全、易于维护的方式管理用户身份。Oracle 身份管理 10g(OIM,Oracle Identity Management)提供了兼容于LDAP的目录服务,并结合代理管理(delegated administration)功能(在R2中结合了统一身份管理(federated identity management)),实现了在整个安全域的单点登录能力。OIM 10g依据网格的原则,为应用验证用户身份提供了单一控制点-单点登录服务器(sign-on server),在底层通过代理管理和统一身份管理将身份控制分布到不同组件,实现了系统维护的统一性和全面性。

38

Applications Grid
应用网格

39

Standard Web Services Support. In addition to the robust web services support in Oracle Application Server 10g, Oracle database 10g can publish and consume web services. DML and DDL operations can be exposed as web services, and functions within the database can make a web service appear as a SQL row source, enabling use of powerful SQL tools to analyze web service data in conjunction with relational and non-relational data.

 

支持标准的Web Services。除了Oracle 应用服务器 10g(Oracle Application Server)中对Web Services的强大支持之外,Oracle 10g 数据库也可以发布或使用Web Service。DDL 和 DML操作可以以Web Service的形式供用户使用;数据库可以将Web Service的输出虚拟为一个SQL数据源,因此可以用强大的SQL工具将Web Service数据与关系型数据或非关系型数据关联使用。

40

Oracle Enterprise Manager 10g enhances Oracle's support for service oriented architectures by monitoring and managing web services and any other administrator-defined services, tracking end-to-end performance and performing root cause analysis of problems encountered.

 

OEM 10g体现了Oracle管理功能对SOA的支持,她可以监视、管理Web Services和其他管理员定义的服务,可以进行端到端的性能监控,还可以对系统故障来源进行分析。

41

Information Grid
信息网格

42

  • Data Provisioning. Information starts with data, which must be provisioned wherever consumers need it. For example, users may be geographically distributed, and fast data access may be more important for these users than access to an identical resource. In these cases, data must be shared between systems, either in bulk or near real time. Oracle's bulk data movement technologies include Transportable Tablespaces and Data Pump.

    For more fine-grained data sharing, the Oracle Streams feature of Oracle Database 10g captures database transaction changes and propagates them, thus keeping two or more database copies in sync as updates are applied. It also unifies traditionally distinct data sharing mechanisms, such as message queuing, replication, events, data warehouse loading, notifications and publish/subscribe, into a single technology.

 

  • 数据供给。信息源自数据,而数据必须在用户需要的时候就能得到供给。例如,有些用户也许在地理上很分散,对他们来说数据访问的及时性比数据的唯一性更加重要。在这种情况下,数据需要在系统间以批量或接近实时的方式共享。Oracle的批量数据迁移技术包括 可移动表空间(Transportable Tablespaces)和数据泵(Data Pump)。

    对于粒度更细的数据共享,由Oracle Database 10g 的 Oracle 数据流(Oracle Streams)功能实现。Oracle 数据流能捕捉数据库事务并在数据库间传播,使多个数据库的数据在发生变化时可以同步的复制。Oracle 数据流将消息队列(message queuing)、复制(replication)、事件(events)、数据仓库加载(data warehouse loading)、通知(notifications )和发布/订阅(publish/subscribe)等传统上相互独立的数据共享机制整合成为一项统一的技术。

43

  • Centralized Data Management. Oracle Database 10g manages all types of structured, semi-structured and unstructured information, representing, maintaining and querying each in its own optimal way while providing common access to all via SQL and XML Query. Along with traditional relational database structures, Oracle natively implements OLAP cubes, standard XML structures, geographic spatial data and unlimited sized file management, thus virtualizing information representation. Combining these information types enables connections between disparate types of information to be made as readily as new connections are made with traditional relational data.

 

  • 集中化的数据管理。Oracle 10g 数据库可以管理结构化、半结构化、非结构化等各种类型的信息,以SQL和XML作为通用的访问接口,并依据各种类型数据的特点以最适合的方式展示、维护、查询这些数据的同时。Oracle不仅提供了传统的关系型数据结构,还内置了OLAP立方体 (OLAP cube)、标准XML数据结构、地理空间数据库和无限制大文件管理等数据存储方式,这就使信息表现的虚拟化成为可能。通过对这些数据就够内在的支持,Oracle能够在迅速在异构的数据源之间建立关联,就如同关联传统的关系型数据一样。

44

  • Metadata Management. Oracle Warehouse Builder is more than a traditional batch ETL tool for creating warehouses. It enforces rules to achieve data quality, does fuzzy matching to automatically overcome data inconsistency, and uses statistical analysis to infer data profiles. With Oracle 10g Release 2, its metadata management capabilities are extended from scheduled data pulls to handle a transaction-time data push from an Oracle database implementing the Oracle Streams feature.

    Oracle's series of enterprise data hub products (for example, Oracle Customer Data Hub) provide real-time synchronization of operational information sources so that companies can have a single source of truth while retaining separate systems and separate applications, which may include a combination of packaged, legacy and custom applications. In addition to the data cleansing and scheduling mechanisms, Oracle also provides a well-formed schema, established from years of experience building enterprise applications, for certain common types of information, such as customer, financial, and product information.

 

  • 元数据管理。Oracle Warehouse Builder(OWB)不仅是传统意义上批量处理ETL建立数据仓库的工具。她可以根据预订规则保证数据质量;进行模糊比对(fuzzy matching)自动消除数据不一致性;通过统计性的分析推断数据全貌。在Oracle 10g R2中,元数据管理功能得到进一步加强,比如可以周期性运行的“拉”数据(data pull)技术,还有Oracle Streams功能带来的事务级的“推”数据(data push)技术。

    Oracle还具备一系列企业级的数据中心产品(例如,Oracle客户数据中心,Oracle Customer Data Hub)来实现于操作性信息源的实时同步,这保证了企业既可以拥有具有唯一真实性的数据源,又可以继续使用各种预制、定制、历史遗留的应用系统。Oralce根据多年来建设企业信息应用的经验,为一些通用的业务,如客户管理、财务管理、产品管理,提供了完善的数据模型。此外,Oracle还提供了数据清洗和数据调度管理功。

45

  • Metadata Inference. Joining the Oracle 10g software family is the new Oracle Enterprise Search product. Oracle Enterprise Search 10g crawls all information sources in the enterprise, whether public or secure, including e-mail servers, document management servers, file systems, web sites, databases and applications, then returns information from all of the most relevant sources for a given search query. This crawl and index process uses a series of heuristics specific to each data source to infer metadata about all enterprise information that is used to return the most relevant results to any query.

 

  • 元数据推测。Oracle企业搜索(Oracle Enterprise Search,OES)是新加入Oracle 10g家族的产品。OES 10g 可以检索所有企业内公开或保密的信息源,包括email服务器、文档管理服务器、文件系统、网站、数据库和应用,返回与搜索请求相关的所有信息。检索和索引的过程中会针对不同数据源采取一系列探索性的方式推断所有企业信息的元数据,用来保证每次搜索都能得到最有用的结果。

46

Overview of Application Architecture

数据库应用的体系结构概述

47

There are two common ways to architect a database: client/server or multitier. As internet computing becomes more prevalent in computing environments, many database management systems are moving to a multitier environment.

 

数据库应用的体系结构有两种主要形式:客户端/服务器结构和多层结构。随着互联网(internet)的兴起,越来越多的数据库管理系统转变为多层结构。

48

Client/Server Architecture

 

1.1.2.1 客户端/服务器体系结构

49

Multiprocessing uses more than one processor for a set of related jobs. Distributed processing reduces the load on a single processor by allowing different processors to concentrate on a subset of related tasks, thus improving the performance and capabilities of the system as a whole.

 

多处理技术(multiprocessing)指使用多个处理器完成同一个任务。将处理能力分布 到不同处理器,使每个处理器专注于任务的一个子集能够减轻单个处理器的负担,从而提高系统整体的性能。

50

An Oracle database system can easily take advantage of distributed processing by using its client/server architecture. In this architecture, the database system is divided into two parts: a front-end or a client, and a back-end or a server.

 

Oracle数据库系统的客户端/服务器体系结构(client/server architecture)很容易发挥多处理技术的优势。在这种体系结构下,数据库系统被分为两部分:前端,也称为客户端(client);后端,也称为服务端(server)。

51

The Client
客户端

52

The client is a database application that initiates a request for an operation to be performed on the database server. It requests, processes, and presents data managed by the server. The client workstation can be optimized for its job. For example, it might not need large disk capacity, or it might benefit from graphic capabilities.
 

客户端是一个数据库应用程序,她提交在数据库上执行操作的请求。她负责请求、处理、展现由数据库服务器管理的数据。运行客户端的计算机可以针对她自身的工作进行优化。例如, 客户端计算机不需要大容量的磁盘,但应该适当提高显示性能。

53

Often, the client runs on a different computer than the database server, generally on a PC. Many clients can simultaneously run against one server.

 

通常,客户端程序与数据库服务器运行在不同的计算机上,以PC机为主。多个客户端可以同时使用同一个服务器。

54

The Server
服务端

55

The server runs Oracle software and handles the functions required for concurrent, shared data access. The server receives and processes the SQL and PL/SQL statements that originate from client applications. The computer that manages the server can be optimized for its duties. For example, it can have large disk capacity and fast processors.

 

服务器运行Oracle数据库管理软件,处理并发、共享的数据访问。数据库服务器接收、处理由客户端应用程序提交的SQL或PL/SQL语句。运行数据库的计算机也可以根据她的职责进行优化,她应具备大容量存储和较快的处理能力。

56

Multitier Architecture: Application Servers

 

1.1.2.2 多层体系结构:应用服务器

57

A multitier architecture has the following components:

  • A client or initiator process that starts an operation

  • One or more application servers that perform parts of the operation. An application server provides access to the data for the client and performs some of the query processing, thus removing some of the load from the database server. It can serve as an interface between clients and multiple database servers, including providing an additional level of security.

  • An end or database server that stores most of the data used in the operation

 

多层体系结构具备以下组成部分:

  • 客户端程序,提交数据库操作

  • 一个或多个应用服务器处理一个操作请求的不同部分。应用服务器首先负责访问数据,再对查询结果进行处理,这就减轻了数据库服务器的负担。应用服务器可以作为客户端与数据库之间的接口,还可提供额外的 安全控制。

  • 一个数据库服务器,或称为服务端,存储用户操作所需的数据。

 

 

58

This architecture enables use of an application server to do the following:

  • Validate the credentials of a client, such as a Web browser

  • Connect to an Oracle database server

  • Perform the requested operation on behalf of the client

 

在这种体系结构下,应用服务器起到以下作用:

  • 验证用户身份,

  • 连接Oracle数据库服务器

  • 代替用户执行对数据库的请求

 

59

If proxy authentication is being used, then the identity of the client is maintained throughout all tiers of the connection.
 

如果使用身份认证代理(proxy authentication ),多层体系结构中各层间用户身份认证可以被统一维护。

60

Overview of Physical Database Structures

1.1.3 物理数据库结构概述

61

The following sections explain the physical database structures of an Oracle database, including datafiles, redo log files, and control files.
 

以下各节介绍Oracle数据库的各种物理结构,包括数据文件(datafile)、重做日志文件(redo log files)、和控制文件(control files)。

62

Datafiles

 

1.1.3.1 数据文件

63

Every Oracle database has one or more physical datafiles. The datafiles contain all the database data. The data of logical database structures, such as tables and indexes, is physically stored in the datafiles allocated for a database.
 

每个Oracle数据库使用一个或多个物理的数据文件(datafile)。数据文件中包含所了有的数据库数据。按表、索引等逻辑数据库结构组织的数据存储在数据库的数据文件中。

64

The characteristics of datafiles are:

  • A datafile can be associated with only one database.

  • Datafiles can have certain characteristics set to let them automatically extend when the database runs out of space.

  • One or more datafiles form a logical unit of database storage called a tablespace.

 

数据文件的特点有:

  • 一个数据文件只能属于一个数据库

  • 当数据库空间用完时,数据文件可以按照预定的设置自动扩展。

  • 一个或多个数据文件形成了数据库中的一种逻辑结构-表空间。

 

65

Data in a datafile is read, as needed, during normal database operation and stored in the memory cache of Oracle. For example, assume that a user wants to access some data in a table of a database. If the requested information is not already in the memory cache for the database, then it is read from the appropriate datafiles and stored in memory.

 

当需要时,数据文件中的数据通过数据库操作被读出,并缓存于Oracle的内存结构中。例如,当用户需要访问数据库表中的数据时,如果用户请求的数据还没有放入缓存中, 数据库就会把数据从相应的数据文件中读出再放入内存。

66

Modified or new data is not necessarily written to a datafile immediately. To reduce the amount of disk access and to increase performance, data is pooled in memory and written to the appropriate datafiles all at once, as determined by the database writer process (DBWn) background process.
 

新建或修改的数据不一定立即被写入数据文件。为了减少磁盘访问以提高性能,变化数据暂存在内存中,在适当时间集中地写入相应的数据文件,这个过程由后台进程数据库写进程 (DBWn)(database writer process,DBWn)完成。

67

See Also:

"Overview of the Oracle Instance" for more information about Oracle's memory and process structures

另见:

Oracle实例概述” 了解关于Oracle内存和进程结构的详细信息

68

Control Files

 

1.1.3.2 控制文件

69

Every Oracle database has a control file. A control file contains entries that specify the physical structure of the database. For example, it contains the following information:

  • Database name

  • Names and locations of datafiles and redo log files

  • Time stamp of database creation

 

每个Oracle数据库都有控制文件(control file)。控制文件中含有说明数据库物理结构的内容。例如,其中包含以下信息:

  • 数据库名

  • 数据文件、重做日志文件的名称和位置

  • 数据库创建的时间戳

70

Oracle can multiplex the control file, that is, simultaneously maintain a number of identical control file copies, to protect against a failure involving the control file.

 

Oracle可以使用多重控制文件,即同时维护多个完全相同的控制文件,以防止控制文件损坏造成的数据库故障。

71

Every time an instance of an Oracle database is started, its control file identifies the database and redo log files that must be opened for database operation to proceed. If the physical makeup of the database is altered (for example, if a new datafile or redo log file is created), then the control file is automatically modified by Oracle to reflect the change. A control file is also used in database recovery.

 

Oracle数据库的实例每次启动时,通过控制文件中的内容来确定哪些数据库文件和重做日志文件是执行数据库操作所必需的。当数据库的物理构成发生变化时(例如创建了新的数据文件或重做日志文件),Oracle自动地修改控制文件以反映这些变化。此外,数据库恢复(database recovery)时也要用到控制文件。

72

See Also:

Chapter 3, "Tablespaces, Datafiles, and Control Files"

另见:

第三章,“表空间,数据文件和控制文件

73

Redo Log Files

 

1.1.3.3 重做日志文件

74

Every Oracle database has a set of two or more redo log files. The set of redo log files is collectively known as the redo log for the database. A redo log is made up of redo entries (also called redo records).

 

每个Oracle数据库都有两个或多个重做日志文件(redo log file)。这组文件作为一个整体被称为数据库的重做日志。重做日志由重做条目(redo record)构成(也被称为重做记录)。

75

The primary function of the redo log is to record all changes made to data. If a failure prevents modified data from being permanently written to the datafiles, then the changes can be obtained from the redo log, so work is never lost.
 

重做日志的主要功能是记录对数据的操作。如果某种故障导致无法将修改过的数据永久的写入数据文件,那么这些修改内容可以从重做日志中获得,用户已完成的任务不会丢失。

76

To protect against a failure involving the redo log itself, Oracle allows a multiplexed redo log so that two or more copies of the redo log can be maintained on different disks.

 

为了防止重做日志自身的问题导致故障,Oracle支持多重重做日志(multiplexed redo log)功能,即将内容相同的多份重做日志保存在不同的磁盘中。

77

The information in a redo log file is used only to recover the database from a system or media failure that prevents database data from being written to the datafiles. For example, if an unexpected power outage terminates database operation, then data in memory cannot be written to the datafiles, and the data is lost. However, lost data can be recovered when the database is opened, after power is restored. By applying the information in the most recent redo log files to the database datafiles, Oracle restores the database to the time at which the power failure occurred.

 

重做日志中的信息只能用于恢复由于系统或介质故障导致的不能被写入数据文件的数据。例如,如果突然的断电导致数据库操作停止,则内存中的数据不能被写入数据文件,造成数据丢失。当电力恢复数据库再次打开时可以恢复丢失的数据。将最新的重做日志文件中的信息应用于数据文件,Oracle可以将数据库恢复到断电时的状态。

78

The process of applying the redo log during a recovery operation is called rolling forward.

 

在恢复操作中恢复重做日志信息的过程叫做前滚(rolling forward )。

79

See Also:

"Overview of Database Backup and Recovery Features"

另见:

数据库备份与恢复功能概述

80

Archive Log Files

 

1.1.3.4 存档日志文件

89

You can enable automatic archiving of the redo log. Oracle automatically archives log files when the database is in ARCHIVELOG mode.

 

重做日志文件可以被自动归档。当数据库运行在ARCHIVELOG模式下,Oracle将自动地归档重做日志文件。

90

Parameter Files

 

1.1.3.5 参数文件

91

Parameter files contain a list of configuration parameters for that instance and database.

 

参数文件包含了数据库与实例的配置参数列表。

92

Oracle recommends that you create a server parameter file (SPFILE) as a dynamic means of maintaining initialization parameters. A server parameter file lets you store and manage your initialization parameters persistently in a server-side disk file.

 

Oracle建议数据库管理员创建服务器参数文件(server parameter file,SPFILE),以便动态地维护初始化参数。服务器参数文件使用户可以在服务器端磁盘的文件中保存初始化参数,并进行管理。

93

See Also:

另见:

94

Alert and Trace Log Files

 

1.1.3.6 告警和跟踪调试日志文件

95

Each server and background process can write to an associated trace file. When an internal error is detected by a process, it dumps information about the error to its trace file. Some of the information written to a trace file is intended for the database administrator, while other information is for Oracle Support Services. Trace file information is also used to tune applications and instances.
 

每一个服务进程、后台进程都有一个与之相关的跟踪调试文件(trace file)。当进程检查出一个内部错误时,就将错误信息导出到她的跟踪调试文件中。跟踪调试文件 中的一些信息供数据库管理员使用,还有些是供Oracle技术支持(Oracle Support Services)使用的。跟踪调试文件的内容还可以被用做应用与实例的调优。

96

The alert file, or alert log, is a special trace file. The alert log of a database is a chronological log of messages and errors.

 

告警文件,或称作告警日志,是一种特殊的跟踪调试文件。数据库的告警日志按时间顺序记录了数据库运行时产生的消息与错误信息。

97

See Also:

Oracle Database Administrator's Guide

另见:

 

Oracle数据库管理员指南

98

Backup Files

 

1.1.3.7 备份文件