Olap for a small company. Olap in the narrow sense of the word is interpreted as: olap cubes


OLAP (On-Line Analytical Processing) is a method of electronic analytical data processing that represents the organization of data into hierarchical categories using pre-calculated totals. OLAP data is organized hierarchically and is stored in cubes rather than tables. OLAP cubes are a multidimensional data set with axes containing parameters and cells containing parameter-dependent aggregate data. Cubes are designed for complex multidimensional analysis of large volumes of data because they provide only summary results for reporting, instead of a large number of individual records.

The concept of OLAP was described in 1993 by the famous database researcher and author of the relational data model E. F. Codd. Currently, OLAP support is implemented in many DBMSs and other tools.

An OLAP cube contains two types of data:

· total values, values ​​for which you want to summarize, representing calculated data fields;

· descriptive information representing measurements or dimensions. Descriptive information is typically organized into levels of detail. For example: “Year”, “Quarter”, “Month” and “Day” in the “Time” dimension. Organizing fields into levels of detail allows reporting users to choose the level of detail they want to view, starting with high-level summary data and then drilling down to a more detailed view, and vice versa.

Microsoft Query tools also allow you to create OLAP cubes from a query that loads data from a relational database such as Microsoft Access, transforming a linear table into a structured hierarchy (cube).

The Create OLAP Cube Wizard is a built-in Microsoft Query tool. To create an OLAP cube based on a relational database, you must complete the following steps before running the wizard.

1. Determine the data source (see Figure 6.1).

2. Using Microsoft Query, create a query, including only those fields that will be either data fields or dimension fields of an OLAP cube; if a field in a cube is used more than once, then it must be included in the query the required number of times.

3. At the last step of the query creation wizard, set the switch on the item Creating an OLAP cube from a given query(see Fig. 6.2) or after the request is created using the Query menu directly File select a team Create OLAP Cube, after which the Create OLAP Cube Wizard will be launched.

The Create OLAP Cube Wizard consists of three steps.

At the first step of the wizard (see Fig. 6.6) the data fields– calculated fields for which total values ​​must be determined.



Rice. 6.6. Defining Data Fields

The wizard places expected calculated fields (usually numeric fields) at the top of the list, checks them, and determines the resulting function of these fields, usually - Sum. When selecting data fields, at least one field must be selected as a calculated field and at least one field must be left unchecked to determine the dimension.

When creating an OLAP cube, you can use four summary functions − Sum, Number(number of values), Minimum, Maximum for numeric fields and one function Number for all other fields. If you want to use several different summary functions of the same field, that field must be included in the query the required number of times.

The name of a calculated field can be changed in a column Data field name.

At the second step of the wizard, descriptive data and their dimensions are determined (see Fig. 6.7). To select a measurement field, you must from the list Source fields drag the desired top-level dimension field to the list Measurements to the area marked as Drag fields here to create dimensions. To create an OLAP cube, you must define at least one dimension. At the same step of the wizard, you can use the context menu to change the name of the dimension or level field.

Rice. 6.7. Defining Dimension Fields

Fields that contain isolated or discrete data and do not belong to a hierarchy can be defined as single-level dimensions. However, the cube will be more efficient if some of the fields are organized into levels. To create a level as part of a dimension, drag a field from the list Source fields on a field that is a dimension or level. Fields containing more detailed information should be placed at lower levels. For example, in Figure 6.7 the field Job title is the field level Department name.

To move a field to a lower or higher level, you need to drag it to a lower or higher field within the dimension. To display or hide levels, use the or buttons, respectively.

If you use date or time fields as the top-level dimension, the OLAP Cube Wizard automatically creates levels for those dimensions. The user can then select which levels should appear in the reports. For example, you can select weeks, quarters and years, or months (see Figure 6.7).

Remember that the wizard automatically creates levels for date and time fields only when you create a top-level dimension; When adding these fields as sublevels of a dimension, automatic levels are not created.

At the third step of the wizard, the type of cube created by the wizard is determined, with three options possible (see Fig. 6.8).

Rice. 6.8. Selecting the type of cube to be created at the third step of the wizard

· The first two options involve creating a cube each time you open a report (if the cube is viewed from Excel, then we are talking about a pivot table). In this case, the request file and the file cube definitions *.oqy, which contains instructions for creating a cube. The *.oqy file can be opened in Excel to create reports based on the cube, and if you need to make changes to the cube, you can open it with Query to run the Create Cube Wizard again.

By default, cube definition files, like query files, are stored in the user profile folder in Application Data\Microsoft\Que-ries. When saving a *.oqy file in the standard folder, the name of the cube definition file is displayed on the tab OLAP cubes when opening a new query in Microsoft Query or when selecting a command Create a request(menu Data, submenu Importing external data) in Microsoft Excel.

· In case of choosing the third option of cube type Saving a cube file containing all the data for the cube, all data for the cube is retrieved and a cube file with the extension * is created in a user-specified location .cub, in which this data is stored. This file is not created immediately when the button is clicked Ready; the file is created either when you save the cube definition to a file or when you create a report based on the cube.

The choice of cube type is determined by several factors: the amount of data the cube contains; the type and complexity of reports that will be created based on the cube; system resources (memory and disk space), etc.

A separate *.cub cube file should be created in the following cases:

1) for frequently changed interactive reports if there is sufficient disk space;

2) when you need to save the cube on a network server to provide access to it for other users when creating reports. A cube file can provide specific data from the source database while omitting sensitive or sensitive data that you want to prevent other users from accessing.

Blue arrows indicate the paths through which information enters the system; green arrows indicate how the information is subsequently used.

  1. Information about orders is entered into the 1c system - dbf version.
  2. Loading "auto-exchange" data. Actually, this is an extra step. Data can be obtained directly from the dbf database. But 1c programmers decided that the standard (for 1c) data upload mechanism would do less harm.
  3. Once a day, changes for the past day are uploaded to a specially prepared MsSql database - storage. Not all information is downloaded, but only what is needed for the cubes.

    In principle, it is not necessary to build a “storage facility”. Data for the cube can be obtained directly from the 1C database (MsSQL or dbf). But in my case, from 1C, data from previous periods is periodically deleted and directories are cleared. In addition, before loading into storage, the data is “cleaned” a little.

  4. The cube is recalculated - the data goes into the cube.
Information from the storage is used not only by cubes, but also by external applications, for example, this data is needed for payroll calculation, for accounting for payments and deliveries, for planning the work of a manager. At the same time, data from these external programs also enters the cubes.

Employees in the office work with cubes - management, managers, marketing, accounting. Information is also sent to suppliers and sales representatives in different cities of the region.

Any user can obtain information in different ways:

  1. Build a report yourself on a web page or in Excel

    At first, only Excel was used, but many problems arose with the Excel files being “scattered”; it was necessary to get one “entry point” to select information.
    Therefore, a local site was created on which pages from PivotTable were published. An employee who wants to get a couple of numbers “here and now” goes to this site and builds a report in the form he needs. If a person needs to use this report in the future, he can write a request to have his report published in SSRS or save it himself in Excel.

  2. View a standard report published to SQL Server Reporting Services (SSRS)
  3. Get a local cube - and outside the office “rotate” data using Excel
  4. Subscribe to the newsletter and receive standard reports from SSRS by e-mail
  5. The marketing department also uses the CubeSlice program. In it you can create local cubes yourself and it is much more convenient than in Excel

Local Cubes

Sometimes a user needs to periodically receive reports containing large amounts of data. For example, the marketing department sent reports to suppliers in the form of Excel files containing several dozen pages.
Olap is not designed to receive such information - the reports took a very long time to generate.

As a rule, it is also inconvenient for the supplier to work with large reports. Therefore, most of them, having tried working with local cubes, agreed to receive reports in this form. The list of reports generated by the marketing department has been significantly reduced. The remaining heavy reports were implemented in SSRS, subscriptions were created (reports are generated automatically and sent to suppliers on a schedule)

Basic system parameters

Server configuration:

processor: 2xAMD Opteron 280
memory: 4Gb
disk arrays:
operating system: RAID 1 (mirror) 2xSCSI 15k
data: RAID 0+1 4xSCSI 10k

Agree, it’s difficult to call such a machine a “powerful” server

Data volume:

10GB storage, data since 2002
aggregation 30%
Multidimensional database size 350M
number of members of “large dimensions”: goods 25 thousand, addresses – 20 thousand.
number of documents per day - 400. average number of lines in a document - 30

What the company ended up with:

pros

  • For the management of the enterprise
    Allows you to look at the situation “from above” and identify general patterns of business development.
    Helps to track the dynamics of changes in the main performance indicators of the organization as a whole and quickly evaluate the performance indicators of subordinates.
  • For the manager
    The ability to independently and quickly obtain the information necessary to make a decision.
    Ease of operation. All actions are intuitive
  • For suppliers
    Possibility of interactive work with information
  • From the point of view of an IT specialist
    Reducing routine work. The user receives most of the reports independently.

Minuses:

  • Implementation cost. Additional hardware and software required.
  • Lack of trained specialists. Costs for training IT department employees.

In a standard pivot table, the source data is stored on your local hard drive. This way, you can always manage and reorganize them, even without access to the network. But this in no way applies to OLAP pivot tables. In OLAP pivot tables, the cache is never stored on the local hard drive. Therefore, immediately after disconnecting from the local network, your pivot table will no longer work. You will not be able to move a single field in it.

If you still need to analyze OLAP data after going offline, create an offline data cube. An offline data cube is a separate file that is a pivot table cache and stores OLAP data that is viewed after disconnecting from the local network. OLAP data copied into a pivot table can be printed; this is described in detail on the website http://everest.ua.

To create a standalone data cube, first create an OLAP pivot table. Place the cursor within the pivot table and click on the OLAP Tools button on the Tools contextual tab, which is part of the PivotTable Tools contextual tab group. Select the Offline OLAP command (Fig. 9.8).

The Offline OLAP Data Cube Settings dialog box appears on the screen. Click on the Create Offline Data File button. You have launched the Create Data Cube File Wizard. Click the Next button to continue the procedure.

First you need to specify the dimensions and levels that will be included in the data cube. In the dialog box, you must select the data that will be imported from the OLAP database. The idea is to specify only those dimensions that will be needed after the computer is disconnected from the local network. The more dimensions you specify, the larger the autonomous data cube will be.

Click the Next button to move to the next wizard dialog box. This gives you the ability to specify members or data elements that will not be included in the cube. In particular, you won't need the Internet Sales-Extended Amount measure, so its checkbox will be cleared in the list. A cleared check box indicates that the specified item will not be imported and take up unnecessary space on your local hard drive.

In the last step, specify the location and name of the data cube. In our case, the cube file will be named MyOfflineCube.cub and will be located in the Work folder.

Data cube files have the extension .cub

After some time, Excel will save the offline data cube in the specified folder. To test it, double-click on the file, which will automatically generate an Excel workbook that contains a pivot table associated with the selected data cube. Once created, you can distribute the offline data cube to all interested users who are working in offline LAN mode.

Once connected to your local network, you can open the offline data cube file and update it and the corresponding data table. The main principle states that the offline data cube is used only to work when the local network is disconnected, but it is required to be updated after the connection is restored. Attempting to update an offline data cube after a connection failure will result in a failure.

A stand-alone cube file (.cub) stores data in a form in an online analytical processing (OLAP) cube. This data may represent part of an OLAP database from an OLAP server, or it may have been created independently of any OLAP database. To continue working with PivotTable and PivotChart reports when the server is unavailable or when offline, use an offline cube file.

Learn more about offline cubes

When you work with a PivotTable or PivotChart report that is based on a data source from an OLAP server, use the Offline Cube Wizard to copy the source data to a separate offline cube file on your computer. To create these offline files, you must have an OLAP data provider that supports these capabilities, such as MSOLAP from Microsoft SQL Server Analysis Services, installed on your computer.

Note: Creating and using stand-alone cube files from Microsoft SQL Server Analysis Services is subject to Microsoft SQL Server installation terms and licensing. Review the appropriate licensing information for your version of SQL Server.

Using the Offline Cube Wizard

To create an offline cube file, use the Offline Cube Wizard to select a subset of data in the OLAP database, and then save that set. The report does not have to include all the fields included in the file, and you can choose from any of its dimensions and data fields available in the OLAP database. To minimize file size, you can include only the data that you want to be able to display in the report. You can skip all dimensions and, for most types of dimensions, also omit lower-level detail and top-level features that you don't want to display. For an offline file, all elements that can be included in the property fields that are available in the database for those elements are also saved.

Taking data offline and then bringing data back online

To do this, you first need to create a PivotTable report or a PivotChart report that is based on the server database, and then create a standalone cube file from the report. Subsequently, when working with a report, you can switch between the server database and the offline file at any time (for example, when working on a laptop at home or on the road and then reconnecting the computer to the network).

The following describes the basic steps for taking data offline and bringing it back online.

Note:

    Click the PivotTable report. If this is a PivotChart report, select the associated PivotTable report.

    On the "tab" Analysis" in Group calculations click the button OLAP service and press the button Offline OLAP.

    Select an item OLAP with connectivity and then click the button OK.

    If prompted to find a data source, click Find source and find an OLAP server on the network.

    Click the PivotTable report that is based on the offline cube file.

    In Excel 2016: On the " tab data" in Group requests and connections Update all and press the button Update.

    In Excel 2013: On the " tab data" in Group connections click the arrow next to the button Update all and press the button Update.

    On the "tab" Analysis" in Group calculations click the button OLAP service and press the button Offline OLAP.

    Click the button Offline OLAP mode, and then - .

Note: Stop in the dialog box.

Warning:

Creating an offline cube file from an OLAP server database

Note: If the OLAP database is large and the cube file is needed to provide access to a large subset of the data, a lot of disk space will be required, and saving the file may take a long time. To improve performance, it is recommended that you create stand-alone cube files using an MDX script.

Problem: My computer does not have enough disk space when saving a cube.

OLAP databases are designed to manage large amounts of detailed data, so a database hosted on a server can take up significantly more space than is available on your local hard drive. If you select a large amount of data for an offline data cube, you may not have enough free disk space. The following approach will help reduce the size of the offline cube file.

Free up disk space or select a different disk Before saving the cube file, remove unnecessary files from the disk or save the file on a network drive.

Including less data in an offline cube file Consider how you can minimize the amount of data included in the file so that the file contains all the data needed for a PivotTable report or PivotChart. Try the steps below.

Connecting an offline cube file to an OLAP server database

Updating and re-creating an offline cube file

Updating an offline cube file that is created from the latest data obtained from a server cube or from a new offline cube file can take a significant amount of time and require a large amount of temporary disk space. Run this process when you don't need immediate access to other files, after making sure you have enough space on your hard drive.

Problem: New data does not appear in the report when refreshed.

Checking the availability of the source database The offline cube file may be unable to connect to the source server database to obtain new data. Make sure that the original database on the server that is the data source for the cube has not been renamed or moved to another location. Make sure the server is accessible and can be connected to.

Checking for new data Check with your database administrator to see if the data that should be included in the report has been updated.

Checking the immutability of the database organization If the OLAP server cube has been modified, you may need to reorganize the report, create an offline cube file, or run the Create OLAP Cube Wizard to access the changed data. To learn about database changes, contact your database administrator.

Including other data in the offline cube file

Saving a modified offline cube file can be time-consuming, and you cannot work in Microsoft Excel while the file is being saved. Run this process when you don't need immediate access to other files, after making sure you have enough space on your hard drive.

    Verify that there is a network connection and that the source OLAP server database from which the offline cube file obtained data is accessible.

    Click a PivotTable report created from a stand-alone cube file, or an associated PivotTable report for a PivotChart report.

    On the tab Options in Group Service click the button OLAP service and press the button Offline OLAP mode.

    Click the button Offline OLAP mode, and then - Edit Offline Data File.

    Follow the Offline Cube Wizard to select other data to include in this file. In the last step, specify the name and path to the file to change.

Note: To cancel saving the file, click the button Stop in the dialog box Creating a cube file - progress.

Deleting an offline cube file

Warning: If you delete an offline cube file for a report, you can no longer use that report offline and you can no longer create an offline cube file for that report.

    Close any workbooks that contain reports that use the offline cube file, or ensure that all such reports are deleted.

    On Microsoft Windows, locate and delete the offline cube file (CUB file).

additional information

You can always ask a question from the Excel Tech Community, ask for help in the Answers community, or suggest a new feature or improvement to the website

In the previous article in this series (see No. 2’2005), we talked about the main innovations of analytical services in SQL Server 2005. Today we will take a closer look at the tools for creating OLAP solutions included in this product.

Briefly about the basics of OLAP

Before we start talking about tools for creating OLAP solutions, let us recall that OLAP (On-Line Analytical Processing) is a technology for complex multidimensional data analysis, the concept of which was described in 1993 by E.F. Codd, the famous author of the relational data model. Currently, OLAP support is implemented in many DBMSs and other tools.

OLAP cubes

What is OLAP data? To answer this question, consider a simple example. Let's assume that in the corporate database of a certain enterprise there is a set of tables containing information about sales of goods or services, and on their basis an Invoices view has been created with the fields Country (country), City (city), CustomerName (name of the client company), Salesperson (manager for sales), OrderDate (date of order placement), CategoryName (product category), ProductName (product name), ShipperName (carrier company), ExtendedPrice (payment for goods), while the last of these fields is, in fact, the object of analysis .

Selecting data from such a view can be done using the following query:

SELECT Country, City, CustomerName, Salesperson,

OrderDate, CategoryName, ProductName, ShipperName, ExtendedPrice

FROM Invoices

Suppose we are interested in the total value of orders made by customers from different countries. To get an answer to this question you need to make the following request:

SELECT Country, SUM (ExtendedPrice) FROM Invoices

GROUP BY Country

The result of this query will be a one-dimensional set of aggregate data (in this case, sums):

Country SUM (ExtendedPrice)
Argentina 7327.3
Austria 110788.4
Belgium 28491.65
Brazil 97407.74
Canada 46190.1
Denmark 28392.32
Finland 15296.35
France 69185.48
209373.6
...

If we want to know the total cost of orders placed by customers from different countries and delivered by different delivery services, we must run a query containing two parameters in the GROUP BY clause:

SELECT Country, ShipperName, SUM (ExtendedPrice) FROM Invoices

GROUP BY COUNTRY, ShipperName

Based on the results of this query, you can create a table that looks like this:

This set of data is called a pivot table.

SELECT Country, ShipperName, SalesPerson SUM (ExtendedPrice) FROM Invoices

GROUP BY COUNTRY, ShipperName, Year

Based on the results of this query, a three-dimensional cube can be constructed (Fig. 1).

By adding additional parameters for analysis, you can create a cube with theoretically any number of dimensions, and along with the sums, the cells of the OLAP cube can contain the results of calculating other aggregate functions (for example, average, maximum, minimum values, the number of records of the original view corresponding to a given set parameters). The fields from which results are calculated are called cube measures.

Hierarchies in dimensions

Suppose we are interested not only in the total value of orders made by customers in different countries, but also in the total value of orders made by customers in different cities of the same country. In this case, you can take advantage of the fact that the values ​​plotted on the axes have different levels of detail - this is described within the concept of a hierarchy of changes. Let's say that countries are located at the first level of the hierarchy, cities are at the second. Note that starting with SQL Server 2000, analysis services support so-called unbalanced hierarchies, which contain, for example, members whose “children” are not contained at adjacent levels of the hierarchy or are missing for some members of the change. A typical example of such a hierarchy is taking into account the fact that in different countries there may or may not be administrative-territorial units such as a state or region, located in the geographic hierarchy between countries and cities (Fig. 2).

Note that recently it has been common to distinguish typical hierarchies, for example those containing geographical or temporal data, and also to support the existence of several hierarchies in one dimension (in particular, for the calendar and fiscal year).

Creating OLAP cubes in SQL Server 2005

SQL Server 2005 cubes are created using SQL Server Business Intelligence Development Studio. This tool is a special version of Visual Studio 2005 designed to solve this class of problems (and if you have an already installed development environment, the list of project templates is supplemented with projects designed to create solutions based on SQL Sever and its analytical services). In particular, the Analysis Services Project template is designed for creating solutions based on analytical services (Fig. 3).

To create an OLAP cube, you first need to decide on what data to form it. Most often, OLAP cubes are built on the basis of relational data warehouses with star or snowflake schemas (we talked about them in the previous part of the article). The SQL package includes an example of such a storage - the AdventureWorksDW database, to use which as a source you should find the Data Sources folder in Solution Explorer, select the New Data Source context menu item and sequentially answer the questions of the corresponding wizard (Fig. 4).

It is then recommended to create a Data Source View on which the cube will be created. To do this, you need to select the appropriate context menu item in the Data Source Views folder and consistently answer the wizard’s questions. The result of these actions will be a data schema, with the help of which a representation of data sources will be built, and in the resulting schema, instead of the original ones, you can specify “friendly” table names (Fig. 5).

The cube described in this way can be transferred to the analytical services server by selecting the Deploy option from the project context menu and viewing its data (Fig. 7).

Cube creation now takes advantage of many of the features of the new version of SQL Server, such as the data source view. The description of the source data for constructing a cube, as well as the description of the structure of the cube, is now done using the Visual Studio tool familiar to many developers, which is a significant advantage of the new version of this product - the study of new tools by developers of analytical solutions in this case is minimized.

Note that in the created cube you can change the composition of measures, delete and add dimension attributes, and add calculated attributes of dimension members based on existing attributes (Fig. 8).

Rice. 8. Add a calculated attribute

In addition, SQL Server 2005 cubes can automatically group or sort dimension members by attribute value, define relationships between attributes, implement many-to-many relationships, determine key business metrics, and much more (learn how All of these steps can be found in the SQL Server Analysis Services Tutorial in that product's Help).

In subsequent parts of this publication, we will continue to explore the analytical services of SQL Server 2005 and find out what's new in the area of ​​Data Mining support.

Editor's Choice
Loving is not as easy as it seems, and living next to another person is even more difficult. That's why I can safely say that every anniversary...

Give your loved one a letter in which gentle words will turn into gentle lines with a kind and gentle meaning, with love and respect, with a rainbow...

Morning, afternoon, evening and night... Yesterday, today, tomorrow and always I love you! Good morning, honey! May you have a successful day today...

You may have already wondered how to ask for forgiveness correctly if you have done something wrong. What words to start with, how to express regret...
Inviting actors to host costumed congratulations has been in demand for several years among parents who want to congratulate...
We invite you to read quotes about life. Here are collected phrases, aphorisms, quotes about the lives of great people and ordinary people. Among these quotes are...
Anna Sedokova is a singer and actress, TV presenter, writer and director. She attracted attention and made her debut on the music scene in...
This is a product of Chinese healers that has no analogues! Concentrated product used as a dietary supplement and...
The journalist notes that on July 20 she woke up with similar feelings to those she had 16 years ago. I woke up at 5.30 am and stretched...