Functional and stochastic connections. Problem of mathematical modeling (approximation) Stochastic dependence formula

Between various phenomena and their characteristics, it is necessary first of all to distinguish two types of connections: functional (rigidly determined) and statistical (stochastic deterministic).

The relationship of feature y with feature x is called functional if each possible value of the independent feature x corresponds to one or more strictly defined values ​​of the dependent feature y. The definition of a functional relationship can be easily generalized to the case of many features x1,x2,…,x n.

A characteristic feature of functional connections is that in each individual case a complete list of factors that determine the value of the dependent (resultative) characteristic is known, as well as the exact mechanism of their influence, expressed by a certain equation.

The functional relationship can be represented by the equation:

Where y i is the resultant sign (i=1,…, n)

f(x i) – known function of the connection between the resultant and factor characteristics

x i – factor sign.

A stochastic connection is a connection between quantities in which one of them, a random quantity y, reacts to a change in another quantity x or other quantities x1, x2,..., xn, (random or non-random) by changing the distribution law. This is due to the fact that the dependent variable (resulting attribute), in addition to the independent ones under consideration, is influenced by a number of unaccounted or uncontrolled (random) factors, as well as some inevitable errors in the measurement of variables. Since the values ​​of the dependent variable are subject to random scatter, they cannot be predicted with sufficient accuracy, but only indicated with a certain probability.

A characteristic feature of stochastic relationships is that they manifest themselves in the entire population, and not in each of its units (and neither the complete list of factors that determine the value of the effective characteristic, nor the exact mechanism of their functioning and interaction with the effective characteristic is known). There is always the influence of the random. Various values ​​of the dependent variable appearing - realizations of a random variable.

The stochastic communication model can be represented in general form by the equation:

Where y i is the calculated value of the resulting characteristic

f(x i) – part of the resulting characteristic, formed under the influence of the known factor characteristics (one or many) taken into account, which are in a stochastic connection with the characteristic

ε i is part of the resultant characteristic that arose as a result of the action of uncontrolled or unaccounted factors, as well as the measurement of characteristics inevitably accompanied by some random errors.

Considering the dependence between characteristics, let us first of all highlight the dependence between the change in factor and resultant characteristics, when a very specific value of the factorial characteristic corresponds to many possible values ​​of the effective characteristic. In other words, each value of one variable corresponds to a certain (conditional) distribution of another variable. This dependence is called stochastic. The emergence of the concept of stochastic dependence is due to the fact that the dependent variable is influenced by a number of uncontrolled or unaccounted factors, as well as the fact that changes in the values ​​of variables are inevitably accompanied by some random errors. An example of a stochastic relationship is the dependence of agricultural crop yields Y from the mass of applied fertilizers X. We cannot accurately predict the yield, since it is influenced by many factors (precipitation, soil composition, etc.). However, it is obvious that with a change in the mass of fertilizers, the yield will also change.

In statistics, observed values ​​of characteristics are studied, so stochastic dependence is usually called statistical dependence.

Due to the ambiguity of the statistical relationship between the values ​​of the resultant characteristic Y and the values ​​of the factor characteristic X, the dependence scheme averaged over X is of interest, i.e. pattern expressed by conditional mathematical expectation M(Y/X = x)(calculated with a fixed value of the factor characteristic X = x). Dependencies of this kind are called regression, and the function ср(х) = M(Y/X = x) - regression function Y on X or forecast Y By X(designation y x= f(l)). At the same time, the effective sign Y also called response function or explained, output, resultant, endogenous variable, and the factor attribute X - regressor or explanatory, input, predictive, predictor, exogenous variable.

In Section 4.7 it was proved that the conditional mathematical expectation M(Y/X) =ср(х) gives the best forecast of Y from X in the root-mean-square sense, i.e. M(Y- f(x)) 2 M(Y-g(x)) 2, where g(x) - any other UPOH forecast.

So, regression is a one-way statistical relationship that establishes correspondence between characteristics. Depending on the number of factor characteristics describing the phenomenon, there are steam room And multiple regression. For example, paired regression is a regression between production costs (factor characteristic X) and the volume of products produced by the enterprise (resultative characteristic Y). Multiple regression is a regression between labor productivity (resultative characteristic Y) and the level of mechanization of production processes, working hours, material intensity, and worker qualifications (factor characteristics X t, X 2, X 3, X 4).

They are distinguished by shape linear And nonlinear regression, i.e. regressions expressed by linear and nonlinear functions.

For example, f(X) = Oh + Kommersant - paired linear regression; f(X) = aX 2 + + bx + With - quadratic regression; f(X 1? X 2,..., X p) = p 0 4- fi(X(+ p 2 X 2 + ... + p„X w - multiple linear regression.

The problem of identifying statistical dependence has two sides: establishing tightness (strength) of connection and definition forms of communication.

Dedicated to establishing closeness (strength) of communication correlation analysis, the purpose of which is to obtain, based on available statistical data, answers to the following basic questions:

  • how to choose a suitable statistical connection meter (correlation coefficient, correlation ratio, rank correlation coefficient, etc.);
  • how to test the hypothesis that the resulting numerical value of the relationship meter really indicates the presence of a statistical relationship.

Determines the form of communication regression analysis. In this case, the purpose of regression analysis is to solve the following problems based on available statistical data:

  • choosing the type of regression function (model selection);
  • finding unknown parameters of the selected regression function;
  • analysis of the quality of the regression function and verification of the adequacy of the equation to empirical data;
  • forecasting unknown values ​​of the resultant characteristic based on given values ​​of factor characteristics.

At first glance, it may seem that the concept of regression is similar to the concept of correlation, since in both cases we are talking about a statistical dependence between the characteristics being studied. However, in reality there are significant differences between them. Regression implies a causal relationship when a change in the conditional average value of an effective characteristic occurs due to a change in factor characteristics. Correlation does not say anything about the causal relationship between signs, i.e. if there is a correlation between X and Y, then this fact does not imply that changes in values X determine the change in the conditional average value of Y. Correlation simply states the fact that changes in one value, on average, correlate with changes in another.

Federal State Educational Institution

higher professional education

Academy of Budget and Treasury

Ministry of Finance of the Russian Federation

Kaluga branch


by discipline:


Subject: Econometric method and the use of stochastic dependencies in econometrics

Faculty of Accounting


accounting, analysis and audit

Part-time department

Scientific director

Shvetsova S.T.

Kaluga 2007


1. Analysis of various approaches to determining probability: a priori approach, a posteriori-frequency approach, a posteriori-model approach

2. Examples of stochastic dependencies in economics, their features and probability-theoretic methods of studying them

3. Testing a number of hypotheses about the properties of the probability distribution for the random component as one of the stages of econometric research




The formation and development of the econometric method took place on the basis of the so-called higher statistics - on the methods of paired and multiple regression, paired, partial and multiple correlation, identification of trends and other components of the time series, and statistical estimation. R. Fisher wrote: “Statistical methods are an essential element in the social sciences, and it is mainly with the help of these methods that social teachings can rise to the level of sciences.”

The purpose of this essay was to study the econometric method and the use of stochastic dependencies in econometrics.

The objectives of this essay are to analyze various approaches to determining probability, give examples of stochastic dependencies in economics, identify their features and give probability-theoretic methods for studying them, and analyze the stages of econometric research.

1. Analysis of various approaches to determining probability: a priori approach, a posteriori-frequency approach, a posteriori-model approach

To fully describe the mechanism of the random experiment under study, it is not enough to specify only the space of elementary events. Obviously, along with listing all the possible outcomes of the random experiment under study, we must also know how often in a long series of such experiments certain elementary events can occur.

To construct (in a discrete case) a complete and complete mathematical theory of a random experiment - probability theory – in addition to the original concepts random experiment, elementary outcome And random event need to stock up more one initial assumption (axiom), postulating the existence of probabilities of elementary events (satisfying a certain normalization), and definition the probability of any random event.

Axiom. Each element w i of the space of elementary events Ω corresponds to some non-negative numerical characteristic p i chances of its occurrence, called the probability of the event w i , and

p 1 + p 2 + . . . + p n + . . . = ∑ p i = 1 (1.1)

(from here, in particular, it follows that 0 ≤ R i ≤ 1 for all i ).

Determining the probability of an event. Probability of any event A is defined as the sum of the probabilities of all elementary events that make up the event A, those. if we use the symbols P(A) to denote the “probability of an event A» , That

P(A) = ∑ P( w i } = ∑ p i (1.2)

From here and from (1.1) it immediately follows that 0 ≤ Р(A) ≤ 1, and the probability of a reliable event is equal to one, and the probability of an impossible event is equal to zero. All other concepts and rules for dealing with probabilities and events will already be derived from the four initial definitions introduced above (random experiment, elementary outcome, random event and its probability) and one axiom.

Thus, for an exhaustive description of the mechanism of the random experiment under study (in the discrete case), it is necessary to specify a finite or countable set of all possible elementary outcomes Ω and each elementary outcome w i associate some non-negative (not exceeding one) numerical characteristic p i , interpreted as the probability of the outcome occurring w i (we will denote this probability by the symbols P( w i )), and the established correspondence of type w i ↔ p i must satisfy the normalization requirement (1.1).

Probability space is precisely the concept that formalizes such a description of the mechanism of a random experiment. To define a probability space means to define the space of elementary events Ω and define in it the above-mentioned type correspondence

w i p i = P ( w i }. (1.3)

To determine the probability from the specific conditions of the problem being solved P { w i } individual elementary events, one of the following three approaches is used.

A priori approach to calculating probabilities P { w i } consists in a theoretical, speculative analysis of the specific conditions of this particular random experiment (before conducting the experiment itself). In a number of situations, this preliminary analysis makes it possible to theoretically substantiate the method for determining the desired probabilities. For example, it is possible that the space of all possible elementary outcomes consists of a finite number N elements, and the conditions for producing the random experiment under study are such that the probability of each of these N elementary outcomes seem equal to us (this is exactly the situation we find ourselves in when tossing a symmetrical coin, throwing a fair dice, randomly drawing a playing card from a well-shuffled deck, etc.). By virtue of axiom (1.1), the probability of each elementary event is equal in this case 1/ N . This allows us to obtain a simple recipe for calculating the probability of any event: if the event A contains N A elementary events, then in accordance with definition (1.2)

P(A) = N A / N . (1.2")

The meaning of formula (1.2’) is that the probability of an event in this class of situations can be defined as the ratio of the number of favorable outcomes (i.e., elementary outcomes included in this event) to the number of all possible outcomes (the so-called classical definition of probability). In its modern interpretation, formula (1.2’) is not a definition of probability: it is applicable only in the particular case when all elementary outcomes are equally probable.

A posteriori-frequency approach to calculating probabilities R (w i } is based, essentially, on the definition of probability adopted by the so-called frequency concept of probability. According to this concept, the probability P { w i } determined as a limit on the relative frequency of occurrence of the outcome w i in the process of unlimited increase in the total number of random experiments n, i.e.

p i =P( w i ) = limm n (w i )/n (1.4)

Where m n (w i) – number of random experiments (out of the total number n random experiments performed) in which the occurrence of an elementary event was recorded w i. Accordingly, for a practical (approximate) determination of the probabilities p i it is proposed to take the relative frequencies of occurrence of the event w i in a fairly long series of random experiments.

The definitions in these two concepts are different. probabilities: according to the frequency concept, probability is not objective, existing before experience property of the phenomenon being studied, and appears only in connection with the experiment or observations; this leads to a mixture of theoretical (true, conditioned by the real complex of conditions for the “existence” of the phenomenon under study) probabilistic characteristics and their empirical (selective) analogues.

A posteriori model approach to setting probabilities P { w i } , which corresponds specifically to the real set of conditions under study, is currently perhaps the most widespread and most practically convenient. The logic of this approach is as follows. On the one hand, within the framework of an a priori approach, i.e. within the framework of a theoretical, speculative analysis of possible options for the specifics of hypothetical real sets of conditions, a set of model probabilistic spaces (binomial, Poisson, normal, exponential, etc.). On the other hand, the researcher has results from a limited number of random experiments. Further, with the help of special mathematical and statistical techniques, the researcher, as it were, adapts hypothetical models of probability spaces to the observation results he has and leaves for further use only that model or those models that do not contradict these results and, in a sense, best correspond to them.

dependence between random variables, manifested in the fact that a change in the distribution law of one of them occurs under the influence of a change in the other.

Probability theory is often perceived as a branch of mathematics that deals with the “calculus of probabilities.”

And all this calculation actually comes down to a simple formula:

« The probability of any event is equal to the sum of the probabilities of the elementary events included in it" In practice, this formula repeats the “spell” that is familiar to us since childhood:

« The mass of an object is equal to the sum of the masses of its constituent parts».

Here we will discuss not so trivial facts from probability theory. We will talk, first of all, about dependent And independent events.

It is important to understand that the same terms in different branches of mathematics can have completely different meanings.

For example, when they say that the area of ​​a circle S depends on its radius R, then, of course, we mean functional dependence

The concepts of dependence and independence have a completely different meaning in probability theory.

Let's start getting acquainted with these concepts with a simple example.

Imagine that you are conducting a dice-throwing experiment in this room, and your colleague in the next room is also tossing a coin. Suppose you are interested in event A – your colleague gets a “two” and event B – your colleague gets a “tails”. Common sense dictates: these events are independent!

Although we have not yet introduced the concept of dependence/independence, it is intuitively clear that any reasonable definition of independence must be designed so that these events are defined as independent.

Now let's turn to another experiment. A dice is thrown, event A is a two, and event B is an odd number of points. Assuming that the bone is symmetrical, we can immediately say that P(A) = 1/6. Now imagine that they tell you: “As a result of the experiment, event B occurred, an odd number of points fell.” What can we now say about the probability of event A? It is clear that now this probability has become zero.

The most important thing for us is that she changed.

Returning to the first example, we can say information the fact that event B happened in the next room will not affect your ideas about the probability of event A. This probability Will not change from the fact that you learned something about event B.

We come to a natural and extremely important conclusion -

if information that the event IN happened changes the probability of an event A , then events A And IN should be considered dependent, and if it does not change, then independent.

These considerations should be given a mathematical form, the dependence and independence of events should be determined using formulas.

We will proceed from the following thesis: “If A and B are dependent events, then event A contains information about event B, and event B contains information about event A.” How can you find out whether it is contained or not? The answer to this question is given by theory information.

From information theory we need only one formula that allows us to calculate the amount of mutual information I(A, B) for events A and B

We will not calculate the amount of information for various events or discuss this formula in detail.

It is important for us that if

then the amount of mutual information between events A and B is equal to zero - events A and B independent. If

then the amount of mutual information is events A and B dependent.

Appeal to the concept of information is of an auxiliary nature here and, as it seems to us, allows us to make the concepts of dependence and independence of events more tangible.

In probability theory, the dependence and independence of events is described more formally.

First of all, we need the concept conditional probability.

The conditional probability of event A, provided that event B has occurred (P(B) ≠0), is called the value P(A|B), calculated by the formula


Following the spirit of our approach to understanding the dependence and independence of events, we can expect that conditional probability will have the following property: if events A and B independent , That

This means that information that event B has occurred has no effect on the probability of event A.

The way it is!

If events A and B are independent, then

For independent events A and B we have


