Motivation
==========
Many physical phenomena (and financial ones) can be modelled as a
`stochastic process

Now let's look at what happens when we use the Stratonovich convention (using the :math:`\circ` operator to denote it) with :math:`s_j = \frac{t_j + t_{j+1}}{2}`: .. math:: &\int_0^t W(s) \circ dW(s) \\ &= \lim_{||\Pi|| \to 0} \sum_{j=0}^{n-1} W(s_j)[W(t_{j+1}) - W(t_j)] \\ &= \lim_{||\Pi|| \to 0} \sum_{j=0}^{n-1} \big[W(s_j)W(t_{j+1}) - W(s_j)W(t_j) + W(t_j)W(s_j) - W(t_j)W(s_j) \\ &+ W(t_j)^2 - W(t_j)^2 + W(s_j)^2 - W(s_j)^2 \big] \\ &= \lim_{||\Pi|| \to 0} \sum_{j=0}^{n-1} \big[W(t_j)(W(s_j) - W(t_j)) + W(s_j)(W(t_{j+1}) - W(s_j)) \big] \\ &+ \sum_{j=0}^{n-1}\big[ W(s_j) - W(t_j) \big]^2 \\ &= \int_0^t W(s) dW(s) + \lim_{||\Pi|| \to 0} \sum_{j=0}^{n-1}\big[ W(s_j) - W(t_j) \big]^2 && \text{Itô integral with partitions } t_0, s_0, t_1, s_1, \ldots \\ &= \frac{W(t)^2}{2} - \frac{t}{2} + \lim_{||\Pi|| \to 0} \sum_{j=0}^{n-1}\big[ W(s_j) - W(t_j) \big]^2 && \text{Equation 3.9} \\ &= \frac{W(t)^2}{2} - \frac{t}{2} + \frac{t}{2} && \text{Half-sample quadratic variation} \\ &= \frac{W(t)^2}{2} \\ \tag{3.10} We use the fact that the half-sample quadratic variation is equal to :math:`\frac{t}{2}` using a similar proof to Theorem 1. What we see here is that the Stratonovich integral actually follows our regular rules of calculus more closely, which is the reason it's used in certain domains. However in many domains such as finance, it is not appropriate. This is because the integrand represents a decision we are making for a time interval :math:`[t_j, t_{j+1}]`, such as a position in an asset, and we have to decide that *before* that interval starts, not mid-way through. That's analogous to deciding in the middle of the day that I should have actually bought more of a stock at the start of the day for a stock that went up in price. Quadratic Variation of Stochastic Integrals with Brownian Motion **************************************************************** Let's look at the quadratic variation (or sum of squared incremental differences) along a particular path for the stochastic integral we just defined above, and a related property. Note: the "output" of the stochastic integral is a stochastic process. .. admonition:: **Theorem 3** *The quadratic variation accumulated up to time* :math:`t` *by the Itô integral with the Wiener process* (*denoted by* :math:`I`) *from Equation 3.2 is*: .. math:: [I, I] = \int_0^t H^2(s) ds \tag{3.11} .. admonition:: **Theorem 4 (Itô isometry)** *The Itô integral with the Wiener process from Equation 3.2 satisfies*: .. math:: Var(I(t)) = E[I^2(t)] = E\big[\int_0^t H^2(s) ds\big] \tag{3.12} A couple things to notice. First, the quadratic variation is "scaled" by the underlying integrand :math:`H(t)` as opposed to accumulating quadratic variation at one unit per time from the Wiener process. Second, we start to see the difference between the path-dependent quantity of quadratic variation and variance. The former depends on the path taken by :math:`H(t)` up to time :math:`t`. If it's large, then the quadratic variation will be large, and similarly small with small values. Variance on the other hand is a fixed quantity up to time :math:`t` that is averaged over all paths and does not change (given the underlying distribution). Finally, let's gain some intuition on the quadratic variation by utilizing the informal differential notation from Equation 2.26-2.28. We can re-write our stochastic integral from Equation 3.2: .. math:: I(t) = \int_0^t H(s) dW(s) \tag{3.13} as: .. math:: dI(t) = H(t)dW(t) \tag{3.14} Equation 3.13 is the *integral form* while Equation 3.14 is the *differential form*, and they have identical meaning. The differential form is a bit easier to understand intuitively. We can see that it matches the approximation (Equation 3.4) that we discussed in the previous subsection. Using this differential notation and the informal notation we defined above in Equation 2.26-2.28, we can "calculate" the quadratic variation as: .. math:: dI(t)dI(t) = H^2(t)dW(t)dW(t) = H^2(t)dt \tag{3.15} using the fact that the quadratic variation for the Wiener process accumulates at one unit per time (:math:`dW(t)dW(t) = dt`) from Theorem 1. We'll utilize this differential notation more in the following subsections as we move into stochastic differential equations. Itô Processes and Integrals --------------------------- In the previous subsections, we only allowed integrators that were Wiener processes but we'd like to extend that to a more general class of stochastic processes called Itô processes [2]_: Let :math:`W(t)`, :math:`t\geq 0`, be a Wiener process with an associated filtration :math:`\mathcal{F}(t)`. An **Itô process** is a stochastic process of the form: .. math:: X(t) = X(0) + \int_0^t \mu(s) ds + \int_0^t \sigma(s) dW(s) \tag{3.16} where :math:`X(0)` is nonrandom and :math:`\sigma(s)` and :math:`\mu(s)` are adapted stochastic processes. Equation 3.16 can also be written in its more natural (informal) differential form: .. math:: dX(t) = \mu(t)dt + \sigma(t)dW(t) \tag{3.17} A large class of stochastic processes are Itô processes. In fact, any stochastic process that is square integrable measurable with respect to a filtration generated by a Wiener process can be represented by Equation 3.16 (see the `martingale representation theorem