$\newcommand{\ds}{\displaystyle}$ $\newcommand{\Rr}{\mathbb{R}}$ $\newcommand{\Qq}{\mathbb{Q}}$ $\newcommand{\Nn}{\mathbb{N}}$ $\newcommand{\lint}{\underline{\int}}$ $\newcommand{\uint}{\overline{\int}}$ $\newcommand{\sgn}{\mathrm{sgn}}$ $\newcommand{\ve}{\varepsilon}$ $\newcommand{\norm}[1]{\left\| #1 \right\|}$ $\newcommand{\mbf}{\mathbf}$ $\newcommand{\vx}{\mbf{x}}$ $\newcommand{\ve}{\mbf{e}}$ $\newcommand{\vh}{\mbf{h}}$ $\newcommand{\va}{\mbf{a}}$

Inverse and Implicit Functions¶

A natural question about continuity (differentiablity) that we have not addressed up to this point is about inverse functions.

We start by investigating the single-variable case.

Let $I$ be an open interval. Suppose $f \colon I \to \Rr$ is 1-to-1 and continuous (differentiable). Let $J = f(I)$.

When will $f^{-1} \colon J \to I$ be continuous (differentiable)?

Continuity¶

Note first that even if $f$ is differentiable on $I$ it does not guarantee that $J:=f(I)$ is open.

For example, $x^2$ maps $(-1,1)$ to $[0,1)$ which is not open.

However, we have

Proposition 1. An injective continuous function $f$ on an interval $I$ must be strictly monotonic.

Proof. If not then there exist three points $a < b < c$ in $I$ such that $f(b)$ is not between $f(a)$ and $f(c)$. In other words, either $f(a)$ is between $f(b)$ and $f(c)$ or $f(c)$ is between $f(a)$ and $f(b)$. It follows from the intermediate value theorem that in the first case, there will be some $b < a' < c$ such that $f(a')=f(a)$ and some $a < c' <b$, in the second case, such that $f(c')=f(c)$. But either way contradicts the injectivity of $f$.

Corollary 2. Suppose $f$ is strictly monotonic on an open interval $I$ then $f(I)$ is an open interval.

Proof. By Proposition 1. $f$ is strictly monotonic. Since $f$ is strictly monotonic and $I$ is open, it follows that $\inf f(I) \notin f(I)$. Likewise, $\sup f(I) \notin f(I)$. Thus, it follows from the IVT that $f(I) = (\inf f(I), \sup f(I))$ (here the inf and the sup are taken as extended reals).

Proposition 3. (3.6.6) If $f$ is strictly monotone on an interval $I$, then $f^{-1} \colon f(I) \to I$ is continuous.

Note that we do not assume $f$ is continuous on $I$ nor we assume $I$ is open. See Example 3.6.7

Theorem 4. If $f$ is an injective continuous function on an interval $I$, then its inverse $f^{-1} \colon f(I) \to I$ is continuous.

Proof. Follows immediately from Proposition 1 and Proposition 3.

Differentiability¶

Suppose $f \colon I \to J$ is a differentiable and bijection between intervals. Then unlike continuity, its inverse $f^{-1}$ need not be differentiable on $J$.

The reason is that suppose the inverse function $f^{-1} \colon J \to I$ is differentiable. Then since $x = (f^{-1}\circ f)(x)$ for all $x \in I$, so according to the Chain Rule, we have

$$ 1 = (x)' = (f^{-1}\circ f)'(x) = (f^{-1})'(f(x))f'(x)$$

But that means $f'(x) \neq 0$.

Here is a concrete example, $f(x) = x^3$ is a bijection from $(-1,1)$ to itself, but $f^{-1}(x) = \sqrt[3]{x}$ is not differentiable at $x=0$.

However, this is the only obstacle: if we require $f'(x_0) \neq 0$, then $f^{-1}$ would be differentiable at $f(x_0)$.

Proposition 4 (c.f. Lemma (4.4.1))

Let $I,J \subset \Rr$ be intervals. Suppose $f \colon I \to J$ is a continuous bijection (i.e. $f(I) = J$), differentiable at $x_0 \in I$ and $f'(x_0) \neq 0$, then $f^{-1}$, the inverse of $f$ is differentiable at $y_0:=f(x_0)$ and

$$ (f^{-1})'(y_0) = \frac{1}{f'(f^{-1}(y_0))} = \frac{1}{f'(x_0)}. $$

If $f$ is (continuously) differentiable and $f'$ is never zero on $I$, then $f^{-1}$ is (continuously) differentiable.

Proof. Under the assumption $f$ has a continuous inverse (Theorem 4), call it $g \colon J \to I$ for conveninent. Since functions $f$ and $g$ are inverse to each other for any $x \in I$ and $y \in J$, $y = f(x)$ means that same as $x = g(y)$.

By differentiability of $f$ at $x_0$, there exists a function $\varphi(x)$ continuous at $x_0$ such that $\varphi(x_0) = f'(x_0)$ and that for $x \in I$,

$$ f(x) - f(x_0) = \varphi(x)(x-x_0). $$

Thus, for $y \in J$,

$$ y - y_0 = \varphi(g(y))(g(y)-g(y_0)). $$

By continuity of $g$ at $y_0$ and $\varphi(x)$ at $x_0 = g(y_0)$, $\varphi(g(y))$ is continuous at $y_0$. Moreover, since $\varphi(g(y_0)) = \varphi(x_0) \neq 0$ by assumption, the function $\psi(y) = 1/\varphi(g(y))$ is defined and continuous at $y_0$. Therefore,

$$ g(y) - g(y_0) = \psi(y)(y-y_0) $$

and so $g$ is differentiable at $y_0$, moreover,

$$ g'(y_0) = \psi(y_0) = \frac{1}{\varphi(g(y_0))} = \frac{1}{\varphi(x_0)} = \frac{1}{f'(x_0)} = \frac{1}{f'(g(y_0))}. $$

If $f'(x) \neq 0$ on $I$, then the argument above applies for all $x \in I$, hence $f^{-1}$ is differentiable on $J$. If in addition $f'$ is continuous on $I$, then as $g'(y) = 1/(f'\circ g(y))$ and $g$ is differentiable (hence continuous), we conclude that $g'$ is also continuous on $J$.

Now we can give a local criteria for the existence of differentiable inverse.

Note first that just the fact that $f'(x_0) \neq 0$ is not enough. For example, the function

\begin{align*} f(x) = \begin{cases} x + x^2\sin(1/x) & x \neq 0 \\ 0 & x = 0 \end{cases} \end{align*}

has $f'(0) = 1$ but $f$ is not even 1-to-1 on any neighborhood of $0$.

However, if in addition, we require $f'$ to be continuous at $x_0$ then that's enough.

Theorem 5. (Single Variable Inverse Function Theorem)

Let $f \colon (a,b) \to \Rr$ be $C^1$, $x_0 \in (a,b)$ a point where $f'(x_0) \neq 0$. Then there exists an open interval $I \subseteq (a,b)$ containing $x_0$ such that $f\mid_I$ is injective with a $C^1$ inverse $g \colon J:=f(I) \to I$ and

$$ g'(y) = \frac{1}{f'(g(y))} \qquad \forall y \in J. $$

Proof. As $f'$ is continuous and $f'(x_0) \neq 0$, $f'$ does not change sign on an open interval $I$ containing $x_0$. So, $f$ is strictly monotonic on $I$, and hence the restriction $f\mid_I$ maps $I$ onto $J:=f(I)$. Also, as $f$ is continuous, it is a consequence of the IVT that $J$ is also an interval. Now, the theorem follows from Proposition 4.

As an application, one establish (see Corollary 4.4.3) the existence and uniquence of the $n$-th root $(n \ge 1)$ of any nonnegative real number $x$.

Read Example 4.2 and 4.3. They show that $I$ can be strictly smaller than $(a,b)$ and that the assumption $f'(x_0) \neq 0$ is necessary for the conclusion that the inverse of $f$ is differentiable at $y_0 = f(x_0)$.

Here we state the inverse function theorem for functions of several variables. The proof of it is considerably more sopishicated. I encourage you to consult Spviak's classic book Calculus on Manifolds for a nice proof.

Theorem 6. (Inverse Function Theorem, Spivak 2-11)

Suppose the $f \colon \Rr^n \to \Rr^n$ is continuously differentiable in an open set containing $a$, and $\det Df(a) \neq 0$. Then there is an open set $V$ containing $a$ and an open set $W$ containing $f(a)$ such that $f \colon V \to W$ has a continuous inverse $f^{-1} \colon W \to V$ which id differentiable and for all $y \in W$ satisfies

$$ Df^{-1}(y) = Df(f^{-1}(y))^{-1}. $$

As an application, we use it to deduce the implicit function theorem.

Implicit Function Theorem¶

Consider the function $f \colon \Rr^2 \to R$ defined by $f(x,y)=x^2+y^2-1$

Pick a point $(a,b)$ on the level set $f(x,y)=0$ (the unit circle).

If $a \neq \pm 1$, then there are open intervals $A$ containing $a$ and $B$ containing $b$ with the following property: if $x \in A$, there is a unique $y \in B$ such that $f(x,y) = 0$.

In other words, there is a function $g \colon A \to B$ such that $f(x,g(x)) = 0$ on $A$.

If we require $g(a) = b$ then $g(x) = \sqrt{1-x^2}$ and if $b >0$ and $g(x) = -\sqrt{1-x^2}$ if $b < 0$.

We say that these functions are implicitly defined by the equation $f(x,y)=0$.

If $a = \pm 1$, then it is impossible ot find any such function $g(x)$ defined on an open interval containing $a$.

An simple criterion for deciding when, in general, such a function exists is provided by the implicit function theorem.

Theorem 6 (Single Variable Implicit Function Theorem)

Suppose $f \colon \Rr^2 \to \Rr$ is continuously differentiable in an open set containing $(a,b)$ and $f(a,b) = 0$. If

$$ \frac{\partial f}{\partial y}(a,b) \neq 0$$

there exists an open neighborhood $A$ of $a$ and an open neighborhood $B$ of $b$, with the following property:

for each $x \in A$, there is a unique $g(x) \in B$ such that $f(x,g(x)) = 0$.

Moreover, the function $g(x)$ is differentiable.

Proof. Let $F \colon \Rr^2 \to \Rr^2$ be the function defined by

$$ F(x,y) = (x,f(x,y)) $$

Then $$ \det DF(a,b) = \det \begin{bmatrix} 1 & 0 \\ \frac{\partial f}{\partial x}(a,b) & \frac{\partial f}{\partial y}(a,b) \end{bmatrix} = \frac{\partial f}{\partial y}(a,b) \neq 0 $$

By the inverse function theorem, there is an open set $W$ in $\Rr^2$ containing $F(a,b) = (a,0)$ and an open set in $\Rr^2$ containing $(a,b)$ of the form $A \times B$ (because open sets in $\Rr^n$ are unions of products of open intervals), such that

$$ F \colon A \times B \to W$$

has a differentiable inverse $H \colon W \to A \times B$. So, $H(x,y) = (h(x,y), k(x,y))$ for some differentiable functions $h$ and $k$. And since $H$ is the inverse of $F(x,y) = (x,f(x,y))$. We must have $h(x,y) = x$.

Let $\pi \colon \Rr^2 \to \Rr$ be the projection $\pi(x,y) = y$. Then $\pi \circ F = f$.

Therefore,

\begin{align*} f(x,k(x,y)) & = f\circ H(x,y) = (\pi \circ F) \circ H(x,y) \\ &=\pi \circ (F \circ H)(x,y) = \pi(x,y) = y. \end{align*}

Thus, $f(x,k(x,0))=0$ for $x \in A$; in other words we can take $g(x) = k(x,0)$.