Experiments in Certified Digital Audio Processing

Emilio J. Gallego Arias
(joint work with Pierre Jouvelot)

MINES ParisTech, PSL Research University, France

11 Avril 2016 - Specfun - Palaiseau

Real Time Signal Processing
$\cap$
Programming Language Theory
$\cap$
Theorem Proving

FEEVER ANR Project

Audio Objects

Recursive 2nd order Filter

Image Credits: Julius Orion Smith III

Waveguide Resonator

Image Credits: Julius Orion Smith III

Questions?

Implementation ease?
Composability?
Mathematical Semantics?
FP Semantics?
PL Semantics!
Domain practice.
Frequency response.
State Space.
Optimization.

A Possible Roadmap

Julius Orion Smith III audio-focused book's series:

Mathematics of the Discrete Fourier Transform (DFT)

[Partially handled + missing inf. sum.]

Introduction to Digital Filters

[Focus of this talk]

Physical Audio Signal Processing

[Minor bits in this talk + missing ODE]

Spectral Audio Signal Processing

[Future work]

Main Points of the Series:

Free mix of mathematics and a little computation.
Linear algebra is pervasive.
Proofs of uneven difficulty, not constructive-friendly.

Our View

Audio processing:

repeated linear algebra.

Audio programs:

finite approximation of the mathematics.

Topic of interest:

reconcile the algebraic and synchronous worlds.

Where to Look?

DSP & Theorem Proving

Isabelle/HOL[Akbarpour, Tahar et al.]

Focus on fixed systems, numerical issues.

Lucid, Velus[Boulmé, Auger, ...]

Safety Property, ~~some work unpublished?~~

Kahn Networks[Paulin-Mohring]

Domain Theory point of view, setoid-rewriting based.

VeriDrone Project[Ricketts, Machela, Lerner, ...]

Work in progress, from LTL to Lyapunov!

FFT in Coq [Capretta]

Main successor of our current world view.

Likely more work that we are unaware of...

DSP & Programming Languages

Arrows/FRP [Elliot, Hughes, Hudak, Nilsson, ...]

Too abstract, not convenient for sample-based DSP view.

Synchronous Languages [Berry, Caspi, Halbwachs, Pouzet, ...]

Far from a "linear algebra" view, not particularly well suited for DSP, (state of the art: [Guatto]'s PhD).
Data-intensive vs control-intensive require quite different control techniques. [Berry, 2000]

DataFlow Languages [Lee, Thies, ...]

IMHO most successful approach so far, weak on formalization?

String Diagrams [Bonchi, Sobocinski, Zanasi]

Very interesting recent research, covers a limited subset of DSP.

Other Approaches

VOBLA, Ziria, Halide, Darkroom, Julia, Spiral, FeldSpar
Varying degrees of formal semantics.

A First Try: Mini-Faust & Coq

Start from Faust [Orlarey 2002]: functional DSL for DSP based on arrow-like point-free combinators.
Implement in Coq: step-indexing, math-comp.
Define a (sample-based) logic for reasoning about this programs.
Works nicely for easy examples; supports self-composition, etc...

$$ \newcommand{\ip}[2]{\langle{#1},{#2}\rangle} \newcommand{\inferrule}[2]{\frac{#1}{#2}} \newcommand{\fpo}{\varphi} \newcommand{\fpw}{\psi} \newcommand{\fpt}{\theta} \newcommand{\fjud}[3]{\{ #1 \} ~~ {#2} ~~ \{ #3 \}} \fjud{\fpo}{f}{\fpw} \iff \forall t, \fpo(i(t)) \Rightarrow \fpw(f(i)(t)) $$

Problems with the approach:

Theorems in DSP not expressed in point-free combinators.
PL methods awkward for DSP experts.
Doesn't work so well for more complex examples.

https://github.com/ejgallego/mini-faust-coq

Example: Filter Stability

In PL terms:

$$ \fjud{x \in [a,b]}{\mathsf{*(1-c) : + \sim *(c)}}{x \in [a,b]} $$

In DSP terms:

The impulse response decays to 0 as time goes to infinity.

High number of components make PL proof methods impractical.

What now? Wagner

Simple λ-calculus with feedback, variables carry a time index.

Types and Syntax:

$$ \begin{array}{lrll} T & := & R \mid I_n \mid R[n] & \text{Samples, Ordinals, Arrays} \\ & ∣& R@r & \text{rated stream} \\ & ∣& T ⊗ T & \text{product} \\ & ∣& R → T & \text{simple function} \\ & ∣& R@r ⇒_m T & \text{stream processor} \\ \end{array} $$ $$ \newcommand{\R}{\mathbf{R}} \newcommand{\feed}[2]{\text{feed}~{#1} = {#2}} \newcommand{\let}[3]{\text{let}~{#1}={#2}~\text{in}~{#3}} \newcommand{\feedin}[3]{\text{feed}~{#1} = {#2}~\text{in}~{#3}} \begin{array}{lrll} e & := & x \mid c \mid p & \text{Var, Const, Prim} \\ & \mid & \pi_n~e \mid (e, e) & \text{Products} \\ & \mid & λ x.~e \mid (e_f~e_a ) & \text{Regular functions} \\ & \mid & Λ x.~e \mid \{e_1, …, e_n\} & \text{Stream functions} \\ & \mid & \feed{x}{e} \mid e_{\{j\}} & \text{Causal Feedback, Previous} \end{array} $$

Wagner: Operational Semantics

We use a big-step, time-indexed style. Every program reduces to a value at a given time step $n$. Equivalent to synchronous dataflow.

$$ \newcommand{\ev}[3]{{#1} \downarrow_{#2} {#3}} \begin{gather} \inferrule { } {\ev{x}{n}{x}} \qquad \inferrule {\ev{e_1}{n}{v_1} \quad … \quad \ev{e_k}{n}{v_k} \quad P(v_1, …, v_1) ↓ v } {\ev{P(e_1,…,e_k)}{n}{v}} \\[1em] \inferrule {\ev{eₐ}{n}{vₐ} \quad \ev{e_f[x/vₐ]}{n}{v}} {\ev{(λ~x.e_f)~eₐ}{n}{v}} \qquad \inferrule {\ev{eₐ}{n}{vₐ} \quad \ev{e_f[x/vₐ]}{n}{v}} {\ev{(Λ~x.e_f)~eₐ}{n}{v}} \\[1.5em] \inferrule { \ev{e_1}{n}{v_1} ~\ldots~ \ev{e_k}{n}{v_k} } { \ev{ \{e_1, \ldots, e_k \} }{n}{\{v_1, \ldots, v_k \} } } \qquad \inferrule { } {\ev{e_{\{k+1\}}}{0}{0_e}} \qquad \inferrule {\ev{e_{\{k\}}}{n}{v}} {\ev{e_{\{k+1\}}}{n+1}{v}} \\[2em] \inferrule {\ev{e[x/0_e]}{0}{v}} {\ev{\feed{x}{e}}{0}{v}} \qquad \inferrule {\ev{\feed{x}{e}}{n}{v_f} \quad \ev{e[x/v_f]}{n+1}{v}} {\ev{\feed{x}{e}}{n+1}{v}} \end{gather} $$

Examples: DF-I Filter

$$ \mathsf{df1} ≡ λ a b : R[k]. Λ x.~\feed{y}{ \{ b \cdot x + a \cdot y} \} $$

Note the overload of $x \cdot y$! The types of $a$ and $b$ will fix the op.

$$ x \cdot y ≡ \sum_i x[i] ⬝ y[i] $$

Examples: DF-II

$$ \mathsf{df2} ≡ Λ x.~ \feedin{v}{ \{ x + a \cdot v \}}{ \{ b \cdot v \} } $$

Compare with df1:

$$ \mathsf{df1} ≡ Λ x.~\feed{y}{ \{ b \cdot x + a \cdot y} \} $$

Examples: WaveGuide OSC

$$ \begin{array}{rcl} \text{feed}~x & = & C \cdot (G \cdot x + y) - y \\ y & = & C \cdot (G \cdot x + y) + G \cdot x \end{array} $$

or without sugar:

$$ \feed{z}{ \{ (C \cdot (G \cdot z_1 + z_2) - z_2, C \cdot (G \cdot z_1 + z_2) + G \cdot z_1) \} } $$

Typing and Equational Theory

Typing is almost standard: we restrict pre to variables and use coeffect-inspired annotations to keep track of history-access. We also subtype rated streams to arrays:

$$ \begin{gather} \inferrule { } { \Gamma, x :_n T \vdash x_{\{n\}} : T} \qquad \inferrule { \Gamma, x :_n R@r \vdash x : R@r } { \Gamma, x :_n R@r \vdash x : R[rn] } \end{gather} $$

We can recover the general pre as:

$$ \mathbf{pre}~e ≡ (Λ x.~x_{\{1\}})~e $$

Application has to distinguish between positive and negative types to correctly propagate environment annotations. For this talk we use a restricted functional type.

Note: Work in coeffects, focusing, and duality is relevant and moslty subsumes our rules.

Now the fun starts

Lightweight embedding into Coq, use a mildy-typed syntax for a reduced subset (recall that: $ R ⇒ R ⇒ R ≈ R ⊗ R → R $):

Interpreting the Programs:

We want an easy embedding and "executable" embedding, our evaluation relation is not. Basically we are looking for something such as:

$$ ⟦ \_ ⟧ : ∀ t : ty, \mathsf{seq}~ℝ → \mathsf{expr}~t → ℕ → \mathsf{tyI}~t $$

However, what should the value of:

$$ ⟦ Γ ⊢ x : ℝ ⟧_n = ~?? $$

Indeed, we need to equiq the environments with history, thus they become list of lists of values. A first try for the interpretation could be:

$$ \begin{array}{lcl} \mathsf{tyI}(⊗~n) &= & \mathsf{cV[ℝ]_n} \\ \mathsf{tyI}(n_a ⇒ n_r) &= & \mathsf{seq~cV[ℝ]_{n_a} → cV[ℝ]_{n_r}} \\ \end{array} $$

Step-Indexing

However it is not enough, as when interpreting λ we don't have enough history!

$$ ⟦ Γ ⊢ Λ x . e ⟧_n = \mathsf{fun}~x ⇒ ⟦ Γ, x ⊢ e ⟧_n $$

We must use a strong induction principle so the application rule can compute all previous values:

Many more refinements are possible!

Case Study I: Linearity

The first application is to study the set of linear Wagner programs, as a first step towards the theory for linear systems and in particular, the Z-transform.

Adding programs:

What should linearity mean for our programs? The proper definition involves something known as a logical relation:

$$ \begin{array}{lcl} \mathsf{linR}_{⊗ n}(e) &≡& \forall Γ_1, Γ_2.~ ⟦Γ_1 ⊢ e⟧ - ⟦Γ_2 ⊢ e⟧ = ⟦Γ_1 - Γ_2 ⊢ e⟧ \\ \mathsf{linR}_{n1 ⇒ n2}(f) &≡& \forall Γ_1, Γ_2, r_1, r_2.\\ && \quad ⟦Γ_1 ⊢ f⟧ r_1 - ⟦Γ_2 ⊢ f⟧ r_2 = ⟦Γ_1 - Γ_2 ⊢ f⟧ (r_1 - r_2) \end{array} $$

The relation is simple but would also work at higher-types. If we restrict the syntax to linear programs, we every program satisfies the relation!

The proof proceeds first, by induction on the typing derivation, then by induction on the number of steps.

Case Study I: Linearity

What about multi-linearity?

Compare: $$ \begin{array}{l} λ x. Λ y . x \cdot y : R → S ⇒ T \\ Λ x. Λ y . x \cdot y : R ⇒ S ⇒ T \end{array} $$

Work in progress: Relate to the completeness theorem for signal dataflow! [Bonchi, Sobociński, Zanasi 2015/2016] [Baez]

Case Study II: Frequency Domain

Quick Review

The frequency response of a filter = output for the impulse response.
A filter is stable = its Frequency Response is summable.
Stability is implied by the existence of the DTFT.
The DFT approximates the DTFT.
The DTFT is the evaluation of the Z-transform for the unit circle.

Crucial Prerrequisites:

Previous linearity theorem.
A notion of DFT in Coq.

Geometric Signal Theory

The DFT:

$$ X(\omega_k) = \ip{x^T, \omega_k} ~~\text{where}~~ \omega_k = \omega^{k*0},\ldots,\omega^{k*(N-1)} $$

Most properties are an immediate consequency of matrix multiplication lemmas. Again life is easy here.

In matrix form:

Primitive roots:

The constructive algebraic numbers in mathcomp provides us with a primitive root of the unity such that $ω^N = 1$.

In an Ideal World...

... we'd have everything we need in our library.

We are not so far from it.
Took the definitions from classfun.v and proved:

Energy theorems are trivial corollaries. Quite compact development for now.

Transfer Functions

We want to relate programs to their frequency semantics; it is well known that:

$$ H(e^{jωT}) = \text{DTFT}_{ωT}(h) $$

We'd like to relate programs to their transfer function:

Z-interpretation of programs

We tried to build an effective procedure, however it is difficult.

Current approach: Using a relation

Z-interpretation of programs

We can thus prove:

We show then that evalution of the impulse response DFT corresponds to the polynomial plus a residual based on the poles.

Stability is implied if the magnitude of all the poles is less than 1. This also implies the existence of the DTFT.

Implementation: The SSSM Abstract Machine

Open Topics

Multirate processing: [quite a bit working]
State Space analysis, control theory, MIMO systems
Floating Point Interpretation.
Schur-Cohn Stability Test
Verification of transforms optimization. [Püschel]
Computational Z-interpretation:
Connection to more mainstream numerical approximation literature?

Conclusions

Implementation in Ocaml with a core type system. Full type system still not fully settled, interesting desing space.
Same for Coq formalization, theorems are working but more experience is needed to choose some implementation details (for instance, choice of environment representation
Next logical step: COLA/OLA.
We are particularly excited to see what the relation with SPIRAL could be.
Numerical issues are pervasive in this domain, but trying to get there. Library for floating point aritmetic in progress.
Missing an integrated library for true real and complex numbers, recent effort by PY and Assia? Should we port Coquelicot?

Thanks!

Context of the ANR project:

Practice of real time DSP still far from convenient.

Starting point, Faust:

Faust [Orlarey 2002], a functional PL for audio programming.
Abstracts away low-level complexity, efficient execution.
A success, good number of users, high-interest topic. (CCRMA workshops, Kadenze course, etc...)

Faust's Future:

Extend Faust to multirate processing.
Reasoning about audio programs is not easy.

A note on streams.

The beggining of it all:

First exercise, port of Lucid paper to modern Coq/Ssreflect; // broken with coinductives! Dependent types not very scalable; decidability/extensionality of streams: bad.

Move to sequences!

"Inversion views", but better, wf_str is decidable now!

Experiments in Certified Digital Audio Processing

Emilio J. Gallego Arias (joint work with Pierre Jouvelot)

MINES ParisTech, PSL Research University, France

11 Avril 2016 - Specfun - Palaiseau

Audio Objects

Recursive 2nd order Filter

Waveguide Resonator

Questions?

A Possible Roadmap

Main Points of the Series:

Our View

Audio processing:

repeated linear algebra.

Audio programs:

finite approximation of the mathematics.

Topic of interest:

reconcile the algebraic and synchronous worlds.

Where to Look?

DSP & Theorem Proving

Isabelle/HOL[Akbarpour, Tahar et al.]

Lucid, Velus[Boulmé, Auger, ...]

Kahn Networks[Paulin-Mohring]

VeriDrone Project[Ricketts, Machela, Lerner, ...]

FFT in Coq [Capretta]

Likely more work that we are unaware of...

DSP & Programming Languages

Arrows/FRP [Elliot, Hughes, Hudak, Nilsson, ...]

Synchronous Languages [Berry, Caspi, Halbwachs, Pouzet, ...]

DataFlow Languages [Lee, Thies, ...]

String Diagrams [Bonchi, Sobocinski, Zanasi]

Other Approaches

A First Try: Mini-Faust & Coq

Problems with the approach:

Example: Filter Stability

In PL terms:

In DSP terms:

What now? Wagner

Types and Syntax:

Wagner: Operational Semantics

Examples: DF-I Filter

Examples: DF-II

Examples: WaveGuide OSC

Typing and Equational Theory

Now the fun starts

Interpreting the Programs:

Step-Indexing

Case Study I: Linearity

Adding programs:

Case Study I: Linearity

Case Study II: Frequency Domain

Quick Review

Crucial Prerrequisites:

Geometric Signal Theory

The DFT:

In matrix form:

Primitive roots:

In an Ideal World...

... we'd have everything we need in our library.

Transfer Functions

Z-interpretation of programs

Current approach: Using a relation

Z-interpretation of programs

Implementation: The SSSM Abstract Machine

Open Topics

Conclusions

Thanks!

Thanks!

Context of the ANR project:

Starting point, Faust:

Faust's Future:

A note on streams.

The beggining of it all:

Emilio J. Gallego Arias
(joint work with Pierre Jouvelot)