Experiments in Certified Digital Audio Processing
    
    Emilio J. Gallego Arias 
      (joint work with Pierre Jouvelot)
    
    MINES ParisTech, PSL Research University, France
     
    11 Avril 2016 - Specfun - Palaiseau 
  
  
    Real Time Signal Processing 
    $\cap$                      
    Programming Language Theory 
    $\cap$                      
    Theorem Proving             
 
    
      FEEVER ANR Project
    
  
  Audio Objects
  
  
    Recursive 2nd order Filter
     
    Image Credits: Julius Orion Smith III
   
  
    Waveguide Resonator
     
    Image Credits: Julius Orion Smith III
   
   
  
    Questions?
    
      - Implementation ease?
- Composability?
- Mathematical Semantics?
- FP Semantics?
- PL Semantics!
- Domain practice.
- Frequency response.
- State Space.
- Optimization.
 
  A Possible Roadmap
  
    Julius Orion Smith III audio-focused book's series:
  
  
    - Mathematics of the Discrete Fourier Transform (DFT)[Partially handled + missing inf. sum.]
- Introduction to Digital Filters[Focus of this talk]
- Physical Audio Signal Processing[Minor bits in this talk + missing ODE]
- Spectral Audio Signal Processing[Future work]
  Main Points of the Series:
  
    - 
      Free mix of mathematics and a little computation.
    
- 
      Linear algebra is pervasive.
    
- 
      Proofs of uneven difficulty, not constructive-friendly.
    
 
  Our View
  
  Audio processing: 
  repeated linear algebra.
  
  Audio programs: 
  finite approximation of the mathematics.
  
  Topic of interest: 
  reconcile the algebraic and synchronous worlds.
  DSP & Theorem Proving
  
    Isabelle/HOL[Akbarpour, Tahar et al.]
    
      Focus on fixed systems, numerical issues.
    
   
  
    Lucid, Velus[Boulmé, Auger, ...]
    
      Safety Property, some work unpublished?
    
   
  
    Kahn Networks[Paulin-Mohring]
    
      Domain Theory point of view, setoid-rewriting based.
    
   
  
    VeriDrone Project[Ricketts, Machela, Lerner, ...]
    Work in progress, from LTL to Lyapunov!
   
  
    FFT in Coq [Capretta]
    
      Main successor of our current world view.
    
    Likely more work that we are unaware of...
   
  DSP & Programming Languages
  
    Arrows/FRP [Elliot, Hughes, Hudak, Nilsson, ...]
    Too abstract, not convenient for sample-based DSP view.
   
  
    Synchronous Languages [Berry, Caspi, Halbwachs, Pouzet, ...]
    
      Far from a "linear algebra" view, not particularly well suited
      for DSP, (state of the art: [Guatto]'s PhD). 
      
        Data-intensive vs control-intensive require quite
        different control techniques. [Berry, 2000]
      
    
   
  
    DataFlow Languages [Lee, Thies, ...]
    
      IMHO most successful approach so far, weak on formalization?
    
   
  
    String Diagrams [Bonchi, Sobocinski, Zanasi]
    
      Very interesting recent research, covers a limited subset of DSP.
    
   
  
    Other Approaches
    
      VOBLA, Ziria, Halide, Darkroom, Julia, Spiral, FeldSpar
      
      Varying degrees of formal semantics.
    
   
  A First Try: Mini-Faust & Coq
  
    - 
      Start from Faust [Orlarey 2002]: functional DSL for DSP based on
      arrow-like point-free combinators.
    
- Implement in Coq: step-indexing, math-comp.
- Define a (sample-based) logic for reasoning about this programs.
- Works nicely for easy examples; supports self-composition, etc...
$$
  \newcommand{\ip}[2]{\langle{#1},{#2}\rangle}
  \newcommand{\inferrule}[2]{\frac{#1}{#2}}
  \newcommand{\fpo}{\varphi}
  \newcommand{\fpw}{\psi}
  \newcommand{\fpt}{\theta}
  \newcommand{\fjud}[3]{\{ #1 \} ~~ {#2} ~~ \{ #3 \}}
  \fjud{\fpo}{f}{\fpw} \iff \forall t, \fpo(i(t)) \Rightarrow \fpw(f(i)(t))
  $$
  Problems with the approach:
  
    - Theorems in DSP not expressed in point-free combinators.
- PL methods awkward for DSP experts.
- Doesn't work so well for more complex examples.
 
  
  Example: Filter Stability
  In PL terms:
  $$ \fjud{x \in [a,b]}{\mathsf{*(1-c) : + \sim *(c)}}{x \in [a,b]} $$
  
  
    In DSP terms:
    
      The impulse response decays to 0 as time goes to infinity.
    
    
      High number of components make PL proof methods impractical.
    
   
  What now? Wagner
  Simple λ-calculus with feedback, variables carry a time index.
  Types and Syntax:
  
    $$
    \begin{array}{lrll}
    T & := & R \mid I_n \mid R[n] & \text{Samples, Ordinals, Arrays} \\
        & ∣& R@r               & \text{rated stream}                   \\
        & ∣& T ⊗ T             & \text{product}                        \\
        & ∣& R   → T           & \text{simple function}                \\
        & ∣& R@r ⇒_m T         & \text{stream processor}               \\
    \end{array}
    $$
    $$
    \newcommand{\R}{\mathbf{R}}
    \newcommand{\feed}[2]{\text{feed}~{#1} = {#2}}
    \newcommand{\let}[3]{\text{let}~{#1}={#2}~\text{in}~{#3}}
    \newcommand{\feedin}[3]{\text{feed}~{#1} = {#2}~\text{in}~{#3}}
    \begin{array}{lrll}
    e &   := & x \mid c \mid p     & \text{Var, Const, Prim}    \\
      & \mid & \pi_n~e \mid (e, e)         & \text{Products}            \\
      & \mid & λ x.~e \mid (e_f~e_a     )  & \text{Regular functions}   \\
      & \mid & Λ x.~e \mid \{e_1, …, e_n\} & \text{Stream functions}    \\
      & \mid & \feed{x}{e} \mid e_{\{j\}}  & \text{Causal Feedback, Previous}
    \end{array}
    $$
  
  Wagner: Operational Semantics
  
    We use a big-step, time-indexed style. Every program reduces to a
    value at a given time step $n$. Equivalent to synchronous dataflow.
  
  
  $$
  \newcommand{\ev}[3]{{#1} \downarrow_{#2} {#3}}
  \begin{gather}
    \inferrule
    { }
    {\ev{x}{n}{x}}
    \qquad
    \inferrule
    {\ev{e_1}{n}{v_1} \quad … \quad \ev{e_k}{n}{v_k} \quad P(v_1, …, v_1) ↓ v }
    {\ev{P(e_1,…,e_k)}{n}{v}}
    \\[1em]
    \inferrule
    {\ev{eₐ}{n}{vₐ} \quad \ev{e_f[x/vₐ]}{n}{v}}
    {\ev{(λ~x.e_f)~eₐ}{n}{v}}
    \qquad
    \inferrule
    {\ev{eₐ}{n}{vₐ} \quad \ev{e_f[x/vₐ]}{n}{v}}
    {\ev{(Λ~x.e_f)~eₐ}{n}{v}}
    \\[1.5em]
    \inferrule
    { \ev{e_1}{n}{v_1} ~\ldots~ \ev{e_k}{n}{v_k} }
    { \ev{ \{e_1, \ldots, e_k \} }{n}{\{v_1, \ldots, v_k \} } }
    \qquad
    \inferrule
    { }
    {\ev{e_{\{k+1\}}}{0}{0_e}}
    \qquad
    \inferrule
    {\ev{e_{\{k\}}}{n}{v}}
    {\ev{e_{\{k+1\}}}{n+1}{v}}
    \\[2em]
    \inferrule
    {\ev{e[x/0_e]}{0}{v}}
    {\ev{\feed{x}{e}}{0}{v}}
    \qquad
    \inferrule
    {\ev{\feed{x}{e}}{n}{v_f} \quad \ev{e[x/v_f]}{n+1}{v}}
    {\ev{\feed{x}{e}}{n+1}{v}}
  \end{gather}
  $$
  
  Examples: DF-I Filter
  
  
   
  
  
  
  $$
  \mathsf{df1} ≡ λ a b : R[k]. Λ x.~\feed{y}{ \{ b \cdot x + a \cdot y} \}
  $$
  
  Note the overload of $x \cdot y$! The types of $a$ and $b$ will fix the op.
  
    $$ x \cdot y ≡ \sum_i x[i] ⬝ y[i] $$
  
  Examples: DF-II
  
  
   
  
  
  $$
  \mathsf{df2} ≡ Λ x.~ \feedin{v}{ \{ x + a \cdot v \}}{ \{ b \cdot v \} }
  $$
  
  Compare with df1: 
  
  $$
  \mathsf{df1} ≡ Λ x.~\feed{y}{ \{ b \cdot x + a \cdot y} \}
  $$
  
  
  Examples: WaveGuide OSC
  
  
   
  
  
  $$
  \begin{array}{rcl}
  \text{feed}~x & = & C \cdot (G \cdot x + y) - y         \\
              y & = & C \cdot (G \cdot x + y) + G \cdot x
  \end{array}
  $$
  
  or without sugar:
  
  $$
  \feed{z}{ \{ (C \cdot (G \cdot z_1 + z_2) - z_2, C \cdot (G \cdot z_1 + z_2) + G \cdot z_1) \} }
  $$
  
  Typing and Equational Theory
  
    Typing is almost standard: we restrict pre to variables and use coeffect-inspired
    annotations to keep track of history-access. We also subtype rated
    streams to arrays:
  
  
    $$
    \begin{gather}
    \inferrule
    { }
    { \Gamma, x :_n T \vdash x_{\{n\}} : T}
    \qquad
    \inferrule
    { \Gamma, x :_n R@r \vdash x : R@r }
    { \Gamma, x :_n R@r \vdash x : R[rn] }
    \end{gather}
    $$
  
  
    We can recover the general pre as:
  
  
    $$ \mathbf{pre}~e ≡ (Λ x.~x_{\{1\}})~e $$
  
  
    Application has to distinguish between positive
    and negative types to correctly propagate
    environment annotations. For this talk we use a restricted
    functional type.
  
  
    
      Note: Work in coeffects, focusing, and duality is
      relevant and moslty subsumes our rules.
    
  
  Now the fun starts
  
    Lightweight embedding into Coq, use a mildy-typed syntax for a
    reduced subset (recall that: $ R ⇒ R ⇒ R ≈ R ⊗ R → R $):
  
  
  Interpreting the Programs:
  
    We want an easy embedding and "executable" embedding, our
    evaluation relation is not. Basically we are looking for something
    such as:
  
  
    $$
    ⟦ \_ ⟧ : ∀ t : ty, \mathsf{seq}~ℝ → \mathsf{expr}~t → ℕ → \mathsf{tyI}~t
    $$
  
  However, what should the value of:
  
    $$
    ⟦ Γ ⊢ x : ℝ ⟧_n = ~??
    $$
  
  
    Indeed, we need to equiq the environments with history, thus they
    become list of lists of values. A first try for the interpretation could be:
  
  
    $$
    \begin{array}{lcl}
    \mathsf{tyI}(⊗~n)       &= & \mathsf{cV[ℝ]_n} \\
    \mathsf{tyI}(n_a ⇒ n_r) &= & \mathsf{seq~cV[ℝ]_{n_a} → cV[ℝ]_{n_r}}      \\
    \end{array}
    $$
  
  Step-Indexing 
  
    However it is not enough, as when interpreting λ we don't have
    enough history!
  
  
    $$
    ⟦ Γ ⊢ Λ x . e ⟧_n = \mathsf{fun}~x ⇒ ⟦ Γ, x ⊢ e ⟧_n
    $$
  
  
  
    We must use a strong induction principle so the application rule
    can compute all previous values:
  
  
  Many more refinements are possible!
   
  Case Study I: Linearity
  
    The first application is to study the set of linear Wagner
    programs, as a first step towards the theory for linear systems
    and in particular, the Z-transform.
  
  Adding programs:
  
    What should linearity mean for our programs? The proper definition
    involves something known as a logical relation:
  
  
    $$
    \begin{array}{lcl}
    \mathsf{linR}_{⊗ n}(e)     &≡& \forall Γ_1, Γ_2.~ ⟦Γ_1 ⊢ e⟧ - ⟦Γ_2 ⊢ e⟧ = ⟦Γ_1 - Γ_2 ⊢ e⟧ \\
    \mathsf{linR}_{n1 ⇒ n2}(f) &≡& \forall Γ_1, Γ_2, r_1, r_2.\\
     && \quad ⟦Γ_1 ⊢ f⟧ r_1 - ⟦Γ_2 ⊢ f⟧ r_2 = ⟦Γ_1 - Γ_2 ⊢ f⟧ (r_1 - r_2)
    \end{array}
    $$
  
  
    The relation is simple but would also work at higher-types. If we
    restrict the syntax to linear programs, we every program satisfies
    the relation!
  
  
    The proof proceeds first, by induction on the typing derivation,
    then by induction on the number of steps.
  
  Case Study I: Linearity
  
    What about multi-linearity?
  
  
    
      Compare:
      $$
      \begin{array}{l}
      λ x. Λ y . x \cdot y : R → S ⇒ T \\
      Λ x. Λ y . x \cdot y : R ⇒ S ⇒ T
      \end{array}
      $$
    
    
    
      Work in progress:
      Relate to the completeness theorem for signal dataflow!
      [Bonchi, Sobociński, Zanasi 2015/2016] [Baez]
    
   
  Case Study II: Frequency Domain
  Quick Review
  
    - 
      The frequency response of a filter = output for the impulse
      response.
    
- 
      A filter is stable = its Frequency Response is summable.
    
- 
      Stability is implied by the existence of the DTFT.
    
- 
      The DFT approximates the DTFT.
    
- 
      The DTFT is the evaluation of the Z-transform for the unit circle.
    
Crucial Prerrequisites:
  
    - Previous linearity theorem.
- A notion of DFT in Coq.
Geometric Signal Theory
  The DFT:
  $$
  X(\omega_k) = \ip{x^T, \omega_k}
  ~~\text{where}~~ \omega_k = \omega^{k*0},\ldots,\omega^{k*(N-1)}
  $$
  
    Most properties are an immediate consequency of matrix
    multiplication lemmas. Again life is easy here.
  
  
    In matrix form:
    
  Primitive roots:
  
    The constructive algebraic numbers in mathcomp provides us with a
    primitive root of the unity such that $ω^N = 1$.
  
   
  In an Ideal World...
  ... we'd have everything we need in our library.
  
    - We are not so far from it.
- Took the definitions from classfun.v and proved:
    Energy theorems are trivial corollaries. Quite compact development
    for now.
  
  Transfer Functions
  
    We want to relate programs to their frequency semantics; it is
    well known that:
  
  $$ H(e^{jωT}) = \text{DTFT}_{ωT}(h) $$
  
    We'd like to relate programs to their transfer function:
  
  
  Z-interpretation of programs
  
    We tried to build an effective procedure, however it is difficult.
  
  Current approach: Using a relation
  
  Z-interpretation of programs
  We can thus prove:
  
  
    We show then that evalution of the impulse response DFT
    corresponds to the polynomial plus a residual based on the poles.
  
  
    Stability is implied if the magnitude of all the poles is less
    than 1. This also implies the existence of the DTFT.
  
  
  Implementation: The SSSM Abstract Machine
  
  Open Topics
  
    - 
      Multirate processing: [quite a bit working]
    
- 
      State Space analysis, control theory, MIMO systems
    
- 
      Floating Point Interpretation.
    
- 
      Schur-Cohn Stability Test
    
- 
      Verification of transforms optimization. [Püschel]
    
- 
      Computational Z-interpretation:
    
- 
      Connection to more mainstream numerical approximation literature?
    
Conclusions
  
    - 
      Implementation in Ocaml with a core type system. Full type
      system still not fully settled, interesting desing space.
    
- 
      Same for Coq formalization, theorems are working but more
      experience is needed to choose some implementation details (for
      instance, choice of environment representation
    
- 
      Next logical step: COLA/OLA.
    
- 
      We are particularly excited to see what the relation with SPIRAL
      could be.
    
- 
      Numerical issues are pervasive in this domain, but trying to get
      there. Library for floating point aritmetic in progress.
    
- 
      Missing an integrated library for true real and complex numbers,
      recent effort by PY and Assia? Should we port Coquelicot?
    
Thanks!
  Context of the ANR project:
  Practice of real time DSP still far from convenient.
  Starting point, Faust:
  
    
    - Faust [Orlarey 2002], a functional PL for audio programming.
- Abstracts away low-level complexity, efficient execution.
- 
      A success, good number of users, high-interest topic. (CCRMA
      workshops, Kadenze course, etc...)
    
Faust's Future:
  
    - 
      Extend Faust to multirate processing.
    
- 
      Reasoning about audio programs is not easy.
    
A note on streams.
  The beggining of it all:
  
    First exercise, port of Lucid paper to modern Coq/Ssreflect; // broken
    with coinductives! Dependent types not very scalable;
    decidability/extensionality of streams: bad.
  
  Move to sequences!
  
  
  "Inversion views", but better, wf_str is decidable now!
    
    
    
    
    
    
      
      /