Picard ASR

Picard ASR is an advanced automatic speech recognition (ASR) system that is dedicated to run on platforms with extremely low resources. Especially the amount of on-chip RAM dominates the price of a microcontroller (µC), why Picard is strongly optimized in terms of RAM memory consumption. Thus, Picard ASR is the first high quality state of the

art speech recognition system (world-wide lowest RAM consumption of 15 kB) which is able to run on low price µCs. This enables the usage of speech controlled human machine interfaces (HMI) not only in high-end life style products but also in medium and low price consumer products. Hence, Picard ASR provides the potential for speech HMIs to appear in everyday life, making our world more interactive.

Product information

Technical Data

  • continuous ASR of words, phrases or sentences
  • recognition accuracy up to 98 %
  • speaker independent
  • main world languages available
  • unlimited vocabulary size
  • platform independent
  • min CPU clock: 40 MHz
  • FLASH: 90 kB (60 kB program memory, 30 kB ASR model data)
  • RAM 15 kB (7 kB static variables, 5 kB heap, 3 kB stack)

Speech Recognition Technology

  • phoneme based Hidden Markov Model (HMM) recognizer
  • Mel-Frequency Cepstral Coefficients (MFCC) with velocity and acceleration features (delta and delta-delta)
  • Gaussian Mixture Model (GMM) with 64, 128 or 256 Gaussians
  • 3 state linear phoneme HMMs
  • monophone and triphone HMMs

Usage of Picard ASR
Application designers can use Picard ASR to introduce smart speech HMIs into machines, devices, entertainment systems, home appliances, technical toys and many more. For the design of the human interaction the grammar of Picard ASR needs to be defined. An ASR grammar is a number of semantic units (words, phrases, sentences) within a speech dialog menu. For example, a grammar consisting of 4 semantic units for a home automation system:

  1. run-microwave-three-minutes
  2. please-start-the-microwave-oven-for-three-minutes
  3. lights-on
  4. could-you-switch-to-the-menu

These units are to be recognized in parallel at once (max. number of parallel units is 64). A complete dialog structure consists of several grammars which are loaded subsequently depending on the current dialog menu. With the help of a software tool provided by Linguwerk GmbH, the application designer can generate own ASR grammars including phoneme transcriptions of words (phoneme dictionary).
Picard ASR is delivered as a compiled software library. The most important library functions are illustrated on the left picture.