Houjun Liu

#Ntj

Jonell 2021

Last edited: June 6, 2022

DOI: 10.3389/fcomp.2021.642633

One-Liner

Developed a kitchen sink of diagnoses tools and correlated it with biomarkers.

Novelty

The kitchen sink of data collection (phones, tablet, eye tracker, microphone, wristband) and the kitchen sink of noninvasive data imaging, psych, speech assesment, clinical metadata.

Notable Methods

Here’s their kitchen sink

I have no idea why a thermal camera is needed

Key Figs

Here are the features they extracted

Developed the features collected via a method similar to action research, did two passes and refined/added information after preliminary analysis. Figure above also include info about whether or not the measurement was task specific.

Laguarta 2021

Last edited: June 6, 2022

DOI: 10.3389/fcomp.2021.624694

One-Liner

Proposed a large multimodal approach to embed auditory info + biomarkers for baseline classification.

Novelty

Developed a massively multimodal audio-to-embedding correlation system that maps audio to biomarker information collected (mood, memory, respiratory) and demonstrated its ability to discriminate cough results for COVID. (they were looking for AD; whoopsies)

Notable Methods

  • Developed a feature extraction model for AD detection named Open Voice Brain Model
  • Collected a dataset on people coughing and correlated it with biomarkers

Key Figs

Figure 2

This is MULTI-MODAL as heck

Martinc 2021

Last edited: June 6, 2022

DOI: 10.3389/fnagi.2021.642647

One-Liner

Combined bag-of-words on transcript + ADR on audio to various classifiers for AD; ablated BERT’s decesion space for attention to make more easy models in the future.

Novelty

  • Pre-processed each of the two modalities before fusing it (late fusion)
  • Archieved \(93.75\%\) accuracy on AD detection
  • The data being forced-aligned and fed with late fusion allows one to see what sounds/words the BERT model was focusing on by just focusing on the attention on the words

Notable Methods

  • Used classic cookie theft data
  • bag of words to do ADR but for words
  • multimodality but late fusion with one (hot-swappable) classifier

Key Figs

How they did it

This is how the combined the forced aligned (:tada:) audio and transcript together.

Meghanani 2021

Last edited: June 6, 2022

DOI: 10.3389/fcomp.2021.624558

One-Liner

analyzed spontaneous speech transcripts (only!) from TD and AD patients with fastText and CNN; best was \(83.33\%\) acc.

Novelty

  • threw the NLP kitchen sink to transcripts
    • fastText
    • CNN (with vary n-gram kernel 2,3,4,5 sizes)

Notable Methods

  • embeddings seaded by GloVe
  • fastText are much faster, but CNN won out

Key Figs

the qual results

PAR (participant), INV (investigator)

Notes

Hey look a review of the field:

Shah 2021

Last edited: June 6, 2022

DOI: 10.3389/fcomp.2021.624659

One-Liner

Multi-feature late fusion of NLP results (by normalizing text and n-gram processing) with OpenSMILE embedding results.

Novelty

NLP transcript normalization (see methods) and OpenSMILE; otherwise similar to Martinc 2021. Same gist but different data-prep.

Notable Methods

  • N-gram processed the input features
  • Used WordNet to replace words with roots

Key Figs

New Concepts