About Me

I’m Nianlong Gu, a Machine Learning Scientist at Universität Zürich. I earned my PhD in Computer Science from ETH Zürich, where I developed advanced NLP models for scientific literature retrieval, extractive summarization, and LLM-based citation text generation. My current research focuses on AI-based vocal segmentation and speech recognition, crucial for bioacoustic studies and language evolution research. I also develop web platform to assist researchers in annotating audio files using AI.

Research Interests

  • Vocal Segmentation and Speech Recognition
    • AI-based tools for human and animal voice activity detection
    • Enhancing bioacoustic research and cross-species vocal comparisons
    • Utterance extraction and transcription pipelines for minor languages
  • Information Retrieval and Document Summarization
    • Efficient retrieval systems for large-scale scientific databases
    • Summarization models for scientific articles
    • Large language models for scientific writing and discovery
    • Deep learning and reinforcement learning applications

Education

ETH Zürich

Dr. Sc. in Computer Science
Institute of Neuroinformatics, Department of Information Technology and Electrical Engineering
Zürich, Switzerland | Aug. 2019 – Dec. 2022

RWTH Aachen University

M.Sc. in Communications Engineering
Faculty of Electrical Engineering and Information Technology
Aachen, Germany | Sep. 2016 – Jul. 2019

Fudan University

B.Sc. in Microelectronics
Shanghai, China | Sep. 2011 – Jul. 2016

Publications

ICASSP 2024

2024 whisperseg

Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
Nianlong Gu, Kanghwi Lee, Maris Basha, Sumit Kumar Ram, Guanghao You, and Richard H. R. Hahnloser
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2024).

Code

Under Review

2024_LLM_surpass

Large language models surpass human experts in predicting neuroscience results
Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, ..., Nianlong Gu, ... et al.
arXiv preprint (2024).

EACL 2024

2024_argument_aligner

Evaluating Unsupervised Argument Aligners via Generation of Conclusions of Structured Scientific Abstracts
Yingqiang Gao, Nianlong Gu, Jessica Lam, James Henderson, and Richard Hahnloser
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics. (2024).

SDProc 2024

2024_cit_gen

Controllable Citation Sentence Generation with Language Models
Nianlong Gu, and Richard HR Hahnloser
Proceedings of the 4th Workshop on Scholarly Document Processing at ACL 2024. (2024).

Code

SwissText 2024

2024_sent_keyword_control

Sentiment- and Keyword-Controllable Text Generation in German with Pre-trained Language Models
Paulina Aleksandra Zal, Nianlong Gu, and Guang Lu
SwissText 2024. (2024).

Code

CODI 2024

2024_scipara

SciPara: A New Dataset for Investigating Paragraph Discourse Structure in Scientific Papers
Anna Kiepura, Yingqiang Gao, Jessica Lam, Nianlong Gu, and Richard H.R. Hahnloser
Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024). (2024).

Code

ACL Demo 2024

2023_scilit

SciLit: A Platform for Joint Scientific Literature Discovery, Summarization and Citation Generation
Nianlong Gu and Richard H.R. Hahnloser
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations) (2023).

Code    Online Demo

DocIU 2023

2023_memsum_dqa

EMNLP 2023

2023_cas

GreedyCAS: Unsupervised Scientific Abstract Segmentation with Normalized Mutual Information
Yingqiang Gao, Jessica Lam, Nianlong Gu, and Richard Hahnloser
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2023).

Code

LIRAI 2023

Legal extractive summarization of US court opinions
Emmanuel Bauer, Dominik Stammbach, Nianlong Gu, and Elliott Ash
Proceedings of the 1st Legal Information Retrieval meets Artificial Intelligence Workshop (LIRAI) (2023).

Code

Under Review

Primate origins of human event cognition
Vanessa AD Wilson, Sebastian Sauppe, Sarah Brocard, Erik Jacob Ringen, Moritz M Daum, Stephanie Wermelinger, Nianlong Gu, Caroline Andrews, Arrate Isasi-Isasmendi, Balthasar Bickel, Klaus Zuberbuehler
arXiv preprint (2023).

ACL 2022

2022_memsum

MemSum: Extractive Summarization of Long Documents Using Multi-Step Episodic Markov Decision Processes
Nianlong Gu, Elliott Ash, and Richard Hahnloser
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022. (2022).

Code    Online Demo

ECIR 2022

2022_hatten

Local Citation Recommendation with Hierarchical-Attention Text Encoder and SciBERT-Based Reranking
Nianlong Gu, Yingqiang Gao, and Richard Hahnloser
Proceedings of ECIR 2022. (2022).

Code

Argument Mining Workshop 2022

2022_discourse

Do Discourse Indicators Reflect the Main Arguments in Scientific Papers?
Yingqiang Gao, Nianlong Gu, Jessica Lam, and Richard H R Hahnloser
Proceedings of the 9th Workshop on Argument Mining. (2022).

Code

ACL Demo 2020

2020_demo

Embedding-based Scientific Literature Discovery in a Text Editor Application
Onur Gökçe, Jonathan Prada, Nikola I. Nikolov, Nianlong Gu, and Richard H.R. Hahnloser
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. (2020).

WACV 2020

2020_vae

Reverse Variational Autoencoder for Visual Attribute Manipulation and Anomaly Detection
Lydia Gauerhof and Nianlong Gu
Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). (2020).

Code