About Me
I’m Nianlong Gu, a Machine Learning Scientist at Universität Zürich. I earned my PhD in Computer Science from ETH Zürich, where I developed advanced NLP models for scientific literature retrieval, extractive summarization, and LLM-based citation text generation. My current research focuses on AI-based vocal segmentation and speech recognition, crucial for bioacoustic studies and language evolution research. I also develop web platform to assist researchers in annotating audio files using AI.
Research Interests
- Vocal Segmentation and Speech Recognition
- AI-based tools for human and animal voice activity detection
- Enhancing bioacoustic research and cross-species vocal comparisons
- Utterance extraction and transcription pipelines for minor languages
- Information Retrieval and Document Summarization
- Efficient retrieval systems for large-scale scientific databases
- Summarization models for scientific articles
- Large language models for scientific writing and discovery
- Deep learning and reinforcement learning applications
Education
ETH Zürich
Dr. Sc. in Computer Science
Institute of Neuroinformatics, Department of Information Technology and Electrical Engineering
Zürich, Switzerland | Aug. 2019 – Dec. 2022
RWTH Aachen University
M.Sc. in Communications Engineering
Faculty of Electrical Engineering and Information Technology
Aachen, Germany | Sep. 2016 – Jul. 2019
Fudan University
B.Sc. in Microelectronics
Shanghai, China | Sep. 2011 – Jul. 2016
Publications
ICASSP 2024

Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2024).
Under Review

Large language models surpass human experts in predicting neuroscience results
arXiv preprint (2024).
EACL 2024

Evaluating Unsupervised Argument Aligners via Generation of Conclusions of Structured Scientific Abstracts
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics. (2024).
SDProc 2024

Controllable Citation Sentence Generation with Language Models
Proceedings of the 4th Workshop on Scholarly Document Processing at ACL 2024. (2024).
SwissText 2024

Sentiment- and Keyword-Controllable Text Generation in German with Pre-trained Language Models
SwissText 2024. (2024).
CODI 2024

SciPara: A New Dataset for Investigating Paragraph Discourse Structure in Scientific Papers
Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024). (2024).
ACL Demo 2024

SciLit: A Platform for Joint Scientific Literature Discovery, Summarization and Citation Generation
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations) (2023).
DocIU 2023

EMNLP 2023

GreedyCAS: Unsupervised Scientific Abstract Segmentation with Normalized Mutual Information
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2023).
LIRAI 2023
Legal extractive summarization of US court opinions
Proceedings of the 1st Legal Information Retrieval meets Artificial Intelligence Workshop (LIRAI) (2023).
Under Review
Primate origins of human event cognition
arXiv preprint (2023).
ACL 2022

MemSum: Extractive Summarization of Long Documents Using Multi-Step Episodic Markov Decision Processes
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022. (2022).
ECIR 2022

Local Citation Recommendation with Hierarchical-Attention Text Encoder and SciBERT-Based Reranking
Proceedings of ECIR 2022. (2022).
Argument Mining Workshop 2022

Do Discourse Indicators Reflect the Main Arguments in Scientific Papers?
Proceedings of the 9th Workshop on Argument Mining. (2022).
ACL Demo 2020

Embedding-based Scientific Literature Discovery in a Text Editor Application
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. (2020).
WACV 2020

Reverse Variational Autoencoder for Visual Attribute Manipulation and Anomaly Detection
Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). (2020).