Hello, I am Sotaro Takeshita. I am a second-year Ph.D. student at University of Mannheim focusing on text summarization. I am interested in
- text summarization
- scholarly document processing
- information extraction
- multilinguality in NLP models
I like to read (both papers and books) and programming checkout my paper search system and its extension with an LLM, as well as my OSS projects. I speak Japanese (native), English (fluent), Spanish (basic) and now learning German. For programming, I like to use (neo)vim to write python. Feel free to get in touch by email (oh.sore.sore.soutarou at gmail.com).
Education
- Sep. 2021 - present
- The Data and Web Science Group, University of Mannheim
- Advisor: Prof. Dr. Simone Paolo Ponzetto
- Apr. 2013 - Mar. 2018, B.A. in Information Science
- Faculty of Informatics and Engineering, A National University of Electro-Communications
- Advisor: Minami Yasuhiro
- Apr. 2018 - Mar. 2020, M.S. in Computer Science
- Faculty of Informatics and Engineering, A National University of Electro-Communications
- Advisor: Minami Yasuhiro
Experience
- Feb. 2016 - Apr. 2016, Research Internship
- Sección de Estudios de Posgrado e Investigación de ESIME Culhuacan, Instituto Politecnico Nacional
- Advisor: Mariko Nakano Miyatake
- funded by A National University of Electro-Communications
- Aug. 2017 - Sep. 2017, Recruit Holdings, data scientist internship
- Best work prize
- Aug. 2018 - Seq. 2018, NTT Media Intelligence Lab, NLP research internship
- Advisor: Dr. Ryuichiro Higashinaka
- Nov. 2018 - Dec. 2018, Research Internship
- The University of Campinas, Faculty of Electrical and Computer Engineering
- Advisor: Prof. Dr. Eric Rohmer
- funded by A National University of Electro-Communications
- Apr. 2018 - Jun. 2021, BuildIt, data scientist
- Sep. 2019, University of Mannheim, research visiting
- Advisor: Dr. Goran Glavaš
- funded by A National University of Electro-Communications
Project
- GenGO
- Paper exploration system with NLP technologies.
- GenGO Chat
- LLM-powered RAG system to support literature search for NLPers.
- The Token
- “The open community for all NLP people” where I contribute technical stuff.
- NLP TLDRs
- A list of major NLP conference proceedings with one sentence summaries.
- schnitsum
- Easy to use python pkg to generate summaries with state-of-the-art neural network models.
- tofunlp/sister
- Very simple and easy to use pkg to encode sentences in various language into vector representations.
- sobamchan/pytorch-lightning-transformers
- Clean readable code for finetuning transformers with pytorch-lightning.