CV

Yejin Cho

Ph.D. in Computational Linguistics
University of Texas at Austin

ycho@utexas.edu
Curriculum Vitae [pdf]


Research Interests


  • Computational Semantics
    • Representation learning, language modeling, natural language understanding
    • Inference over knowledge graphs, representation learning from graphs

Publications


  • Yejin Cho*, Juan Diego Rodriguez*, Yifan Gao, and Katrin Erk. Leveraging WordNet Paths for Neural Hypernym Prediction. 2020. Proceedings of COLING, In Proceedings of the 28th COLING, pages 3007-3018, Barcelona, Spain (Online). (*: Equal contribution) [pdf]
  • Heejo You, Hyungwon Yang, Jaekoo Kang, Youngsun Cho, Sunghah Hwang, Yeonjung Hong, Yejin Cho, Seohyun Kim, and Hosung Nam. 2016. Development of Articulatory Estimation Model using Deep Neural Network. Phonetics and Speech Sciences. 8:31-38. 10.13064/KSSS.2016.8.3.031. [pdf]

Education


ut-austin
University of Texas at Austin
Ph.D. in Computational Linguistics
Fall 2018 – Present
Advisor: Katrin Erk
Teaching assistantship (TA):
– LIN 350 Computational Semantics (Spring 2020)
(Instructor: Katrin Erk) [course website]

– LIN 313 Language and Computers (Fall 2018, Fall 2019)
(Instructor: Jessy Li) [course website]

Research Assistantship (RA):
– AIDA project (Spring 2020 – Present) (Advisor: Katrin Erk)

– Hypernym Prediction project (Spring 2019 – Fall 2019) (Advisor: Katrin Erk)

Selected Courses (Instructor):
– Topics in Natural Language Processing (Eunsol Choi)
– Philosophy of Language (Joshua Dever & Ray Buchanan)
– Natural Language Processing (Greg Durrett)
– Computational Discourse (Jessy Li)


스크린샷 2017-11-27 오전 1.03.13.png

Korea University
Masters of English Language and Literature
Fall 2015 – Fall 2017
Total GPA of 4.5 / 4.5 (100 / 100)
Advisor: Hosung Nam

Master’s thesis (Fall 2017):
Functional awareness in RNNLM-based word embeddings [pdf]

Courses in Linguistics:
– Practice in Phonetics, Special topics in Phonetics I & II, Special topics in Phonology I, Seminar in Phonology, History of Syntactic Theory, Second Language Acquisition
Courses in Computer Science / Brain and Cognitive Engineering / Psychology:
– Natural Language Processing, Introduction to Neural Networks, Numerical Linear Algebra, Introduction to Applied Mathematics (audited), Applied Mathematics for Brain and Cognitive Engineering I, Advanced Experimental Design


스크린샷 2017-11-26 오후 7.47.42

Yonsei University

Bachelor of Arts in Korean Language and Literature,
and English Language and Literature (Double major)
Spring 2011 – Spring 2015
Total GPA of 3.87 / 4.3 (90 / 100)
Courses (undergraduate level):
– Language and its Application (Introduction to General Linguistics), Korean Phonetics, Introduction to Korean Linguistics, Introduction to French Linguistics, Use of English and Society (Sociolinguistics), Korean Semantics, English Computational Linguistics, Korean Morphology, Teaching Method of Korean Language Curriculum, History of Research in Korean Grammar, Korean Syntax
Courses (graduate level):
– Studies in Korean Phonetics, Acoustic Analysis of Speech and Construction of Speech Corpora, Studies in English Phonology I


UCLA_emblem
University of California, Los Angeles (UCLA)
Exchange Student, Linguistics
Fall 2013 – Spring 2014
Total GPA of 3.97 / 4.0 (99 / 100)
Listed on the Dean’s Honors list for three quarters
Research Assistant (RA) for Kie Zuraw and Yu Tanaka at UCLA Phonetics Laboratory, Spring 2014
Courses in Linguistics:
– Introduction to Linguistics, Syntax I, Introduction to General Phonetics, Phonology I, Individual Studies in Linguistics, Phonetic Theory (Graduate level)


Research Experience


  • Hypernym Prediction in WordNet, UT Austin (Spring 2019 – Fall 2019)
    Graduate Research Assistant
    • Advisor: Katrin Erk
    • Task: Given a node in WordNet (e.g., daisy), predict its direct hypernym (e.g., flower) within the graph.
    • Idea: Framed the task into sequence generation problem where a model generates from the encoded representation of a given hyponym (i.e., daisy) the entire taxonomy path to the root (e.g., flowerflowering plantplantorganismobjectentity). Evaluated the first node in the generated chain which corresponds to the direct hypernym.
    • Ran experiments with a sequence-to-sequence encoder-decoder model with attention using OpenNMT.
    • Replicated five benchmark systems in link prediction task on WN18RR dataset by adapting the original source codes to our dataset and task setup:
      • TransE (Bordes et al., NeurIPS 2013)
      • M3GM (Pinter & Eisenstein, EMNLP 2018)
      • Poincaré embeddings (Nickel & Kiela, NeurIPS 2018)
      • CRIM for hypernym discovery (Bernier-Colborne & Barriere, SemEval-2018)
      • text2edges (Prokhorov et al., NAACL 2019)
    • Achieved the new state-of-the-art performance in hypernym prediction task on WN18RR-hp dataset.

  • EMCS Laboratory, Korea University (Fall 2015 – Present)
    (Education, Mathematics, Computer science and Speech Laboratory)
    Research Member

    • Advisor: Hosung Nam
    • Built Korean Large Vocabulary Continuous Speech Recognition (LVCSR) system (800k vocabulary) from raw text and audio corpora with transcription using Kaldi Speech Recognition Toolkit
    • Subword (Pseudo-morpheme) language modeling for building Korean LVCSR system
    • Language modeling experiments using SRILM and RNNLM Toolkit
    • Designed and developed Korean text normalization and language preparation package for LM in Kaldi-based ASR system (KoLM) [github]
    • Designed and developed rule-based Korean Grapheme-to-Phone conversion system (KoG2P) [github]

Teaching


  • MATLAB Programming for Humanities majors, Korea University (January, 2016)
    Teaching Assistant

    • Covered basics of MATLAB programming
    • Ran hands-on classes for over 40 students, answered questions in person and online, scored and commented on student assignments for 4 weeks
    • Led team project LyricsAnalyzer
      • Web crawling and text mining of Korean lyrics [slide]

Honors and Awards


  • Graduate School Fellowship, University of Texas at Austin, 2018
  • National Humanities Scholarship: Graduate Research Scholarship for Humanities and Social Sciences, Korea Student Aid Foundation (KOSAF), 2016–2017
  • Honors Scholarships, Korea University, 2016
  • Teaching Assistant Scholarships, Korea University, 2015-2017

Technical Skills


Languages Python, MATLAB, UNIX shell scripting, R. Readability in Java, Javascript.
Toolkits PyTorch, TensorFlow, Kaldi Speech Recognition Toolkit

Language Proficiency


  • Native in Korean
  • Fluent in English
  • Intermediate in French (DELF B1)
  • Basic conversation and readability in Japanese and Modern Standard Arabic (MSA)