2023:Program/Submissions/Improve Your Research with Natural Language Processing - VE8KM3
Title: Improve Your Research with Natural Language Processing
Speakers:
Kevin Chang
Kevin is the founder and CEO of Kai Analytics and he has more than a decade of experience in market research. His work with survey and qualitative data research inspired him to share the science of natural language processing and make qualitative data analysis more accessible. Since founding Kai Analytics in 2018, Kevin and his team have led global surveys focused on climate adaptation, youth empowerment, and economic development.
Room:
Start time:
End time:
Type: Workshop
Track: Technology
Submission state: submitted
Duration: 60 minutes
Do not record: false
Presentation language: en
Abstract & description
editAbstract
editNatural Language Processing (NLP) or Computational Linguistics is a field of data science that uncovers the nuance and context of everyday language. Through NLP, we can understand underlying sentiments and personas of qualitative research participants. This workshop will help ensure you are fully informed on using this amazing technology.
Description
editAnalyzing open-ended comments from surveys can be daunting. Researchers often spend many hours manually grouping responses into themes before they can even start to perform any type of qualitative analysis. Fortunately, natural language processing (NLP) techniques can help us gain deeper insights with significantly less effort, while also helping researchers more responsibly engage with the data by extracting meaning from all parts of each response—not just what happens to catch our eye while skimming.
NLP is a popular and rapidly growing field of computational linguistics, which focuses on statistically uncovering themes within large bodies of text. Many of us are already familiar with word clouds, which are a common NLP technique for visualizing word frequency. Despite the popularity of word clouds, understanding language requires context, making single-word representations difficult to interpret.
The aim of this workshop is to explore essential text analytics concepts in NLP and linguistics through detailed, hands-on examples. Specifically, our objective is to illustrate how to analyze and visualize over 100,000 Coursera course reviews in Google CoLab using NLP functions written in Python with the popular Natural Language Toolkit (NLTK) package.
This workshop will cover various NLP concepts including pre-processing (tolkenization, stop-word removal, and lemmitization), automatic part-of-speech tagging, n-gram analysis, and visualization as a network graph. We will also explore how researchers can address problematic issues with real-life textual data, including domain-specific terminology, lexical diversity, and unreliable spelling/grammar.
During Q&A, we’ll help attendees address pain points with adapting these techniques for their own analyses.
Further details
editQn. How does your session relate to the event themes: Diversity, Collaboration Future?
The problem with bias in machine learning (or AI) technologies is a big problem for ensuring diverse groups are represented in data. Knowing the benefits and also the limitation of such technologies and address issues of social bias in our data is crucial for research involving marginalized groups. We will be using Google Colab during the workshop which will support open-source and collaborative programming and learning.
Qn. What is the experience level needed for the audience for your session?
Everyone can participate in this session
Qn. What is the most appropriate format for this session?
- Onsite in Singapore
- Remote online participation, livestreamed
- Remote from a satellite event
- Hybrid with some participants in Singapore and others dialing in remotely
- Pre-recorded and available on demand