HomeUndergraduate and post graduate educational programs Educational program Дисциплина

Natural Language Processing

Zhomartkyzy Gulnaz

Description: The course is dedicated to the fundamentals of Natural Language Processing (NLP): from text preprocessing and language models to vector representations, sentiment analysis, and machine translation technologies. It covers key methods of classification, dimensionality reduction, and the development of efficient NLP systems.

Amount of credits: 6

Пререквизиты:

Software Engineering

Course Workload:

Types of classes	hours
Lectures	30
Practical works
Laboratory works	30
SAWTG (Student Autonomous Work under Teacher Guidance)	30
SAW (Student autonomous work)	90
Form of final control	Exam
Final assessment method

Component: University component

Cycle: Profiling disciplines

Goal

To develop students' theoretical knowledge and practical skills in the field of Natural Language Processing (NLP), necessary for the design and application of algorithms, methods, and models for automatic text data analysis, as well as to teach them to use modern NLP tools and models for solving practical tasks.

Objective

To study the basic concepts, methods and technologies of text and speech data processing.
Develop skills in analyzing and preprocessing text corpora and assessing the quality of NLP models.

Learning outcome: knowledge and understanding

Theoretical knowledge and practical skills in the field of natural language processing (NLP).

Learning outcome: applying knowledge and understanding

Be able to process and analyze large amounts of data using modern software

Learning outcome: formation of judgments

the ability to independently apply methods and means of knowledge, learning and self-control, to be aware of the prospects of intellectual, cultural, moral, physical and professional self-development and self-improvement, to be able to critically assess their strengths and weaknesses.

Learning outcome: communicative abilities

arry out communications in the professional sphere and in society as a whole, including in a foreign language, analyze existing and develop independently technical documentation, clearly state and protect the results of complex engineering activities in the field of IT technologies

Assessment of the student's knowledge

Teacher oversees various tasks related to ongoing assessment and determines students' current performance twice during each academic period. Ratings 1 and 2 are formulated based on the outcomes of this ongoing assessment. The student's learning achievements are assessed using a 100-point scale, and the final grades P1 and P2 are calculated as the average of their ongoing performance evaluations. The teacher evaluates the student's work throughout the academic period in alignment with the assignment submission schedule for the discipline. The assessment system may incorporate a mix of written and oral, group and individual formats.

Period	Type of task	Total
1 rating	Laboratory work 1	0-100
	Laboratory work 2
	Laboratory work 3
	Laboratory work 4
2 rating	Laboratory work 5	0-100
	Laboratory work 6
	Laboratory work 7
	Laboratory work 8
Total control	Exam	0-100

The evaluating policy of learning outcomes by work type

Type of task	90-100	70-89	50-69	0-49
Type of task	Excellent	Good	Satisfactory	Unsatisfactory

Evaluation form

The student's final grade in the course is calculated on a 100 point grading scale, it includes:

40% of the examination result;
60% of current control result.

The final grade is calculated by the formula:

FG = 0,6	MT₁+MT₂	+0,4E
	2

Where Midterm 1, Midterm 2are digital equivalents of the grades of Midterm 1 and 2;

E is a digital equivalent of the exam grade.

Final alphabetical grade and its equivalent in points:

The letter grading system for students' academic achievements, corresponding to the numerical equivalent on a four-point scale:

Alphabetical grade	Numerical value	Points (%)	Traditional grade
A	4.0	95-100	Excellent
A-	3.67	90-94	Excellent
B+	3.33	85-89	Good
B	3.0	80-84
B-	2.67	75-79
C+	2.33	70-74
C	2.0	65-69	Satisfactory
C-	1.67	60-64
D+	1.33	55-59
D	1.0	50-54
FX	0.5	25-49	Unsatisfactory
F	0	0-24	Unsatisfactory

Topics of lectures

Introduction to NLP technology
Text pre-processing techniques
Part of speech tagging
Term frequency and weighting
Word vector representation methods in NLP
Feature extraction based on n-grams
Methods for reducing the dimensionality of the feature space
Sentiment analysis using logistic regression
Sentiment Analysis of Texts Using the Naïve Bayes Classifier
Similarity measures and dimensionality reduction in NLP: Euclidean distance, Cosine similarity and PCA
Part of speech tagging
Architecture of the CBOW Model
Neural Networks and Recurrent Models in Text Processing

Key reading

Sunil Patel. Getting Started with Deep Learning for Natural Language Processing, BPB PUBLICATIONS, ISBN: 978-93-89898-11-8, 2021.
Ekaterina Kochmar. Getting Started with Natural Language Processing, Manning Publications Co., ISBN: 9781617296765, 2022
Materials https://www.deeplearning.ai/
Francesco Mosconi. Zero to Deep Learning, 2019
Hobson Lane. Natural Language Processing in Action. 2020

June 12, 2023 - National mourning day in the Republic of Kazakhstan