NLProc-IRTM-B: Information Retrieval and Text Mining (SS25)

Enrolment options

NLProc-IRTM-B: Information Retrieval and Text Mining

Want to learn how to build an Internet search engine from scratch? Want to learn the fundaments for natural language processing and textual document processing?

In this class, offered as a lecture in a lecture hall with all lectures being recorded and made available as videos, we discuss fundamental data structures for information retrieval, ranking, classification, or clustering of documents. This class also creates the fundament for further natural language processing methods, including natural language understanding and deep learning for natural language processing.

Participation in the lectures is not mandatory, but if you like to interact, ask questions and actively discussed, very appreciated. Active participation in the exercises is expected, but participation is also not mandatory.

In more detail, the following topics will be part of this class:

• Boolean Retrieval, Term Vocabularies and Postings Lists, Dictionaries and Tolerant Retrieval, Spelling Correction, Index Construction, Compression, Scoring, Ranking, Evaluation, Query Expansion, Probabilistic IR
• Text Classification, Naïve Bayes, MaxEnt Classifier, kNN, Neural Networks, Feature Selection, Vector space classification, Document similarities
• Learning to Rank, Learning to Score
• Flat clustering, Hierarchical Clustering, Evaluation

Moderator/in: Joy Kearney
Moderator/in: Roman Klinger

Semester: 2025 Sommersemester

Enrolment options

NLProc-IRTM-B: Information Retrieval and Text Mining

Self enrolment (Teilnehmer/in)