Cs 3308 Information Retrieval
CIU Request Information
Course Outline
CS 3308: Information Retrieval
Prerequisites |
CS 3303: Data Structures |
Course Description:
This course introduces the fundamental concepts of information retrieval (IR) systems. Information Retrieval systems are systems that provide the ability to search for and find specific data or information within a collection. Although there are many implementations of IR technology, web search engines such as Google.com, Altavista.com, bing.com, and ask.com are all examples of IR technology applied to content in the world wide web.
Required Textbook and Materials:
The main required textbook for this course is listed below, and can be readily accessed using the provided link. There may be additional required/recommended readings, supplemental materials, or other resources and websites necessary for lessons; these will be provided for you in the course’s General Information and Forums area, and throughout the term via the weekly course Unit areas and the Learning Guides.
Manning, C.D., Raghaven, P., & Schütze, H. (2009). An Introduction to Information Retrieval (Online ed.). Cambridge, MA: Cambridge University Press. Available at http://nlp.stanford.edu/IR-book/information-retrieval-book.html
Software Requirements/Installation: The information retrieval (IR) course provides learning experiences that address both the theory and practice of information retrieval systems. As part of this course students will learn fundamental and critical theories of information retrieval and put those theories into practice by constructing elements of a information retrieval system. Students will be required to construct a parser, indexer, and search interface using the Python language.
For these programming assignments you must download and install the appropriate Python interpreter for your computer and operating system. Versions of the software are available for Windows (XP, Vista, Windows7), Linux distributions, and Mac OS. Most popular distributions of Linux will either include Python or will provide an installation option for it in the software management utility.
You can find available downloads for Python v2.7.x at the following URL: http://www.python.org/download/
Instructions to install and configure Python can be found in the Python setup and usage section of this page.
Learning Objectives and Outcomes:
By the end of this course students will be able to:
- Explain fundamental concepts and theories of information retrieval.
- Differentiate between and apply index compression and search effectiveness techniques.
- Compute weights and scores of documents within an IR system.
- Determine the effectiveness of an information retrieval system using a known document corpus.
- Construct a complete information retrieval system.
- Construct a web search system by integrating indexer, search engine, and web crawler (spider) components.
Course Schedule and Topics:
This course will cover the following topics in eight learning sessions, with one Unit per week. The Final Exam will take place during Week/Unit 9.
Week 1: Unit 1 – Introduction to IR, Boolean Retrieval, and Terms and Postings (Chapters 1 & 2)
Week 2: Unit 2 – Dictionaries and Index Construction (Chapters 3 & 4)
Week 3: Unit 3 – Index Compression (Chapter 5)
Week 4: Unit 4 – Scoring, Term Weighting, and the Vector Space Model (Chapter 6)
Week 5: Unit 5 – Scoring and Ranking in a Complete Search System (Chapter 7)
Week 6: Unit 6 – Evaluation in Information Retrieval (Chapter 8)
Week 7: Unit 7 – Introduction to Web Search (Chapter 19)
Week 8: Unit 8 – Web Crawling (Chapter 20 & 21)
Week 9: Unit 9 – Course Review and Final Exam