Week 2, Day 4: Thursday, July 20
Synopsis
Week 2, Day 4 Mike Kestemont joins the instructional team to teach bag of words and text processing. Unfortunately, Mike is unable to publish materials used during this institute online due to copyright issues, so for now we will link his excellent presentations instead. Authenticity criticism | Stylometry with R
Outcome goals
- Grasping the concept of modelling text as trees and graphs
- Understanding annotation as a form of adding layers to text
- Varieties of layered editions
- Deeper discussion of the alignment step in the GM
- An awareness of computation to understand that we do near-matching late (in the pipeline) for reasons of efficiency
Legend
- Presentation: by instructors
- Discussion: instructors and participants
- Talk lab: participants discuss or plan in small groups
- Code lab: participants code alone or in small groups
9:00–10:30: Text analytics 1
Text analytics 1
Time | Topic | Type |
---|---|---|
15 min | Bag of words | Presentation |
30 min | Text processing | Presentation |
15 min | Text as tables | Code lab |
30 min | Query the tables | Code lab |
10:30–11:00: Coffee break
11:00–12:30: Text analytics 1 (cont.)
Time | Topic | Type |
---|---|---|
90 min | Bag of words, text processing, text as tables, query the tables (continued) | Code lab |
12:30–2:00: Lunch
2:00–3:30: Modeling: annotations as layers to the text
Modeling: annotations as layers to the text
Time | Topic | Type |
---|---|---|
15 min | [Review of tokenization, normalization, and collation from the point of view of annotations] | Discussion |
15 min | Envisioning your edition as a layered model | Talk lab |
15 min | Existing models (e.g. computational linguistics) | Presentation |
15 min | Hands-on: identify your own layers | Talk lab |
30 min | Hands-on: model your edition’s pipeline | Code lab |
3:30–4:00: Coffee break
4:00–5:30: Collation 2
Collation 2
Time | Topic | Type |
---|---|---|
15 min | Advanced collation: Alignment in the Gothenburg Model | Presentation |
45 min | Near-matching - theory (as step in the computational pipeline) | Code lab |
30 min | Review | Talk lab |
We’ll end each day with a request for feedback, based on a general version of the day’s outcome goals, and we’ll try to adapt on the fly to your responses. Please complete Week 2, Day 4 feedback (just copy and paste it into a plain-text document) and email your response to Kaylen at kaylensanders@pitt.edu with the subject heading “Week 2, Day 4 feedback”.