Week 2, Day 3: Wednesday, July 19
Synopsis
Week 2, Day 3 expands upon the idea of digital editions as text processing
pipelines. After a short recap of day 2, we continue with the step normalization. We
will show how these two pipeline stages prepare texts for automated collation. The
process of automated collation is also discussed from a modeling perspective (with the Gothenburg Model).
Participants learn that their research goals and questions influence the
computational pipelines.
Outcome goals
- Understanding the principles of basic text transformations like normalization and how they serve different objectives
- Bringing together tokenization and normalization as individual pipeline steps and seeing how they can be implemented in the act of collation
- Normalize, tokenize, and collate text
- Fundamentals of TAG: hypergraph
- Modeling discontinuity
Legend
- Presentation: by instructors
- Discussion: instructors and participants
- Talk lab: participants discuss or plan in small groups
- Code lab: participants code alone or in small groups
9:00–10:30: Normalization
10:30–11:00: Coffee break
Time |
Topic |
Type |
15 min |
Modeling and collation |
Presentation |
15 min |
Collation within editorial theory |
Talk lab |
30 min |
Collation practice |
Code lab |
30 min |
Tokenization and normalization for collation purposes |
Code lab |
12:30–2:00: Lunch
Time |
Topic |
Type |
90 min |
Challenging textual phenomena: Introducing Text as Graph (TAG) |
Presentation |
3:30–4:00: Coffee break
4:00–5:30: Review
Time |
Topic |
Type |
90 min |
Review |
Talk lab |
We’ll end each day with a request for feedback, based on a general version of the day’s outcome goals, and we’ll try to adapt on the fly to your responses. Please complete Week 2, Day 3 feedback (just copy and paste it into a plain-text document) and email your response to Kaylen at kaylensanders@pitt.edu with the subject heading “Week 2, Day 3 feedback”.