GIGA Event
Data Management Workshop
12.09.2018
10:00 Uhr (MESZ)
Empirical social scientists spend a large amount of time on preparing and managing data for analysis. Whether you are a qualitative scholar who wants to analyze text data, or a quantitative researcher who aims to run a regression analysis, you will dedicate a substantial amount of time managing your data.
This workshop will help participants solve thorny data management problems. It introduces them to general guidelines for good data management and enables them to quickly engage in their data management tasks. Participants are encouraged to bring their own data management challenges to the workshop to work on them.
The workshop will be problem-oriented and offer solutions to frequent issues such as non-matching ID variables, duplicate data entries, selecting relevant cases, lagging variables, aggregating data across hierarchies, and harmonizing different data structures and variable types. While doing so it will draw on typical conflict data sets such as Correlates of War, the UCDP/PRIO Armed Conflict Database, the Ethnic Power Relations data and others.
Participants will receive problem sets and attempt to solve them in teams in the statistical computing language R. Example code will be provided by the workshop organizers. The workshop will focus equally on conceptual issues of data management, for example, the steps required to merge two datasets on different levels of aggregation, and their technical implementation in R.
Participants who have no background in R are welcome to join the workshop as they will benefit from the discussion of conceptual issues. Moreover, they will be paired with more R-experienced participants and receive hands-on instructions to manage data. The organizers also offer a one-hour drop-in session before the workshop begins to help participants set up R and give a short introduction.
Why R? We rely on the statistical computing software R because it is freely available, enables a transparent work flow, and has a large and supportive community.
The number of workshop spaces is restricted to 20 due to logistical reasons and spaces will be filled on a first-come-first-served basis. We will inform you in a separate email if you can participate or if you are put on the waiting list.
ORGANIZERS: Nils-Christian Bormann (Exeter) Sabine Otto (Uppsala) Sebastian Schutte (Konstanz)
CONTACT: Nils-Christian Bormann ([email protected])
TIME TABLE:
10:00-11:00 Drop-in session to set up R and small introduction
11:00-12:30 Morning session: basic issues of data import, setup, and merging
12:30-14:00 Lunch break and time to individual data issues
14:00-15:30 Afternoon session 1: Data hierarchies, variable transformation
15:30-16:00 Coffee Break
16:00-17:00 Afternoon session 2: Different data types (text and event data)
Hamburg
English