Notes from Jason
- Humanities perspective and a technical perspective
- Wanting to bring the linked database (connecting multiple databases together)
- Streamline the Marriage stuff..
- Line out the humanities research question and what are your technical research questions. What are the exact things you need the grant to do?
- Productive way of combining the databases to answer the questions we want to answer, then we’d be in a position to show how the smaller (joint-db) part fits into the bigger research questions with the marriage project.
- Practice of polygamy and its origins, and scholarly history
- One component of that project is this piece…
- Need some sense of what this metadata looks like and there are ways people have gotten around crosswalking data
- Want the reviewers to be captivated by:
- Humanities interesting history stuff
- Very pragmatic nature of the need (ex: differing datasets and it’s hard to coordinate these together and this will help)
- Should include both the humanities and technical aspects of the previous works
- Talk about doing: interface, schema to join the desparate datasets,
- This is allowed and encouraged to be experimental (ex: We think this will be better, and here’s how we’re going to design and test)
- Stage out the workplan so that they can see how you’re putting it together and going to move forward (piece by piece)
- They streamlined and expanded what’s allowed. Removed a lot of emphasis on innovation, but added a framework and require a way to dissemenate the knowledge learned (but any are allowed as outcomes, however you feel is best to dissemenate that knowledge–scripts, samples, data, schema, etc).
- No expectation (at least not for 40-70k), no one expects you to have formal polished tool or anything. Have a realistic set of expectations as to what the outputs are
- “Here’s how we better understand the relationships between these partners…”
- “This is how one dataset represents a relationship, and here’s how another one does it.” Go on to understand how they are different and linked better, and we can then better understand how those relationships are detailed.
- People are interested in interesting arguments and details
- Just need to get into enough detail so that people get captivated by the details but not caught up or tripping over or overwhelmed by the details.
- This kind of data curation component is the least sexy thing of this kind of scholarship, but it’s the most important. This is a common problem that’s rarely addressed but commonly raised.
- Need environmental scan (what’s been currently done and what’s out there)
What Luther Suggests
- Prototype (initial)
- Match table
- (DB, table, row) matches to (DB, table, row), metadata: who said they were the same, when, and notes about the match
- Need to know how to identify a thing in a generalizable form (db entry, xml entry, XLS entry, etc)
- Lots if idiosyncratic things
- Need somewhere to store and host the database (backend)
- API
- Into the new database
- Into all the other databases that are linked from this one
- User Interface (to the back end)
- Edit and query interfaces
- Good query language (cross-database) for humanities researchers
- Where to go from here:
- Populate
- Programmatic, human, from logs
- We have the BYU to UVA connector
- Cross-database stats and visualizations
- Consistency checks
- Do all the sources and matches actually agree?
- Do all the assertions check out