Discuss
- Something else happened with the Brown import. Zina Prescendia Young 54895 was overwritten by Abigail Abbot in the database. She is also listed as Male in the DB.
- Meeting with Kathleen tomorrow at 1pm at IATH. Anything I should bring up?
To Do
- Continue thinking about equivalence classes of identities
- Write things down, type them out
- What are some other examples?
- Circuses
- Marriages
- Citation graph: equivalence knob = institution level
- With equvalence classes as institutions (corps or schools), they are evolving nodes with people that come and go from the institution, cite within the institution and across institutions
- With equivalence classes as departments (within institutions), we can see how the smaller units inter-cite and interact as people move and cite.
- There are some interesting metrics to be had here. Such as which departments/institutions are the most similar across time (cosine similarity)
- Enron data: departments
- Others? Can be real or fabricated.
- Snac data?
- What happens when these things don’t define equivalence classes?
- With binary marriages as nodes, and the EC collecting marriages to a particular man (or woman), it’s easy to see this is an equivalence class
- As time goes on, is it still an equivalence class? Worthy seemed to not want to define it as the same class. Its identity (definition = marriages to person X) stays the same across time, but its content (marriages included in the class) changes as time goes on. What can we gather from this?
- What about the citation graph, when people can be in multiple institutions (affiliated with UVA and Google, for example). The institutions aren’t realy equivalence classes because they overlap some (without being equal). Do we define it in a different way here? Do we relax some of the restraints of equivalence classes?
- With binary marriages as nodes, and the EC collecting marriages to a particular man (or woman), it’s easy to see this is an equivalence class
- What about special cases when we’re looking at binary relations to build our equivalence class, and we have issues with transitivity. (I didn’t completely follow the example). But say we have
A <--> B <--> C
and they are an equivalence class if we look at them in the right order,A <--> B
andB <--> C
, but if we look at them differently, sayA ??? C
, they don’t appear to be in an equivalence class together.- I took from this (which Worthy suggested isn’t the right meaning) to consider if we collapse all nodes (binary relations) down to their equivalence classes and relations between classes, then consider how those change over time as opposed to doing the computation on the original graph over time, and computing the equivalence classes at each step from the graph itself.
- That is, doing the simpler EC only computation vs the more complex overall-graph computations and overlaying the EC boundaries after computing
- Overlay the EC boundaries and then compute vs compute and then overlay the EC boundaries
- There is interesting computational questions here as well
- The important point is to note, I think, how we are constructing and defining our equivalence classes (or whatever we call them). If we construct them based on pre-defined binary relations, we need to be careful to define them properly or to relax the transitivity property to ensure we capture the classes we’re intending to capture.
- I took from this (which Worthy suggested isn’t the right meaning) to consider if we collapse all nodes (binary relations) down to their equivalence classes and relations between classes, then consider how those change over time as opposed to doing the computation on the original graph over time, and computing the equivalence classes at each step from the graph itself.
- Continue reading papers, make notes, be ready to share on Tuesday afternoon
- Start building these networks to do some of these metrics, specifically betweenness and closeness centrality
- What are the computational challenges along the way?
- Start writing!
- Force directed graph ported from SNAC over to mormon db. Can I just use a SQL api, like I have been, to create the nodes/edges the way I have been using neo4j for snac?
- Don’t get distracted too much by the data entry or Mormon example specifically. Go for dissertation!