-
Notifications
You must be signed in to change notification settings - Fork 4
egor2 format specification
The purpose of this page is to have and discuss the current detailed specification for the relational egor
's internal representation.
Note that this table does not update automatically. Rather, you need to go to https://tableofcontents.herokuapp.com/, paste the contents of the page after the table into it, then replace the table with the output. (You might also need to delete the first four spaces on each line.)
An egor
object is a subclass of list
with certain special-purpose elements and additional attributes.
This table is a srvyr
object containing information about how the egos were selected. The columns of egos
contain ego attributes.
.egoID
integer: a unique identifier for each ego. Always the last column.
This table is a tibble
containing the alter data, with columns containg alter attributes or attributes of the ego-alter relation.
.egoID
integer: an identifier of the ego that had nominated that alter. Joins with egos$.egoID
. Always the penultimate column.
.altID
integer: a unique (within a given .egoID
) identifier for each alter. Always the last column.
This table is a tibble
containing the alter data, with columns containg attributes of the alter-alter relation.
.egoID
integer: an identifier of the ego that had nominated that alter. Joins with egos$.egoID
and alts$.egoID
. Always the third-to-last column.
.srcID
, .tgtID
integer: identifiers of the two alts whose relation is being stored. Joins with alts$.altID
. Always the last column.
A list
containing information about how the data about alts were collected. Currently, this includes:
-
max
(required): Maximum number of alters an ego was allowed to nominate. Set to+Inf
if no limit.
Since the special columns are meant to be keys for joining the tables, accessors and modifiers must preserve certain invariants:
- No two
egos
rows may have the same.egoID
s: any operations that duplicate ego rows must also create new ego IDs. - No two
alts
rows may have the same (.egoID
,altID
) combination: any operations that duplicate ego rows must also create new ego IDs and copy their alters. - No two
aaties
rows may have the same (.egoID
,srcID
,.tgtID
) combination: any operations that duplicate ego or alter rows must also create new ego IDs and copy their alters. - Special columns must always be the last columns in their respective
tibble
. Transformation and subsetting methods must resist attempts to remove or reorder them.
In general, the end-user should not have persistent data columns whose names begin with a dot (.
). This will help ensure that data columns will not accidentally mask variables when using non-standard evaluation like subset.egor()
does. When using placeholder or ephemeral variables, the user should also be aware that the following have been reserved for egor
's use:
-
.egoID
,.altID
,.srcID
,.tgtID
,.egoRow
,.altRow
,.srcRow
,.tgtRow
- Should alter design be an
egor
list element or an attribute? - To implement
tidygraph
-style semantics, should the currently activated attribute be a list element or an attribute? - Are the invariants too strict?
- Should the user be able to manually specify the
.egoID
,.altID
, etc., and should they be allowed to be characters as well?