Difference between revisions of "GEPS 047: Select data from SQL API"
(→Basic Usage) |
|||
| Line 56: | Line 56: | ||
You can access nested attributes using dot notation: | You can access nested attributes using dot notation: | ||
| − | + | <pre> | |
# Get surnames | # Get surnames | ||
surnames = list(db.select_from_person(what="person.primary_name.surname_list[0].surname")) | surnames = list(db.select_from_person(what="person.primary_name.surname_list[0].surname")) | ||
| Line 65: | Line 65: | ||
# Get primary names | # Get primary names | ||
names = list(db.select_from_person(what="person.primary_name.first_name")) | names = list(db.select_from_person(what="person.primary_name.first_name")) | ||
| − | + | </pre> | |
=== Extracting from Arrays (List Comprehensions) === | === Extracting from Arrays (List Comprehensions) === | ||
| Line 71: | Line 71: | ||
You can extract values from arrays using list comprehensions: | You can extract values from arrays using list comprehensions: | ||
| − | + | <pre> | |
# Extract all role values from event_ref_list | # Extract all role values from event_ref_list | ||
role_values = list( | role_values = list( | ||
| Line 92: | Line 92: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
| − | + | '''Note''': List comprehensions in `what` return one row per matching array element. If a person has 3 matching event_refs, you'll get 3 rows. | |
=== Array Expansion Pattern === | === Array Expansion Pattern === | ||
| Line 100: | Line 100: | ||
For simpler cases, you can use the array expansion pattern: | For simpler cases, you can use the array expansion pattern: | ||
| − | + | <pre> | |
# Extract role.value from all event_refs | # Extract role.value from all event_refs | ||
role_values = list( | role_values = list( | ||
| Line 108: | Line 108: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
You can also include person-level fields along with array element fields: | You can also include person-level fields along with array element fields: | ||
| − | + | <pre> | |
# Get both person.handle and role.value for each event_ref | # Get both person.handle and role.value for each event_ref | ||
# Use a list for multiple fields | # Use a list for multiple fields | ||
| Line 122: | Line 122: | ||
) | ) | ||
# Returns tuples: (handle, role_value) | # Returns tuples: (handle, role_value) | ||
| − | + | </pre> | |
This is equivalent to the list comprehension but uses a different syntax. When you include `person.handle`, each row will show which person the array element belongs to. | This is equivalent to the list comprehension but uses a different syntax. When you include `person.handle`, each row will show which person the array element belongs to. | ||
| Line 134: | Line 134: | ||
Simply reference other tables using their lowercase names (e.g., `family`, `person`, `event`): | Simply reference other tables using their lowercase names (e.g., `family`, `person`, `event`): | ||
| − | + | <pre> | |
# Join person with family - get person and family handles where person is the father | # Join person with family - get person and family handles where person is the father | ||
results = list( | results = list( | ||
| Line 142: | Line 142: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== JOIN with Additional Conditions === | === JOIN with Additional Conditions === | ||
| Line 148: | Line 148: | ||
You can combine JOIN conditions with other filters: | You can combine JOIN conditions with other filters: | ||
| − | + | <pre> | |
# Get person and family where person is father and family type is married | # Get person and family where person is father and family type is married | ||
results = list( | results = list( | ||
| Line 156: | Line 156: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Reverse Direction (from Family to Person) === | === Reverse Direction (from Family to Person) === | ||
| Line 162: | Line 162: | ||
You can query from any table and JOIN to others: | You can query from any table and JOIN to others: | ||
| − | + | <pre> | |
# Get family and father's information | # Get family and father's information | ||
results = list( | results = list( | ||
| Line 170: | Line 170: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== JOIN with Mother Relationship === | === JOIN with Mother Relationship === | ||
| − | + | <pre> | |
# Get person and family where person is the mother | # Get person and family where person is the mother | ||
results = list( | results = list( | ||
| Line 182: | Line 182: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== JOIN with Multiple Conditions === | === JOIN with Multiple Conditions === | ||
| − | + | <pre> | |
# Get person and family with conditions on both tables | # Get person and family with conditions on both tables | ||
results = list( | results = list( | ||
| Line 194: | Line 194: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== JOIN with ORDER BY === | === JOIN with ORDER BY === | ||
| − | + | <pre> | |
# Join and sort results | # Join and sort results | ||
results = list( | results = list( | ||
| Line 207: | Line 207: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== JOIN with Variable-Index Array Access === | === JOIN with Variable-Index Array Access === | ||
| Line 213: | Line 213: | ||
You can join tables using variable-index array access, which is useful for joining through array references: | You can join tables using variable-index array access, which is useful for joining through array references: | ||
| − | + | <pre> | |
# Join person to event through the birth event reference | # Join person to event through the birth event reference | ||
# This uses person.birth_ref_index to find the birth event in event_ref_list | # This uses person.birth_ref_index to find the birth event in event_ref_list | ||
| Line 222: | Line 222: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
This is particularly useful for joining to related records through indexed array references. The system automatically handles the variable index (like `person.birth_ref_index`) and generates the appropriate SQL subquery. | This is particularly useful for joining to related records through indexed array references. The system automatically handles the variable index (like `person.birth_ref_index`) and generates the appropriate SQL subquery. | ||
| Line 230: | Line 230: | ||
You can combine variable-index array access joins with additional filters: | You can combine variable-index array access joins with additional filters: | ||
| − | + | <pre> | |
from gramps.gen.lib import EventType | from gramps.gen.lib import EventType | ||
| Line 240: | Line 240: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
This finds all persons with birth events that are actually marked as BIRTH type events. | This finds all persons with birth events that are actually marked as BIRTH type events. | ||
| Line 246: | Line 246: | ||
=== How JOIN Detection Works === | === How JOIN Detection Works === | ||
| − | + | * '''Table References''': When you use `family.handle`, `person.handle`, etc., the system detects these as table references | |
| − | + | * '''Automatic JOIN Generation''': JOIN clauses are automatically added to the SQL query | |
| − | + | * '''Join Conditions''': The system requires explicit join conditions in your WHERE clause (e.g., `person.handle == family.father_handle`). Join conditions must be explicitly specified - there are no fallback defaults. | |
| − | + | * '''Table Prefixes''': When JOINs are present, all table attributes automatically use table prefixes (e.g., `person.json_data`, `family.json_data`) | |
=== Available Tables === | === Available Tables === | ||
You can reference these tables in your queries: | You can reference these tables in your queries: | ||
| − | + | * `person` - Person records | |
| − | + | * `family` - Family records | |
| − | + | * `event` - Event records | |
| − | + | * `place` - Place records | |
| − | + | * `source` - Source records | |
| − | + | * `citation` - Citation records | |
| − | + | * `repository` - Repository records | |
| − | + | * `media` - Media records | |
| − | + | * `note` - Note records | |
| − | + | * `tag` - Tag records | |
=== Notes on JOINs === | === Notes on JOINs === | ||
| − | 1. | + | 1. '''Table Names''': Use lowercase table names (e.g., `family`, not `Family`) to reference tables |
| − | 2. | + | 2. '''Class Names''': PascalCase names like `Person`, `FamilyRelType` are classes from the environment, not table references |
| − | 3. | + | 3. '''Constants''': All Gramps constants are available (e.g., `Person.MALE`, `FamilyRelType.MARRIED`, `EventType.MARRIAGE`) |
| − | 4. | + | 4. '''Join Type''': Currently supports INNER JOIN. The system automatically determines the join condition based on handle relationships |
| − | 5. | + | 5. '''Multiple JOINs''': You can reference multiple tables in a single query |
| − | 6. | + | 6. '''Handle-Only Joins''': JOINs are only allowed between handle fields. Valid handle fields include: |
- `handle` (the primary handle, e.g., `person.handle`) | - `handle` (the primary handle, e.g., `person.handle`) | ||
- `ref` (reference handle, e.g., `event_ref.ref`, `citation_ref.ref`) | - `ref` (reference handle, e.g., `event_ref.ref`, `citation_ref.ref`) | ||
| Line 294: | Line 294: | ||
=== Basic Filtering === | === Basic Filtering === | ||
| − | + | <pre> | |
# Filter by gender | # Filter by gender | ||
males = list( | males = list( | ||
| Line 318: | Line 318: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Filtering with `any()` and List Comprehensions === | === Filtering with `any()` and List Comprehensions === | ||
| Line 324: | Line 324: | ||
Check if any element in an array matches a condition: | Check if any element in an array matches a condition: | ||
| − | + | <pre> | |
# Find persons with any event_ref | # Find persons with any event_ref | ||
persons_with_events = list( | persons_with_events = list( | ||
| Line 348: | Line 348: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Combining Filters === | === Combining Filters === | ||
| Line 354: | Line 354: | ||
You can combine `any()` filters with other conditions: | You can combine `any()` filters with other conditions: | ||
| − | + | <pre> | |
# Find male persons with any event_ref | # Find male persons with any event_ref | ||
male_with_events = list( | male_with_events = list( | ||
| Line 362: | Line 362: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Array Expansion in `where` === | === Array Expansion in `where` === | ||
| Line 368: | Line 368: | ||
You can also use array expansion for filtering: | You can also use array expansion for filtering: | ||
| − | + | <pre> | |
# Get role values from event_refs where role.value == 1 | # Get role values from event_refs where role.value == 1 | ||
primary_roles = list( | primary_roles = list( | ||
| Line 384: | Line 384: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
## Combining `what` and `where` with List Comprehensions | ## Combining `what` and `where` with List Comprehensions | ||
| Line 390: | Line 390: | ||
You can combine list comprehensions in both `what` and `where` for powerful filtering and extraction: | You can combine list comprehensions in both `what` and `where` for powerful filtering and extraction: | ||
| − | + | <pre> | |
# Filter persons with any PRIMARY event_ref, then extract role values from PRIMARY event_refs | # Filter persons with any PRIMARY event_ref, then extract role values from PRIMARY event_refs | ||
primary_roles = list( | primary_roles = list( | ||
| Line 414: | Line 414: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
## The `order_by` Parameter | ## The `order_by` Parameter | ||
| Line 420: | Line 420: | ||
Sort results using the `order_by` parameter: | Sort results using the `order_by` parameter: | ||
| − | + | <pre> | |
# Sort by surname | # Sort by surname | ||
sorted_persons = list( | sorted_persons = list( | ||
| Line 436: | Line 436: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
## Common Patterns | ## Common Patterns | ||
| Line 442: | Line 442: | ||
=== Getting All Handles === | === Getting All Handles === | ||
| − | + | <pre> | |
all_handles = list(db.select_from_person(what="person.handle")) | all_handles = list(db.select_from_person(what="person.handle")) | ||
| − | + | </pre> | |
=== Filtering by Array Length === | === Filtering by Array Length === | ||
| − | + | <pre> | |
# Persons with events | # Persons with events | ||
with_events = list( | with_events = list( | ||
| Line 464: | Line 464: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Extracting from Nested Structures === | === Extracting from Nested Structures === | ||
| − | + | <pre> | |
# Get all citation handles from persons | # Get all citation handles from persons | ||
citation_handles = list( | citation_handles = list( | ||
| Line 482: | Line 482: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Working with Event References === | === Working with Event References === | ||
| − | + | <pre> | |
# Get all event reference handles | # Get all event reference handles | ||
event_ref_handles = list( | event_ref_handles = list( | ||
| Line 500: | Line 500: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Variable-Index Array Access === | === Variable-Index Array Access === | ||
| Line 506: | Line 506: | ||
You can access array elements using a variable index (like `person.birth_ref_index`): | You can access array elements using a variable index (like `person.birth_ref_index`): | ||
| − | + | <pre> | |
# Get the birth event reference using birth_ref_index | # Get the birth event reference using birth_ref_index | ||
birth_refs = list( | birth_refs = list( | ||
| Line 536: | Line 536: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
## Limitations | ## Limitations | ||
| − | 1. | + | 1. '''List Comprehensions in `what`''': When using list comprehensions in `what`, you get one row per matching array element. If a person has 5 matching event_refs, you'll get 5 rows. |
| − | 2. | + | 2. '''Array Expansion vs List Comprehension''': |
- Array expansion (`item in person.array_path`) works in `where` and expands the result set | - Array expansion (`item in person.array_path`) works in `where` and expands the result set | ||
- List comprehensions in `what` extract values from arrays | - List comprehensions in `what` extract values from arrays | ||
- You cannot combine array expansion in `where` with list comprehensions in `what` effectively | - You cannot combine array expansion in `where` with list comprehensions in `what` effectively | ||
| − | 3. | + | 3. '''Complex Boolean Logic''': You can use `and` and `or` in `where` clauses with full structure preservation. Most combinations are supported: |
| − | + | '''All cases work correctly:''' | |
- Simple combinations: `condition1 and condition2` or `condition1 or condition2` | - Simple combinations: `condition1 and condition2` or `condition1 or condition2` | ||
- Same operator throughout: `A and B and C` or `A or B or C` | - Same operator throughout: `A and B and C` or `A or B or C` | ||
| Line 559: | Line 559: | ||
- Array expansion with OR: `(A and B) or (item in person.array)` | - Array expansion with OR: `(A and B) or (item in person.array)` | ||
| − | + | '''Array Expansion in OR Expressions:''' | |
When array expansion is used in an OR expression, the query uses a UNION to combine results: | When array expansion is used in an OR expression, the query uses a UNION to combine results: | ||
| − | + | <pre> | |
# This works correctly, including persons with empty arrays: | # This works correctly, including persons with empty arrays: | ||
db.select_from_person( | db.select_from_person( | ||
| Line 569: | Line 569: | ||
where="(person.gender == Person.MALE and len(person.family_list) > 0) or (item in person.event_ref_list)" | where="(person.gender == Person.MALE and len(person.family_list) > 0) or (item in person.event_ref_list)" | ||
) | ) | ||
| − | + | </pre> | |
| − | + | '''Note:''' If you need one row per person (not per array element), use `any()` instead: | |
| − | + | <pre> | |
# Returns one row per person (not per array element): | # Returns one row per person (not per array element): | ||
db.select_from_person( | db.select_from_person( | ||
| Line 579: | Line 579: | ||
where="(person.gender == Person.MALE and len(person.family_list) > 0) or any([item for item in person.event_ref_list])" | where="(person.gender == Person.MALE and len(person.family_list) > 0) or any([item for item in person.event_ref_list])" | ||
) | ) | ||
| − | + | </pre> | |
The difference: `any()` returns one row per person, while array expansion returns one row per array element. | The difference: `any()` returns one row per person, while array expansion returns one row per array element. | ||
| − | 4. | + | 4. '''Single Condition in List Comprehensions''': List comprehensions support only one `if` condition. Multiple conditions should be combined with `and`: |
| − | + | <pre> | |
# This works: | # This works: | ||
what="[item.field for item in person.array if item.field1 == 1 and item.field2 == 2]" | what="[item.field for item in person.array if item.field1 == 1 and item.field2 == 2]" | ||
| Line 591: | Line 591: | ||
# Multiple if clauses are not supported: | # Multiple if clauses are not supported: | ||
# what="[item.field for item in person.array if item.field1 == 1 if item.field2 == 2]" # Not supported | # what="[item.field for item in person.array if item.field1 == 1 if item.field2 == 2]" # Not supported | ||
| − | + | </pre> | |
| − | 5. | + | 5. '''Nested Arrays''': Nested array access (arrays within arrays) is '''not supported'''. |
| − | + | '''What works:''' | |
- Single-level arrays: `person.event_ref_list`, `person.citation_list` | - Single-level arrays: `person.event_ref_list`, `person.citation_list` | ||
- Accessing attributes of array elements: `item.role.value` where `item` is from `person.event_ref_list` | - Accessing attributes of array elements: `item.role.value` where `item` is from `person.event_ref_list` | ||
| − | + | '''What doesn't work:''' | |
- Nested arrays: Accessing arrays that are properties of array elements | - Nested arrays: Accessing arrays that are properties of array elements | ||
- Example: If an event_ref had a `sub_items` array, `item.sub_items` would not work | - Example: If an event_ref had a `sub_items` array, `item.sub_items` would not work | ||
| Line 605: | Line 605: | ||
- Arrays within nested objects: `person.some_object.array_field` where `some_object` itself contains arrays | - Arrays within nested objects: `person.some_object.array_field` where `some_object` itself contains arrays | ||
| − | 6. | + | 6. '''Lambda Functions''': Lambda functions (callables) are '''not supported'''. All arguments (`what`, `where`, `order_by`) must be strings or lists of strings. |
## Hints and Best Practices | ## Hints and Best Practices | ||
| − | 1. | + | 1. '''Use Constants''': All of the Gramps objects and types are available in the environment |
| − | + | <pre> | |
# Good: | # Good: | ||
where="eref.role.value == EventRoleType.PRIMARY" | where="eref.role.value == EventRoleType.PRIMARY" | ||
| Line 618: | Line 618: | ||
# Avoid: | # Avoid: | ||
where="eref.role.value == 1" # What does 1 mean? | where="eref.role.value == 1" # What does 1 mean? | ||
| − | + | </pre> | |
| − | 2. | + | 2. '''Empty Arrays''': To check for empty arrays, use `len()` or `not any()`: |
| − | + | <pre> | |
# Both work: | # Both work: | ||
where="len(person.event_ref_list) == 0" | where="len(person.event_ref_list) == 0" | ||
where="not any([eref for eref in person.event_ref_list])" | where="not any([eref for eref in person.event_ref_list])" | ||
| − | + | </pre> | |
| − | 3. | + | 3. '''Performance''': |
- List comprehensions in `what` can return many rows if arrays are large | - List comprehensions in `what` can return many rows if arrays are large | ||
- `any()` in `where` is efficient for filtering | - `any()` in `where` is efficient for filtering | ||
- Consider filtering in `where` before extracting in `what` to reduce result set size | - Consider filtering in `where` before extracting in `what` to reduce result set size | ||
| − | 4. | + | 4. '''Testing for Existence''': Use `any()` to test if any element exists: |
| − | + | <pre> | |
# Check if person has any event_refs | # Check if person has any event_refs | ||
where="any([eref for eref in person.event_ref_list])" | where="any([eref for eref in person.event_ref_list])" | ||
| − | + | </pre> | |
| − | 5. | + | 5. '''Extracting Multiple Fields''': |
- For array expansion, you can include both person-level and array element fields: | - For array expansion, you can include both person-level and array element fields: | ||
| − | + | <pre> | |
# Get person.handle along with array element fields | # Get person.handle along with array element fields | ||
results = list( | results = list( | ||
| Line 648: | Line 648: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
- For list comprehensions, you can extract multiple fields using tuples: | - For list comprehensions, you can extract multiple fields using tuples: | ||
| − | + | <pre> | |
# Extract multiple fields at once using tuples | # Extract multiple fields at once using tuples | ||
refs_and_roles = list(db.select_from_person(what="[(eref.ref, eref.role.value) for eref in person.event_ref_list]")) | refs_and_roles = list(db.select_from_person(what="[(eref.ref, eref.role.value) for eref in person.event_ref_list]")) | ||
| Line 658: | Line 658: | ||
refs = list(db.select_from_person(what="[eref.ref for eref in person.event_ref_list]")) | refs = list(db.select_from_person(what="[eref.ref for eref in person.event_ref_list]")) | ||
roles = list(db.select_from_person(what="[eref.role.value for eref in person.event_ref_list]")) | roles = list(db.select_from_person(what="[eref.role.value for eref in person.event_ref_list]")) | ||
| − | + | </pre> | |
| − | 6. | + | 6. '''Order of Operations''': |
- `where` filters which records (persons) to include | - `where` filters which records (persons) to include | ||
- `what` determines what data to extract from those records | - `what` determines what data to extract from those records | ||
- When both use list comprehensions, `where` filters persons first, then `what` extracts from matching array elements | - When both use list comprehensions, `where` filters persons first, then `what` extracts from matching array elements | ||
| − | 7. | + | 7. '''Debugging''': If a query doesn't work as expected: |
- Check that attribute paths are correct (use dot notation) | - Check that attribute paths are correct (use dot notation) | ||
- Verify array names (e.g., `event_ref_list`, not `event_refs`) | - Verify array names (e.g., `event_ref_list`, not `event_refs`) | ||
| Line 675: | Line 675: | ||
=== Find All Persons with Birth Events === | === Find All Persons with Birth Events === | ||
| − | + | <pre> | |
birth_handles = list( | birth_handles = list( | ||
db.select_from_person( | db.select_from_person( | ||
| Line 682: | Line 682: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Extract All Event Reference Handles === | === Extract All Event Reference Handles === | ||
| − | + | <pre> | |
# Just the handles | # Just the handles | ||
all_event_refs = list( | all_event_refs = list( | ||
| Line 701: | Line 701: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Find People with Families === | === Find People with Families === | ||
| − | + | <pre> | |
have_family = list( | have_family = list( | ||
db.select_from_person( | db.select_from_person( | ||
| Line 712: | Line 712: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Find People Who Are Fathers (Using JOIN) === | === Find People Who Are Fathers (Using JOIN) === | ||
| − | + | <pre> | |
# Using JOIN to find people who are fathers | # Using JOIN to find people who are fathers | ||
fathers = list( | fathers = list( | ||
| Line 724: | Line 724: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Find Married Families with Male Fathers === | === Find Married Families with Male Fathers === | ||
| − | + | <pre> | |
married_families = list( | married_families = list( | ||
db.select_from_person( | db.select_from_person( | ||
| Line 735: | Line 735: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Get Surnames of All Males, Sorted === | === Get Surnames of All Males, Sorted === | ||
| − | + | <pre> | |
male_surnames = list( | male_surnames = list( | ||
db.select_from_person( | db.select_from_person( | ||
| Line 747: | Line 747: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
=== Extract Role Values from Event References with Conditions === | === Extract Role Values from Event References with Conditions === | ||
| − | + | <pre> | |
# Get PRIMARY role values from persons who have PRIMARY events | # Get PRIMARY role values from persons who have PRIMARY events | ||
primary_roles = list( | primary_roles = list( | ||
| Line 759: | Line 759: | ||
) | ) | ||
) | ) | ||
| − | + | </pre> | |
Revision as of 13:49, 21 December 2025
GEP to document discussions about creating an API to allow fast database queries, targeting SQLite and Postgresql.
Contents
- 1 Design Goals
- 2 API
- 2.1 Basic Usage
- 2.2 Simple Examples
- 2.3 The `what` Parameter
- 2.4 Extracting Attributes
- 2.5 Extracting from Arrays (List Comprehensions)
- 2.6 Array Expansion Pattern
- 2.7 JOINing Tables
- 2.8 Basic JOIN Syntax
- 2.9 JOIN with Additional Conditions
- 2.10 Reverse Direction (from Family to Person)
- 2.11 JOIN with Mother Relationship
- 2.12 JOIN with Multiple Conditions
- 2.13 JOIN with ORDER BY
- 2.14 JOIN with Variable-Index Array Access
- 2.15 JOIN with Variable-Index Array Access and Additional Conditions
- 2.16 How JOIN Detection Works
- 2.17 Available Tables
- 2.18 Notes on JOINs
- 2.19 Basic Filtering
- 2.20 Filtering with `any()` and List Comprehensions
- 2.21 Combining Filters
- 2.22 Array Expansion in `where`
- 2.23 Getting All Handles
- 2.24 Filtering by Array Length
- 2.25 Extracting from Nested Structures
- 2.26 Working with Event References
- 2.27 Variable-Index Array Access
- 2.28 Find All Persons with Birth Events
- 2.29 Extract All Event Reference Handles
- 2.30 Find People with Families
- 2.31 Find People Who Are Fathers (Using JOIN)
- 2.32 Find Married Families with Male Fathers
- 2.33 Get Surnames of All Males, Sorted
- 2.34 Extract Role Values from Event References with Conditions
- 3 See also
Design Goals
1. Easy for Gramps developers to use (no new syntax, no ORMs) 2. Provide fast access to select data, targeting filters 3. Ability to be used from SQLite and Postgresql 4. Easily used by Gramps Web 5. Targeting exploiting Gramps 6.0 selected_handles in filters
API
This section shows examples using `db.select_from_person()`, `db.select_from_family()`, and other `select_from_*` functions to query the database using Python string expressions.
Basic Usage
The `select_from_*` functions accept three main parameters:
- what: What data to extract (defaults to the entire object)
- where: Filter conditions (optional)
- order_by: Sorting order (optional)
Simple Examples
# Get all person handles
handles = list(db.select_from_person(what="person.handle"))
# Get handles of all males
male_handles = list(
db.select_from_person(
what="person.handle",
where="person.gender == Person.MALE"
)
)
# Get handles sorted by surname
sorted_handles = list(
db.select_from_person(
what="person.handle",
order_by="person.primary_name.surname_list[0].surname"
)
)
The `what` Parameter
The `what` parameter specifies what data to extract from each matching record.
Extracting Attributes
You can access nested attributes using dot notation:
# Get surnames surnames = list(db.select_from_person(what="person.primary_name.surname_list[0].surname")) # Get birth dates birth_dates = list(db.select_from_person(what="person.birth_ref.ref")) # Get primary names names = list(db.select_from_person(what="person.primary_name.first_name"))
Extracting from Arrays (List Comprehensions)
You can extract values from arrays using list comprehensions:
# Extract all role values from event_ref_list
role_values = list(
db.select_from_person(
what="[eref.role.value for eref in person.event_ref_list]"
)
)
# Extract role values with a condition
primary_roles = list(
db.select_from_person(
what="[eref.role.value for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY]"
)
)
# Extract multiple fields
event_data = list(
db.select_from_person(
what="[eref.ref for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY]"
)
)
Note: List comprehensions in `what` return one row per matching array element. If a person has 3 matching event_refs, you'll get 3 rows.
Array Expansion Pattern
For simpler cases, you can use the array expansion pattern:
# Extract role.value from all event_refs
role_values = list(
db.select_from_person(
what="item.role.value",
where="item in person.event_ref_list"
)
)
You can also include person-level fields along with array element fields:
# Get both person.handle and role.value for each event_ref
# Use a list for multiple fields
results = list(
db.select_from_person(
what=["person.handle", "item.role.value"],
where="item in person.event_ref_list"
)
)
# Returns tuples: (handle, role_value)
This is equivalent to the list comprehension but uses a different syntax. When you include `person.handle`, each row will show which person the array element belongs to.
JOINing Tables
You can reference other tables in your queries using natural table prefixes. The system automatically detects table references and generates appropriate JOIN clauses.
Basic JOIN Syntax
Simply reference other tables using their lowercase names (e.g., `family`, `person`, `event`):
# Join person with family - get person and family handles where person is the father
results = list(
db.select_from_person(
what=["person.handle", "family.handle"],
where="person.handle == family.father_handle"
)
)
JOIN with Additional Conditions
You can combine JOIN conditions with other filters:
# Get person and family where person is father and family type is married
results = list(
db.select_from_person(
what=["person.handle", "family.handle", "family.type.value"],
where="person.handle == family.father_handle and family.type.value == FamilyRelType.MARRIED"
)
)
Reverse Direction (from Family to Person)
You can query from any table and JOIN to others:
# Get family and father's information
results = list(
db.select_from_family(
what=["family.handle", "person.handle", "person.primary_name.first_name"],
where="family.father_handle == person.handle"
)
)
JOIN with Mother Relationship
# Get person and family where person is the mother
results = list(
db.select_from_person(
what=["person.handle", "family.handle"],
where="person.handle == family.mother_handle"
)
)
JOIN with Multiple Conditions
# Get person and family with conditions on both tables
results = list(
db.select_from_person(
what=["person.handle", "family.handle"],
where="person.handle == family.father_handle and person.gender == Person.MALE and family.type.value == FamilyRelType.MARRIED"
)
)
JOIN with ORDER BY
# Join and sort results
results = list(
db.select_from_person(
what=["person.handle", "family.handle"],
where="person.handle == family.father_handle",
order_by=["person.handle", "family.handle"]
)
)
JOIN with Variable-Index Array Access
You can join tables using variable-index array access, which is useful for joining through array references:
# Join person to event through the birth event reference
# This uses person.birth_ref_index to find the birth event in event_ref_list
results = list(
db.select_from_person(
what=["person.handle", "event.handle"],
where="person.event_ref_list[person.birth_ref_index].ref == event.handle"
)
)
This is particularly useful for joining to related records through indexed array references. The system automatically handles the variable index (like `person.birth_ref_index`) and generates the appropriate SQL subquery.
JOIN with Variable-Index Array Access and Additional Conditions
You can combine variable-index array access joins with additional filters:
from gramps.gen.lib import EventType
# Join person to event via birth, and filter by event type
results = list(
db.select_from_person(
what=["person.handle", "event.handle", "event.type.value"],
where="person.event_ref_list[person.birth_ref_index].ref == event.handle and event.type.value == EventType.BIRTH"
)
)
This finds all persons with birth events that are actually marked as BIRTH type events.
How JOIN Detection Works
- Table References: When you use `family.handle`, `person.handle`, etc., the system detects these as table references
- Automatic JOIN Generation: JOIN clauses are automatically added to the SQL query
- Join Conditions: The system requires explicit join conditions in your WHERE clause (e.g., `person.handle == family.father_handle`). Join conditions must be explicitly specified - there are no fallback defaults.
- Table Prefixes: When JOINs are present, all table attributes automatically use table prefixes (e.g., `person.json_data`, `family.json_data`)
Available Tables
You can reference these tables in your queries:
- `person` - Person records
- `family` - Family records
- `event` - Event records
- `place` - Place records
- `source` - Source records
- `citation` - Citation records
- `repository` - Repository records
- `media` - Media records
- `note` - Note records
- `tag` - Tag records
Notes on JOINs
1. Table Names: Use lowercase table names (e.g., `family`, not `Family`) to reference tables 2. Class Names: PascalCase names like `Person`, `FamilyRelType` are classes from the environment, not table references 3. Constants: All Gramps constants are available (e.g., `Person.MALE`, `FamilyRelType.MARRIED`, `EventType.MARRIAGE`) 4. Join Type: Currently supports INNER JOIN. The system automatically determines the join condition based on handle relationships 5. Multiple JOINs: You can reference multiple tables in a single query 6. Handle-Only Joins: JOINs are only allowed between handle fields. Valid handle fields include:
- `handle` (the primary handle, e.g., `person.handle`) - `ref` (reference handle, e.g., `event_ref.ref`, `citation_ref.ref`) - Any field ending in `_handle` (e.g., `family.father_handle`, `family.mother_handle`)
Examples of valid joins: - `person.handle == family.father_handle` ✓ - `family.mother_handle == person.handle` ✓ - `event_ref.ref == event.handle` ✓ - `person.event_ref_list[person.birth_ref_index].ref == event.handle` ✓ (variable-index array access) - `person.event_ref_list[0].ref == event.handle` ✓ (constant-index array access)
Examples of invalid joins (will be ignored): - `person.gender == family.type.value` ✗ (not handle fields) - `person.gramps_id == family.gramps_id` ✗ (not handle fields)
- The `where` Parameter
The `where` parameter filters which records to include.
Basic Filtering
# Filter by gender
males = list(
db.select_from_person(
what="person.handle",
where="person.gender == Person.FEMALE"
)
)
# Filter by multiple conditions
married_males = list(
db.select_from_person(
what="person.handle",
where="person.gender == Person.UNKNOWN and len(person.family_list) > 0"
)
)
# String operations
smiths = list(
db.select_from_person(
what="person.handle",
where="person.primary_name.surname_list[0].surname.startswith('Smith')"
)
)
Filtering with `any()` and List Comprehensions
Check if any element in an array matches a condition:
# Find persons with any event_ref
persons_with_events = list(
db.select_from_person(
what="person.handle",
where="any([eref for eref in person.event_ref_list])"
)
)
# Find persons with any PRIMARY event_ref
persons_with_primary = list(
db.select_from_person(
what="person.handle",
where="any([eref for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY])"
)
)
# Negation: find persons with no event_refs
persons_without_events = list(
db.select_from_person(
what="person.handle",
where="not any([eref for eref in person.event_ref_list])"
)
)
Combining Filters
You can combine `any()` filters with other conditions:
# Find male persons with any event_ref
male_with_events = list(
db.select_from_person(
what="person.handle",
where="person.gender == Person.MALE and any([eref for eref in person.event_ref_list])"
)
)
Array Expansion in `where`
You can also use array expansion for filtering:
# Get role values from event_refs where role.value == 1
primary_roles = list(
db.select_from_person(
what="item.role.value",
where="item in person.event_ref_list and item.role.value == 1"
)
)
# Include person.handle to see which person each event_ref belongs to
results = list(
db.select_from_person(
what=["person.handle", "item.role.value"],
where="item in person.event_ref_list and item.role.value == 1"
)
)
- Combining `what` and `where` with List Comprehensions
You can combine list comprehensions in both `what` and `where` for powerful filtering and extraction:
# Filter persons with any PRIMARY event_ref, then extract role values from PRIMARY event_refs
primary_roles = list(
db.select_from_person(
what="[eref.role.value for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY]",
where="any([eref for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY])"
)
)
# Different conditions: filter persons with any event_ref, extract only PRIMARY roles
mixed = list(
db.select_from_person(
what="[eref.role.value for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY]",
where="any([eref for eref in person.event_ref_list])"
)
)
# Combine with person-level filtering
female_primary = list(
db.select_from_person(
what="[eref.role.value for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY]",
where="person.gender == Person.FEMALE and any([eref for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY])"
)
)
- The `order_by` Parameter
Sort results using the `order_by` parameter:
# Sort by surname
sorted_persons = list(
db.select_from_person(
what="person.handle",
order_by="person.primary_name.surname_list[0].surname"
)
)
# Sort by multiple fields (use a list)
sorted_persons = list(
db.select_from_person(
what="person.handle",
order_by=["person.primary_name.surname_list[0].surname", "person.primary_name.first_name"]
)
)
- Common Patterns
Getting All Handles
all_handles = list(db.select_from_person(what="person.handle"))
Filtering by Array Length
# Persons with events
with_events = list(
db.select_from_person(
what="person.handle",
where="len(person.event_ref_list) > 0"
)
)
# Persons without events
without_events = list(
db.select_from_person(
what="person.handle",
where="len(person.event_ref_list) == 0"
)
)
Extracting from Nested Structures
# Get all citation handles from persons
citation_handles = list(
db.select_from_person(
what="[cit.handle for cit in person.citation_list]"
)
)
# Get note handles
note_handles = list(
db.select_from_person(
what="[note.handle for note in person.note_list]"
)
)
Working with Event References
# Get all event reference handles
event_ref_handles = list(
db.select_from_person(
what="[eref.ref for eref in person.event_ref_list]"
)
)
# Get PRIMARY event reference handles only
primary_event_refs = list(
db.select_from_person(
what="[eref.ref for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY]"
)
)
Variable-Index Array Access
You can access array elements using a variable index (like `person.birth_ref_index`):
# Get the birth event reference using birth_ref_index
birth_refs = list(
db.select_from_person(
what="person.event_ref_list[person.birth_ref_index]"
)
)
# Get the role value from the birth event reference
birth_roles = list(
db.select_from_person(
what="person.event_ref_list[person.birth_ref_index].role.value"
)
)
# Filter persons who have a valid birth event reference
persons_with_birth = list(
db.select_from_person(
what="person.handle",
where="person.event_ref_list[person.birth_ref_index]"
)
)
# Filter persons whose birth event reference has PRIMARY role
persons_with_primary_birth = list(
db.select_from_person(
what="person.handle",
where="person.event_ref_list[person.birth_ref_index].role.value == EventRoleType.PRIMARY"
)
)
- Limitations
1. List Comprehensions in `what`: When using list comprehensions in `what`, you get one row per matching array element. If a person has 5 matching event_refs, you'll get 5 rows.
2. Array Expansion vs List Comprehension:
- Array expansion (`item in person.array_path`) works in `where` and expands the result set - List comprehensions in `what` extract values from arrays - You cannot combine array expansion in `where` with list comprehensions in `what` effectively
3. Complex Boolean Logic: You can use `and` and `or` in `where` clauses with full structure preservation. Most combinations are supported:
All cases work correctly: - Simple combinations: `condition1 and condition2` or `condition1 or condition2` - Same operator throughout: `A and B and C` or `A or B or C` - Mixed operators: `A and B or C and D` - Nested expressions: `(A and B) or (C and D)` - Deeply nested structures: `(A and (B or C)) and D` - `any()` with any boolean structure: `(A and B) or any([...])` or `(A or any([...])) and B` - Array expansion with AND: `(A and B) and (item in person.array)` - Array expansion with OR: `(A and B) or (item in person.array)`
Array Expansion in OR Expressions:
When array expansion is used in an OR expression, the query uses a UNION to combine results:
# This works correctly, including persons with empty arrays:
db.select_from_person(
what="person.handle",
where="(person.gender == Person.MALE and len(person.family_list) > 0) or (item in person.event_ref_list)"
)
Note: If you need one row per person (not per array element), use `any()` instead:
# Returns one row per person (not per array element):
db.select_from_person(
what="person.handle",
where="(person.gender == Person.MALE and len(person.family_list) > 0) or any([item for item in person.event_ref_list])"
)
The difference: `any()` returns one row per person, while array expansion returns one row per array element.
4. Single Condition in List Comprehensions: List comprehensions support only one `if` condition. Multiple conditions should be combined with `and`:
# This works: what="[item.field for item in person.array if item.field1 == 1 and item.field2 == 2]" # Multiple if clauses are not supported: # what="[item.field for item in person.array if item.field1 == 1 if item.field2 == 2]" # Not supported
5. Nested Arrays: Nested array access (arrays within arrays) is not supported.
What works: - Single-level arrays: `person.event_ref_list`, `person.citation_list` - Accessing attributes of array elements: `item.role.value` where `item` is from `person.event_ref_list`
What doesn't work:
- Nested arrays: Accessing arrays that are properties of array elements
- Example: If an event_ref had a `sub_items` array, `item.sub_items` would not work
- The code does not support `json_each` within `json_each` (nested array iteration)
- Arrays within nested objects: `person.some_object.array_field` where `some_object` itself contains arrays
6. Lambda Functions: Lambda functions (callables) are not supported. All arguments (`what`, `where`, `order_by`) must be strings or lists of strings.
- Hints and Best Practices
1. Use Constants: All of the Gramps objects and types are available in the environment
# Good: where="eref.role.value == EventRoleType.PRIMARY" where="person.gender == Person.MALE" # Avoid: where="eref.role.value == 1" # What does 1 mean?
2. Empty Arrays: To check for empty arrays, use `len()` or `not any()`:
# Both work: where="len(person.event_ref_list) == 0" where="not any([eref for eref in person.event_ref_list])"
3. Performance:
- List comprehensions in `what` can return many rows if arrays are large - `any()` in `where` is efficient for filtering - Consider filtering in `where` before extracting in `what` to reduce result set size
4. Testing for Existence: Use `any()` to test if any element exists:
# Check if person has any event_refs where="any([eref for eref in person.event_ref_list])"
5. Extracting Multiple Fields:
- For array expansion, you can include both person-level and array element fields:
# Get person.handle along with array element fields
results = list(
db.select_from_person(
what=["person.handle", "item.role.value", "item.ref"],
where="item in person.event_ref_list"
)
)
- For list comprehensions, you can extract multiple fields using tuples:
# Extract multiple fields at once using tuples
refs_and_roles = list(db.select_from_person(what="[(eref.ref, eref.role.value) for eref in person.event_ref_list]"))
# Returns list of tuples: [(ref1, role1), (ref2, role2), ...]
# You can also extract one field at a time if needed
refs = list(db.select_from_person(what="[eref.ref for eref in person.event_ref_list]"))
roles = list(db.select_from_person(what="[eref.role.value for eref in person.event_ref_list]"))
6. Order of Operations:
- `where` filters which records (persons) to include - `what` determines what data to extract from those records - When both use list comprehensions, `where` filters persons first, then `what` extracts from matching array elements
7. Debugging: If a query doesn't work as expected:
- Check that attribute paths are correct (use dot notation) - Verify array names (e.g., `event_ref_list`, not `event_refs`) - Ensure constants are imported and used correctly - Test with simpler queries first, then add complexity
- Examples by Use Case
Find All Persons with Birth Events
birth_handles = list(
db.select_from_person(
what="person.handle",
where="any([eref for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY])"
)
)
Extract All Event Reference Handles
# Just the handles
all_event_refs = list(
db.select_from_person(
what="[eref.ref for eref in person.event_ref_list]"
)
)
# With person.handle to see which person each belongs to (using array expansion)
all_event_refs_with_person = list(
db.select_from_person(
what=["person.handle", "item.ref"],
where="item in person.event_ref_list"
)
)
Find People with Families
have_family = list(
db.select_from_person(
what="person.handle",
where="len(person.family_list) > 0"
)
)
Find People Who Are Fathers (Using JOIN)
# Using JOIN to find people who are fathers
fathers = list(
db.select_from_person(
what=["person.handle", "family.handle"],
where="person.handle == family.father_handle"
)
)
Find Married Families with Male Fathers
married_families = list(
db.select_from_person(
what=["person.handle", "family.handle"],
where="person.handle == family.father_handle and person.gender == Person.MALE and family.type.value == FamilyRelType.MARRIED"
)
)
Get Surnames of All Males, Sorted
male_surnames = list(
db.select_from_person(
what="person.primary_name.surname_list[0].surname",
where="person.gender == Person.MALE",
order_by="person.primary_name.surname_list[0].surname"
)
)
Extract Role Values from Event References with Conditions
# Get PRIMARY role values from persons who have PRIMARY events
primary_roles = list(
db.select_from_person(
what="[eref.role.value for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY]",
where="any([eref for eref in person.event_ref_list if eref.role.value == EventRoleType.PRIMARY])"
)
)
See also
1. [Gramps-devel A DB select method that can be engine-optimized] 2. [Discourse Gramps, Next Generation] 3. Proposed API Examples
[[Category:GEPS|S]