Relations
This plugin adds support for generating relations either between multiple models (for example, from a dataset to an article), or intra-model relation (model defines a list of chemical compounds and a measurement defined later in the model references one of the compounds).
Installation
pip install oarepo-model-builder oarepo-model-builder-relations
Usage
Relations to another model
Model
Add the following snippet to your model file:
# article model
model:
properties:
metadata:
properties:
title: fulltext
# dataset model
model:
properties:
metadata:
title: fulltext
author: keyword
article:
type: relation
model: article
Compilation
Now compile the article model with:
oarepo-compile-model article.yaml --output-dir articles
and then compile the dataset model with:
oarepo-compile-model dataset.yaml --output-dir datasets --include article=articles/article/models/model.json
Sample doc
Suppose that you have the following article already stored in repository:
{
"id": "abcde-ghijk",
"metadata": {
"title": "Test Article",
"author": "John Smith"
}
}
If you store the following dataset:
{
"metadata": {
"title": "Test Dataset",
"article": {
"id": "abcde-ghijk"
}
}
}
and get it from the repository, you'll get:
{
"metadata": {
"title": "Test Dataset",
"article": {
"id": "abcde-ghijk",
"metadata": {
"title": "Test Article"
}
}
}
}
Note that the "title" property has been copied from the article.
Specifying fields to copy
You can specify which fields should be copied with keys
attribute:
article:
type: relation
model: article
keys: [id, metadata.title, metadata.author]
You should always include the id
field.
Omitting the internal metadata
wrapper
Use flatten: true
inside your model to remove the internal metadata
wrapper:
article:
type: relation
model: article
flatten: true
will give you:
{
"metadata": {
"title": "Test Dataset",
"article": {
"id": "abcde-ghijk",
"title": "Test Article" # <-- no metadata here
}
}
}
Internal references
You can also reference an internal part of the model. This might be useful for example when the model schema contains m:n relationship (for example, a list of chemical samples and measurements performed on pairs of them).
Then you might have the following schema:
# dataset model
model:
properties:
metadata:
samples[]:
type: object
id: sample
properties:
id: keyword
name: keyword
measurements[]:
type: object
properties:
sample1:
type: relation
model: "#sample"
keys: [id, name]
sample2:
type: relation
model: "#sample"
keys: [id, name]
This will generate an intra-document references from sample1
and sample2
to samples
. Note that you have to specify the keys as the default [id, metadata.title]
are not valid in this scenario.
With an input:
{
"metadata": {
"samples": [
{
"id": "231",
"name": "sea_water_sample_231"
},
{
"id": "233",
"name": "sea_water_sample_233"
},
],
"measurements": [
{
"sample1": {"id": "231"},
"sample2": {"id": "233"},
}
]
}
}
you would get in your elasticsearch/REST api:
{
"metadata": {
"samples": [
{
"id": "231",
"name": "sea_water_sample_231"
},
{
"id": "233",
"name": "sea_water_sample_233"
},
],
"measurements": [
{
"sample1": {"id": "231", "name": "sea_water_sample_231"},
"sample2": {"id": "233", "name": "sea_water_sample_233"},
}
]
}
}
Referencing custom fields
You might reference even custom fields, but in this case you have to provide your own schema for the field:
# referred document
model:
custom-fields:
- config: CF
element: custom_fields
properties:
metadata:
properties:
title: fulltext
# reference
reference:
type: relation
model: referred
flatten: true
keys:
- id
- metadata.title
- key: custom_fields.test # <-- you need to specify
model: # both key and model
type: keyword
How does it work?
On the record level, the generated json schema will contain copied parts from the referred model.
The mapping for "keys" from the referred model is copied into the generated mapping.
On record API, relations
field and dumper extension are added. The relations
specify what is the relation's source and destination, the dumper extension will make sure that before the record is put to the search engine, relation is fetched and inserted to the appropriate position.