oarepo-vocabularies
This library provides support for:
- defining extra metadata on invenio vocabularies (for example, in case the standard key->string mapping in props is not enough)
- adding hierarchy to your vocabulary items
Installation
To install the library, add oarepo-vocabularies
to your dependencies (pipfile, requirements.txt, setup.cfg)
Usage
This library replaces Invenio service and resource config with a custom-fields enhanced one. To use it, please add the following
lines to your invenio.cfg
:
# invenio.cfg
from oarepo_vocabularies.services.config import VocabulariesConfig
from oarepo_vocabularies.resources.config import VocabulariesResourceConfig
VOCABULARIES_SERVICE_CONFIG = VocabulariesConfig
VOCABULARIES_RESOURCE_CONFIG = VocabulariesResourceConfig
Because the enhanced vocabularies use custom fields, you will have to add those fields to your opensearch server before you start to populate vocabularies. To do so, invoke on the command line:
invenio oarepo cf init
If you forget to do so, you will get errors while indexing some of the vocabulary items.
Apart from these, just use the current_service/current_resource from invenio_vocabularies.proxies
with enhanced content - it will be handled appropriately. If you ever need to bypass the service and use the database(record) level, please use oarepo_vocabularies.records.api.Vocabulary
.
Adding extra metadata
To be able to add extra metadata, specify their custom fields in invenio.cfg
# invenio.cfg
from invenio_records_resources.services.custom_fields.text import KeywordCF
from tests.customfields import NonPreferredLabelsCF
OAREPO_VOCABULARIES_CUSTOM_CF = [
KeywordCF("blah"),
NonPreferredLabelsCF("nonpreferredLabels"),
]
where tests/customfields.py
is an example of a custom complex field:
from invenio_records_resources.services.custom_fields import BaseCF
import marshmallow as ma
from invenio_vocabularies.services.schema import i18n_strings
class NonPreferredLabelsCF(BaseCF):
@property
def mapping(self):
"""Return the mapping."""
return {"type": "object", "dynamic": True}
@property
def field(self):
"""Marshmallow field for custom fields."""
return ma.fields.List(i18n_strings)
Have a look at invenio sources for the list of pre-defined custom field types.
Then just create your vocabulary item with the normal service - see the test for an example
Using hierarchies in vocabulary items
Each vocabulary item might point to its parent item (within the same vocabulary type).
Adding vocabulary items
To add vocabulary items programmatically, please specify hierarchy.parent
field pointing to item's parent:
# parent item
{
"id": "eng",
"title": {"en": "English", "da": "Engelsk"},
"type": "languages",
}
# child item
{
"id": "eng.US",
"title": {"en": "English (US)", "da": "Engelsk (US)"},
"type": "languages",
"hierarchy": {"parent": "eng"},
}
This way you can construct hierarchies with unlimited depth.
Getting vocabulary items
When you get a vocabulary item, you'll have a couple more properties:
{
"links": {
"self": "https://127.0.0.1:5000/api/vocabularies/languages/eng.US",
"parent": "https://127.0.0.1:5000/api/vocabularies/languages/eng",
"children": "https://127.0.0.1:5000/api/vocabularies/languages?h-parent=eng.US","descendants": "https://127.0.0.1:5000/api/vocabularies/languages?h-ancestor=eng.US"
},
"id": "eng.US",
"title": {"en": "English (US)", "da": "Engelsk (US)"},
"type": "languages",
"hierarchy": {
"level": 2,
"parent": "eng",
"title": [
{"da": "Engelsk (US)", "en": "English (US)"},
{"da": "Engelsk", "en": "English"},
],
"ancestors": ["eng"]
}
}
The links
section is enhanced to help you with navigation to parent, children and descendants. The hierarchy
section has been automatically enhanced with data from the ancestors:
level
is the actual level inside the taxonomy tree. Tree roots have level=1parent
is the parent you have specifiedancestors
contains a list of identifiers of all the ancestors. Parent is the first item in the list, root is the last itemtitle
contains titles of all the ancestors (this is useful in the UI so that you do not have to perform extra requests)
The "hierarchy" section is defined via custom fields, see the code.
In your use case you might want to add more information from the ancestors (such as icons, some properties, ...). To do so, define your own custom field and put it to your invenio.cfg
- but make sure to copy all the standard system fields.
Importing vocabularies from a file
You can import vocabularies from yaml, csv, json lines or excel (xlsx) file formats. To do so, call:
invenio oarepo vocabularies fixtures <folder>
The folder must contain vocabularies.yaml
with imported vocabulary types:
access-rights:
pid-type: v-ar
title:
cs: Přístupová práva
en: Access rights
data-file: accessRights.xlsx
contributor-types:
pid-type: v-ct
title:
cs: Role přispěvatele
en: Contributor Type
data-file: contributorType.xlsx
The data files are then just serialized vocabulary items. For yaml format, see for example fixtures at RDM.
The excel is just a table where column names are the fields. Nesting is represented with dot or underscore, arrays with 0-based indices. See examples at https://github.com/oarepo/oarepo-vocabularies/tree/main/tests/complex-data
.