Categorygithub.com/caltechlibrary/aspace

modulepackage

0.0.15-rc1

Repository: https://github.com/caltechlibrary/aspace.git

Documentation: pkg.go.dev

# README

cait

cait is a set of utilities written in the Go language that work with and augment the ArchivesSpace API.

cait - a command line utility for ArchivesSpace interaction (basic CRUD operations and export)
cait-genpages - a simple static page generator based on exported ArchivesSpace content
cait-indexpages - for indexing exported JSON structures with Bleve
cait-servepages - a web service providing public search services and content browsing

Requirements

A working deployment of ArchivesSpace
Golang 1.8 or better to compile
Three 3rd party Go packages
- Bleve by Blevesearch, Apache License, Version 2.0
Caltech Library's Go packages
- cait, Caltech Library's ArchivesSpace integration tools

Compiling

If you already have Go setup and installed compiling the utilities are pretty straight forward.

Clone the git repository for the project
"Go get" the 3rd party libraries
Compile
Setup the necessary environment variables for using the utilities

Here's a typical example of setting things up.

    go get github.com/blevesearch/bleve/...
    git clone [email protected]:caltechlibrary/cait.git
    cd cait
    mkdir $HOME/bin
    export PATH=$HOME/bin:$PATH
    go build -o $HOME/bin/cait cmds/cait/cait.go
    go build -o $HOME/bin/cait-genpages  cmds/cait-genpages/cait-genpages.go
    go build -o $HOME/bin/cait-indexpages cmds/cait-indexpages/cait-indexpages.go
    go build -o $HOME/bin/cait-servepages cmds/cait-servepages/cait-servepages.go

At this point you should have your command line utilities ready to go in the bin directory. You are now ready to setup your environment variables.

Setting up your environment

The command line tools and services are configured via environment variables. Below is an example of setting things up under Bash running on your favorite Unix-like system.

    #!/bin/bash
    #
    # setup.sh - this script sets the environment variables for cait project.
    # You would source file before using cait, cait-indexpages, or cait-servepages.
    #

    #
    # Local Development setup
    #
    export CAIT_API_URL=http://localhost:8089
    export CAIT_USERNAME=admin
    export CAIT_PASSWORD=admin
    export CAIT_DATASET=dataset
    export CAIT_SITE_URL=http://localhost:8501
    export CAIT_HTDOCS=htdocs
    export CAIT_BLEVE=htdocs.bleve
    export CAIT_TEMPLATES=templates/default

One time setup, creat the directories matching your configuration.

    #!/bin/bash
    #
    # Create the necessary directory structure
    #
    mkdir -p $CAIT_DATASET
    mkdir -p $CAIT_HTDOCS
    mkdir -p $CAIT_TEMPLATES

Assuming Bash and that you've named the file cait.bash you could source the file from your shell prompt by typing

    . etc/cait.bash

Setting up a dev box

I run ArchivesSpace in a vagrant box for development use. You can find details to set that up at github.com/caltechlibrary/archivesspace_vagrant. I usually run the cait tools locally. You can see and example workflow in the document EXPORT-IMPORT.md.

Utilities

cait

This command is a general purpose tool for fetch ArchivesSpace data from the ArchivesSpace REST API, saving or modifying that data as well as querying the locally capture output of the API.

Current cait supports operations on repositories, subjects, agents, accessions and digital_objects.

These are the common actions that can be performed

create
list (individually or all ids)
update (can use a file instead of the command line, see -input option)
delete
export (useful with integrating into static websites or batch processing via scripts)

Here's an example session of using the cait command line tool on the repository object.

    . setup.sh # Source my setup file so I can get access to the API
    cait repository create '{"uri":"/repositories/3","repo_code":"My Archive","name":"My Archive"}' # Create an archive called My Archive
    cait repository list # show a list of archives, for example purposes we'll use archive ID of 3
    cait repository list '{"uri":"/repositories/3"}' # Show only the archive JSON for repository ID equal to 3
    cait repository list '{"uri":"/repositories/3"}' > repo2.json # Save the output to the file repo3.json
    cait repository update -input repo3.json # Save your changes back to ArchivesSpace
    cait repository export '{"uri":"/repositories/3"}' # export the repository metadata to data/repositories/3.json
    cait repository delete '{"uri":"/repositories/3"}' # remove repository ID 3

This is the general pattern also used with subject, agent, accession, digital_object.

The cait command uses the following environment variables

CAIT_API_URL, the URL to the ArchivesSpace API (e.g. http://localhost:8089 in v1.4.2)
CAIT_USERNAME, username to access the ArchivesSpace API
CAIT_PASSWORD, to access the ArchivesSpace API
CAIT_DATASET, the directory for exported content

cait-genpages

This command generates static webpages from exported ArchivesSpace content.

It relies on the following environment variables

CAIT_DATASET, where you've exported your ArchivesSpace content
CAIT_HTDOCS, where you want to write your static pages
CAIT_TEMPLATES, the templates to use (this defaults to template/defaults but you probably want custom templates for your site)

The typical process would use cait to export all your content and then run cait-genpages to generate your website content.

    cait archivesspace export # this takes a while
    cait-genpages # this is faster

Assuming the default settings you'll see new webpages in your local htdocs directory.

cait-indexpages

This command creates bleve indexes for use by cait-servepages.

Current cait-indexpages operates on JSON content exported with cait. It expects a specific directory structure with each individual JSON blob named after its numeric ID and the extension .json. E.g. htdocs/repositories/2/accession/1.json would correspond to accession id 1 for repository 2.

cait-indexpages depends on four environment variables

CAIT_HTDOCS, the root directory where the JSON blobs and HTML files are saved
CAIT_BLEVE, the name of the Bleve index (created or maintained)

cait-servepages

cait-servepages provides both a static web server as well as web search service.

Current cait-servepages uses the Bleve indexes created with cait-indexpages. It also uses the search page and results templates defined in CAIT_TEMPLATES.

It uses the following environment variables

CAIT_HTDOCS, the htdoc root of the website
CAIT_BLEVE, the Bleve index to use to drive the search service
CAIT_TEMPLATES, templates for search service as well as browsable static pages
CAIT_SITE_URL, the url you want to run the search service on (e.g. http://localhost:8501)

Assuming the default setup, you could start the like

    cait-servepages

Or you could add a startup script to /etc/init.d/ as appropriate.

Setting up a production box

The basic production environment would export the contents of ArchivesSpace nightly, regenerate the webpages, re-index the webpages and finally restart cait-servepages service.

The script in scripts/nightly-update.sh shows these steps based on the configuration in etc/setup.sh. This script is suitable for running form a cronjob under Linux/Unix/Mac OS X.

# Packages

cmds

No description provided by the author

# Functions

CreateCollection

No description provided by the author

FlattenDates

FlattenDates takes an array of Date types, flatten it into a human readable string.

GetKeys

No description provided by the author

IntListToString

IntListToString String from an array of instances.

New

New creates a new ArchivesSpaceAPI object for use with most of the functions in the gas package.

OpenCollection

No description provided by the author

ReadJSON

ReadJSON read saved JSON file from a dataset collection.

URIToID

URIToID return an id integer value from a URI for given type.

URIToRepoID

URIToRepoID return the repository ID from a URI.

URIToVocabularyID

URIToVocabularyID return the vocabulary ID from a URI.

WriteJSON

WriteJSON write out an ArchivesSpace data structure as a JSON file.

# Variables

LicenseText

Version of library.

TmplMap

TmplMap adds functions for working specifically with ArchivesSpace objects.

Version

Version of library.

# Structs

AbstractAgent

AbstractAgent JSONModel(:abstract_agent).

AbstractAgentRelationship

AbstractAgentRelationship JSONModel(:abstract_agent_relationship).

AbstractArchivalObject

AbstractArchivalObject JSONModel(:abstract_archival_object).

AbstractClassification

AbstractClassification JSONModel(:abstract_classification).

AbstractName

AbstractName JSONModel(:abstract_name).

AbstractNote

AbstractNote JSONModel(:abstract_note).

AcccessionPartsRelationship

AcccessionPartsRelationship JSONModel(:accession_parts_relationship).

Accession

Accession JSONModel(:accession).

AccessionSiblingRelationship

AccessionSiblingRelationship JSONModel(:accession_sibling_relationship).

ActiveEdits

ActiveEdits JSONModel(:active_edits).

AdvancedQuery

AdvancedQuery JSONModel(:advanced_query).

Agent

Agent represents an ArchivesSpace complete agent record from the client point of view.

AgentContact

AgentContact JSONModel(:agent_contact).

AgentCorporateEntity

AgentCorporateEntity JSONModel(:agent_corporate_entity).

AgentFamily

AgentFamily JSONModel(:agent_family).

AgentPerson

AgentPerson JSONModel(:agent_person).

AgentRelationshipAssociative

AgentRelationshipAssociative JSONModel(:agent_relationship_associative).

AgentRelationshipEarlierlater

AgentRelationshipEarlierlater JSONModel(:agent_relationship_earlierlater).

AgentRelationshipParentchild

AgentRelationshipParentchild JSONModel(:agent_relationship_parentchild).

AgentRelationshipSubordinatesuperior

AgentRelationshipSubordinatesuperior JSONModel(:agent_relationship_subordinatesuperior).

AgentSoftware

AgentSoftware JSONModel(:agent_software).

ArchivalObject

ArchivalObject JSONModel(:archival_object).

ArchivalRecordChildren

ArchivalRecordChildren JSONModel(:archival_record_children).

ArchivesSpaceAPI

ArchivesSpaceAPI is a struct holding the essentials for communicating with the ArchicesSpace REST API.

BooleanFieldQuery

BooleanFieldQuery JSONModel(:boolean_field_query).

BooleanQuery

BooleanQuery JSONModel(:boolean_query).

Classification

Classification JSONModel(:classification).

ClassificationTerm

ClassificationTerm JSONModel(:classification_term).

ClassificationTree

ClassificationTree JSONModel(:classification_tree).

CollectionManagement

CollectionManagement JSONModel(:collection_management).

Container

Container JSONModel(:container).

ContainerLocation

ContainerLocation JSONModel(:container_location).

ContainerProfile

ContainerProfile JSONModel(:container_profile).

Date

Date JSONModel(:date).

DateFieldQuery

DateFieldQuery JSONModel(:date_field_query).

Deaccession

Deaccession JSONModel(:deaccession).

Defaults

Defaults JSONModel(:defaults).

DefaultValues

DefaultValues JSONModel(:default_values).

DigitalObject

DigitalObject represents a digital object that will eventually become a EAD at COA.

DigitalObjectComponent

DigitalObjectComponent JSONModel(:digital_object_component).

DigitalObjectTree

DigitalObjectTree JSONModel(:digital_object_tree).

DigitalRecordChildren

DigitalRecordChildren JSONModel(:digital_record_children).

Enumeration

Enumeration JSONModel(:enumeration).

EnumerationMigration

EnumerationMigration JSONModel(:enumeration_migration).

EnumerationValue

EnumerationValue JSONModel(:enumeration_value).

Event

Event JSONModel(:event).

Extent

Extent represents an extends json model found in Accession records.

ExternalDocument

ExternalDocument a pointer to external documents.

ExternalID

ExternalID represents an external ID as found in Accession records.

FieldQuery

FieldQuery JSONModel(:field_query).

FileVersion

FileVersion JSONModel(:file_version).

FindAndReplaceJob

FindAndReplaceJob JSONModel(:find_and_replace_job).

Group

Group JSONModel(:group).

ImportJob

ImportJob JSONModel(:import_job).

Instance

Instance JSONModel(:instance).

Job

Job JSONModel(:job).

Location

Location JSONModel(:location).

LocationBatch

LocationBatch JSONModel(:location_batch).

LocationBatchUpdate

LocationBatchUpdate JSONModel(:location_batch_update).

MergeRequest

MergeRequest JSONModel(:merge_request).

NameCorporateEntity

NameCorporateEntity JSONModel(:name_corporate_entity).

NameFamily

NameFamily JSONModel(:name_family).

NameForm

NameForm JSONModel(:name_form).

NamePerson

NamePerson JSONModel(:name_person).

NameSoftware

NameSoftware JSONModel(:name_software).

NavElementView

NavElementView defined previous, next links used in paging results or browsable record lists.

NormalizedAccessionView

NormalizedAccessionView returns a structure suitable for templating public web content.

NormalizedDigitalObjectView

NormalizedDigitalObjectView returns a structure suitable for templating public web content.

NoteAbstract

NoteAbstract JSONModel(:note_abstract).

NoteBibliography

NoteBibliography JSONModel(:note_bibliography).

NoteBiogHist

NoteBiogHist JSONModel(:note_bioghist).

NoteChronology

NoteChronology JSONModel(:note_chronology).

NoteCitation

NoteCitation JSONModel(:note_citation).

NoteDefinedlist

NoteDefinedlist JSONModel(:note_definedlist).

NoteDigitalObject

NoteDigitalObject JSONModel(:note_digital_object).

NoteIndex

NoteIndex JSONModel(:note_index).

NoteIndexItem

NoteIndexItem JSONModel(:note_index_item).

NoteMultipart

NoteMultipart JSONModel(:note_multipart).

NoteOrderedlist

NoteOrderedlist JSONModel(:note_orderedlist).

NoteOutline

NoteOutline JSONModel(:note_outline).

NoteOutlineLevel

NoteOutlineLevel JSONModel(:note_outline_level).

NoteSinglepart

NoteSinglepart JSONModel(:note_singlepart).

NoteText

NoteText JSONModel(:note_text).

PageView

PageView is a simple container for rendering pages.

Permission

Permission JSONModel(:permission).

Preference

Preference JSONModel(:preference).

PrintToPDFJob

PrintToPDFJob JSONModel(:print_to_pdf_job).

RdeTemplate

RdeTemplate JSONModel(:rde_template).

RecordTree

RecordTree JSONModel(:record_tree).

ReportJob

ReportJob JSONModel(:report_job).

Repository

Repository represents an ArchivesSpace repository from the client point of view.

RepositoryWithAgent

RepositoryWithAgent JSONModel(:repository_with_agent).

Resource

Resource JSONModel(:resource).

ResourceTree

ResourceTree JSONModel(:resource_tree).

ResponseMsg

ResponseMsg is a structure to hold the JSON portion of a response from the ArchivesSpaceAPI.

RevisionStatement

RevisionStatement JSONModel(:revision_statement).

RightsRestriction

RightsRestriction JSONModel(:rights_restriction).

RightsStatement

RightsStatement JSONModel(:rights_statement).

SearchQuery

SearchQuery represents the query options supported by search.

SubContainer

SubContainer JSONModel(:sub_container).

Subject

Subject JSONModel(:subject).

Telephone

Telephone JSONModel(:telephone).

Term

Term JSONModel(:term).

TopContainer

TopContainer JSONModel(:top_container).

User

User is a JSONModel used to administer ArchivesSpace.

UserDefined

UserDefined JSONModel(:user_defined).

Vocabulary

Vocabulary JSONModel(:vocabulary).

# Type aliases

NavView

NavView is an array of NavelementViews.

Object

Object JSONModel(:object).