Skip to content

Configuration Examples

Basics

At present, there are two similar but different configuration files that are used by the two core applications: gleaner and nabu These can be generated using a command line tool: glcon when generating usingglcon, a file called localConfig.yaml is edited, and a command generate generates the two configuraiton files.

Plans for the future are to refactor into two files, core services and sources

Services and Sources

To load data you need to know the services and the sources. The services can be a remote cloud based, or local usually running in a container (warn local is not always easy.)

Using glcon to generate configurations

Step overview:

  • ./glcon config init --cfgName {projectname}
  • edit configs/projectname/localConfig.yaml
  • ./glcon config generate --cfgName {projectname}
# NOTE: while you can, it's not always a good pattern to put a comment after a property: value
#     property: value # comment
# sometimes things do not go well
---
minio:
  address: 0.0.0.0
# aws need to include the region in the bucket. eg: s3.us-west-2.amazonaws.com
  port: 9000
  accessKey: worldsbestaccesskey
  secretKey: worldsbestsecretkey
  ssl: false
  bucket: gleaner
  # can be overridden with MINIO_BUCKET
sparql:
  endpoint: http://localhost/blazegraph/namespace/earthcube/sparql
s3:
  bucket: gleaner
  # sync with above... can be overridden with MINIO_BUCKET... get's zapped if it's not here.
  domain: us-east-1

#headless field in gleaner.summoner
headless: http://127.0.0.1:9222
sourcesSource:
  type: csv
  location: sources.csv
# this can be a remote csv
#  type: csv
#  location: https://docs.google.com/spreadsheets/d/{key}/gviz/tq?tqx=out:csv&sheet={sheet_name}
# TBD -- Just use the sources in the gleaner file.
#  type: yaml
#  location: gleaner.yaml

Examples

Demo

This is configured as a local

---
minio:
  address: 0.0.0.0
  port: 9000
  accessKey: worldsbestaccesskey
  secretKey: worldsbestsecretkey
  ssl: false
  bucket: gleaner # can be overridden with MINIO_BUCKET
sparql:
  endpoint: http://localhost/blazegraph/namespace/earthcube/sparql
s3:
  bucket: gleaner # sync with above... can be overridden with MINIO_BUCKET... get's zapped if it's not here.
  domain: us-east-1

#headless field in gleaner.summoner
headless: http://127.0.0.1:9222
sourcesSource:
  type: csv
  location: sources.csv

Flight Test

This