Configuration Examples
Basics
At present, there are two similar but different configuration files that are used by the two core applications: gleaner
and nabu
These can be generated using a command line tool: glcon
when generating usingglcon
, a file called localConfig.yaml
is edited, and a command generate generates
the two configuraiton files.
Plans for the future are to refactor into two files, core services and sources
Services and Sources
To load data you need to know the services and the sources. The services can be a remote cloud based, or local usually running in a container (warn local is not always easy.)
Using glcon to generate configurations
Step overview:
./glcon config init --cfgName {projectname}
- edit configs/projectname/localConfig.yaml
./glcon config generate --cfgName {projectname}
# NOTE: while you can, it's not always a good pattern to put a comment after a property: value
# property: value # comment
# sometimes things do not go well
---
minio:
address: 0.0.0.0
# aws need to include the region in the bucket. eg: s3.us-west-2.amazonaws.com
port: 9000
accessKey: worldsbestaccesskey
secretKey: worldsbestsecretkey
ssl: false
bucket: gleaner
# can be overridden with MINIO_BUCKET
sparql:
endpoint: http://localhost/blazegraph/namespace/earthcube/sparql
s3:
bucket: gleaner
# sync with above... can be overridden with MINIO_BUCKET... get's zapped if it's not here.
domain: us-east-1
#headless field in gleaner.summoner
headless: http://127.0.0.1:9222
sourcesSource:
type: csv
location: sources.csv
# this can be a remote csv
# type: csv
# location: https://docs.google.com/spreadsheets/d/{key}/gviz/tq?tqx=out:csv&sheet={sheet_name}
# TBD -- Just use the sources in the gleaner file.
# type: yaml
# location: gleaner.yaml
Examples
Demo
This is configured as a local
---
minio:
address: 0.0.0.0
port: 9000
accessKey: worldsbestaccesskey
secretKey: worldsbestsecretkey
ssl: false
bucket: gleaner # can be overridden with MINIO_BUCKET
sparql:
endpoint: http://localhost/blazegraph/namespace/earthcube/sparql
s3:
bucket: gleaner # sync with above... can be overridden with MINIO_BUCKET... get's zapped if it's not here.
domain: us-east-1
#headless field in gleaner.summoner
headless: http://127.0.0.1:9222
sourcesSource:
type: csv
location: sources.csv
Flight Test
This