Skip to content

Data Loading - Watching CLI/Manual Loading

When you are doing a manual load, things you can observe

  • console
  • logs
  • Minio/s3

Note

Remember to use screen for long data loads. This will make sure the process does not die when a terminal disconnccts. screen -S {SOME_NAME}

console

  • screen -ls
  • screen r {SOME_NAME} You should see a set of json records being reported.

Note

Nabu will provide a progress bar, but not anything into the main gleaner log.

Logs

  • cd indexing or whereever you ran glcon from
  • ls -l logs You will see a set of logs. initally, this will be there:
gleaner-2023-04-27-21-56-46.log

Then when sitemaps are loaded, reposiories will appear:

repo-magic-issues-2022-12-20-18-54-58.log
repo-magic-loaded-2022-10-06-15-40-07.log
repo-opentopography-issues-2022-10-05-22-04-03.log
repo-opentopography-loaded-2022-10-05-22-04-03.log

Note

Headless repositories run after the not headless repositories. They also run serially, so it can take a long time to run headless. You can just run the headless as separate runs using the --source {SOURCE/REPO}


gleaner-runstats-2023-04-27-05-38-59.log

You can tail -f logs/{file}

Minioadmin/s

In minioadmin you can see a bucket loading. Go to a minioadmin https://minioadmin.geocodes.ncsa.illinois.edu/

Go to the bucket you are loading, summoned path select a repo, Sort by date to see what is the latest loaded (click twice)

Run a missing report

If you run the missing_report you can do a quick idea of what did not make it in... but there is still alot to go the there will be a lot of missing.