github recommendation version 3
idea: download data from bigquery github table, analyze the data and visualize the data.
-
Use github.com/ferryzhou/gcutil
-
Download pemission json file and save it to ~/.config/gcloud/application_default_credentials.json, see https://godoc.org/golang.org/x/oauth2/google#DefaultTokenSource
-
Enable storage JSON API
-
Setup bigquery project, dataset and storage bucket
-
Download data
./run.sh get_bq_data
- Result Data
- repos.csv repo_url, name, owner, created_at, watchers, language, description, ...
- recs.csv repo1_url, repo2_url, count
raw data is large and we don't need them all. here we sequencing the url and truncate recommendations data.
map shortPath to int and vice-versa recs[i] is a slice of
./run.sh process_data
- Prerequisite
//install postgrest
- Serve repos data: /repos?
- Load data to postgres
./run.sh csv2db
- Run postgrest
./run.sh serve_repos
./run.sh serve
- Test api
./run.sh test_api