Calculating Vulnerability Alerts with the Github API

At work I recently had to show our risk profile with Github vulnerability alerts and display them in Domo. Github's APIv3 (REST) doesn't allow you to query the vulnerability alerts, but APIv4 (GraphQL) does. I found the documentation around gathering those results very opaque due to being in Preview status, so here are some examples for pulling out the data you need.

Github Vulnerability Alerts can be enabled in the Settings component of a repository. It will scan the dependencies listed in various package managers (requirements.txt, package.json, etc) and build a dependency graph (super cool!). From there, each dependency is checked for CVEs posted on public vulnerability websites. This is useful information but it is difficult to visualize when you have many (in my case, hundreds) of repositories to watch.

It was my first time interacting with a GraphSQL API server and I went from "pfft, this is stupid, I like REST" to "Oh wow, this is surprisingly useful". The biggest painpoints are in switching your mindset (many calls into one call) and in building the query tree, but you will get the hang of it quickly.

Step 1

Run the first command. It is shown in curl format, but you can easily translate this to your favourite programming language.

You need to provide your Github username and a personal access token to authenticate to the API. The payload will be in JSON format. The tee command allows you to see the output from the curl command but also redirect it to a file for further processing.

NB: I'm using line breaks (\) to make the command more readable.

curl --silent \
  -H "Accept: application/vnd.github.vixen-preview" \
  -u $GITHUB_USERNAME:$GITHUB_PERSONAL_TOKEN \
  -X POST \
  -d "{ \"query\": \"{ organization(login:unbounce) { repositories(first: 100) { edges { cursor, node { name, vulnerabilityAlerts { totalCount } } } } } }\"" \
  https://api.github.com/graphql | tee payload.1.json

Step 2

Using the resulting file, grab the last record's cursor:

$ jq -r '.data.organization.repositories.edges[-1].cursor' payload.1.json
AQ

Step 3

A limitation of the Github API is that you can only request 100 repositories at a time. So you have to paginate your queries by asking for 100 results in the first attribute), and using the value of the cursor as the starting point in the after attribute.

Run another command and set the cursor to the right place. Don't forget to increment the JSON filename where the output is being written (e.g. change payload.1.json to payload.2.json).

curl --silent \
  -H "Accept: application/vnd.github.vixen-preview" \
  -u $GITHUB_USERNAME:$GITHUB_PERSONAL_TOKEN \
  -X POST \
  -d "{ \"query\": \"{ organization(login:unbounce) { repositories(first: 100, after: AQ) { edges { cursor, node { name, vulnerabilityAlerts { totalCount } } } } } }\"" \
  https://api.github.com/graphql | tee payload.2.json

Step 4

Repeat steps 2 and 3 until there are no more results.

Summary

The JSON file will contain data in the following format:

{
  "data": {
    "organization": {
      "edges": [
        {
          "cursor": "AQ",
          "node": {
            "name": "paw",
            "vulnerabilityAlerts": {
              "totalCount": 1
            }
          }
        }
      ]
    }
  }
}

You can then slice and dice the JSON file with jq to put the data in a more convenient format for uploading to visualization tools like Domo or Excel.