Skip to content

digipres/awesome-gdoc-indexer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Awesome Google Docs Indexer

An experiment towards integrating Google Documents with Awesome Indexes.

In this version, we attempt a much simpler task:

  • We start with a (potentially private) Google Document that is used to collect meeting notes.
  • We want to summarise what kind of thing is covered in those meetings, without publishing anything sensitive.
  • Can we do that by just extracting any hyperlinks and their anchor text?

The current version does this, as follows:

$ python3.11 -m venv .venv
$ source .venv/bin/activate
$ pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
$ python extract_links.py <DOCUMENT_ID> links.csv

The results are not suitable for publication without manual review! Some links may be sensitive. But it could be used to generate summaries from time to time, with manual oversight.

The script itself was initially created using Gemini 2.5 Pro.

Background & Credentials Setup

This script was largely based on the Google Docs Python quickstart. Setting up the credentials was more time consuming. The basic flow is outlined in the quickstart document, but some additional steps were needed:

  • The first time I ran it, it complained that the credentials didn't have access to the Google Docs API. The error provided a link where that could be added in.
  • I added my own email address as a test user account.
  • After that, when first running the script, an instance of Chrome was started so I could log in as that test user and authorise the application.
  • This created a token.json file that the script could use.

At this point, I can run this script on any document that I (the test user) have access to.

About

An experiment towards including Google Docs in Awesome Indexes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages