Python script to query github code in your terminal

I was using the Advanced Search in Github the other day and thought: what if I could use this in a terminal. So I started to try out some things in Python which led to the following script.

Some comments about the script

featured image

  • It is an interactive script that you run from a terminal, main options are (n) for new search and (s) for show script snippet. You first search for a keyword and optionally a programming language and number result pages to parse. See an example at the end of this post …
  • It takes these args and builds the right search URL (base url =https://github.com/search?q=) and uses html2text.theinfo.org to strip out the html (I tried the remote versionhere, for local use just download and import the html2text Python module, see my last post for an example).
  • It filters the relevant html with re.split(r”seconds)|## Breakdown”, html) – this is based on what html2text makes of github’s html markup.
  • When choosing (s) and then a number of the search result the method “show_script_context” imports the raw script and shows the line that matches the search string with 8 lines before and after (like grep -A8 -B8 would do)
  • You can use Conque to run this in a split window in Vim which allows you to copy output to the script you are working on.

The code

See below and on github:

#!/usr/bin/env python                                                                                                                                     
# -*- coding: utf-8 -*-
# Author: Bob Belderbos / written: Dec 2012
# Purpose: have an interactive github cli search app
#
import re, sys, urllib, pprint
# import html2text # -- to use local version

class GithubSearch:
  """ This is a command line wrapper around Github's Advanced Search
      https://github.com/search """

  def __init__(self):
    """ Setup variables """
    self.searchTerm = ""
    self.scripts = []
    self.show_menu()


  def show_menu(self):
    """ Show a menu to interactively use this program """
    prompt = """
      (N)ew search
      (S)how more context (github script)
      (Q)uit
      Enter choice: """
    while True:
      chosen = False 
      while not chosen:
        try:
          choice = raw_input(prompt).strip().lower()
        except (EOFError, KeyboardInterrupt):
          choice = 'q'
        except:
          sys.exit("Not a valid option")
        print 'nYou picked: [%s]' % choice 
        if choice not in 'nsq':
          print "This is an invalid option, try again"
        else:
          chosen = True
      if choice == 'q': sys.exit("Goodbye!")
      if choice == 'n': self.new_search() 
      if choice == 's': self.show_script_context()

  
  def new_search(self):
    """ Take the input field info for the advanced git search """
    # reset script url tracking list and counter
    self.scripts = [] 
    self.counter = 0
    # take user input to define the search
    try:
      self.searchTerm = raw_input("Enter search term: ").strip().lower().replace(" ", "+")
    except:
      sys.exit("Error handling this search term, exiting ...")
    lang = raw_input("Filter on programming language (press Enter to include all): ").strip().lower()
    try:
      prompt = "Number of search pages to process (default = 3): "
      numSearchPages = int(raw_input(prompt).strip()[0])
    except:
      numSearchPages = 3
    # get the search results
    for page in range(1,numSearchPages+1):
      results = self.get_search_results(page, lang)
      for result in results[1].split("##"): # each search result is divided by ##
        self.parse_search_result(result)


  def get_search_results(self, page, lang):
    """ Query github's advanced search and re.split for the relevant piece of info 
        RFE: have a branch to use html2text local copy if present, vs. remote if not """
    githubSearchUrl = "https://github.com/search?q="
    searchUrl = urllib.quote_plus("%s%s&p=%s&ref=searchbar&type=Code&l=%s" % 
      (githubSearchUrl, self.searchTerm, page, lang))
    html2textUrl = "http://html2text.theinfo.org/?url="
    queryUrl = html2textUrl+searchUrl
    html = urllib.urlopen(queryUrl).read()
    return re.split(r"seconds)|## Breakdown", html)


  def parse_search_result(self, result):
    """ Process the search results, also store each script URL in a list for reference """
    lines = result.split("n")
    source = "".join(lines[0:2])
    pattern = re.compile(r".*((.*?))s+((.*?)).*")
    m = pattern.match(source)
    if m != None:
      self.counter += 1 
      url = "https://raw.github.com%s" % m.group(1).replace("tree/", "")
      lang = m.group(2)
      self.print_banner(lang, url)
      self.scripts.append(url) # keep track of script links 
      for line in lines[2:]:
        # ignore pagination markup
        if "github.com" in line or "https://git" in line or "[Next" in line: continue 
        if line.strip() == "": continue
        print line


  def print_banner(self, lang, url):
    """ Print the script, lang, etc. in a clearly formatted way """
    print "n" + "+" * 125
    print "(%i) %s / src: %s" % (self.counter, lang, url)


  def show_script_context(self, script_num=""):
    """ Another menu option to show more context from the github script 
        surrounding or leading up to the search term """
    if len(self.scripts) == 0:
      print "There are no search results yet, so cannot show any scripts yet."
      return False
    script_num = int(raw_input("Enter search result number: ").strip())
    script = self.scripts[script_num-1] # list starts with index 0 = 1 less than counter
    a = urllib.urlopen(script)
    if a.getcode() != 200:
      print "The requested script did not give a 200 return code"
      return False
    lines = a.readlines() 
    a.close()
    if len(lines) == 0:
      print "Did not get content back from script, maybe it is gone?"
      return False
    num_context_lines = 8
    print "nExtracting more context for search term <%s> ..." % self.searchTerm
    print "Showing %i lines before and after the match in the original script hosted here:n%sn" % 
      (num_context_lines, script)
    for i, line in enumerate(lines):
      if self.searchTerm.lower() in line.lower():
        print "n... %s found at line %i ..." % (self.searchTerm, i)
        j = i - num_context_lines
        for x in lines[i-num_context_lines : i+num_context_lines]:
          if self.searchTerm.lower() in x.lower():
            print "%i ---> %s" % (j, x), # makes the match stand out
          else:
            print "%i      %s" % (j, x),        
          j += 1


### instant
github = GithubSearch()

See it in action

$ vi github_search.py 


      (N)ew search
      (S)how more context (github script)
      (Q)uit
      Enter choice: N

You picked: [n]
Enter search term: os.system
Filter on programming language (press Enter to include all): python
Number of search pages to process (default = 3): 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(1) Python / src: https://raw.github.com/fhopecc/stxt/1a14c802362047af4c9f6d5ec2312a57cbc9bca6/task/setup_win.py
    import os
    _os.system_(

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(2) Python / src: https://raw.github.com/fhopecc/stxt/325dc6e2cbfecc9d071264f71aee7b156a8a6970/task/shutdown.py
    import os
    _os.system_('shutdown -s -f')

..
..
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(9) Python / src: https://raw.github.com/rob0r/snmpnetif/b6228f3ba6c55a7f8119af3a1bd4c014f5533b9b/snmpnetif.py
    (True):
                try:
                    # clear the screen
                    if os.name == 'nt': clearscreen = _os.system_

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(10) Python / src: https://raw.github.com/trey0/geocamShare/98029ffb1d26784346f7a2e5984048e8764df116/djangoWsgi.py
    .mkstemp('djangoWsgiSourceMe.txt')
        os.close(fd)
        _os.system_('bash -c "(source %s/sourceme.sh && printenv > %s

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(11) Python / src: https://raw.github.com/jphcoi/MLMTC/48029dd647dc17173ed94693deccbb8d7bb42ed6/map_builder/CFpipe.py
    _sys
        try:
          _os.system_(command_sys)
        except:
          '----------------------------------detection de communaut

      (N)ew search
      (S)how more context (github script)
      (Q)uit
      Enter choice: s

You picked: [s]
Enter search result number: 9

Extracting more context for search term <os.system> ...
Showing 8 lines before and after the match in the original script hosted here:

https://raw.github.com/rob0r/snmpnetif/b6228f3ba6c55a7f8119af3a1bd4c014f5533b9b/snmpnetif.py

... os.system found at line 250 ...
242              ifidx = self.ifactive()
243              
244              # get active interface names
245              ifnames = self.ifnames(ifidx)
246              
247              while(True):
248                  try:
249                      # clear the screen
250 --->                 if os.name == 'nt': clearscreen = os.system('cls')
251 --->                 if os.name == 'posix': clearscreen = os.system('clear')
252                      
253                      # print the device name and uptime
254                      print(devicename)
255                      print('Device uptime: {0}n').format(self.devuptime())
256                      
257                      # print stats if the first loop has run
..

      (N)ew search
      (S)how more context (github script)
      (Q)uit
      Enter choice: n

You picked: [n]
Enter search term: grep
Filter on programming language (press Enter to include all): perl
Number of search pages to process (default = 3): 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(1) Perl / src: https://raw.github.com/xPapers/xPapers/1fe2bf177e3d37f2024d00601340627a8ded85ad/lib/xPapers/Cat.pm
    ->catCount($me->catCount-1);
        $me->save;
        $me->clear_cache;
        # detach
        $me->cat_memberships([_grep_

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(2) Perl / src: https://raw.github.com/roethigj/Lx-Office-Anpassungen/e06afb2fc94573bc4a305a41e95a8b7a812e2db0/SL/IS.pm
    ->{TEMPLATE_ARRAYS}->{$_} }, "") } _grep_({ $_ ne "description" } @arrays));
        }
        $form

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
(3) Perl / src: https://raw.github.com/roethigj/Lx-Office-Anpassungen/e06afb2fc94573bc4a305a41e95a8b7a812e2db0/SL/OE.pm
    ->{sort} && _grep_($form->{sort}, keys(%allowed_sort_columns))) {
        $sortorder = $allowed

      (N)ew search
      (S)how more context (github script)
      (Q)uit
      Enter choice: s

You picked: [s]
Enter search result number: 3

Extracting more context for search term <grep> ...
Showing 8 lines before and after the match in the original script hosted here:

https://raw.github.com/roethigj/Lx-Office-Anpassungen/e06afb2fc94573bc4a305a41e95a8b7a812e2db0/SL/OE.pm

... grep found at line 210 ...
202          "ordnumber"               => "o.ordnumber",
203          "quonumber"               => "o.quonumber",
204          "name"                    => "ct.name",
205          "employee"                => "e.name",
206          "salesman"                => "e.name",
207          "shipvia"                 => "o.shipvia",
208          "transaction_description" => "o.transaction_description"
209        );
210 --->   if ($form->{sort} && grep($form->{sort}, keys(%allowed_sort_columns))) {
211          $sortorder = $allowed_sort_columns{$form->{sort}} . " ${sortdir}";
212        }
213        $query .= qq| ORDER by | . $sortorder;
214      
215        my $sth = $dbh->prepare($query);
216        $sth->execute(@values) ||
217          $form->dberror($query . " (" . join(", ", @values) . ")");

... grep found at line 1135 ...
1127        my $sameitem = "";
1128        foreach $item (sort { $a->[1] cmp $b->[1] } @partsgroup) {
1129          $i = $item->[0];
1130      
1131          if ($item->[1] ne $sameitem) {
1132            push(@{ $form->{TEMPLATE_ARRAYS}->{description} }, qq|$item->[1]|);
1133            $sameitem = $item->[1];
1134      
1135 --->       map({ push(@{ $form->{TEMPLATE_ARRAYS}->{$_} }, "") } grep({ $_ ne "description" } @arrays));
1136          }
1137      
1138          $form->{"qty_$i"} = $form->parse_amount($myconfig, $form->{"qty_$i"});
1139      
1140          if ($form->{"id_$i"} != 0) {
1141      
1142            # add number, description and qty to $form->{number}, ....
..

      (N)ew search
      (S)how more context (github script)
      (Q)uit
      Enter choice: q

You picked: [q]
Goodbye!

shell returned 1

© 2014 Bob Belderbos. All rights reserved.

- If you like something here, link to it instead of copy+paste.
- Disclaimer: ideas expressed on my blog are mine, and have nothing to do with the current/previous employers.
- Proudly using Wordpress and the Insider Theme on Bluehost