Thursday, December 04, 2008

How Python helped me grab all the results

The results come again, and everyone is excited to know "What was your GPA ?". And I there sitting in my room cursing my own 8.07 is thinking how to grab everyone's gpa ?!
I've been learning Python this winter break on my own, so I just look up on google about how to access web from python. I end up on urllib2 - " extensible library for opening URLs"
And the method urlopen(url[, data]) is what looks promising for the hack.

The result page is simple - a page with an input field and instructions as to how to access results. Once BG10XYY$$$ style registration code is filled, it shows you the result, again in formatted HTML format with lot of things that we don't want to see :)

Observation #1: hitting the submit button takes you to$$$

So how do you access results, simply by pushing a variable called RegNo into the GET request to Grades.jsp. And by now you'd have guessed what we're gonna do! I will just drive the RegNo=BG10XYY$$$ part from starting registration number to end by the urlopen method you saw there.

Observation #2: take a look at the html we get back from Grades.jsp
 <p align="center"><font size="5" color="#000000"><b>
Name</b> :- </font><font size="4" color="#000000"> Abinaya A</font></td>


<table border="1" width="50%" align="center" >
<td width="64%">
<p align="center"><font size="4" color="#000000"><b>Grade Point Average</b></font></td>
<td width="36%">
<p align="center" ><font color="#000000">6.785714</font></p></td>


Obviously the information of interest is entangled in HTML code which is of no meaning for the purpose. We can use split to take em out :)

So now, for modularity and elegance sake, I divide my code into functons -
1. getresult (code) takes registration code, fetches it, splits html to grab result (name and gpa), writes on stdout
2. driveresult() generates registration codes for a branch and year and calls getresult in a loop.

Finally we have -
# looks pretty small right ;)

import urllib2

def getresult(code):
f = urllib2.urlopen(""+code)
crap =
if (crap.find("Average")!=-1):
first,second = crap.split("Name</b> :- </font><font size=\"4\" color=\"#000000\">",1)
first,second = second.split("</font></td>",1)
name = first
first,second = crap.split("<p align=\"center\" ><font color=\"#000000\">",1)
first,second = second.split("</font",1)
gpa = first
print code,"\t\t",name,"\t\t\t",gpa

def driveresult (prefix,branch,max):
counter = 1
while (counter<=max):
num = str(counter)
num = num.rjust(3,"0")
#print num
getresult (prefix+branch+num)

driveresult ("bg107","cs",180)
print "---------------------------------------------------------------------"
driveresult ("bg107","ec",200)
print "---------------------------------------------------------------------"
driveresult ("bg107","ee",200)
print "---------------------------------------------------------------------"
driveresult ("bg107","it",200)
print "---------------------------------------------------------------------"
driveresult ("bg107","ei",200)
print "---------------------------------------------------------------------"
And here is the result -
Hope you enjoyed reading about this hack. Do let me know if you did something similar in your college/work/anywhere.



At December 19, 2008 at 1:55 PM , Blogger barun said...

Perfect application of one's knowledge! Btw, how do you add that date tag on the top left corner of your post? I was searching for that ... but didn't get yet.

At December 19, 2008 at 2:07 PM , Blogger Abhishek Mishra said...

Hey thanks... the date thingy is a feature of this blogger template i'm using.

Have a look at some css -

.post .date {
height: 50px;
width: 45px;
font: normal 22px Arial, Helvetica, sans-serif;
color: #666666;
text-align: center;
padding: 0px 2px 0 0;
line-height: 100%;
float: left;
.post .date span {
height: 16px;
display: block;
font: normal 11px Arial, Helvetica, sans-serif;
color: #ffffff;
text-align: center;
padding-top: 5px;

.post .date {
background: url( no-repeat;

Basically the date is formatted to come over this image -

and basically the following javascript fixes everything :

<script type='text/javascript'>
var timestamp = "Thursday, December 04, 2008";
if (timestamp != '') {
var timesplit = timestamp.split(",");
var date_yyyy = timesplit[2];
var timesplit = timesplit[1].split(" ");
var date_dd = timesplit[2];
var date_mmm = timesplit[1].substring(0, 3);
<div class='date'>
<span><script type='text/javascript'>document.write(date_mmm);</script></span>
<script type='text/javascript'>document.write(date_dd);</script>


