thoughts, ideas, code and other things...

Thursday, December 04, 2008

How Python helped me grab all the results

The results come again, and everyone is excited to know "What was your GPA ?". And I there sitting in my room cursing my own 8.07 is thinking how to grab everyone's gpa ?!
I've been learning Python this winter break on my own, so I just look up on google about how to access web from python. I end up on urllib2 - " extensible library for opening URLs"
And the method urlopen(url[, data]) is what looks promising for the hack.

The result page is simple - a page with an input field and instructions as to how to access results. Once BG10XYY$$$ style registration code is filled, it shows you the result, again in formatted HTML format with lot of things that we don't want to see :)

Observation #1: hitting the submit button takes you to$$$

So how do you access results, simply by pushing a variable called RegNo into the GET request to Grades.jsp. And by now you'd have guessed what we're gonna do! I will just drive the RegNo=BG10XYY$$$ part from starting registration number to end by the urlopen method you saw there.

Observation #2: take a look at the html we get back from Grades.jsp
 <p align="center"><font size="5" color="#000000"><b>
Name</b> :- </font><font size="4" color="#000000"> Abinaya A</font></td>


<table border="1" width="50%" align="center" >
<td width="64%">
<p align="center"><font size="4" color="#000000"><b>Grade Point Average</b></font></td>
<td width="36%">
<p align="center" ><font color="#000000">6.785714</font></p></td>


Obviously the information of interest is entangled in HTML code which is of no meaning for the purpose. We can use split to take em out :)

So now, for modularity and elegance sake, I divide my code into functons -
1. getresult (code) takes registration code, fetches it, splits html to grab result (name and gpa), writes on stdout
2. driveresult() generates registration codes for a branch and year and calls getresult in a loop.

Finally we have -
# looks pretty small right ;)

import urllib2

def getresult(code):
f = urllib2.urlopen(""+code)
crap =
if (crap.find("Average")!=-1):
first,second = crap.split("Name</b> :- </font><font size=\"4\" color=\"#000000\">",1)
first,second = second.split("</font></td>",1)
name = first
first,second = crap.split("<p align=\"center\" ><font color=\"#000000\">",1)
first,second = second.split("</font",1)
gpa = first
print code,"\t\t",name,"\t\t\t",gpa

def driveresult (prefix,branch,max):
counter = 1
while (counter<=max):
num = str(counter)
num = num.rjust(3,"0")
#print num
getresult (prefix+branch+num)

driveresult ("bg107","cs",180)
print "---------------------------------------------------------------------"
driveresult ("bg107","ec",200)
print "---------------------------------------------------------------------"
driveresult ("bg107","ee",200)
print "---------------------------------------------------------------------"
driveresult ("bg107","it",200)
print "---------------------------------------------------------------------"
driveresult ("bg107","ei",200)
print "---------------------------------------------------------------------"
And here is the result -
Hope you enjoyed reading about this hack. Do let me know if you did something similar in your college/work/anywhere.



At December 19, 2008 at 1:55 PM , Blogger barun said...

Perfect application of one's knowledge! Btw, how do you add that date tag on the top left corner of your post? I was searching for that ... but didn't get yet.

At December 19, 2008 at 2:07 PM , Blogger Abhishek Mishra said...

Hey thanks... the date thingy is a feature of this blogger template i'm using.

Have a look at some css -

.post .date {
height: 50px;
width: 45px;
font: normal 22px Arial, Helvetica, sans-serif;
color: #666666;
text-align: center;
padding: 0px 2px 0 0;
line-height: 100%;
float: left;
.post .date span {
height: 16px;
display: block;
font: normal 11px Arial, Helvetica, sans-serif;
color: #ffffff;
text-align: center;
padding-top: 5px;

.post .date {
background: url( no-repeat;

Basically the date is formatted to come over this image -

and basically the following javascript fixes everything :

<script type='text/javascript'>
var timestamp = "Thursday, December 04, 2008";
if (timestamp != '') {
var timesplit = timestamp.split(",");
var date_yyyy = timesplit[2];
var timesplit = timesplit[1].split(" ");
var date_dd = timesplit[2];
var date_mmm = timesplit[1].substring(0, 3);
<div class='date'>
<span><script type='text/javascript'>document.write(date_mmm);</script></span>
<script type='text/javascript'>document.write(date_dd);</script>


Post a Comment

Subscribe to Post Comments [Atom]

<< Home