better living through python

An adventure in programming and recovery.

Connecting the final pieces and returning the csv data

February 07, 2012

 

The final steps needed to create the csv data for my createcsv() function are the following.  Set the subject variable, consisting of the team name and the tshirt color (aka if the team is home or away that game).  Second, setting the description variable, including the current game number of the season and whom the competitor is.  Then identify if this event is an “all day” event that many calendars include.  This data is to be input as either true or false, which is here entered as false.  

function createcsv (schedule){

//Other needed strings

  var subject = schedule[i].teamname + " " + schedule[i].tshirtcolor;

  var description = "Game #" + schedule[i].game + " vs " + schedule[i].competitor;

  var alldayevent = "False";

  var location = "Corvallis Sports Park";

  var pprivate = "True"

  var newcsventry = subject + "," + startdate + "," + starttime + "," + enddate + "," + endtime + "," + alldayevent + "," + description + "," + location + "," + pprivate + "\r\n";

  finalcsv += newcsventry;

}

return finalcsv

};


Next a variable is made called location which is simply a string, “Corvallis Sports Park”.  Then pprivate is identified as true so that the event will show up on your calendar automatically as a private event.  The last part of setting up the CSV file is to concatenate all of these parts, following the original example shown here on the Google Help Forum, and add it to the final list.  Please note that the last bits added to this string is ‘\r\n’ to make sure that the ‘feed line’ and ‘cartridge return’ (read: hitting the enter button) occurs.  Once the for loop has finished processing, it will return a long string called finalcsv.  

Up Next Time: Issues and edits of code thus far

 

Adjusting and setting the end date and time for CSV format

February 06, 2012

 

The main difference when creating the endtime variable is that you have to adjust the starting time that is stored in schedule.   To do that you must increase the start time by the total length of the event.  My soccer games are technically 50 minutes in length, so I need to increase the time by 50 minutes.  I use the getTime() method to do this.  It provides a result that is the time extracted in milliseconds.  Thus, to increase the starting time to the appropriate endtime I need to add 50 minutes to the starting time in the equivalent of milliseconds, which in numerics is (50*60*1000).  Then I add on this newtime to the date using the Date() method.    

function createcsv (schedule){

...see previous entry

  //setting endtime

  var newtime = schedule[i].datetime.getTime() + (50*60*1000);

  newtime = new Date(newtime);

  var endhour = newtime.getHours();

  hourampm = adjusthour(endhour);

  endhour = lessthanten(hourampm[0]);

  var minutes = schedule[i].datetime.getMinutes();

 minutes =  lessthanten(minutes);  

  var endminutes = newtime.getMinutes();

 endminutes =  lessthanten(endminutes);

  var seconds = schedule[i].datetime.getSeconds();

 seconds =  lessthanten(seconds);

  var starttime = hour + ":" + minutes + ":" + seconds + hourampm[1];

  var endtime = endhour + ":" + endminutes + ":" + seconds + endhourampm[1];;

  …additional code explained in next entry

}

return finalcsv

};


Once the newtime variable is set I can use it to create the variables endhour and endminutes.  I create these end time variables, but using newtime instead of schedule as I was doing so in my previous entry.  Then I adjust the endhour by using the adjusthour() and lessthanten() functions.  

After I set the endhour variable you can see how next I set the minutes and endminutes variables differently.  I access the original datetime information by setting the minutes variable by accessing schedule.  I set the endminutes like I did endhour by using newtime.  Both minute related variables are adjusted by the lessthanten() function.  Finally the seconds variable is set by using schedule.  The actual start second and end second are no different, and thus I use the same value for starttime and endtime.  This variable is also adjusted by the lessthanten() function.

At the very end of the date and time code in this function I set the variables starttime and endtime.  I do this by simply concatenating the various time related variables I’ve created.  Next time I will continue with the createcsv() function.  I will also discuss some problems I’ve noticed have occurred during my readjustment of this code, namely regarding output of the adjusthour() function.    


Up Next Time: Connecting the final pieces and returning the csv data

 

Setting the start date and time of my calendar event in CSV format

February 02, 2012

 

Now that most of the minor functions, and other details have been explained I can show how my function extracts the necessary data from the schedule list of dictionaries.  The first section I will explain is the setting of the current start date and time.  To begin with I set three variables; ampm, endampm, and month.  The first two are simply empty strings to be adjusted later on in the code.  The last uses the getMonth() method.        

function createcsv (schedule){

 ...see former entry regarding this section

   //adjusting time from dictionary 'schedule'

   //setting date

   var ampm = "";

   var endampm = "";

   var month = schedule[i].datetime.getMonth() + 1;

   month = lessthanten(month);

   var date = schedule[i].datetime.getDate();

   date = lessthanten(date);

   var year = schedule[i].datetime.getFullYear().toString();

   year = year.slice(2);

   var startdate = month + "/" + date + "/" + year;

   var enddate = startdate;

   var hour = schedule[i].datetime.getHours();

   hourampm= adjusthour(hour);

   hour = lessthanten(hourampm[0]);

   …continuing with endtime, and other code bits

 }

 return finalcsv

};


The getMonth() method, explained on the Mozilla Developer Network, will extract the month listed in the datetime variable listed in the schedule function.  I access this piece of information by first identifying the location (schedule[i]), then the variable in that dictionary (datetime.), and then I apply the method and add 1 to the ending result.  Adding one to the end result is done because the months are returned from the getMonth() method from 0 to 11, with 0 equaling January.  After the month is extracted it’s then adjusted with the lessthanten() function explained in my last entry (Some simple functions to adjust my time formatting).  

After the month is set, I go through mostly the same process for the date and year information.  In the end I use the getMonth(), getDate(), and getFullYear() methods.  The first two methods are processed in almost the same way, however the year must be altered slightly.  The year variable, once collected, is 4 digits long.  For CSV formatting I need the year to only be listed as the last two digits.  To change my year variable to the last two digits, I need to slice it.  To make that adjustment to my variable I first change the year from an integer to a string using toString().  I must do this because you can’t slice an integer without first making it a string.  Then I slice my variable with this piece of code:

year = year.slice(2);


Once the year has been adjusted I combine all of my date variables into the startdate variable, which creates a date format of: 01/01/12.  Now I have to set the starttime of my soccer game.  To do this I use the getHours() method, which operates like the other get() methods listed above.  Then I adjust the hour variable with the adjusthour() and lessthanten() functions.  In the next entry I’ll be discussing the adjustments necessary to create the end time needed for csv calendar formatting.

Up Next Time: Adjusting and setting the end date and time for CSV format

 

Some simple functions to adjust my time formatting

February 01, 2012

 

The main part of the createcsv() function, is where the date and time information is adjusted to the appropriate format.  One of the main adjustments that needs to occur for csv calendar formatting, is making sure all date and time information is two digits long.  For example if the date, as posted on the website, is 1/1/12 (January 1st, 2012), then this function would add a ‘0’ where there is none making the date read 01/01/12.  Once complete the function returns x with it’s new value.     

function lessthanten (x) {

if (x < 10){

  x = "0" + x;

return x;

};

};


The next simple function is one that verifies whether the time of day is am or pm.  If the hour entered is greater than 12 the hour changes from military time to regular time.  It also changes the variable ampm to PM.  If the variable hour is less than 12 it simply changes the ampm hours to AM.  Once complete the function returns the new values of ampm and hour.


function adjusthour (hour) {

if (hour > 12){

    hour = hour - 12;

    ampm = " PM";

  }else {

    ampm = " AM";

  };

return [hour, ampm];

};



Now that the dates have functions to help them be adjusted quickly and effectively, it’s time to explain the rest of the function.

Up Next Time: Setting the start date and time of my calendar event in CSV format

 

The history and meaning of the key 'enter'

January 31, 2012

 

When we simply used typewriters to create documents, there were limitations that the typewriters had to make adjustments for.  The first was that your sheet of paper was only so wide, so your typewriter was made to fit that width.  Once you reached the other side you had to move your cartridge back to the left side of the paper and start again.  This action is called carriage return.  The second part was to create a new line.  It’s not enough that the cartridge returned to the correct side of the piece of paper.  You also had to make sure your typing line moved down to the empty space, so you could begin typing on the new line.  This action is called line feed.  

Within programming there are some small bits of code that accomplish those same actions.  In the past the apple operating system used to accomplish both actions (carriage return and line feed), by inputting the code ‘\r’.  Now both the Linux and apple operating systems use ‘\n’.  Microsoft windows uses ‘\r\n’.

var csvfileentry = "Subject,Start Date,Start Time,End Date,End Time,All Day Event,Description,Location,Private";

finalcsv += (csvfileentry + "\r\n");

        
As I discussed yesterday, the first entry I needed in my CSV file for calendar import was the information: Subject,Start Date,Start Time,End Date,End Time,All Day Event,Description, Location,Private.  This information describes to the calendar what each section of data means when it’s being imported.  However for csv formatting this isn’t enough, you also need to designate the carriage return and line feed, in order for each calendar event to be effectively separated.

finalcsv += (csvfileentry + "\r\n");


In order to make this CSV file applicable for all calendar formats, we simply tack on both bits to the end of each calendar entry, as well as the starting entry.  This page on the Google help forums is where I found some helpful information on how to adjust the file to the appropriate format for import.  

Up Next Time: Some simple functions to adjust my time formatting

 

Creating a CSV calendar file with javascript, adjusting the date to the correct format

January 30, 2012

 

One of the tasks I had to undertake when creating my extension, was creating a function that would adjust the date, time, and other information that I collected from my collectschedule() function.  To start with I create a variable called schedule that holds all the data created from collectschedule(), so I can easily access the data by simply calling on the variable.

var schedule = collectschedule();


Next I started work on my createcsv() function.  This function takes all the data from the variable schedule and alters it as appropriate for CSV format.  Creating a file that is CSV format, means creating a file that’s values are separated by commas (thus CSV = comma separated values).  After reading up some documentation I learned that in all CSV files meant to be imported into calendars, there is a specific format.  The first line in the file must look like this: Subject,Start Date,Start Time,End Date,End Time,All Day Event,Description,Location,Private.  Each of those sections represent information to be input into the calendar.  For many who interact with Microsoft Outlook these terms will be fairly familiar.  Continuing on with the createcsv() function:

function createcsv (schedule){

 var finalcsv = "";

 var csvfileentry = "Subject,Start Date,Start Time,End Date,End Time,All Day Event,Description,Location,Private";

 finalcsv += (csvfileentry + "\r\n");

 for (var i = 0; i < schedule.length; i += 1){

   //adjusting time from dictionary 'schedule'

   //setting date

   //….there is more code here, but I will get to that later

 }

}


In order for my createcsv() function to process, it needs to take in the schedule variable so it can adjust the data as needed.  The first order of business in this function is to create an empty string, called finalcsv.  This variable will be the actual contents of the csv file.  It needs to be in string format, as that is the format needed in a csv to make it more accessible by other programs.  

The second order is to input the first line of information that goes into the csv file.  As discussed above, that is a string that consists of: Subject,Start Date,Start Time,End Date,End Time,All Day Event,Description,Location,Private.  Here we call this variable csvfileentry.  Then we add this string to the finalcsv string variable.  We don’t enter it simply as the string listed above.  We also needed to add a couple of formatting quirks.

finalcsv += (csvfileentry + "\r\n");


We have the variable finalcsv, then use += to add this entry to the variable.  On the end we tack on “\r\n”.  This ending piece has some interesting history, but we’ll discuss it in the next entry.

Up Next Time: The history and meaning of key ‘enter’

 

The adjustdate() function that runs inside my collectschedule() function

January 27, 2012

 

The reasoning for this function, once again, is because the information I originally collect for the datetime variable is not in a format that is usable for conversion in my next functions (to be discussed next week).  It is also missing the year in which the date occurs.  Thus adjustments are made.    

This function works very simply.  It takes in the datetime information collected in my previous function, and simply alters it so that it is easier to use.  First it uses the Date() function which changes the datetime information given into something more manageable.  Then it separates this new information provided by the Date() function, into date and year variables.  

function adjustdate(datetime){
 var d = new Date(datetime);
 var cdate = new Date();
 var cyear = cdate.getFullYear();
 d.setYear(cyear);
 if (d.toDateString().slice(0,3) != datetime.slice(0,3)){
   if (d.getMonth() >= 9){
     d.setYear(cyear - 1);
   }else {
     d.setYear(cyear + 1);      
   }
 };  
 return d;
};

I start by setting my year (in the date information), to the current year as based on Internet time.  That way I have my year, and I can process the following if block.  The beginning question in my if block asks whether or not the day of the week provided from the website is the same day of the week of the year recently set.  If it is not, then I adjust the current year as appropriate.    

My soccer game schedules can sometimes span two years as they are eight weeks in length and can sometimes start in November but continue through till end of January.  In order to determine the year variable I start by asking if the current month is the months of October through December, and if it is then I change the year to make it one less than it is now.  Otherwise if the month is between January and March I increase the year by one.    

Once all of these items are created I finish my variable d (the datetime information) and return it from the function.  Woo for a week full of programming blogs!  Expect more for next week!

 

Misplaced anger

January 26, 2012

 

Do you ever get angry because you don’t feel someone was listening to you when you’re explaining something, only to realize that you weren’t asking the question correctly in the first place?  Then your anger, which is still there and active, doesn’t know what to do with itself.  It has no reason anymore.  

What do you do when you realize that your anger has no reason.  That you’re upset about something that you agree has no purpose.  You’re still angry, but you have to find a way to let the pointless anger go?

Some days it’s these things that are the hardest to push past.

 

My for loop in the collectschedule() function

January 25, 2012

 

The for loop is the really important part of my collectschedule() function, because it makes sure to scrape all the necessary data set in the HTML.  Here is only my for loop.

for (var rownumber = 1; rownumber < totalrows; rownumber += 2){
   var schedule = {};
   var tablerows = scheduletable['children'][0];
   var tablerow = tablerows.children[rownumber];
   var datetime = tablerow.children[0].innerText;
   var comp = tablerow.children[2].innerText;
   var tshirt = tablerow.children[2].innerHTML;
   var tshirtcolor = "- Home";
   if (tshirt.indexOf("<") != 0){
     var tshirtcolor = "- Away"
   };
   if (tshirtcolor == "- Away"){
     comp = comp.slice(3)
   };
   datetime = adjustdate(datetime);
   schedule['game'] = game;
   schedule['datetime'] = datetime;
   schedule['competitor'] = comp;
   schedule['tshirtcolor'] = tshirtcolor;
   schedule['teamname'] = teamname;
   finalschedule.push(schedule);
   game = game + 1;
 };
 console.log( finalschedule );

 return finalschedule;
};

A for loop in javascript is set up like this: for(initialization; test; iteration){where my code goes}; Initialization is the starting point of the for loop.  Here I designate my starting point as row number 1, which is also the first line the schedule table.  The second part is a test, which basically asks that if this statement is true go through the loop again.  Here I am stating: if the rownumber is less than the totalrows, I am to continue through the loop.  This makes sure that once I’ve gone through all the rows I stop.  The last part of a javascript for loop is the iteration, or an action taken each time through.  Here I specify that after each processing the variable rownumber is to increase by 2.  This is because the schedule table information is only included in the odd numbered rows.   

Each time the for loop runs, the first thing to happen is the creation of schedule, an empty dictionary variable.  This dictionary will hold all the information needed from each row.  In order to reach the rest of the data, I have to ‘traverse the DOM tree’.  Some websites are made up of tables inside tables.  To reach information you sometimes to go inside one table, then inside a table that is listed in that table, and you keep going until you reach the area you’re looking for.  Luckily I only had three hops to do, shown below:

   var tablerows = scheduletable['children'][0];
   var tablerow = tablerows.children[rownumber];

I went into the original table variable, called scheduletable, and created a new variable tablerows.  Then I go further into tablerows and create a new variable called tablerow.  As mentioned above, each tablerow includes the necessary information for each game.  At this point I can easily scrape the rest of the needed data.  In order I collect the datetime of the event, the comp or the team you’re competing against, and whether or not you’re home or away in the tshirtcolor section.  To collect tshirtcolor I have to ‘traverse the DOM tree’ again, but that’s the last time.  

Once all the data has been stored in variables I then insert it into the dictionary, after a few minor changes.  First I run a function against the datetime information I collect, as I need to alter it to make it more easily adjustable in my other functions I’ll be talking about over the next week or so.  I’ll explain how the adjustdate() function works tomorrow.      

   datetime = adjustdate(datetime);
   schedule['game'] = game;
   schedule['datetime'] = datetime;
   schedule['competitor'] = comp;
   schedule['tshirtcolor'] = tshirtcolor;
   schedule['teamname'] = teamname;
   finalschedule.push(schedule);
   game = game + 1;

Then I assign each variable of information as a piece of my dictionary.  To do that in javascript you simply follow this formula: dictionaryname[‘dictionaryvaluename’] = value to be added to dictionary.  Thus when I insert schedule[‘teamname’] = teamname; I am adding a variable to my dictionary called ‘teamname’ that is made up of the same information my teamname variable had above.  Once the dictionary schedule has been filled with the rows information I add it to my finalschedule list by doing this: finalschedule.push(schedule);.  The last odd thing I do is increase my variable game by one.  That way as I go through the scheduletable and collect each game, I’m identifying which game of the season it is accurately.  

The last two things I do to this for loop is I use what is the equivalent of a python ‘print’ statement to aid with debugging.  That is the: console.log(finalschedule);.  Once the loop has finished running it will print the contents of my finalschedule list so that I can figure out what’s going on.  Lastly I return the dictionary I’ve created.  That way I can use the information I’ve collected.

 console.log( finalschedule );
 return finalschedule;

Up Next Time: The adjustdate() function that runs inside my collectschedule() function

 

Scraping the CSP Mysam website for my soccer schedule

January 24, 2012

 

Yesterday I published my first Chrome Extension.  This extension collects the game schedule data from the Mysam account of our local Indoor Sports Park and creates a button called ‘Download Calendar’.  Once you click on the button it brings you to a popup that provides you with the option to select your calendar type and it then provides you with either a csv or ics file that you can import into the Calendar type of your choosing.

One of the first things I had to do, when starting this project, was scrape the information from the page.  Scraping, in this context, means simply identifying what information I want, where it’s stored, and then collecting it together in some fashion.  Identifying where the code is stored can sometimes be the most difficult part of the scraping process.  How long this takes is highly dependent on how well the site is put together.  

Luckily the CSP website was fairly well put together, so it wasn’t that difficult to do.  First off I created a function to house all this information.  I called it collectschedule(), as is appropriate.  

function collectschedule (){
 var name = $j(".header")[0].innerText;
 var teamname = name.slice(5)
 var scheduletable = $j(".table-data")[0];
 var totalrows = scheduletable.children[0].childElementCount;
 var finalschedule = [];
 var game = 1;
 for (var rownumber = 1; rownumber < totalrows; rownumber += 2){
   var schedule = {};
   var tablerows = scheduletable['children'][0];
   var tablerow = tablerows.children[rownumber];
   var datetime = tablerow.children[0].innerText;
   var comp = tablerow.children[2].innerText;
   var tshirt = tablerow.children[2].innerHTML;
   var tshirtcolor = "- Home";
   if (tshirt.indexOf("<") != 0){
     var tshirtcolor = "- Away"
   };
   if (tshirtcolor == "- Away"){
     comp = comp.slice(3)
   };
   datetime = adjustdate(datetime);
   schedule['game'] = game;
   schedule['datetime'] = datetime;
   schedule['competitor'] = comp;
   schedule['tshirtcolor'] = tshirtcolor;
   schedule['teamname'] = teamname;
   finalschedule.push(schedule);
   game = game + 1;
 };
 console.log( finalschedule );

 return finalschedule;
};

This functions starts out by setting a bunch of variables.  The first one, name, is the first element in the list of elements with class header, and for programming that means position zero.  The position is designated by the zero in brackets.  Class is designated by the ‘.’ and as with all javascript the variable is first indicated with the var in the beginning.  The variable name equates to the first line ‘Team *team name*’.   The second variable is that same line, but sliced apart so it only includes *team name*.  Then I create a variable that identifies the table where the schedule is included (scheduletable), the total rows in the table (tablerows), an empty list called finalschedule and a base variable called game.  The last variable game, is used to identify which game of the season is being played.

The most important data to collect is what is up next, the actual schedule.  The schedule is in a table, with each table’s section filled with the appropriate information.  In order to collect this information correctly, I made a for loop that would go through each line in the table and collect the necessary data.  Each line’s data is then put into it’s own dictionary.  Then, after each line is collected, that small dictionary is added to the empty list finalschedule.  

Up next time: A continuation of my for loop in the collectschedule() function  


 

Links