Parsing and Charting

Flatiron School / 9 July 2013

The following is a guest post by Kirin Masood and originally appeared on her blog. Kirin is currently a student at The Flatiron School. You can follow her on Twitter here.

Last week one of our assignments was to create a parser that returned the 100 most used words in Herman Melville’s Moby Dick. I thought this was really fascinating and once I finally got my parser working I proceeded to parse through random material like a madwomen. I also used a simple rails application and a gem to generate two pie charts.

1: Parsing the “I Have A Dream” Speech

First, I decided to parse various small items including Martin Luther King Jr’s “I Have a Dream Speech.” I found a text version so I put that into a tumblr blog post and I added the words “parse” to the beginning and “parsefinish” to the end. This made it much easier for me to parse the data.

Code: This is the code that I used to parse through the speech:

This is the parser running:

I then decided to put this information in a pie chart in order to make it more digestable. I found a ruby gem titled googlecharts and I put that into a simple rails application. I then put the following code in the view to make the chart:

This is the generated pie chart:

2: Parsing “Macbeth”

Next, I thought it would be in­ter­est­ing to check out the kind of vocab­u­lary Shakespeare util­ized, so I de­cided to parse through Act I of “Macbeth.” I found the full text ver­sion on­line in a simple format, so I didn’t have to re­sort to tumblr. 

Code: Here is the code that I used to parse through “Macbeth”: 

This is the parser running:


I decided to put some of this information in a pie chart as well. This time I decided to chart the amount of times a character’s name is used in the text.

This is the code that generates the chart:

This is the pie chart:

3: Parsers Are Cool

I parsed through many, many things. A lot of the times I just got per­sonal pro­nouns and gen­eric terms. I was hunt­ing for some text that would give me an in­ter­est­ing as­sort­ment of words. I wanted to be really classy, so I de­cided to parse through a chapter of 50 Shades of Gray.(I feel the need to say that I have NOT read this. I swear.)

Code: Same thing again with the code. 

These are some of the words that were frequently used in the chapter: 


I’m pretty sure successfully implementing a parser is one of the sweetest feelings known to mankind, so I encourage everyone to give it a go.

Rails Scales Previous Post Learning a Little Rails, the Second Time Next Post