about me

I'm Thomas Patrick Horton. I'm a huge nerd with a love for anything internet- design, development, programming, gaming, you name it.

skills that kill

Everyone knows that web design and development are constantly evolving artforms. In my work, I have had the opportunity to explore the following languages:

I'm always looking for new solutions to old problems- my experience with multitudes of scripting and markup languages allow me to dynamically address client needs (and it's fun!)

In addition to design and development, I've had the opportunity to work in a variety of Social Media Marketing, Search Engine Marketing, and Search Engine Optimization environments. Want examples?

experience

ch-check it out

the blog

Ruby on Rails Introduction- The Hard Way

May 8, 2012 - 22:10

I was presented a problem late last week- a website that was displaying database records on individual pages. This information was important for a client who didn't have backend access to the site. Having seen The Social Network a couple too many times, I knew that it was possible. This would require new knowledge, so I set out on my quest.

I landed on Ruby as the language for my solution. I had already played with the basics (pretty much just helloworld.rb). I found a slightly dated but still great resource on scraping websites that had me running my test scripts in minutes. At the guide's suggestion I installed the Nokogiri Gem to allow for page parsing.

There wasn't much of a hard part to the process at all- once I was able to figure out where the information I needed was within the DOM, I built my XML queries and ran the script on a single record. The hardest part, honestly, was remembering the syntax for output- like I said, I have almost no Ruby experience.

I ended up with something like this:

raw_data = Nokogiri::HTML(open('http://www.example_website.com/?record=1'))

first_name = raw_data.xpath('//td/div/span')[1].content
last_name = raw_data.xpath('//td/div/span[2]')[1].content
#etc...

puts "First Name: "+first_name
puts "Last Name: "+last_name
#etc...

The next step was introducing a variable to the URL.

active = 1
active_page = 'http://www.example_website.com/?record='+active.to_s()

The hangup here came from not setting active's type to string- you can't concantenate strings with ints (Is this right?)
Now that there's a variable involved, we can wrap the whole function in a for.. loop.

for i in 1..100 #whatever the endpoint is
active = i

active_record = "http://www.example_website.com/?record="+active.to_s()

raw_data = Nokogiri::HTML(open(active_record))

first_name = raw_data.xpath('//td/div/span')[1].content
last_name = raw_data.xpath('//td/div/span[2]')[1].content
#etc...

puts i.to_s()+" of 2218....."
puts "First Name: "+first_name
puts "Last Name: "+last_name
#etc...

end

What's the point of running this scraper if we can't store the data? I found a pretty cool gem that allows Ruby to export to CSV called FasterCSV. Call open the .csv before the loop initializes, and then add to the .csv on each loop iteration. Easy as pie.

require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'fastercsv'

FasterCSV.open("temp.csv","w") do |csv|

for i in 1..100 #or whatever
active = i

active_record = "http://www.example_website.com/?record="+active.to_s()

raw_data = Nokogiri::HTML(open(active_record))

first_name = raw_data.xpath('//td/div/span')[1].content
last_name = raw_data.xpath('//td/div/span[2]')[1].content
#etc...

puts i.to_s()+" of 2218....."
puts "First Name: "+first_name
puts "Last Name: "+last_name
#etc...

csv<< [first_name,last_name,etc]

end

end

And that's how I got my data. All in all, a pretty quick foray into some new (to me) functions of Ruby. As usual, by the time I got around to doing this the client had already change their minds on the information.