Ruby Intro

SENG2021

Parsing XML With XMLSimple

For the purpose of this exercise, I’ve downloaded and extracted a zip file from the AEC’s site wget wreckbea.ch/aec-mediafeed-results-standard-light-15508.xml Ruby has an XML parser called REXML in its standard library, but it’s known to be very slow – Some 50 times slower than Nokogiri. I would love to demonstrate Nokogiri, but unfortunately it’s more complex to use than XmlSimple. XmlSimple parses the data into a native Ruby hash whereas Nokogiri has its own set of classes.

Third Party libraries

Third party libraries in Ruby are referred to as gems. Gem is an executable that comes with Ruby. Tell it to install, along with the name of the gem and it will download and install the gem that you want as well as all its dependencies…gem install xml-simple

Let’s Parse!

1
2
3
require 'xmlsimple'
xml = File.read 'aec-mediafeed-results-standard-light-15508.xml';0
data = XmlSimple.xml_in xml

Parsing XML takes a little while, and XmlSimple isn’t the most efficient of parsers. If speed is a concern at all, you should definitely look into Nokogiri.

Once it’s done we can see what’s in there one step at a time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
data.keys
# ["Id", "Created", "SchemaVersion", "EmlVersion", "xmlns", "xmlns:eml", "xmlns:ds", "xmlns:xal", "xmlns:xnl", "xmlns:ts", "xmlns:xs", "xs:schemaLocation", "ManagingAuthority", "MessageLanguage", "MessageGenerator", "Cycle", "Results"]

data["Results"].keys
# NoMethodError

data["Results"].class
# Array

data["Results"][0].class
# Hash

data["Results"][0]["Election"][0]["House"][0]["Contests"][0]["Contest"][0].keys
# ["Projected", "ContestIdentifier", "Enrolment", "FirstPreferences", "TwoCandidatePreferred", "TwoPartyPreferred", "PollingPlaces"]

data["Results"][0]["Election"][0]["House"][0]["Contests"][0]["Contest"][0]["TwoPartyPreferred"][0]["Coalition"][0]["Votes"][0]
# 0

data["Results"][0]["Election"][0]["House"][0]["Contests"][0]["Contest"][0]["Enrolment"][0]
# 124215