At Lexity (where I work and you should too!) we recently had a hack day. I’ve just put the results of my efforts up on Github in a repository called git_post_receive, available at http://github.com/pariser/git_post_receive.
I have always disliked the fact that Github and JIRA speak different languages. If you don’t know what I’m talking about, then you’re probably thankful that you are not the target demographic of this post, but in case you want a bit more understanding about what I’m talking about, we use Github to manage our code and JIRA to manage our bugs and issues). That terrible “Men are from Mars…” idiom applies all too to these two systems…
You would think that when a JIRA issue is referenced in a Github commit message, that the bug would be updated with information about the associated commit. This project, git_post_receive, is a lightweight server I wrote which will do just that, hence increasing my (and hopefully your) productivity!
Continue reading →
I’m going to give a quick introduction for a tool that I built to ease my own web scraping efforts in Python. Aspirationally, I’ve decided to call it PWSU, Python Web Scraping Utilities, thus allowing me to add more functions as time goes on…
It’s available at http://github.com/pariser/pwsu. It’s actually a pretty simple piece of code, with really one use right now, the HTMLCache.
When writing a web-scraper, you’re rarely able to write the processing code correctly on the first try. The HTMLCache makes it easy to iterate in your web processing code. All of the methods are designated as @classmethod so the HTMLCache need not be instantiated. To use it, you need only import it:
from pwsu import HTMLCache
html = HTMLCache.read_url("http://github.com/pariser/pwsu")
First, the HTMLCache will look to a local file cache to see if this URL has been downloaded before. If it has, it will read the html document from file. If not, it will make a live call to download the source of the given URL and put the html document to file before returning it to the user.
HTMLCache also provides other conveniences beyond caching
- It adds a user agent pretending to be Firefox 8.0 MacOS X
- It correctly encodes incoming html documents in utf-8.
Have fun scraping!
There are only a few things that frustrate me about traveling, and one of them is the guide book. Don’t get me wrong — there’s a ton of value to a guide book and I’d never leave home without one; but really, it’s not the book that’s of value but the content inside that book. In this day of a ubiquitous iPhone or Android-powered device, tell me why I should carry a 5 pound monster everywhere I go?
A screenshot of the Itinerary Buidler prototype tool I've written, available at http://pariser.me/itinerary
This is the realization that led me to build a prototype tool, which I’ve blandly named Itinerary Builder (http://pariser.me/itinerary), with the hope of re-conceptualizing the format of a “guide book”, at least for the planning stages of a trip. Continue reading →
The stress alone of having to pick a catchy or clever title has stalled way too many potential projects; too many “Andrew’s super awesome site of awesome!” placeholder names have never been changed.
I’ve never been good at coming up with good names. I don’t even have an interesting online handle. You can find me on almost any service as pariser. Twitter? @pariser. Facebook? facebook.com/pariser. Damn Network Solutions is preventing me from complete domination by keeping pariser.com out of my hands. (Sorry, family.)
This blog was going to happen, and I resolved to give it a reason to exist by coming up with a good title. I’m pretty proud of the name “aware nerd rips.” If you didn’t get it yet, this is an anagram of my full name, and it’s just one of many good choices. Here’s a selection of my favorite anagrams of my name, and the hypothetical projects these anagrams suggest I should start working on ASAP:
- Wanderers Pair – honeymoon planning for backpackers
- Rapider Answer – when I build a faster-than-Google search engine
- Predawn Rears, I – the first of volume of my poetry compilation
- Rarer I Spawned – a study of systematic bias in multiplayer FPS
- New Raider Raps – An Oakland hip-hop discovery blog
- Rearward Penis – (I’ll let you decide what this one is for)
Thanks, Internet Anagram Server. Here begins the Andrew Pariser web-ring.