PHP String Manipulation Performance Benchmarking

Posted by Chad

Believe it or not, but I’ve been off the Rails for a couple months now. I went back to PHP because of the speed issues with Rails (and Ruby in general). As I mentioned, video.conductr was so slow on Rails I had to rewrite the whole thing in PHP. It wasn’t that bad, it’s a dead simple application. But, that’s what got me working with PHP again – and I love the speed.

Now I’ve started working on a new project that is quite large. I learned a lot about OOP from using Ruby which has made coming back to PHP tolerable being that it now has better support for objects. (I jumped ship sometime before PHP5).

Anyways, the point of writing this article is because I always refactor my code for performance. My refactoring process often includes looking for double quotes which can be single quotes, rethinking my functions, loops, and a whole bunch of string manipulations. But, as I was refactoring I start to wonder what kind of gains I was receiving or whether they were even worth my time.

To further spark my curiosity, I ran accross these tips today courtesy of DZone. So, I decided to write up a little benchmarking script to see how much difference my refactoring has on overall performance.

Benchmarking PHP String Manipulations

After refreshing the page a few times it’s easy to see that the performance advantages are so small that they are not even consistent. The typical performance gain is a few tenths of a second per 100,000 executions.

Build Slide Show Movies from Flickr with Video.Conductr

Posted by Chad

In the past I mentioned an application that I was building using Rails and the Flickr API. After building it, it turned out to be way too resource intensive for me to host with Dreamhost. Since then, I rewrote the application in PHP and it is a bit faster. It’s still a bit of a CPU hog , and Dreamhost probably isn’t too happy about that, but I wanted to put it up anyway.

The application is video.conductr. It searches Flickr for images with your search terms in their tags, text descriptions, etc. and pulls the images together into a Flash video. The videos can be viewed on site, embedded on other pages, or even downloaded as MP4. No account registration is required, however, it will help you keep track of your movies, registered users’ movies have priority in the queue, and you will get a friendly email notice when your movies have been produced. To keep down on resource use, I am currently producing about 3-6 movies per hour maximum. If the site gets heavy usage, I will put it on a better machine and this limitation will be gone.

“Urban” was a recommended tag today when I visited the Flickr homepage. Here is the video and embedded below.

TODO:
  • Fix movie introduction text
  • Fix view counter
  • Add commenting system

Dreamhost's "Fat Finger" Billing Cost Users $1.3 Million

Posted by Chad

Background

Earlier this week Dreamhost, the company hosting this blog, accidentally over billed their users for $7.5 million. Over a couple days time, they looked into the issue and have mostly corrected the problem and now consider it solved.

Of course, this has resulted in thoroughly pissing off their users and the delivery of the explanation was so poor TechCrunch called them out on it. I wasn’t too upset, my credit card had expired and I didn’t have the auto-bill feature activated so I didn’t get hit by it. If I had, it would have been awful. Dreamhost bills my bank account via check card where I usually operate around a $1000 buffer/emergency balance. However, at this time I was shifting some funds around and arranging my various balances because I am gearing up to move to Dallas. Long story short, I only had $50 in my account at the time and a $120 hosting bill would have caused an overdraft.

The cost of my overdraft fees are $35 but, there is more. I also had 4 small transactions pending, all of which would have triggered fees when they attempted to settle. So, I dodged a $175 bullet fired by a trigger happy “fat finger.” Luckily, I never trust any service enough to allow them to auto-bill me and my card was expired anyway but, for many users this event had a real cost.

Financial Analysis

This is a very rudimentary and somewhat ridiculous analysis of how much “fat fingers” can cost your users. I’ll explain my logic but you can get the Excel file and make up your own assumptions.

First, I identified three ways this event has real costs:
  1. Finance Charges
  2. Overdraft Charges
  3. Affect on Credit Rating

Finance Charges

This cost is associated with having an elevated credit card balance. Most credit cards calculate finance charges based on an average daily balance. This means an elevated balance for more than one day, even if refunded in the future, will cost you money.

Given:
  • $7.5 million dollars worth of balance elevation
Assumptions:
  • 18% APR / 365 = 0.049% Daily rate; ignored compounding
  • 3 days elevated (days until refund)
  • 85% of balances were credit cards, otherwise no finance charge

Cost to Users: $9,432

The only given in this analysis is that $7.5 million was billed to users. I assumed the average APR is 18% and that it would take 3 days on average to refund the charges. Also, I figured about 85% of transactions were on credit cards, it could be much more/less. Charges to non-credit card accounts are assumed to bear no cost.

Overdraft Charges

This cost is associated with those users who experienced overdraft, over-the-limit, or some type of fee due to the unexpected charge.

Assumptions:
  • $125 average bill
  • 60,000 bills
  • 10% of bills defaulted
  • $35 fee

Cost to Users: $210,000

Based on Dreamhost’s prices and the user’s comments I estimated $125 as an average billing price. With $7.5 million in total billing, that gives us 60,000 bills that were erroneously collected. The default rate is rather subjective, I wouldn’t be surprised if it was actually 1% while it may be upwards of 20%. And, $35 is on the high side of fees but they are usually in that general range.

Affect on Credit Rating

This cost is associated with the long-term affects of a blemished credit rating. Specifically, the risk that your interest rates will be increased.

Assumptions:
  • 6,000 users affected
  • $9,000 average debt held
  • 2% rate increase
  • 1 year affected

Cost to Users: $1,080,000

If affected, this cost can be huge. However, it is difficult to estimate and the formula I used is extremely sensitive to change. I’ve read reports that puts average U.S. consumer debt at any where from $8,000 to $12,000; so, I went with $9,000. I know the 2% rate increase and 6,000 users affected (all users who received overdraft charges) are probably over-estimates. But, I also only assumed they would be affected for one year. In reality, those affected may be paying up for several years, all the while interest is compounding. For that matter, I also ignored daily compounding throughout the year.

Result

Total Cost to Users: $1,299,432

with

Total Cost per User: $21.66

With the continuous growth of large subscription based sites, I wonder how often this happens? Dollar amount over billed? Real cost to users? And lastly, how long until these issues turn into class action lawsuits?

Why Microsoft can't compete with Google

Posted by Chad

Honestly, I’m a big fan of both companies. Microsoft for Windows and Office. Google for search, Gmail and being a proponent to the Web 2.0 movement. I will admit that I admire Google as a corporation much more than Microsoft and as far as the future goes: I’m long GOOG and holding MSFT.

With that in mind, I came across an article today from BusinessWeek that instills my belief that Google will continue to innovate and grow while Microsoft is just a cash cow.

The meat of the article is an interview with Steve Ballmer (Microsoft’s CEO) and Lisa Brummel (Microsoft’s newly appointed HR executive). It’s fairly lengthy, but the last question asked to Ballmer shows just why Microsoft can’t compete with Google: they are out of touch.

The Question: You think Lisa’s [corporate] blog is a good idea?

The Answer: “I won’t blog ever … Writing is not a natural skill for me. It takes me a long time to write … I could say I speak very well. I could do a verbal blog every day, but it wouldn’t be the same.” (Emphasis mine).

So, I can relate with not being a natural writer. I’m not too fond of writing; as evidenced by the infrequencies in my entries. But, my issue is with the term “verbal blog”. Maybe he meant podcast or video blog (vlog). But, verbal blog? Maybe he’s just out of touch.

Benefits of the Credit Crunch

Posted by Chad

The ongoing, getting worse as I type, credit crunch that is occurring in the U.S. may have a bright side.

Less of this:

... or any other rich media that has me select my state from a caterpillar’s abdomen.

Prosper API for Ruby 22

Posted by Chad

Update: This Prosper API project has been discontinued, and is not functional in it’s current state. Prosper makes huge updates to their API without warning, or even notifying the developer mailing list of the updates. The most recent change totally discontinued HTTP GET/POST access to the API (which this code used) and moved forward with a SOAP only approach. If Prosper opens communication with the developers in the future, I may revisit this project. At this point, I can not invest the time only to have them radically change the API without notice once again.

While working on ProsperK, I wrote an interface to the Prosper API. There was no such project on RubyForge so I decided to put mine up there.

It’s really simple to use, refer to the RDoc for some detailed examples.

Installation

Get the source. require the prosper.rb file.

I put prosper.rb in app/models and it will be loaded with the application.

svn co http://code.conductr.com/svn/prosper/

- or -

Just grab the http://code.conductr.com/svn/prosper/prosper.rb file and require it.

Usage

cd ./prosper
rdoc
Open the doc directory with your browser (or, the index file within). The documentation has usage examples for each method.

10^9: A One Billion Pixel Project - Beta

Posted by Chad

My latest project has just been released in Beta. It’s at 9figs.com and is a goal oriented project. The goal is to receive 1 Billion pixels, uploaded from users, to display in our widget.

This is built with Ruby on Rails except 2 files. The scripts that serve up the widget content and handle the resulting click-throughs are PHP. As the project grows, these files will likely receive a very high number of requests so RoR speed issues are a concern. Initially, I wrote them as part of the Rails app and then ported them over to PHP. Since, it has been over 2 years since I’ve touched anything ending with .php, this made the transition easy. PHP still remains extremely easy to pick up. You just have to remember to end all your lines with a semi-colon.

The widget serving file was extremely slow in Ruby. Each request generated several RMagick objects and it would not have taken many requests to pull the server to a halt. Or, require a more advanced (read: expensive) hosting solution. I was impressed by how fast the same logic could execute when written in PHP. I do miss that benefit of PHP.

I was able to build the entire app in 28 hours of development time. Speaking as a freelancer who only codes a few hours a week, productivity is key. So far, I haven’t found a case where I am not more productive in Rails than I would be in PHP (and I’m not interested in learning anything else).

Go check out the project and get the widget up on your site.

Got my Click

Posted by Chad

I just thought this was a cool ad =p

Bears need Beer, too

Posted by Chad

Mr. Everhart was given a ticket for failing to secure his campsite

Good. Although, I wonder how long it took the authorities to come up with that charge.

Easy Keys using Strings that Succ!

Posted by Chad

Ruby has a cool method on the String object that allows you to make quick keys easily. The succ method recognizes the pattern you have created and returns the successor (the next in the pattern).

>> s='a99y';3.times do puts s.succ! end
a99z
b00a
b00b
=> 3

This comes in handy when trying to make short “keys” to reference. For example, TinyURL generates a small key to reference an entire URL which is saved in a database (presumably).

This is great, but what happens when we run out of 4 character successors?

>> s='z99y';3.times do puts s.succ! end
z99z
aa00a
aa00b
=> 3
>> s='99y';3.times do puts s.succ! end
99z
100a
100b
=> 3

Ruby has it all figured out. It prepends an alpha or numeric depending on which character was found first in the pattern and it never stops.

>> s='@a1';7019.times do |i| puts "#{s.succ!} ===> #{i}" end
.....
@zz8 ===> 7016
@zz9 ===> 7017
@aaa0 ===> 7018
=> 7019

Note: If you are using this with a database, it is important that you make sure the keys are unique.

The Hand-Fed API

Posted by Chad

While planning any application, there is one constant: I may open things up with an API in the future. Maybe I won’t go production without it, or maybe I put the API on the back burner for a future release. Either way, I try to keep that in mind while building the HTML controllers/views. Nothing new there, I’m sure everyone does this these days, but it helped me stumble upon a API method that I think would be pretty cool.

I, as the API provider, want to give the easiest use to all developers using my service. Naturally, that means encouraging the use of Ruby on Rails. Not entirely, it will be just as easy for non-RoR users to develop with the API but we can do so much more for those on Rails. For example, give them a model with the migration, tests, etc.

So here’s the plan, let’s start by looking at some code examples of how our API should be designed to help out a RoR developer.

Inside a rails app do script/generate scaffold_resource bid end.

Migration
create_table :bids do |t|
    t.column :amount, :decimal, :precision => 9, :scale=>4
    t.column :creation_date, :datetime
    t.column :key, :string
    t.column :last_modified_date, :datetime
    t.column :listing_key, :string
    t.column :member_key, :string
    t.column :minimum_rate, :decimal, :precision => 6, :scale=>5
    t.column :participation_amount, :decimal, :precision => 9, :scale=>4
    t.column :status, :integer
    # Your own stuff
end
Once the developer has this in place, then they just follow some simple logic.
# Make a Bid instance
xml = Net::HTTP.get('api-host.cm', '/bids/1.xml')
hash = Hash.from_xml( xml )
@bid = Bid.new hash['bid']

# Developer saves it
@bid.save

# Developer uses it
puts "Winning" if @bid.status == 2
Easy, right? Let’s make it a one liner out of respect for all you Rubyist ninjas… or pirates? I won’t judge.
@bid = Bid.new( Hash.from_xml( Net::HTTP.get('api-host.cm', '/bids/1.xml') )['bid'] )

Fun stuff. Back to my prospective as an API provider. Wow. I did virtually nothing and people can tap into my API literally within 20 seconds of turning on their computer. That’s right, I timed myself to get that statistic and it assumes you have rails installed and ready to go.

The API’s XML file remains easy for any developer to use. It’s important to note that all of this works because the API service runs render :xml => @bid.to_xml such as with the generated REST scaffolding controllers. Rendering the exact XML format that will be recognized by the developers app. If you are not running Rails on your API service, provide an endpoint with the XML formated for RoRs developers. I don’t see this being done anytime soon, but you never know with the growth of RoR thus far.

Thought: For all I know this is what Action Web Services is all about. I’ve never read up on it so I don’t know anything about it. Not really investigating it either shrug.

One Dollar Give Away! To the first person who correctly guess what API was used in the above examples. Just leave your guess as a comment and I’ll send you a buck via Paypal.

AIM: killer app of Tech 1.0

Posted by Chad

Everyone reads about a great thing known as Web 2.0. It goes hand-in-hand with Bubble 2.0. Collectively, Tech 2.0. When thinking of Tech 1.0, reminiscing if you will, I remembered a point in time where I was unusually attached to AIM (AOL’s Instant Messenger). In particular, from about 1998/99ish when I first started using the internet frequently until about 2004, when I just got burned out of being interupted while doing work on the computer. So, one day I just logged off of AIM. I wasn’t really trying to quit the service, but I just never went back to it, only on rare occasion where I want to ask a friend something quickly, maybe someone I don’t talk to frequently and just dropping a casual message not warranting the possible inconvience of a phone call.

So, now it’s over 3 years since I’ve used the AIM application. Since then, I’ve got a Myspace and FaceBook but I’m not really active on either. It’s nice being connected, but shit I’ve got better things to do than poke people and send sparkly image comments saying “Thanks for the Add”.

I’ve adopted text messaging as a preferred method of instant contact. Recently, however, I logged into my old AIM account and was reminded that there is one area where this Tech 1.0 favorite still prevails: separating regular “ChIcKz” from hot chicks.

Putting JavaScript Inline

Posted by Chad

Sometimes, usually in development, I find it useful to have my Javascripts printed inline rather than using the <script src="/path/to"> tags that are generated by javascript_include_tag.

With javascript_inline_tag you can simply declare your Javascripts just like you would with javascript_include_tag and it will spit out the script into your view/layout.

Put it where you like, but I chose to keep with my application_helper.rb.

javascript_inline_tag
module ApplicationHelper
  # ..... your app helpers, per usual
end

module ActionView::Helpers::AssetTagHelper
  def javascript_inline_tag(*sources)
    if sources.include?(:defaults) 
      sources = sources[0..(sources.index(:defaults))] + 
      @@javascript_default_sources.dup + 
      sources[(sources.index(:defaults) + 1)..sources.length]
      sources.delete(:defaults) 
      sources << "application" if defined?(RAILS_ROOT) && File.exists?("#{RAILS_ROOT}/public/javascripts/application.js") 
    end

    sources.collect do |source|
      source = javascript_path(source).sub(/\?\d+/,'')
      contents = ''
      File.open("#{RAILS_ROOT}/public#{source}").each do |line|
        contents << line
      end
      javascript_tag(contents)
    end.join("\n")
  end
end
Sample Usage
<%= javascript_inline_tag :defaults,'niftycube' %>
Notes
  1. You can not pass an options hash.
  2. For production, I’ll just say, consider caching (browser&server) and page size.
  3. Nifty Corners Cube is an awesome script that I recently discovered.
  4. You may find render :file => '' to work for you, I just thought it wasn’t flexible enough for this situation.

Acton MBA call to Action

Posted by Chad

Gee, I can’t wait to get to work on my obituary. I wonder what marketing genius gave the thumbs up on this?

Remove Files After a Destroy

Posted by Chad

This blog was started with the purpose of writing mostly about code. I have yet to do so until now. Here we go.

My current project, similar to AmieStreet, requires the uploading of files (eg mp3’s) that are linked to a record in the database. While the database stores all of the information about the song (eg Title) the mp3 itself is stored on the filesystem. If, for whatever reason, we delete the record from the database we should also remove the associated mp3 from the filesystem.

Rails makes this incredibly easy for us. As the convention goes, I will call the delete a destroy from here on out.

Version 1
def after_destroy
  if FileTest.exist?(source)
    FileUtils.rm self.source
  end
  if FileTest.exist?(clip)
    FileUtils.rm clip
  end
end

This was my original code. I have 2 mp3 files, the source (entire song) and the clip (sample from song), each file should be removed from the filesystem after the record is destroyed. Conveniently, the after_destroy callback lets me run my logic after the destroy.

Since I have 2 files, I test to see if each exists if so I remove it. Note: the source and clip methods just return full paths to the files, like RAILS_ROOT+"/path/to/file.mp3".

It could be so much better, let’s DRY things up a bit.

Version 2
def after_destroy
  removables [source, clip]
end
def removables(paths = [])
  paths.each{|p| FileUtils.rm p if FileTest.exist?p }
end

Here I wanted to make a method that I could feed multiple file paths and it would execute the logic. So I made the removables method, which steps through each path in the paths array and removes the file if it exists. This is pretty clean, but it gets better.

Version 3, Final
def after_destroy
  FileUtils.rm_f [source, clip]
end

Upon closer inspection, I noticed that the FileUtils.rm* methods actually accept a list of files to delete. And, with the force option the method does not throw errors if the file is not there. I was so used to always having to check whether a file existed, in order to avoid errors that come up if I tried to delete a nonexistent file, that I did not even think this could be done. Of course, with Ruby, the solution is simple.

Conclusion

This is a perfect example of how I am trying to think like a Rubyist. From V1 to V3, Final took me about 5 minutes so I feel that I am getting a good deal better … and oh so addicted.