Archive for the ‘EC2’ Category

PHP-JSON on Amazon EC2 Fedora Core4 AMI

Saturday, January 19th, 2008

I have been using a modified version an old FC4 public AMI, which is too good to be deserted with all the software I have installed. For a project, I needed support for JSON. Javascript Object Notation or JSON is a way of passing string representation of Javascript objects to the user-agents. Straight-through serialization of Javascript objects and transmittal of such is much more compact than passing XML and then converting them to Javscript objects.

On the server-side you have to create a string representation of a Javascript object and then return it to the caller. This is very similar to formatting data in XML and then sending the XML back to the requesting clients. JSON takes away the overhead of XML (no parsing, DOM walking etc. instead you get a first-class Javascript object). In PHP you can create a JSON string by passing a PHP variable into an encoder function.  In PHP 5.2.0 JSON is natively compiled, it was contributed by Omar Kilani who wrote the php-json extension.

You may skip the rest of the post if you already have PHP 5.2.0. Keep on reading if you have PHP 5.0/5.1 on Fedora Core 4 and want JSON functionality.

Good news is that the php-json has an RPM available in fedora-extras. You don’t need do any thingamagic of installing the RPM by hand. Instead use yum to install it seamlessly from the fedora extras repository

(Assuming that You have su privileges)

Step1. Navigate to the /etc/yum.repos.d directory. Check if you have a file called fedora-extras.repo. This file contains the required information for yum to look up the extras repository

Step 2. If you do not have the file then create the file with the following text:

[extras]
name=Fedora Extras $releasever - $basearch
baseurl=http://download.fedora.redhat.com/pub/fedora/linux/extras/$releasever/$basearch/
mirrorlist=http://fedora.redhat.com/download/mirrors/fedora-extras-$releasever
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-extras
gpgcheck=1

If you have the file then make sure the “enabled” flag is set to 1. Extras are normally disabled.

Step 3.  Run a yum search command as “yum search php-json”. If your fedora-extras.repo is set correctly then you will see a matching result as:

php-json.i386                            1.1.0-1.fc4            extras
Matched from:
php-json
php-json is an extremely fast PHP C extension for JSON (JavaScript Object
Notation) serialisation.
http://www.aurore.net/projects/php-json/

Step 4. If all looks good, run the install command as yum install php-json. That’s it. Have fun with JSON. Read the PHP Manual for usage.

MYSQL on EC2: Data backup strategy using replication on S3

Tuesday, May 1st, 2007

Few EC2 and S3 facts:
1. S3 (Amazon’s Storage in the Cloud Infrastructure) cannot be natively mounted on EC2 (Amazon’s Cloud Computing Infrastructure).
2. The maximum size of an “object” (atomic unit of a stored data element on S3) is 5GB
3. Multiple EC2 instances (a virtual machine having a horsepower of 1.7GHz, 160 GB ephemeral HDD and 1500 MB RAM) can be booted on demand
I run couple of EC2 instances in the cloud. The backup strategy (call it layman’s strategy or lame strategy!) so far has been a) Freeze the database b) Break the data files into 5GB chunks c) Move the chunks (or objects) onto S3 d) Unfreeze the database e) Repeat.
The above approach brings the database offline for at least 4-6 minutes for every cycle. So, here’s a new strategy I’m planning to test. The pseudo-algo is as follows:
1. Create an AMI which has a pre-configured mysql slave
2. Boot a new instance using the AMI created in #1 above, whenever a backup is desired
3. Read objects from S3 (if any) and coalesce them to rebuild data file
4. Create SSH Tunnel to the master
5. Start slave to catch up with replication
6. Stop Slave after some time
7. Break the fattened data file into chunks or objects (S3 limitation of 5 GB)
8. Move the objects to S3
9. Shutdown the instance
10. Go to 2
The new algo requires quite a bit of automation and there are some unanswered questions, which I’m sure could be figured out after the first trial. The following areas need to be automated:
1. SSH Tunneling between slave and master EC2 instances. The trick is to figure out the host name of the newly booted instance and then tunnel from it.
2. Client scripting for booting and executing the scripts on the slave. I think the best way to address this could be by running a cron job on the master server, which initiates and completes the backup process.
3. Prevention of data corruption. Moving large objects to/from S3 could have it’s own issues. Need to figure out whether the REST API call will guarantee data consistency.

Cloud Computing Panel at TiE: Amazon, where are my candies?

Wednesday, April 18th, 2007

I was in the audience for a panel discussion on Cloud Computing hosted by TiE. The panel was moderated by Nimish Gupta of SAP and had people from Amazon WebServices, Google, Opus Capital, and SAP. The interesting thing to watch was how the panel agreed to disagree on the benefits/definition of Cloud Computing. Pavni Diwanji from Google mentioned that it’s the tools on Google Apps and the API which matters to the developers.
Dan Avida, a VC from Opus, seemed to have innate knowledge about EC2 and mentioned that there are interesting opportunities waiting to be tapped for EC2. It may be interesting to look into those areas.
According to Vishal Sikka, CTO of SAP:

Cloud computing is suitable for smaller applications but not for large applications like SAP.

Adam Selipsky who represented Amazon agreed with that statement and said the current shape of Amazon EC2 & S3 is the first cut and is still in limited private beta. He further mentioned that Amazon’s prime focus is on stability of the platform and they haven’t added any major feature on EC2 and S3 in last 12 months.
On a question about competition for EC2, he joked, “There are rumors that the company on my left (referring to Google, as Pavni Diwanji of Google Apps was seated there) is working on something.” He went serious and said that educating developers to jump onto EC2 is the hardest part and he would love to have some competition so that they could spend millions of dollars in educating the customers.
On being asked whether Amazon is just utilizing the over capacity available in their data centers, Adam responded, “Amazon has invested around $2b for Amazon WebServices including EC2 and S3 and are fully committed.”
I took my turn from the audience and mentioned that using S3 as a natively mounted filesytem is a limitation on EC2 and asked about the oft-requested feature to support large databases on EC2. Adam quipped that he does not want to commit on a date but they are working on it. Cool.
On a side note: Adam and his team (couple of his colleagues in the audience) were pitching people to sign-up with their beta program at the venue but did not bring any candies for existing customers like me. Too bad! After the meeting I even sold the idea of using EC2 to a gentleman who was still kicking tires. Where’s my referral fee? 🙂
Tags:,