bikle.com /Opinions/Tech Tips

Howto: AngularJS, Node.js App deployed to Heroku

Dan's AngularJS Learning Club (DALC)

Stock Market Predictions via MADlib Logistic Regression

Split Date Adjustments of Stock Market Data For Machine Learning

Stock Market Backtest with PostgreSQL 9.2 and LibSVM

Video Conferencing via Gmail

Join bikleTech!

Install Cygwin X-Windows and OpenSSH on Windows8

VirtualBox, Vagrant, Linux on Windows8

Linux 101

gem install pg: debugged

Ubuntu 12.04.3 Libraries needed by Ruby 2.0 and Rails 4.x

Selenium Can Test H2O Web UI

Understand Confusion Matrix via SQL

Call Java HelloWorld from JRuby

Deploy 0xdata H2O 10 Node Cluster on Hadoop on EC2

0xdata H2O on Linux Laptop

Stock Market/MADlib Linear Regression

Node.js For Rails on CentOS 6.4

Heroku Toolbelt

NOT IN to Left Outer Join

Install MADlib on Postgres 9.2 on CentOS 6.4

Install Postgres 9.2.4 on Ubuntu 12.04.2

Access Windows 8 BIOS on HP ENVY TS 15 Notebook PC
Page Top

Deploy 0xdata H2O 10 Node Cluster on Hadoop on EC2

I started this effort by logging into my AWS console account at amazon.com. If you have a credit card you can get an account too. The login screen:
https://aws.amazon.com/console/ Hhp10aws10

After login: Hhp12aws12

Next, I searched Google:
http://www.google.com/search?q=How+to+Install+Cloudera+on+EC2

I found this:
http://blog.cloudera.com/blog/2013/03/how-to-create-a-cdh-cluster-on-amazon-ec2-via-cloudera-manager/ Hhp14cdh10

I followed its instruction:

"Get an AWS Access Key ID and AWS Secret Key"

The URL I used to get them:
https://console.aws.amazon.com/iam/home?#security_credential Hhp16aws

I clicked on Access Keys and followed the wizard. I added an entry to my calendar to delete the key after I was done with this demo. To avoid unexpected AWS bills, It is wise to delete unused keys.

Next, I followed the instruction:

"Pick the Ubuntu Server 12.04 LTS 64-bit image"

I searched a bit via Google:
http://www.google.com/search?q=Ubuntu+Server+12.04+LTS+AMI

I eventually landed here:
https://aws.amazon.com/marketplace/pp/B007Z5YWX4 Hhp18aws

I followed the wizard in that page to deploy an m1.large server.

During the deployment effort I asked Amazon to generate a key file for me. I like key files with short names so I called it ox.pem

Anyway, deployment required about 7 minutes.

I added an /etc/hosts entry to my laptop for the new instance:

54.241.27.211 ox

I added ox.pem to ~/.ssh/ on my laptop.

I used both the ox.pem key and the hosts entry to login to the new server:

ssh -i .ssh/ox.pem ubuntu@ox

Next I followed these instructions in the Cloudera webpage:

cd ~ubuntu
wget http://archive.cloudera.com/cm4/installer/latest/cloudera-manager-installer.bin
chmod +x cloudera-manager-installer.bin
sudo ./cloudera-manager-installer.bin

I saw this: Hhp30cm
Hhp32cm
Hhp34cm

Eventually it prompted me to browse this URL:
http://localhost:7180 Hhp36cm

I browsed this instead:
http://ox:7180

At first I tried browsing the URL with Firefox 17.08.

Firefox malfunctioned (or the server did) so I switched to Opera 12.16 and it worked well:
http://ox:7180 Hhp20cm

Next, I just followed the tutorial on the Cloudera webpage and after 40 minutes I had a 10 node hadoop cluster ready for action.

Two decision points I faced were 'How many nodes do I want?' and 'Do I want to encrypt traffic to the CM server?'

I decided I wanted 12 nodes. I really wanted 10 but if one or two nodes failed to deploy cleanly, I would still have a cluster with at least 10 nodes.

Also I decided I wanted to encrypt traffic.

But then I changed my mind to not encrypt traffic after I found confusing information about how to set it up.

That would be a good topic for another tech-tip:
http://www.google.com/search?q=How+to+Setup+TLS+on+Cloudera

I did, however, change the admin password of Cloudera Manager (CM) after I was done with the setup-wizard.

At this point hadoop was setup.

I turned my attention towards installing H2O on one of the cluster nodes.

A sweet feature of H2O is I only need to install it on one node.

I then rely on H2O to propagate itself to other nodes which I specify in a file, to be described later, named flatfile.txt

Next, I made a note of all the IP-addresses of the nodes by clicking on the hosts link:
http://ox:7180/cmf/hardware/hosts Hhp22hosts

I used my mouse to collect 10 of the IP-addresses:

10.172.25.41
10.170.243.33
10.170.119.11
10.170.134.189
10.170.166.250
10.170.191.200
10.170.203.112
10.171.109.116
10.171.138.148
10.172.14.127

I added a line to /etc/hosts on ox (the m1-large CM server):

10.172.25.41    node1

I logged into node1 from ox:

ssh -i .ssh/ox.pem ubuntu@node1

I setup node1 for git clone and make:

sudo bash
apt-get install build-essential
apt-get install r-base
apt-get install git-core

I logged into a hadoop related account named 'hdfs' which has permission to write to a hadoop component named HDFS:

su - hdfs

Then I cloned a copy of the most recent commit on the master branch of H2O:

git clone https://github.com/0xdata/h2o.git

Next, I ran two shell commands:

cd h2o
make

After awhile it was done.

I did this:

cd target

I needed to tell H2O where it should propagate itself to.

I created a file named flatfile.txt with a list of 10 hadoop nodes:

echo 10.172.25.41    > flatfile.txt
echo 10.170.243.33   >> flatfile.txt
echo 10.170.119.11   >> flatfile.txt
echo 10.170.134.189  >> flatfile.txt
echo 10.170.166.250  >> flatfile.txt
echo 10.170.191.200  >> flatfile.txt
echo 10.170.203.112  >> flatfile.txt
echo 10.171.109.116  >> flatfile.txt
echo 10.171.138.148  >> flatfile.txt
echo 10.172.14.127   >> flatfile.txt
wc -l flatfile.txt
cat   flatfile.txt

Next, I referred to the H2O documentation to craft a command line to start the H2O cluster:
http://docs.0xdata.com/ Hhp24doc

I did this:


hdfs@ip-10-172-25-41:~/h2o/target$ 
hdfs@ip-10-172-25-41:~/h2o/target$ hadoop jar hadoop/h2odriver_cdh4.jar water.hadoop.h2odriver -files flatfile.txt -libjars h2o.jar -mapperXmx 1g -nodes 10 -output out100
Determining driver host interface for mapper->driver callback...
    [Possible callback IP address: 10.172.25.41]
    [Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 10.172.25.41:55580
(You can override these with -driverif and -driverport.)
Driver program compiled with MapReduce V1 (Classic)
Memory Settings:
    mapred.child.java.opts:      -Xms1g -Xmx1g
    mapred.map.child.java.opts:  -Xms1g -Xmx1g
    Extra memory percent:        10
    mapreduce.map.memory.mb:     1126
Job name 'H2O_51360' submitted
JobTracker job ID is 'job_201308241920_0001'
Waiting for H2O cluster to come up...
H2O node 10.170.243.33:54321 requested flatfile
H2O node 10.172.14.127:54321 requested flatfile
H2O node 10.172.45.249:54321 requested flatfile
H2O node 10.170.203.112:54321 requested flatfile
H2O node 10.171.138.148:54321 requested flatfile
H2O node 10.170.134.189:54321 requested flatfile
H2O node 10.171.109.116:54321 requested flatfile
H2O node 10.174.119.151:54321 requested flatfile
H2O node 10.170.166.250:54321 requested flatfile
H2O node 10.172.25.41:54321 requested flatfile
Sending flatfiles to nodes...
    [Sending flatfile to node 10.170.243.33:54321]
    [Sending flatfile to node 10.172.14.127:54321]
    [Sending flatfile to node 10.172.45.249:54321]
    [Sending flatfile to node 10.170.203.112:54321]
    [Sending flatfile to node 10.171.138.148:54321]
    [Sending flatfile to node 10.170.134.189:54321]
    [Sending flatfile to node 10.171.109.116:54321]
    [Sending flatfile to node 10.174.119.151:54321]
    [Sending flatfile to node 10.170.166.250:54321]
    [Sending flatfile to node 10.172.25.41:54321]
H2O node 10.170.243.33:54321 reports H2O cluster size 1
H2O node 10.171.138.148:54321 reports H2O cluster size 1
H2O node 10.171.109.116:54321 reports H2O cluster size 1
H2O node 10.170.134.189:54321 reports H2O cluster size 1
H2O node 10.174.119.151:54321 reports H2O cluster size 1
H2O node 10.172.25.41:54321 reports H2O cluster size 1
H2O node 10.172.14.127:54321 reports H2O cluster size 1
H2O node 10.170.203.112:54321 reports H2O cluster size 1
H2O node 10.172.45.249:54321 reports H2O cluster size 1
H2O node 10.170.166.250:54321 reports H2O cluster size 1
H2O node 10.174.119.151:54321 reports H2O cluster size 2
H2O node 10.170.243.33:54321 reports H2O cluster size 10
H2O node 10.171.138.148:54321 reports H2O cluster size 10
H2O node 10.174.119.151:54321 reports H2O cluster size 10
H2O node 10.170.203.112:54321 reports H2O cluster size 10
H2O node 10.170.166.250:54321 reports H2O cluster size 10
H2O node 10.172.45.249:54321 reports H2O cluster size 10
H2O node 10.170.134.189:54321 reports H2O cluster size 10
H2O node 10.172.25.41:54321 reports H2O cluster size 10
H2O node 10.172.14.127:54321 reports H2O cluster size 10
H2O node 10.171.109.116:54321 reports H2O cluster size 10
H2O cluster (10 nodes) is up
(Note: Use the -disown option to exit the driver after cluster formation)
(Press Ctrl-C to kill the cluster)
Blocking until the H2O cluster shuts down...

At this point the H2O cluster was ready to take in REST traffic from Data Scientists who wanted to both build and score some seriously large models.

I checked CM to see what H2O looked like from hadoop's perspective:
http://ox:7180/cmf/services/3/monitor/activities Hhp26cm
Hhp28cm

Inspection of the above screen-shots told me that hadoop saw the H2O cluster as a set of 10 map jobs.

This is consistent with what I had read in the H2O documentation.

So, I was happy.




Page Top

0xdata H2O on Linux Laptop


This techtip is related to my efforts to prepare 
my laptop for this Meetup:

http://www.meetup.com/H2Omeetup/events/132640822

Big Data Science from R
Tuesday, August 20, 2013 7:00 PM
0xdata Campus
1185 Terra Bella Ave, Mountain View, CA 

Learn how to invoke big data modeling entirely from R.

In this session our resident R & Math hacker, Anqi Fu will
demonstrate the R API for H2O.  Early users, community and
customers of H2O have been invoking GLM, Random Forest and
K-means from an RConsole or RStudio.

In this meetup we'll look at the API and take a small and
big dataset in HDFS and clean it using R, build models,
predict and plot entirely from R.  Users will experience
speed of algorithms in H2O, without having to learn a New
API.

I started by installing an instance of CentOS 6.4 on my laptop.

Then I installed R by following these steps:

  - Login as root
  - cd /tmp/
  - rpm -Uvh http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
  - yum -y install R

After I installed R, I installed RStudio:
  - wget http://download2.rstudio.org/rstudio-server-0.97.551-x86_64.rpm
  - yum install --nogpgcheck rstudio-server-0.97.551-x86_64.rpm


After I installed RStudio, I logged out of the root account.

I logged into a non-root account.

Then, I tried to install R-Packages RCurl AND rjson.

The R software has a nice feature which allows me to install R-packages
under my home directory if I lack permission to install them under /usr.

The feature is implemented by creating a directory named "R" under my
home directory.

The syntax I used is displayed in the screendump below:


cen96 oracle ~/oxdata $ 
cen96 oracle ~/oxdata $ 
cen96 oracle ~/oxdata $ R

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> install.packages(c("RCurl", "rjson"))
Installing packages into ‘/usr/lib64/R/library’
(as ‘lib’ is unspecified)
Warning in install.packages(c("RCurl", "rjson")) :
  'lib = "/usr/lib64/R/library"' is not writable
Would you like to use a personal library instead?  (y/n) y
Would you like to create a personal library
~/R/x86_64-redhat-linux-gnu-library/3.0
to install packages into?  (y/n) y
--- Please select a CRAN mirror for use in this session ---
also installing the dependency ‘bitops’

trying URL 'http://cran.stat.sfu.ca/src/contrib/bitops_1.0-5.tar.gz'
Content type 'application/x-gzip' length 8518 bytes
opened URL
==================================================
downloaded 8518 bytes

trying URL 'http://cran.stat.sfu.ca/src/contrib/RCurl_1.95-4.1.tar.gz'
Content type 'application/x-gzip' length 870915 bytes (850 Kb)
opened URL
==================================================
downloaded 850 Kb

trying URL 'http://cran.stat.sfu.ca/src/contrib/rjson_0.2.12.tar.gz'
Content type 'application/x-gzip' length 96906 bytes (94 Kb)
opened URL
==================================================
downloaded 94 Kb

* installing *source* package ‘bitops’ ...

snip ...

While the rjson package installed smoothly,
the RCurl package was blocked due to a missing Linux library.

I typed the error message into Google and then fixed the blockage with
the shell commands listed in the screen dump below:


cen96 root /home/oracle/bak # 
cen96 root /home/oracle/bak # yum install libcurl-devel.x86_64 libcurl.x86_64
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
 * base: centos.mirror.facebook.net
 * epel: mirror.steadfast.net
 * extras: mirror.unl.edu
 * updates: mirror.steadfast.net
Setting up Install Process
Package libcurl-7.19.7-37.el6_4.x86_64 already installed and latest version
Resolving Dependencies
--> Running transaction check
---> Package libcurl-devel.x86_64 0:7.19.7-37.el6_4 will be installed
--> Processing Dependency: libidn-devel for package: libcurl-devel-7.19.7-37.el6_4.x86_64
--> Processing Dependency: automake for package: libcurl-devel-7.19.7-37.el6_4.x86_64
--> Running transaction check
---> Package automake.noarch 0:1.11.1-4.el6 will be installed
--> Processing Dependency: autoconf >= 2.62 for package: automake-1.11.1-4.el6.noarch
---> Package libidn-devel.x86_64 0:1.18-2.el6 will be installed
--> Running transaction check
---> Package autoconf.noarch 0:2.63-5.1.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package              Arch          Version                Repository      Size
================================================================================
Installing:
 libcurl-devel        x86_64        7.19.7-37.el6_4        updates        244 k
Installing for dependencies:
 autoconf             noarch        2.63-5.1.el6           base           781 k
 automake             noarch        1.11.1-4.el6           base           550 k
 libidn-devel         x86_64        1.18-2.el6             base           137 k

Transaction Summary
================================================================================
Install       4 Package(s)

Total download size: 1.7 M
Installed size: 4.8 M
Is this ok [y/N]: y
Downloading Packages:
(1/4): autoconf-2.63-5.1.el6.noarch.rpm                  | 781 kB     00:01     
(2/4): automake-1.11.1-4.el6.noarch.rpm                  | 550 kB     00:01     
(3/4): libcurl-devel-7.19.7-37.el6_4.x86_64.rpm          | 244 kB     00:04     
(4/4): libidn-devel-1.18-2.el6.x86_64.rpm                | 137 kB     00:00     
--------------------------------------------------------------------------------
Total                                           187 kB/s | 1.7 MB     00:09     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : autoconf-2.63-5.1.el6.noarch                                 1/4 
  Installing : automake-1.11.1-4.el6.noarch                                 2/4 
  Installing : libidn-devel-1.18-2.el6.x86_64                               3/4 
  Installing : libcurl-devel-7.19.7-37.el6_4.x86_64                         4/4 
  Verifying  : libidn-devel-1.18-2.el6.x86_64                               1/4 
  Verifying  : libcurl-devel-7.19.7-37.el6_4.x86_64                         2/4 
  Verifying  : autoconf-2.63-5.1.el6.noarch                                 3/4 
  Verifying  : automake-1.11.1-4.el6.noarch                                 4/4 

Installed:
  libcurl-devel.x86_64 0:7.19.7-37.el6_4                                        

Dependency Installed:
  autoconf.noarch 0:2.63-5.1.el6          automake.noarch 0:1.11.1-4.el6       
  libidn-devel.x86_64 0:1.18-2.el6       

Complete!
cen96 root /home/oracle/bak # 
cen96 root /home/oracle/bak # 
cen96 root /home/oracle/bak # 


cen96 oracle ~/oxdata $ 
cen96 oracle ~/oxdata $ which curl-config
/usr/bin/curl-config
cen96 oracle ~/oxdata $ 
cen96 oracle ~/oxdata $ 

Then I retried my efforts to install RCurl:


> install.packages(c("RCurl", "rjson"))
Installing packages into ‘/home/oracle/R/x86_64-redhat-linux-gnu-library/3.0’
(as ‘lib’ is unspecified)
trying URL 'http://cran.stat.sfu.ca/src/contrib/RCurl_1.95-4.1.tar.gz'
Content type 'application/x-gzip' length 870915 bytes (850 Kb)
opened URL
==================================================
downloaded 850 Kb

trying URL 'http://cran.stat.sfu.ca/src/contrib/rjson_0.2.12.tar.gz'
Content type 'application/x-gzip' length 96906 bytes (94 Kb)
opened URL
==================================================
downloaded 94 Kb

* installing *source* package ‘RCurl’ ...
** package ‘RCurl’ successfully unpacked and MD5 sums checked
checking for curl-config... /usr/bin/curl-config
checking for gcc... gcc

snip......

** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
  converting help for package ‘rjson’
    finding HTML links ... done
    fromJSON                                html  
    newJSONParser                           html  
    rjson                                   html  
    toJSON                                  html  
** building package indices
** installing vignettes
   ‘json_rpc_server.Rnw’ 
** testing if installed package can be loaded
* DONE (rjson)

The downloaded source packages are in
	‘/tmp/RtmpJEhYC7/downloaded_packages’
> 


At this point I was ready to start step 3 listed on the Meetup page.

Sri wrote step 3 like this:

3) Download and install the H2O 
http://docs.0xdata.com/quickstart/quickstart_jar.html
If you already git pulled and ran make, then you can skip this step.

I had not already git pulled and ran make.
But, I was curious about which git repository he was referring to.

After a web-search I found the repository to probably be this:

https://github.com/0xdata/h2o

At this point I was ready to start step 4 listed on the Meetup page.

Sri wrote step 4 like this:

4) Install the H2O R package. 
If you installed the executable in 3), 
then follow these steps:
http://docs.0xdata.com/quickstart/quickstart_jar.html
If you got H2O through Github, follow these steps:
http://docs.0xdata.com/quickstart/quickstart_git.html

I studied this page:

http://docs.0xdata.com/quickstart/quickstart_jar.html

I went on a quest for h2o.jar

Eventually I found it inside a ZIP file listed at this page:

  https://github.com/0xdata/h2o/wiki/How-To-Start-H2O

Here is a screendump of me wrestling with the ZIP file:

cen96 oracle ~/bak $   wget s3.amazonaws.com/h2o-release/fourier-6/h2o-1.5.6.137.zip
--2013-08-17 13:33:14--  http://s3.amazonaws.com/h2o-release/fourier-6/h2o-1.5.6.137.zip
Resolving s3.amazonaws.com... 207.171.185.200
Connecting to s3.amazonaws.com|207.171.185.200|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 61137583 (58M) [application/zip]
Saving to: “h2o-1.5.6.137.zip”

100%[======================================>] 61,137,583   992K/s   in 61s     

2013-08-17 13:34:16 (972 KB/s) - “h2o-1.5.6.137.zip” saved [61137583/61137583]

cen96 oracle ~/bak $ ll
total 77732
drwxrwxr-x.  2 oracle oracle     4096 Aug 17 13:33 ./
drwx------. 28 oracle oracle     4096 Aug 17 12:58 ../
-rw-rw-r--.  1 oracle oracle 61137583 Jul 29 21:04 h2o-1.5.6.137.zip
-rw-rw-r--.  1 oracle oracle 18445445 May 11 08:24 rstudio-server-0.97.551-x86_64.rpm
cen96 oracle ~/bak $ 
cen96 oracle ~/bak $


cen96 oracle ~/bak $ 
cen96 oracle ~/bak $ 
cen96 oracle ~/bak $ unzip h2o-1.5.6.137.zip 
Archive:  h2o-1.5.6.137.zip
   creating: h2o-1.5.6.137/
   creating: h2o-1.5.6.137/R/
  inflating: h2o-1.5.6.137/R/h2o_1.5.6.137.tar.gz  
  inflating: h2o-1.5.6.137/h2o.jar   
  inflating: h2o-1.5.6.137/h2o-sources.jar  
   creating: h2o-1.5.6.137/hadoop/
  inflating: h2o-1.5.6.137/hadoop/h2odriver_cdh3.jar  
  inflating: h2o-1.5.6.137/hadoop/h2odriver_mapr2.1.3.jar  
  inflating: h2o-1.5.6.137/hadoop/h2odriver_cdh4.jar  
  inflating: h2o-1.5.6.137/hadoop/README.txt  
  inflating: h2o-1.5.6.137/hadoop/h2odriver_cdh4_yarn.jar  
cen96 oracle ~/bak $ 
cen96 oracle ~/bak $ 
cen96 oracle ~/bak $ 


cen96 oracle ~/bak $ cd h2o-1.5.6.137
cen96 oracle ~/bak/h2o-1.5.6.137 $ 
cen96 oracle ~/bak/h2o-1.5.6.137 $ ls -la h2o.jar 
-rw-r--r--. 1 oracle oracle 60191282 Jul 29 17:11 h2o.jar
cen96 oracle ~/bak/h2o-1.5.6.137 $ 
cen96 oracle ~/bak/h2o-1.5.6.137 $ 

I finished step 4 by running a command from the R prompt as displayed
in the screendump below:


cen96 oracle ~/oxdata $ R

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> install.packages("/home/oracle/bak/h2o-1.5.6.137/R/h2o_1.5.6.137.tar.gz", repos = NULL, type = "source")
Installing package into ‘/home/oracle/R/x86_64-redhat-linux-gnu-library/3.0’
(as ‘lib’ is unspecified)
* installing *source* package ‘h2o’ ...
** R
** demo
** inst
** preparing package for lazy loading
Creating a generic function for ‘summary’ from package ‘base’ in package ‘h2o’
Creating a generic function for ‘colnames’ from package ‘base’ in package ‘h2o’
** help
*** installing help indices
  converting help for package ‘h2o’
    finding HTML links ... done
    H2OClient-class                         html  
    H2OGLMModel-class                       html  
    H2OKMeansModel-class                    html  
    H2OParsedData-class                     html  
    H2ORForestModel-class                   html  
    H2ORawData-class                        html  
    h2o-package                             html  
    h2o.getTree                             html  
    h2o.glm                                 html  
    h2o.kmeans                              html  
    h2o.randomForest                        html  
    importFile                              html  
    importFolder                            html  
    importHDFS                              html  
    importURL                               html  
    parseRaw                                html  
** building package indices
** testing if installed package can be loaded
* DONE (h2o)
> 

Next, I installed Java from the root account with the shell command listed below:

  yum -y groupinstall 'Java Platform'

Then, from a non-root account, I started the H20 server on my laptop:

cen96 oracle ~/bak $ cd h2o-1.5.6.137
cen96 oracle ~/bak/h2o-1.5.6.137 $ ll
total 59612
drwxr-xr-x. 4 oracle oracle     4096 Jul 29 17:11 ./
drwxrwxr-x. 3 oracle oracle     4096 Aug 17 14:25 ../
-rw-r--r--. 1 oracle oracle 60191282 Jul 29 17:11 h2o.jar
-rw-r--r--. 1 oracle oracle   826180 Jul 29 17:11 h2o-sources.jar
drwxr-xr-x. 2 oracle oracle     4096 Jul 29 17:11 hadoop/
drwxr-xr-x. 2 oracle oracle     4096 Jul 29 17:11 R/
cen96 oracle ~/bak/h2o-1.5.6.137 $ java -Xmx3g -jar h2o.jar -name mystats-cloud
02:26:08.483 main      INFO WATER: ----- H2O started -----
02:26:08.484 main      INFO WATER: Build git branch: (no branch)
02:26:08.484 main      INFO WATER: Build git hash: d78c92f3b8a4c765b2276a40aa2422ec396b62ce
02:26:08.484 main      INFO WATER: Build git describe: d78c92f
02:26:08.484 main      INFO WATER: Build project version: 1.5.6.137
02:26:08.490 main      INFO WATER: Built by: 'jenkins'
02:26:08.490 main      INFO WATER: Built on: 'Mon Jul 29 17:10:45 PDT 2013'
02:26:08.491 main      INFO WATER: Java availableProcessors: 1
02:26:08.589 main      INFO WATER: Java heap totalMemory: 0.02 gb
02:26:08.590 main      INFO WATER: Java heap maxMemory: 2.90 gb
02:26:08.648 main      INFO WATER: ICE root: '/tmp'
02:26:08.756 main      INFO WATER: Internal communication uses port: 54322
+                                  Listening for HTTP and REST traffic on  http://192.168.1.96:54321/
02:26:08.933 main      INFO WATER: H2O cloud name: 'mystats-cloud'
02:26:08.933 main      INFO WATER: (v1.5.6.137) 'mystats-cloud' on /192.168.1.96:54321, discovery address /236.151.114.91:60567
02:26:08.944 main      INFO WATER: Cloud of size 1 formed [/192.168.1.96:54321]
02:26:08.944 main      INFO WATER: Log dir: '/tmp/h2ologs'


At this point I considered by laptop ready for the Meetup:

http://www.meetup.com/H2Omeetup/events/132640822

Big Data Science from R
Tuesday, August 20, 2013 7:00 PM
0xdata Campus
1185 Terra Bella Ave, Mountain View, CA 



Learn how to invoke big data modeling entirely from R.

In this session our resident R & Math hacker, 
Anqi Fu will demonstrate the R API for H2O. 
Early users, community and customers of H2O have been invoking GLM, 
Random Forest and K-means from an RConsole or RStudio.

In this meetup we'll look at the API and take a small and big dataset 
in HDFS and clean it using R, build models, predict and plot entirely from R. 
Users will experience speed of algorithms in H2O, without having to learn a New API.




Page Top

Stock Market/MADlib Linear Regression


This tech-tip is a discussion about MADlib Linear Regression.

Information and discussion about MADlib can be found at the
URL listed below:


http://madlib.net/index.html


MADlib Linear Regression is documented here:


http://doc.madlib.net/latest/group__grp__linreg.html


A general writeup of Linear Regression as a predictive tool can be
found in Wikipedia:


http://en.wikipedia.org/wiki/Multivariate_linear_regression


To make this tech-tip interesting, I used MADlib to try to predict 
stock market prices in 2013 using data from 1993 through 2012.

I started this effort by installing MADlib on a CentOS Linux host.

The installation steps are listed here:


http://bikle.com/techtips/two#madlib



Next, I downloaded stockmarket data from Yahoo into a CSV file.
I downloaded dates and prices for a symbol named 'SPY'.
Shares in SPY can bought on the stockmarket just like shares of AAPL or IBM.
The price of SPY mirrors the price of the S & P 500.
So, if I want to 'predict the stockmarket' I specifically mean that I want
to predict if SPY price will move up or down tomorrow.

I list below Linux shell commands to download SPY prices:

cd /tmp
rm -f temp.csv
wget --output-document=temp.csv http://ichart.finance.yahoo.com/table.csv?s=SPY
# Remove the text-header:
grep -v Date temp.csv > /tmp/SPY.csv
ls -la /tmp/SPY.csv

Next, I created a table in Postgres to hold this SPY data:

CREATE TABLE date_prices
(
ydate     DATE
,opn      DECIMAL
,hhigh    DECIMAL
,llow     DECIMAL
,closing_price DECIMAL
,volume   DECIMAL
,adjclose DECIMAL
)
;

Next, I copied the SPY-CSV data into the table:

--
-- The Postgres server process does the work.
-- It is running relative to another directory.
-- I need to give it the full path to the CSV file.
COPY date_prices (
ydate     
,opn      
,hhigh    
,llow     
,closing_price
,volume   
,adjclose
) FROM '/tmp/SPY.csv' WITH csv
;

Then, for each date, I worked towards building RSI values 
from price using window functions:

-- Ref:
-- http://www.postgresql.org/docs/9.2/static/tutorial-window.html
-- http://en.wikipedia.org/wiki/Relative_strength_index#Principles
--

CREATE TABLE spylin10 AS
SELECT
ydate
,closing_price
-- copy the previous row price to this row
,LAG(closing_price,1,NULL)OVER(ORDER BY ydate)  price_before
-- copy the next row price to this row
,LEAD(closing_price,1,NULL)OVER(ORDER BY ydate) price_after
FROM date_prices
;

CREATE TABLE spylin12 AS
SELECT
ydate
,closing_price
,(closing_price - price_before) d01
,(price_after - closing_price)  gain
FROM spylin10
;

CREATE TABLE spylin14 AS
SELECT
ydate
,closing_price
,gain
-- ng is Normalized 1day Gain
,gain/closing_price AS ngain
,CASE WHEN d01 < 0 THEN -d01 ELSE 0 END dayloss
,CASE WHEN d01 > 0 THEN d01 ELSE 0 END  daygain
FROM spylin12
;

CREATE TABLE spylin16 AS
SELECT
ydate
,closing_price
,gain
,ngain
-- 200 day mvg avg of closing_price is useful:
,AVG(closing_price)OVER(ORDER BY ydate ROWS BETWEEN 200 PRECEDING AND CURRENT ROW)ma200
-- Move towards calculation RSI5,9,14
,AVG(dayloss)OVER(ORDER BY ydate ROWS BETWEEN 4 PRECEDING AND CURRENT ROW)mv_avg_loss5
,AVG(daygain)OVER(ORDER BY ydate ROWS BETWEEN 4 PRECEDING AND CURRENT ROW)mv_avg_gain5
,AVG(dayloss)OVER(ORDER BY ydate ROWS BETWEEN 8 PRECEDING AND CURRENT ROW)mv_avg_loss9
,AVG(daygain)OVER(ORDER BY ydate ROWS BETWEEN 8 PRECEDING AND CURRENT ROW)mv_avg_gain9
,AVG(dayloss)OVER(ORDER BY ydate ROWS BETWEEN 13 PRECEDING AND CURRENT ROW)mv_avg_loss14
,AVG(daygain)OVER(ORDER BY ydate ROWS BETWEEN 13 PRECEDING AND CURRENT ROW)mv_avg_gain14
FROM spylin14
;

CREATE TABLE spylin18 AS
SELECT
ydate
,closing_price
,gain
,ngain
,ma200
,CASE WHEN closing_price > ma200 THEN 1.0 ELSE -1.0 END above_ma200
,CASE WHEN mv_avg_loss5=0 THEN 100.0
 ELSE 0.1 + 100.0 * ( 1 - 1/(1 + mv_avg_gain5/(mv_avg_loss5+0.0001)))END rsi5
,CASE WHEN mv_avg_loss9=0 THEN 100.0
 ELSE 0.1 + 100.0 * ( 1 - 1/(1 + mv_avg_gain9/(mv_avg_loss9+0.0001)))END rsi9
,CASE WHEN mv_avg_loss14=0 THEN 100.0
 ELSE 0.1 + 100.0 * ( 1 - 1/(1 + mv_avg_gain14/(mv_avg_loss14+0.0001)))END rsi14
FROM spylin16
;

CREATE TABLE spylin20 AS
SELECT
ydate
,closing_price
,gain
,ngain
,ma200
,above_ma200
,rsi5
,rsi9
,rsi14
,(rsi5 + rsi9 + rsi14) AS sum_rsi
,(rsi5 * rsi9 * rsi14) AS prod_rsi
,ydate - '1950-01-01' id
FROM spylin18
ORDER BY ydate
;

--
-- rsi14 needs 14 days of data.
-- So, ignore the first 14 days of data.
-- Also leave out 2013,
-- later I want to predict 2013 using data from before 2013:
--
CREATE TABLE spylin22 AS
SELECT
id
,ydate
,closing_price
,gain
,ngain
,ma200
,above_ma200
,rsi5
,rsi9
,rsi14
,sum_rsi
,prod_rsi
FROM spylin20
WHERE ydate > 15 + (SELECT MIN(ydate) FROM spylin20)
AND ydate < '2013-01-01'
ORDER BY ydate
;



Next, I studied the MADlib documentation:

http://doc.madlib.net/v1.1/group__grp__linreg.html#examples

I decided that I wanted to predict the value of a column named ngain
in the above table named spylin22.

Using jargon from the MADlib documentation, 
I say that ngain is the depdendent variable.

Next, I made the assumption that ngain depends on 
the independent variables named rsi5, rsi9, rsi14, sum_rsi, and prod_rsi.

The variables correspond to columns in table spylin22.

To keep things in perspective, I conceptualize the idea that 
each morning I don't know the value of ngain but I do know the values
of rsi5, rsi9, rsi14, sum_rsi, and prod_rsi.


I followed the example the MADlib documentation and wrote this SQL statement:

SELECT linregr_train( 'spylin22', 'spylinreg12', 'ngain', 'array[1,rsi5,rsi9,rsi14,sum_rsi,prod_rsi]' );

The first argument in the above statement is the name of the table which contains
both the dependent variable values and the independent variable values.

The second argument is a name of my choosing for an object called the training model.

I chose to call it spylinreg12 for no particular reason.

The third argument is the name of the depdendent variable which is the thing I want to predict.

The fourth and final argument is a list of dependent variables.

When I ran the above statement I saw this:

PL/pgSQL function __internal_linregr_train_hetero(character varying,character varying,character varying,character varying,boolean) line 23 at EXECUTE statement
SQL statement "SELECT madlib.__internal_linregr_train_hetero(
        source_table, out_table, dependent_varname, independent_varname, False)"
PL/pgSQL function linregr_train(character varying,character varying,character varying,character varying) line 3 at PERFORM
 linregr_train 
---------------
 
(1 row)


The above output is a bit ugly but looks more informational than worrisome.
I said 'okay...'.


Next, following the documentation example, I ran and saw this:

--
-- rpt, Look at the regression created by MADlib:
--
\x on
Expanded display is on.
SELECT * FROM spylinreg12;
-[ RECORD 1 ]+----------------------------------------------------------------------------------------------------------------------------------
coef         | {0.00344487738475487,-3.53653485919165e-05,-6.67879424124425e-05,-3.44699495848545e-11,7.80249719116958e-06,5.67899373282035e-09}
r2           | 0.00452818367761205
std_err      | {0.0010631426367793,2.08783563055287e-05,3.08650982623128e-05,1.06379615290428e-11,2.01362954804233e-05,2.79235011218835e-09}
t_stats      | {3.2402777064711,-1.69387609227416,-2.16386618454387,-3.2402777064711,0.387484242012402,2.0337685120616}
p_values     | {0.00120194969294453,0.0903511012117993,0.0305218174216153,0.00120194969294453,0.698414237680639,0.0420276212722458}
condition_no | 131471944.363643

\x off
Expanded display is off.



I liked the look of that.

Then, I created a table full of 2013 data that I wanted to predict:

CREATE TABLE spylin2013 AS
SELECT
id
,ydate
,closing_price
,gain
,ngain
,ma200
,above_ma200
,rsi5
,rsi9
,rsi14
,sum_rsi
,prod_rsi
FROM spylin20
WHERE ydate > '2013-01-01'
ORDER BY ydate
;

--
-- rpt, How many rows I have?
-- 
SELECT MIN(ydate), COUNT(ydate), MAX(ydate) FROM spylin2013;
    min     | count |    max     
------------+-------+------------
 2013-01-02 |   159 | 2013-08-19
(1 row)


Next, I followed the documentation example to issue a few predictions:

SELECT
ydate
,closing_price
,gain
,ROUND(rsi5,4)
,ROUND(rsi9,4)
,ROUND(rsi14,4)
,linregr_predict(array[1,rsi5,rsi9,rsi14,sum_rsi,prod_rsi], m.coef) as predicted_ngain
FROM spylin2013 s,spylinreg12 m
WHERE ydate BETWEEN '2013-01-01' AND '2013-01-31'
ORDER BY ydate
;
   ydate    | closing_price | gain  |  round   |  round  |  round  |  predicted_ngain   
------------+---------------+-------+----------+---------+---------+--------------------
 2013-01-02 |        146.06 | -0.33 |  72.3112 | 52.7441 | 57.4633 |  3.36287368073e-05
 2013-01-03 |        145.73 |  0.64 |  74.7241 | 55.9591 | 56.2486 |  -0.00014090825312
 2013-01-04 |        146.37 | -0.40 |  78.2900 | 55.2655 | 60.6009 | -1.09867723904e-05
 2013-01-07 |        145.97 | -0.42 |  90.2290 | 65.7438 | 61.0459 | -0.000387208020353
 2013-01-08 |        145.55 |  0.37 |  78.9530 | 65.8733 | 55.5155 |  -0.00054400233881
 2013-01-09 |        145.92 |  1.16 |  46.8484 | 71.1329 | 51.9056 |  -0.00065489680265
 2013-01-10 |        147.08 | -0.01 |  72.6631 | 75.4614 | 59.2301 |   -0.0007025179779
 2013-01-11 |        147.07 | -0.10 |  64.9168 | 87.6984 | 56.8419 |  -0.00123607756307
 2013-01-14 |        146.97 |  0.10 |  74.3538 | 82.2929 | 67.1954 | -0.000599375317291
 2013-01-15 |        147.07 | -0.02 |  93.7512 | 64.3896 | 69.9571 |  6.87288528445e-06
 2013-01-16 |        147.05 |  0.95 |  90.7149 | 70.5772 | 73.5422 |  2.92364071926e-05
 2013-01-17 |        148.00 |  0.33 |  89.0454 | 73.1692 | 76.7909 |  0.000115099211462
 2013-01-18 |        148.33 |  0.80 |  92.0693 | 84.1822 | 88.3023 |  0.000517312166507
 2013-01-22 |        149.13 |  0.24 |  99.1684 | 96.6919 | 86.2939 |   0.00038050482447
 2013-01-23 |        149.37 |  0.04 |  99.2241 | 96.5726 | 78.2756 | -0.000116057754723
 2013-01-24 |        149.41 |  0.84 | 100.0000 | 95.0477 | 83.0541 |  0.000213242804434
 2013-01-25 |        150.25 | -0.18 | 100.0000 | 96.5658 | 83.6438 |   0.00023223865331
 2013-01-28 |        150.07 |  0.59 |  91.5068 | 94.3615 | 86.9486 |  0.000298805954988
 2013-01-29 |        150.66 | -0.59 |  90.5523 | 95.0660 | 94.6668 |  0.000708123395604
 2013-01-30 |        150.07 | -0.37 |  65.7104 | 83.1976 | 84.9540 |  2.66587438842e-05
 2013-01-31 |        149.70 |  1.54 |  55.7312 | 71.4407 | 75.4671 | -9.99272670506e-06
(21 rows)

For the above 21 predictions, it appeared they were not very predictive.
I was dissapointed.

I was curious though, was the technology good enough to usually 
predict at least the direction of tomorrow's price move?


So, I collected predictions for all of 2013 through Aug 19:

CREATE TABLE spylin_pred10 AS
SELECT
ydate
,closing_price
,gain AS actual_gain
,rsi5
,rsi9
,rsi14
,linregr_predict(array[1,rsi5,rsi9,rsi14,sum_rsi,prod_rsi], m.coef) as predicted_ngain
FROM spylin2013 s,spylinreg12 m
ORDER BY ydate
;

Then, I ran a simple report on the collected predictions:

SELECT 
SIGN(predicted_ngain) updown_prediction
,MIN(actual_gain)
,MAX(actual_gain)
,SUM(actual_gain)
,AVG(actual_gain)
,CORR(predicted_ngain,actual_gain) AS predicted_actual_correlation
,MIN(ydate)
,COUNT(ydate)
,MAX(ydate)
FROM spylin_pred10
GROUP BY SIGN(predicted_ngain)
ORDER BY SIGN(predicted_ngain)
;
 updown_prediction |  min  | max  |  sum  |          avg           | predicted_actual_correlation |    min     | count |    max     
-------------------+-------+------+-------+------------------------+------------------------------+------------+-------+------------
                -1 | -4.05 | 2.25 |  4.12 | 0.07103448275862068966 |           -0.170719365677451 | 2013-01-03 |    58 | 2013-08-06
                 1 | -2.89 | 2.46 | 14.59 | 0.14590000000000000000 |           0.0150906303300563 | 2013-01-02 |   101 | 2013-08-19
(2 rows)



I noted that MADlib was bullish for 2013.
This pleased me because 2013 (through Aug 19) has been an 'Up' year.
Out of the 158 predictions, only 58 were bearish.

Also I noticed that MADlib correctly predicted both the largest down-day and largest up-day.

And I saw that the average gain corresponding to up-predictions was
$0.146 which is more than double the corresponding down-prediction gains.

Next I looked at gains for a 'buy/hold' strategy:

SELECT
SUM(gain) AS sum_actual_gain
,AVG(gain)
,MIN(ydate)
,COUNT(ydate)
,MAX(ydate)
FROM spylin2013
;
 sum_actual_gain |          avg           |    min     | count |    max     
-----------------+------------------------+------------+-------+------------
           18.71 | 0.11841772151898734177 | 2013-01-02 |   159 | 2013-08-19
(1 row)

A 'buy/hold' speculator would have earned an average of $0.118 per day
compared to  ($14.59-$4.12)/159 which is $0.066 earned by the prediction-follower.

Based on the above analysis, I'd say that MADlib Linear Regression of RSI values
is somewhat predictive but it is probably safer to just buy and hold.


But, what about 2008?

Could the stock market crash of 2008 happen again?

If yes, could MADlib predictions protect me from a year like 2008?

The answer is 'yes'.

I walked through the steps listed above except that I used data from
before 2008 to predict gains in 2008.

The resulting report is displayed below:

SELECT 
SIGN(predicted_ngain) updown_prediction
,MIN(actual_gain)
,MAX(actual_gain)
,SUM(actual_gain)
,AVG(actual_gain)
,CORR(predicted_ngain,actual_gain) AS predicted_actual_correlation
,MIN(ydate)
,COUNT(ydate)
,MAX(ydate)
FROM spylin_pred10
GROUP BY SIGN(predicted_ngain)
ORDER BY SIGN(predicted_ngain)
;
 updown_prediction |  min  |  max  |  sum   |           avg           | predicted_actual_correlation |    min     | count |    max     
-------------------+-------+-------+--------+-------------------------+------------------------------+------------+-------+------------
                -1 | -7.98 |  3.30 | -38.29 | -0.61758064516129032258 |           -0.031477571915134 | 2008-01-31 |    62 | 2008-12-18
                 1 | -9.83 | 12.85 | -13.68 | -0.07162303664921465969 |            0.103635446618122 | 2008-01-02 |   191 | 2008-12-31
(2 rows)

--
-- Look at sum of actual gains for 'buy and hold' situation:
--
SELECT
SUM(gain) AS sum_actual_gain
,AVG(gain)
,MIN(ydate)
,COUNT(ydate)
,MAX(ydate)
FROM spylin2008
;
 sum_actual_gain |           avg           |    min     | count |    max     
-----------------+-------------------------+------------+-------+------------
          -51.97 | -0.20541501976284584980 | 2008-01-02 |   253 | 2008-12-31
(1 row)

I can see the buy/hold speculator would have lost $51.97 by holding 1 share of SPY during 2008.

The prediction-follower speculator would have gained $38.29 - $13.68 is $24.61.

So, although MADlib linear regression is not wildly predictive, it is powerful enough to offer
significant protection from a year like 2008.




Page Top

Node.js For Rails on CentOS 6.4


A common issue a rails developer sees in new development environments is this message:

cen96 oracle ~/sv $ 
cen96 oracle ~/sv $ bundle exec rake db:create
rake aborted!
Could not find a JavaScript runtime. See https://github.com/sstephenson/execjs for a list of available runtimes.
/home/oracle/spokenvote/config/application.rb:11:in `'
/home/oracle/spokenvote/Rakefile:5:in `require'
/home/oracle/spokenvote/Rakefile:5:in `'
(See full trace by running task with --trace)
cen96 oracle ~/sv $ 

The quickest way to deal with this is to declare 'therubyracer' gem in the Gemfile of the Rails project.

For example:

# Gems used only for assets and not required
# in production environments by default.
group :assets do
  gem 'therubyracer'
  gem 'sass-rails',   '~> 3.2.5'
  gem 'coffee-rails', '~> 3.2.1'
  gem 'uglifier', '>= 1.0.3'
end

If your development environment is CentOS 6.4,
another quick way to deal with this issue is to
install Node.js on the developement environment.

A screendump of shell commands to quickly install 
Node.js on CentOS 6.4 are listed below:

cen96 oracle ~/sv $ su
Password: 
cen96 root /home/oracle/sv # 
cen96 root /home/oracle/sv # cd /etc
cen96 root /etc # cd yum.repos.d/
cen96 root /etc/yum.repos.d # ls -la
total 36
drwxr-xr-x.   2 root root  4096 Aug 11 02:28 ./
drwxr-xr-x. 104 root root 12288 Aug 18 12:45 ../
-rw-r--r--.   1 root root  1946 Aug 11 02:28 CentOS-Base.repo
-rw-r--r--.   1 root root   638 Feb 25 00:57 CentOS-Debuginfo.repo
-rw-r--r--.   1 root root   630 Feb 25 00:57 CentOS-Media.repo
-rw-r--r--.   1 root root  3664 Feb 25 00:57 CentOS-Vault.repo
-rw-r--r--.   1 root root   442 Sep 23  2012 pgdg-92-centos.repo
cen96 root /etc/yum.repos.d # rpm -Uvh http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
Retrieving http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
warning: /var/tmp/rpm-tmp.5AA8EP: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY
Preparing...                ########################################### [100%]
   1:epel-release           ########################################### [100%]
cen96 root /etc/yum.repos.d # ll
total 44
drwxr-xr-x.   2 root root  4096 Aug 18 13:03 ./
drwxr-xr-x. 104 root root 12288 Aug 18 12:45 ../
-rw-r--r--.   1 root root  1946 Aug 11 02:28 CentOS-Base.repo
-rw-r--r--.   1 root root   638 Feb 25 00:57 CentOS-Debuginfo.repo
-rw-r--r--.   1 root root   630 Feb 25 00:57 CentOS-Media.repo
-rw-r--r--.   1 root root  3664 Feb 25 00:57 CentOS-Vault.repo
-rw-r--r--.   1 root root   957 Nov  4  2012 epel.repo
-rw-r--r--.   1 root root  1056 Nov  4  2012 epel-testing.repo
-rw-r--r--.   1 root root   442 Sep 23  2012 pgdg-92-centos.repo
cen96 root /etc/yum.repos.d # 
cen96 root /etc/yum.repos.d # 
cen96 root /etc/yum.repos.d # 

At this point I had EPEL installed.

Information about EPEL can be found here:

http://fedoraproject.org/wiki/EPEL/FAQ


Next I turned towards yum.

A bit of quality time with Google revealed to me that I actually
wanted to install 'npm' rather than Node.js.

Information about npm can be found here:

https://npmjs.org/

The main idea here is that npm depends on Node.js so I can install
Node.js by installing npm.

And, once I have Node.js installed, I probably want npm installed too.

So, I actually wanted to install 'npm' rather than Node.js.



Here is a screendump of me installing npm:

cen96 root /etc/yum.repos.d # yum search npm
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
epel/metalink                                            |  14 kB     00:00     
 * base: yum.phx.singlehop.com
 * epel: linux.mirrors.es.net
 * extras: mirror.keystealth.org
 * updates: centos.vipernetworksystems.com
base                                                     | 3.7 kB     00:00     
epel                                                     | 4.2 kB     00:00     
http://linux.mirrors.es.net/fedora-epel/6/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
epel                                                     | 4.2 kB     00:00     
http://mirrors.kernel.org/fedora-epel/6/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
epel                                                     | 4.2 kB     00:00     
epel/primary_db                                          | 5.4 MB     00:17     
extras                                                   | 3.4 kB     00:00     
pgdg92                                                   | 2.8 kB     00:00     
updates                                                  | 3.4 kB     00:00     
updates/primary_db                                       | 3.9 MB     00:13     
epel/pkgtags                                             | 380 kB     00:01     
=============================== N/S Matched: npm ===============================
nodejs-fstream-npm.noarch : An fstream class for creating npm packages
nodejs-npm-registry-client.noarch : Client for the npm registry
nodejs-npm-user-validate.noarch : Username, password, and e-mail validation for
                                : the npm registry
nodejs-npmconf.noarch : npm configuration module
nodejs-npmlog.noarch : Logger for npm
nodejs-normalize-package-data.noarch : Normalizes npm/package.json metadata
nodejs-read-package-json.noarch : npm's package.json parser
nodejs-semver.noarch : Semantic versioner for npm
npm.noarch : Node.js Package Manager

  Name and summary matches only, use "search all" for everything.



Then, I installed npm:

cen96 root /etc/yum.repos.d # yum -y install npm.noarch
Loaded plugins: fastestmirror, refresh-packagekit, security

snip....

  nodejs-which.noarch 0:1.0.5-8.el6                                             
  v8.x86_64 1:3.14.5.10-1.el6                                                   
  v8-devel.x86_64 1:3.14.5.10-1.el6                                             

Complete!
cen96 root /etc/yum.repos.d # 
cen96 root /etc/yum.repos.d # 
cen96 root /etc/yum.repos.d # 



Next, I checked that Node.js was added to my path:


cen96 oracle ~/sv $ 
cen96 oracle ~/sv $ which node
/usr/bin/node
cen96 oracle ~/sv $ 
cen96 oracle ~/sv $ 
cen96 oracle ~/sv $ node -v
v0.10.14
cen96 oracle ~/sv $ 
cen96 oracle ~/sv $ 

Then, I checked if Rails was happy:

cen96 oracle ~/sv $ 
cen96 oracle ~/sv $ bundle exec rake db:create
spokenvote_development already exists
spokenvote_test already exists
cen96 oracle ~/sv $ 
cen96 oracle ~/sv $ bundle exec rake db:drop
cen96 oracle ~/sv $ bundle exec rake db:create
cen96 oracle ~/sv $ bundle exec rake db:migrate
==  CreateGoverningBodies: migrating ==========================================
-- create_table(:governing_bodies)
   -> 0.1523s
==  CreateGoverningBodies: migrated (0.1523s) =================================

snip ...

Rails was happy so I was happy.




Page Top

Heroku Toolbelt


A mashup of a checklist and screendump of 
Heroku Toolbelt installation.

Here I show you how to install the Heroku Toolbelt
to a location under your HOME directory.



Visit: https://toolbelt.heroku.com/

bikle@z5 ~ $ cd /tmp
bikle@z5 /tmp $ wget https://toolbelt.heroku.com/install.sh
--2013-07-19 20:27:14--  https://toolbelt.heroku.com/install.sh
Resolving toolbelt.heroku.com... 50.16.215.67, 50.16.215.104, 107.21.95.3, ...
Connecting to toolbelt.heroku.com|50.16.215.67|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 849 [text/plain]
Saving to: “install.sh”

100%[======================================>] 849         --.-K/s   in 0s      

2013-07-19 20:27:15 (56.3 MB/s) - “install.sh” saved [849/849]

bikle@z5 /tmp $ 
bikle@z5 /tmp $ 

bikle@z5 /tmp $ ls -la install.sh 
-rw-rw-r--. 1 bikle bikle 849 Jul 19 20:27 install.sh
bikle@z5 /tmp $ 
bikle@z5 /tmp $ 


bikle@z5 /tmp $ cat install.sh 
HEROKU_CLIENT_URL="http://assets.heroku.com.s3.amazonaws.com/heroku-client/heroku-client.tgz"

echo "This script requires superuser access to install software."
echo "You will be prompted for your password by sudo."

# clear any previous sudo permission
sudo -k

# run inside sudo
sudo sh <<SCRIPT

  # download and extract the client tarball
  rm -rf /usr/local/heroku
  mkdir -p /usr/local/heroku
  cd /usr/local/heroku

  if [[ -z "$(which wget)" ]]; then
    curl -s $HEROKU_CLIENT_URL | tar xz
  else
    wget -qO- $HEROKU_CLIENT_URL | tar xz
  fi

  mv heroku-client/* .
  rmdir heroku-client

SCRIPT

# remind the user to add to $PATH
if [[ ":$PATH:" != *":/usr/local/heroku/bin:"* ]]; then
  echo "Add the Heroku CLI to your PATH using:"
  echo "$ echo 'PATH=\"/usr/local/heroku/bin:\$PATH\"' >> ~/.profile"
fi

echo "Installation complete"
bikle@z5 /tmp $ 
bikle@z5 /tmp $ 
bikle@z5 /tmp $ 
bikle@z5 /tmp $ 

bikle@z5 /tmp $ 
bikle@z5 /tmp $ 
bikle@z5 /tmp $ wget http://assets.heroku.com.s3.amazonaws.com/heroku-client/heroku-client.tgz
--2013-07-19 20:29:11--  http://assets.heroku.com.s3.amazonaws.com/heroku-client/heroku-client.tgz
Resolving assets.heroku.com.s3.amazonaws.com... 207.171.189.81
Connecting to assets.heroku.com.s3.amazonaws.com|207.171.189.81|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 659086 (644K) [application/x-gtar]
Saving to: “heroku-client.tgz”

100%[======================================>] 659,086      277K/s   in 2.3s    

2013-07-19 20:29:13 (277 KB/s) - “heroku-client.tgz” saved [659086/659086]

bikle@z5 /tmp $ 
bikle@z5 /tmp $ 
bikle@z5 /tmp $ 
bikle@z5 /tmp $ 


bikle@z5 /tmp $ 
bikle@z5 /tmp $ tar tf heroku-client.tgz 
heroku-client/
heroku-client/bin/
heroku-client/data/
heroku-client/lib/

... snip ...

heroku-client/lib/heroku/client/heroku_postgresql.rb
heroku-client/lib/heroku/client/pgbackups.rb
heroku-client/lib/heroku/client/rendezvous.rb
heroku-client/lib/heroku/client/ssl_endpoint.rb
heroku-client/data/cacert.pem
heroku-client/bin/heroku
bikle@z5 /tmp $ 


bikle@z5 /tmp $ 
bikle@z5 /tmp $ 
bikle@z5 /tmp $ cd ~bikle
bikle@z5 ~ $ 
bikle@z5 ~ $ 
bikle@z5 ~ $ tar xvf /tmp/heroku-client.tgz
heroku-client/
heroku-client/bin/
heroku-client/data/
heroku-client/lib/

... snip ...

heroku-client/lib/heroku/client/heroku_postgresql.rb
heroku-client/lib/heroku/client/pgbackups.rb
heroku-client/lib/heroku/client/rendezvous.rb
heroku-client/lib/heroku/client/ssl_endpoint.rb
heroku-client/data/cacert.pem
heroku-client/bin/heroku
bikle@z5 ~ $ 
bikle@z5 ~ $ 
bikle@z5 ~ $ 

vi .bashrc

Place token: ~/heroku-client/bin
at beginning of PATH like this:

export PATH="${HOME}/heroku-client/bin:${PATH}"


bikle@z5 ~ $ 
bikle@z5 ~ $ which heroku
/usr/bin/which: no heroku in (/home/bikle/.rbenv/bin:/home/bikle/.rbenv/shims:/home/bikle/.rbenv/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/bikle/bin:/sbin:.:/home/bikle/bin:/home/bikle/bin:/sbin:.)
bikle@z5 ~ $ 
bikle@z5 ~ $ 
bikle@z5 ~ $ bash
bikle@z5 ~ $ 
bikle@z5 ~ $ which heroku
~/heroku-client/bin/heroku
bikle@z5 ~ $ 
bikle@z5 ~ $ 


bikle@z5 ~ $ heroku
Usage: heroku COMMAND [--app APP] [command-specific-options]

Primary help topics, type "heroku help TOPIC" for more details:

  addons    #  manage addon resources
  apps      #  manage apps (create, destroy)
  auth      #  authentication (login, logout)
  config    #  manage app config vars
  domains   #  manage custom domains
  logs      #  display logs for an app
  ps        #  manage dynos (dynos, workers)
  releases  #  manage app releases
  run       #  run one-off commands (console, rake)
  sharing   #  manage collaborators on an app

Additional topics:

  account      #  manage heroku account options
  certs        #  manage ssl endpoints for an app
  db           #  manage the database for an app
  drains       #  display syslog drains for an app
  fork         #  clone an existing app
  git          #  manage git for apps
  help         #  list commands and display help
  keys         #  manage authentication keys
  labs         #  manage optional features
  maintenance  #  manage maintenance mode for an app
  pg           #  manage heroku-postgresql databases
  pgbackups    #  manage backups of heroku postgresql databases
  plugins      #  manage plugins to the heroku gem
  regions      #  list available regions
  stack        #  manage the stack for an app
  status       #  check status of heroku platform
  update       #  update the heroku client
  version      #  display version

bikle@z5 ~ $ 
bikle@z5 ~ $ 
bikle@z5 ~ $ heroku version
heroku-toolbelt/2.39.4 (x86_64-linux) ruby/2.0.0
bikle@z5 ~ $ 
bikle@z5 ~ $ 


bikle@z5 ~ $ 
bikle@z5 ~ $ 
bikle@z5 ~ $ date
Fri Jul 19 20:41:40 UTC 2013
bikle@z5 ~ $ 
bikle@z5 ~ $ 



Page Top

NOT IN to Left Outer Join


A demo of replacing a NOT IN SQL query with a left-outer-join.

Imagine I host a BBQ two months from now.

I sent out e-mail invitations one month ago asking for RSVP.

Half the invitees replied with RSVP.

Now, I want to re-send invitations to invitees NOT IN list of invitees who replied with RSVP.

I have this table structure:

CREATE TABLE friends (email VARCHAR2(22));

CREATE TABLE rsvps   (email VARCHAR2(22));

INSERT INTO friends VALUES ('alan@bikle.com');
INSERT INTO friends VALUES ('beth@bikle.com');
INSERT INTO friends VALUES ('catt@bikle.com');
INSERT INTO friends VALUES ('dani@bikle.com');
INSERT INTO friends VALUES ('eric@bikle.com');

INSERT INTO rsvps   VALUES ('beth@bikle.com');
INSERT INTO rsvps   VALUES ('dani@bikle.com');
INSERT INTO rsvps   VALUES ('dani@bikle.com');
INSERT INTO rsvps   VALUES ('dani@bikle.com');
INSERT INTO rsvps   VALUES ('dani@bikle.com');

This SQL will give me a list of invitees NOT IN list of invitees who replied with RSVP:


03:35:23 SQL> 
03:35:24 SQL> SELECT email FROM friends
03:35:26   2  WHERE email NOT IN (SELECT email FROM rsvps);

EMAIL
----------------------
eric@bikle.com
catt@bikle.com
alan@bikle.com

03:35:35 SQL> 
03:35:44 SQL> 


How do I replace the above query with a left-outer-join
(which may be more performant)?

ref: http://www.w3schools.com/sql/sql_join_left.asp

Run this query:

03:35:44 SQL> SELECT f.email email_f, r.email email_r
03:37:58   2  FROM friends f
03:38:06   3  LEFT OUTER JOIN rsvps r
03:38:13   4  ON f.email = r.email;

EMAIL_F 	       EMAIL_R
---------------------- ----------------------
beth@bikle.com	       beth@bikle.com
dani@bikle.com	       dani@bikle.com
dani@bikle.com	       dani@bikle.com
dani@bikle.com	       dani@bikle.com
dani@bikle.com	       dani@bikle.com
eric@bikle.com
catt@bikle.com
alan@bikle.com

8 rows selected.

Now I can see that this left-outer-join can replace the NOT IN query:

03:39:34 SQL> 
03:39:35 SQL> SELECT f.email
03:39:36   2  FROM friends f
03:39:42   3  LEFT OUTER JOIN rsvps r
03:39:47   4  ON f.email = r.email
03:39:54   5  WHERE r.email IS NULL;

EMAIL
----------------------
eric@bikle.com
catt@bikle.com
alan@bikle.com

03:40:00 SQL> 
03:40:01 SQL> 

If performance is not an issue, 
I would use the NOT IN query because it is easier to understand.

For some types of databases though,
a left-outer-join could be made faster via the use of indexes
and SQL-hints.



Page Top

Install MADlib on Postgres 9.2 on CentOS 6.4


This is a mashup of a checklist and screendump.

I use it as a guide to:
Install MADlib on Postgres 9.2 on CentOS 6.4

I started by finding my credit card.

Then I used my credit card to signup for a server at:
https://www.digitalocean.com/registrations/new

I created a server with 2GB of RAM:
https://www.digitalocean.com/pricing

When it asks for the type of image I want,
I selected: "CentOS 6.4 x64"

After 60 seconds, the server was ready.

Then, they mailed me the IP address and root password.

I logged in as root.

I ensured my copy of CentOS was latest:

[root@madlib12 ~]# 
[root@madlib12 ~]# 
[root@madlib12 ~]# yum -y upgrade
yum -y upgrade
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.san.fastserv.com
 * extras: mirrors.sonic.net
 * updates: mirrors.usc.edu
base                                                     | 3.7 kB     00:00     
extras                                                   | 3.4 kB     00:00     
updates                                                  | 3.4 kB     00:00     
updates/primary_db                                       | 3.8 MB     00:00     
Setting up Upgrade Process
No Packages marked for Update
[root@madlib12 ~]# 
[root@madlib12 ~]# 
[root@madlib12 ~]# 

I studied this URL:
http://www.google.com/search?q=how+I+install+postgres+9+on+centos

I studied this URL:
http://wiki.postgresql.org/wiki/YUM_Installation

I did this:
[root@madlib12 yum.repos.d]# 
[root@madlib12 yum.repos.d]# 
[root@madlib12 yum.repos.d]# cd /etc/yum.repos.d
[root@madlib12 yum.repos.d]# ls -la
total 24
drwxr-xr-x  2 root root 4096 Mar  9 14:56 .
drwxr-xr-x 60 root root 4096 Jul 31 20:46 ..
-rw-r--r--  1 root root 1926 Feb 25 08:57 CentOS-Base.repo
-rw-r--r--  1 root root  638 Feb 25 08:57 CentOS-Debuginfo.repo
-rw-r--r--  1 root root  630 Feb 25 08:57 CentOS-Media.repo
-rw-r--r--  1 root root 3664 Feb 25 08:57 CentOS-Vault.repo
[root@madlib12 yum.repos.d]# 
[root@madlib12 yum.repos.d]# grep postgresql CentOS-Base.repo
[root@madlib12 yum.repos.d]# 
[root@madlib12 yum.repos.d]# echo 'exclude=postgresql*' >> CentOS-Base.repo
[root@madlib12 yum.repos.d]# 
[root@madlib12 yum.repos.d]# 


I did this:
[root@madlib12 yum.repos.d]# 
[root@madlib12 yum.repos.d]# 
[root@madlib12 yum.repos.d]# cd /tmp/
[root@madlib12 tmp]# wget http://yum.postgresql.org/9.2/redhat/rhel-6-x86_64/pgdg-centos92-9.2-6.noarch.rpm
--2013-07-31 21:02:59--  http://yum.postgresql.org/9.2/redhat/rhel-6-x86_64/pgdg-centos92-9.2-6.noarch.rpm
Resolving yum.postgresql.org... 98.129.198.114, 2001:4800:7903:3::114
Connecting to yum.postgresql.org|98.129.198.114|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5288 (5.2K) [application/x-redhat-package-manager]
Saving to: “pgdg-centos92-9.2-6.noarch.rpm”

100%[======================================>] 5,288       --.-K/s   in 0s      

2013-07-31 21:02:59 (188 MB/s) - “pgdg-centos92-9.2-6.noarch.rpm” saved [5288/5288]

[root@madlib12 tmp]# rpm -ivh pgdg-centos92-9.2-6.noarch.rpm
warning: pgdg-centos92-9.2-6.noarch.rpm: Header V4 DSA/SHA1 Signature, key ID 442df0f8: NOKEY
Preparing...                ########################################### [100%]
   1:pgdg-centos92          ########################################### [100%]
[root@madlib12 tmp]# 
[root@madlib12 tmp]# 
[root@madlib12 tmp]# 

I did this:
[root@madlib12 tmp]# 
[root@madlib12 tmp]# 
[root@madlib12 tmp]# yum install postgresql92-server.x86_64
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: centos.mirrors.hoobly.com
 * extras: centos.mirror.freedomvoice.com
 * updates: mirror.linux.duke.edu
base                                                     | 3.7 kB     00:00     
extras                                                   | 3.4 kB     00:00     
pgdg92                                                   | 2.8 kB     00:00     
pgdg92/primary_db                                        | 106 kB     00:04     
updates                                                  | 3.4 kB     00:00     
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package postgresql92-server.x86_64 0:9.2.4-1PGDG.rhel6 will be installed
--> Processing Dependency: postgresql92 = 9.2.4-1PGDG.rhel6 for package: postgresql92-server-9.2.4-1PGDG.rhel6.x86_64
--> Processing Dependency: libpq.so.5()(64bit) for package: postgresql92-server-9.2.4-1PGDG.rhel6.x86_64
--> Running transaction check
---> Package postgresql92.x86_64 0:9.2.4-1PGDG.rhel6 will be installed
---> Package postgresql92-libs.x86_64 0:9.2.4-1PGDG.rhel6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package                  Arch        Version                 Repository   Size
================================================================================
Installing:
 postgresql92-server      x86_64      9.2.4-1PGDG.rhel6       pgdg92      3.8 M
Installing for dependencies:
 postgresql92             x86_64      9.2.4-1PGDG.rhel6       pgdg92      970 k
 postgresql92-libs        x86_64      9.2.4-1PGDG.rhel6       pgdg92      185 k

Transaction Summary
================================================================================
Install       3 Package(s)

Total download size: 4.9 M
Installed size: 21 M
Is this ok [y/N]: y
Downloading Packages:
(1/3): postgresql92-9.2.4-1PGDG.rhel6.x86_64.rpm         | 970 kB     00:42     
(2/3): postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64.rpm    | 185 kB     00:05     
(3/3): postgresql92-server-9.2.4-1PGDG.rhel6.x86_64.rpm  | 3.8 MB     00:31     
--------------------------------------------------------------------------------
Total                                            63 kB/s | 4.9 MB     01:20     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Warning: RPMDB altered outside of yum.
  Installing : postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64                   1/3 
  Installing : postgresql92-9.2.4-1PGDG.rhel6.x86_64                        2/3 
  Installing : postgresql92-server-9.2.4-1PGDG.rhel6.x86_64                 3/3 
  Verifying  : postgresql92-server-9.2.4-1PGDG.rhel6.x86_64                 1/3 
  Verifying  : postgresql92-9.2.4-1PGDG.rhel6.x86_64                        2/3 
  Verifying  : postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64                   3/3 

Installed:
  postgresql92-server.x86_64 0:9.2.4-1PGDG.rhel6                                

Dependency Installed:
  postgresql92.x86_64 0:9.2.4-1PGDG.rhel6                                       
  postgresql92-libs.x86_64 0:9.2.4-1PGDG.rhel6                                  

Complete!
[root@madlib12 tmp]# 
[root@madlib12 tmp]# 
[root@madlib12 tmp]# 


I initialized my new Postgres Server:
[root@madlib12 tmp]# 
[root@madlib12 tmp]# 
[root@madlib12 tmp]# cd /etc/init.d
[root@madlib12 init.d]# ls -la postgr*
-rwxr-xr-x 1 root root 9309 Apr  1 23:41 postgresql-9.2
[root@madlib12 init.d]# 

[root@madlib12 init.d]# /etc/init.d/postgresql-9.2 init
Usage: /etc/init.d/postgresql-9.2 {start|stop|status|restart|upgrade|condrestart|try-restart|reload|force-reload|initdb}
[root@madlib12 init.d]# /etc/init.d/postgresql-9.2 initdb
Initializing database: [  OK  ]
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 


I started my new Postgres Server:
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# /etc/init.d/postgresql-9.2 start
Starting postgresql-9.2 service: [  OK  ]
[root@madlib12 init.d]# ps -ef|grep post
ps -ef|grep post
root      1014     1  0 20:37 ?        00:00:00 /usr/libexec/postfix/master
postfix   1024  1014  0 20:37 ?        00:00:00 pickup -l -t fifo -u
postfix   1025  1014  0 20:37 ?        00:00:00 qmgr -l -t fifo -u
postgres  1330     1  0 21:13 ?        00:00:00 /usr/pgsql-9.2/bin/postmaster -p 5432 -D /var/lib/pgsql/9.2/data
postgres  1332  1330  0 21:13 ?        00:00:00 postgres: logger process                                        
postgres  1334  1330  0 21:13 ?        00:00:00 postgres: checkpointer process                                  
postgres  1335  1330  0 21:13 ?        00:00:00 postgres: writer process                                        
postgres  1336  1330  0 21:13 ?        00:00:00 postgres: wal writer process                                    
postgres  1337  1330  0 21:13 ?        00:00:00 postgres: autovacuum launcher process                           
postgres  1338  1330  0 21:13 ?        00:00:00 postgres: stats collector process                               
root      1343  1059  0 21:13 pts/0    00:00:00 grep post
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 


I installed plpython:
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# yum search plpython
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: centos.mirrors.hoobly.com
 * extras: centos.mirror.freedomvoice.com
 * updates: mirror.san.fastserv.com
============================ N/S Matched: plpython =============================
postgresql-plpython.x86_64 : The Python procedural language for PostgreSQL
postgresql92-plpython.x86_64 : The Python procedural language for PostgreSQL

  Name and summary matches only, use "search all" for everything.
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 

[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# yum install postgresql92-plpython.x86_64
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: centos.mirrors.hoobly.com
 * extras: centos.mirror.freedomvoice.com
 * updates: mirror.san.fastserv.com
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package postgresql92-plpython.x86_64 0:9.2.4-1PGDG.rhel6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package                   Arch       Version                  Repository  Size
================================================================================
Installing:
 postgresql92-plpython     x86_64     9.2.4-1PGDG.rhel6        pgdg92      64 k

Transaction Summary
================================================================================
Install       1 Package(s)

Total download size: 64 k
Installed size: 233 k
Is this ok [y/N]: y
Downloading Packages:
postgresql92-plpython-9.2.4-1PGDG.rhel6.x86_64.rpm       |  64 kB     00:00     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : postgresql92-plpython-9.2.4-1PGDG.rhel6.x86_64               1/1 
  Verifying  : postgresql92-plpython-9.2.4-1PGDG.rhel6.x86_64               1/1 

Installed:
  postgresql92-plpython.x86_64 0:9.2.4-1PGDG.rhel6                              

Complete!
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 

I gave a password to the postgres-linux account and then logged into it:
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# grep postg /etc/passwd
postgres:x:26:26:PostgreSQL Server:/var/lib/pgsql:/bin/bash
[root@madlib12 init.d]# passwd postgres
Changing password for user postgres.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 


[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# ssh postgres@localhost
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is c9:f5:8d:5a:6b:f7:e8:6a:a4:62:00:38:d3:d3:e8:ec.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
postgres@localhost's password: 
-bash-4.1$ 
-bash-4.1$ pwd
/var/lib/pgsql
-bash-4.1$ 
-bash-4.1$ 


-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ ls -la
ls -la
total 16
drwx------  3 postgres postgres 4096 Jul 31 21:07 .
drwxr-xr-x 16 root     root     4096 Jul 31 21:07 ..
drwx------  4 postgres postgres 4096 Jul 31 21:12 9.2
-rw-r--r--  1 postgres postgres   88 Jul 31 21:07 .bash_profile
-bash-4.1$ 

-bash-4.1$ cd 9.2
-bash-4.1$ 

-bash-4.1$ ls
backups  data  pgstartup.log
-bash-4.1$ 
-bash-4.1$ cd data
-bash-4.1$ 

-bash-4.1$ ls -la
total 104
drwx------ 15 postgres postgres  4096 Jul 31 21:13 .
drwx------  4 postgres postgres  4096 Jul 31 21:12 ..
drwx------  5 postgres postgres  4096 Jul 31 21:12 base
drwx------  2 postgres postgres  4096 Jul 31 21:13 global
drwx------  2 postgres postgres  4096 Jul 31 21:12 pg_clog
-rw-------  1 postgres postgres  4232 Jul 31 21:12 pg_hba.conf
-rw-------  1 postgres postgres  1636 Jul 31 21:12 pg_ident.conf
drwx------  2 postgres postgres  4096 Jul 31 21:13 pg_log
drwx------  4 postgres postgres  4096 Jul 31 21:12 pg_multixact
drwx------  2 postgres postgres  4096 Jul 31 21:13 pg_notify
drwx------  2 postgres postgres  4096 Jul 31 21:12 pg_serial
drwx------  2 postgres postgres  4096 Jul 31 21:12 pg_snapshots
drwx------  2 postgres postgres  4096 Jul 31 21:25 pg_stat_tmp
drwx------  2 postgres postgres  4096 Jul 31 21:12 pg_subtrans
drwx------  2 postgres postgres  4096 Jul 31 21:12 pg_tblspc
drwx------  2 postgres postgres  4096 Jul 31 21:12 pg_twophase
-rw-------  1 postgres postgres     4 Jul 31 21:12 PG_VERSION
drwx------  3 postgres postgres  4096 Jul 31 21:12 pg_xlog
-rw-------  1 postgres postgres 19587 Jul 31 21:12 postgresql.conf
-rw-------  1 postgres postgres    71 Jul 31 21:13 postmaster.opts
-rw-------  1 postgres postgres    80 Jul 31 21:13 postmaster.pid
-bash-4.1$ 
-bash-4.1$ 

I created a database and a schema both named 'madlib':

oracle@z3:~$ ssh postgres@madlib12
postgres@madlib12's password: 
Last login: Wed Jul 31 21:25:31 2013 from localhost
-bash-4.1$ 
-bash-4.1$ createdb madlib
-bash-4.1$ 
-bash-4.1$ psql
psql (9.2.4)
Type "help" for help.

postgres=# CREATE USER madlib SUPERUSER;
CREATE ROLE
postgres=# ALTER USER madlib PASSWORD 'madlib';
ALTER ROLE
postgres=# 
postgres=# \q
-bash-4.1$ 
-bash-4.1$ 

I enhanced pg_hba.conf.

before:
  host    all             all             127.0.0.1/32            ident

after:
  host    all             all             127.0.0.1/32            md5



I made the enhancement active:
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# /etc/init.d/postgresql-9.2 reload
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 


I installed some MADlib software:
[root@madlib12 init.d]# 
[root@madlib12 init.d]# 
[root@madlib12 init.d]# cd /tmp
[root@madlib12 tmp]# wget http://www.madlib.net/files/madlib-1.0-Linux.rpm
--2013-07-31 21:39:14--  http://www.madlib.net/files/madlib-1.0-Linux.rpm
Resolving www.madlib.net... 74.50.54.170
Connecting to www.madlib.net|74.50.54.170|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 63691848 (61M) [application/x-rpm]
Saving to: “madlib-1.0-Linux.rpm”

100%[======================================>] 63,691,848  11.1M/s   in 5.9s    

2013-07-31 21:39:21 (10.2 MB/s) - “madlib-1.0-Linux.rpm” saved [63691848/63691848]

[root@madlib12 tmp]# 
[root@madlib12 tmp]# 
[root@madlib12 tmp]# yum install madlib-1.0-Linux.rpm --nogpgcheck
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: centos.mirror.freedomvoice.com
 * extras: centos.mirror.freedomvoice.com
 * updates: mirror.san.fastserv.com
Setting up Install Process
Examining madlib-1.0-Linux.rpm: madlib-1.0-1.x86_64
Marking madlib-1.0-Linux.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package madlib.x86_64 0:1.0-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package        Arch           Version          Repository                 Size
================================================================================
Installing:
 madlib         x86_64         1.0-1            /madlib-1.0-Linux         290 M

Transaction Summary
================================================================================
Install       1 Package(s)

Total size: 290 M
Installed size: 290 M
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : madlib-1.0-1.x86_64                                          1/1 
  Verifying  : madlib-1.0-1.x86_64                                          1/1 

Installed:
  madlib.x86_64 0:1.0-1                                                         

Complete!
[root@madlib12 tmp]# 
[root@madlib12 tmp]# 

From the postgres-linux account,
I tested that I could connect to the madlib schema and database:
-bash-4.1$ 
-bash-4.1$ 
-bash-4.1$ id
uid=26(postgres) gid=26(postgres) groups=26(postgres)
-bash-4.1$ 
-bash-4.1$ psql -d madlib -h 127.0.0.1 -U madlib
Password for user madlib: 
psql (9.2.4)
Type "help" for help.

madlib=# CREATE TABLE dropme (myd DATE);
CREATE TABLE
madlib=# DROP TABLE dropme;
DROP TABLE
madlib=# 
madlib=# \q
-bash-4.1$ 
-bash-4.1$ 


I studied this URL:
https://github.com/madlib/madlib/wiki/Installation-Guide


From the postgres-linux account, I ran a shell command:
-bash-4.1$ 
-bash-4.1$ id
uid=26(postgres) gid=26(postgres) groups=26(postgres)
-bash-4.1$ 
-bash-4.1$ /usr/local/madlib/bin/madpack -p postgres -c madlib@127.0.0.1/madlib install
Password for user madlib: 
madpack.py : INFO : Detected PostgreSQL version 9.2.
madpack.py : INFO : *** Installing MADlib ***
madpack.py : INFO : MADlib tools version    = 1.0 (/usr/local/madlib/Versions/1.0/bin/../madpack/madpack.py)
madpack.py : INFO : MADlib database version = None (host=127.0.0.1:5432, db=madlib, schema=madlib)
madpack.py : INFO : Testing PL/Python environment...
madpack.py : INFO : > Creating language PL/Python...
madpack.py : INFO : > PL/Python environment OK (version: 2.6.6)
madpack.py : INFO : Installing MADlib into MADLIB schema...
madpack.py : INFO : > Creating MADLIB schema
madpack.py : INFO : > Creating MADLIB.MigrationHistory table
madpack.py : INFO : > Writing version info in MigrationHistory table
madpack.py : INFO : > Creating objects for modules:
madpack.py : INFO : > - array_ops
madpack.py : INFO : > - bayes
madpack.py : INFO : > - cart
madpack.py : INFO : > - linalg
madpack.py : INFO : > - lda
madpack.py : INFO : > - prob
madpack.py : INFO : > - quantile
madpack.py : INFO : > - sketch
madpack.py : INFO : > - stats
madpack.py : INFO : > - svd_mf
madpack.py : INFO : > - svec
madpack.py : INFO : > - viterbi
madpack.py : INFO : > - validation
madpack.py : INFO : > - elastic_net
madpack.py : INFO : > - summary
madpack.py : INFO : > - assoc_rules
madpack.py : INFO : > - conjugate_gradient
madpack.py : INFO : > - data_profile
madpack.py : INFO : > - kernel_machines
madpack.py : INFO : > - utilities
madpack.py : INFO : > - crf
madpack.py : INFO : > - compatibility
madpack.py : INFO : > - convex
madpack.py : INFO : > - regress
madpack.py : INFO : > - sample
madpack.py : INFO : > - kmeans
madpack.py : INFO : MADlib 1.0 installed successfully in MADLIB schema.
-bash-4.1$ 
-bash-4.1$ 

Next, I started the task of building a simple MADlib demo.



Page Top

Install Postgres 9.2.4 on Ubuntu 12.04.2


Checklist/Screendup for installing Postgres 9.2 on Ubuntu 12.04.2 Linux

I started by installing this image in my VirtualBox environment:

http://www.ubuntu.com/start-download?distro=desktop&bits=64&release=lts

Then I logged in as root and ran these shell commands:

apt-get update
apt-get upgrade
cd /etc/apt/sources.list.d/
echo 'deb http://apt.postgresql.org/pub/repos/apt/ precise-pgdg main' > pgdg.list
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc |  apt-key add -
apt-get update 
apt-get install postgresql-9.2

I studied the output from the last shell command.

I saw indications that it did these tasks:
  - install postgresql-9.2 software
  - create the postgres linux account
  - initialize the postgres server
  - start the postgres server

When I used yum to install postgresql-9.2 on a CentOS 6.4 environment,
yum only installed the software and create the postgres linux account.
Yum did not initialize the server or start the postgres server.

So, it's obvious that apt-get on Ubuntu does a more complete installation
than what I observed on CentOS 6.4.

After I ran
apt-get install postgresql-9.2
on Ubuntu, I looked for the postgres processes:

root@ub94 /home/bikle # 
root@ub94 /home/bikle # 
root@ub94 /home/bikle # ps -ef | grep post
postgres  3788     1  0 17:35 ?        00:00:00 /usr/lib/postgresql/9.2/bin/postgres -D /var/lib/postgresql/9.2/main -c config_file=/etc/postgresql/9.2/main/postgresql.conf
postgres  3790  3788  0 17:35 ?        00:00:00 postgres: checkpointer process                                                                                              
postgres  3791  3788  0 17:35 ?        00:00:00 postgres: writer process                                                                                                    
postgres  3792  3788  0 17:35 ?        00:00:00 postgres: wal writer process                                                                                                
postgres  3793  3788  0 17:35 ?        00:00:00 postgres: autovacuum launcher process                                                                                       
postgres  3794  3788  0 17:35 ?        00:00:00 postgres: stats collector process                                                                                           
root      4143  3853  0 18:51 pts/6    00:00:00 grep post
root@ub94 /home/bikle # 
root@ub94 /home/bikle # 
root@ub94 /home/bikle # 

Next, I logged into the postgres-linux account and 
tried connecting to the postgres-database using the psql shell command:


root@ub94 /home/bikle # 
root@ub94 /home/bikle # 
root@ub94 /home/bikle # passwd postgres
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully
root@ub94 /home/bikle # 
root@ub94 /home/bikle # 
root@ub94 /home/bikle # 


root@ub94 /home/bikle # 
root@ub94 /home/bikle # 
root@ub94 /home/bikle # ssh postgres@localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is 4d:15:17:a7:5f:ce:c3:75:cc:d2:15:df:e5:11:8f:eb.
Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
postgres@localhost's password: 
Welcome to Ubuntu 12.04.2 LTS (GNU/Linux 3.5.0-23-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

Last login: Sat Aug 10 17:43:45 2013 from localhost
postgres@ub94:~$ 
postgres@ub94:~$ 
postgres@ub94:~$ 
postgres@ub94:~$ id
uid=116(postgres) gid=125(postgres) groups=125(postgres),110(ssl-cert)
postgres@ub94:~$ 
postgres@ub94:~$ 
postgres@ub94:~$ psql
psql (9.2.4)
Type "help" for help.

postgres=# 
postgres=# 

At this point I was confident that PostgreSQL 9.2.4 
was installed correctly in my Ubuntu 12.04.2 environment.




Page Top

Access Windows 8 BIOS on HP ENVY TS 15 Notebook PC


I just found a way to change the BIOS on my new HP laptop.

The laptop type is:
  HP ENVY TS 15 Notebook PC
  Product id: E0M24UA#ABA
  CPU: AMD A10-5750M APU with Radeon(tm) HD Graphics
  System BIOS: F.08

I bought it from Costco on 2013-08-27.

I started by right-corner-mouse-hover.

I clicked Search
I clicked Settings
I searched for "BIOS"
I clicked Advanced Startup Options
I scrolled to the end 
I clicked Advanced Startup
I clicked Restart
I waited for PC to shutdown/restart
I clicked Troubleshoot

I noticed Reset PC (useful for PC return to Costco)
I clicked Advanced Options
I clicked UEFI Firmware Settings
I clicked Restart
I waited
I got the familiar black F-key-BIOS screen
I pressed F10 
I was in the familiar BIOS UI
I changed the BIOS so that Virtualization was Enabled
I saved and exited.

Another thing I could do in there is change UEFI so it supports Legacy OS.
Once Legacy OS is enabled, I could then try booting off a Linux CDROM.



Page Top