Sunday, 24 March 2013



Infographics 


Visual.ly is a one-stop shop for the creation of data visualizations and infographics, bringing together various persons based on shared interests.

This tool fetches data from certain period of activities happened so far.
  
As a part of this assignment, I came across many top sites which help create a good infographic resume. I came across a plethora of such sites but i chose visual.ly for a detailed study.

For those of us who wants to design a different looking resume from others, it helps a lot.
I have gone through this and build a different looking resume.

The steps to be followed to create resume through visul.ly

1       Go to the following link http://visual.ly/
2        Click on the create option à http://create.visual.ly/
3         I chose resume by Kelly
4        I chose Helen Wheels, black gradient to create my resume
5         I uploaded my details from my LinkedIn profile à http://www.linkedin.com






Pros:

- Allows choosing between 4-5 themes.
- Options to tweet, share on FB, Pin and share on other social media sites
- Provides option to download as PDF, mail to your email ID.
- Easy Accessibility.
- Different gradient versions.
- Ease of data access, no need to edit/enter any data.

Cons -:

- Doesn't allow playing around with the format of the resume.
- Less options to customise the graphics.
- Limited number of themes to select.




Friday, 15 March 2013


FRIDAY, 15 MARCH 2013


                              IT Lab session 8



We will be doing Panel Data Analysis of "Produc" data

We will be analysing on three types of model :
      Pooled affect model
      Fixed affect model
      Random affect model 

Then we will be determining which model is the best by using functions:
       pFtest : for determining between fixed and pooled
       plmtest : for determining between pooled and random
       phtest: for determining between random and fixed

Commands:

Loading data: 
> data(Produc, package ="plm")
> head(Produc)



Pooled Affect Model 

> pool <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("pooling"), index = c("state","year"))
> summary(pool)


Fixed Affect Model:

> fixed <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("within"), index = c("state","year"))
> summary(fixed)




Random Affect Model:
> random <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("random"), index = c("state","year"))
> summary(random)


Comparison

The comparison between the models would be a Hypothesis testing based on the following concept:

H0: Null Hypothesis: the individual index and time based params are all zero
H1: Alternate Hypothesis: atleast one of the index and time based params is non zero

Pooled vs Fixed

Null Hypothesis: Pooled Affect Model
Alternate Hypothesis : Fixed Affect Model

Command:
> pFtest(fixed,pool)
Result:
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) +      log(emp) + log(unemp) 
F = 56.6361, df1 = 47, df2 = 761, p-value < 2.2e-16
alternative hypothesis: significant effects 

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.



Pooled vs Random

Null Hypothesis: Pooled Affect Model
Alternate Hypothesis: Random Affect Model

Command :
> plmtest(pool)

Result:

        Lagrange Multiplier Test - (Honda)
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) +      log(emp) + log(unemp)
normal = 57.1686, p-value < 2.2e-16
alternative hypothesis: significant effects 


Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Random Affect Model.



Random vs Fixed

Null Hypothesis: No Correlation . Random Affect Model
Alternate Hypothesis: Fixed Affect Model

Command:
 > phtest(fixed,random)

Result:

        Hausman Test
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) +      log(emp) + log(unemp)
chisq = 93.546, df = 7, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent 


Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.



Conclusion: 

So after making all the comparisons we come to the conclusion that Fixed Affect Model is best suited to do the panel data analysis for "Produc" data set.

Hence , we conclude that within the same id i.e. within same "state" there is no variation.

Thursday, 14 February 2013



IT LAB ASSIGNMENT 6

Assignment 1: Find the historical volatility and log of returns data





Assignment 2: Create ACF plot of log returns and do Augmented Dickey-Fuller test




Thursday, 7 February 2013


Assignment 5- session 5



Assignment1: To find and plot returns for NSE data of more than     months.

sol:

> z<-read.csv(file.choose(),header=T)
> head(z)
         Date    Open    High     Low   Close Shares.Traded Turnover..Rs..Cr.
1 02-Jul-2012 5283.85 5302.15 5263.35 5278.60      126161441          4991.57
2 03-Jul-2012 5298.85 5317.  00 5265.95 5287.95     133117055           5161.82
3 04-Jul-2012 5310.40 5317.65 5273.30 5302.55     155995887           5750.10
4 05-Jul-2012 5297.05 5333.65 5288.85 5327.30     118915392           4709.79
5 06-Jul-2012 5324.70 5327.20 5287.75 5316.95     113300726           4760.51
6 09-Jul-2012 5283.70 5300.60 5257.75 5275.15     101169926           4189.25
> open<-z$Open[10:95]
> open.ts<-ts(open,deltat=1/252)
> open.ts
Time Series:
Start = c(1, 1)
End = c(1, 86)
Frequency = 252
 [1] 5242.75 5232.35 5228.05 5199.10 5249.85 5233.55 5163.25 5128.80 5118.40
[10] 5126.30 5124.30 5129.75 5214.85 5220.70 5233.10 5195.60 5260.85 5295.40
[19] 5345.25 5348.30 5308.20 5316.35 5343.25 5385.95 5368.60 5368.70 5395.75
[28] 5426.15 5392.60 5387.85 5348.05 5343.85 5268.60 5298.20 5276.50 5249.15
[37] 5243.90 5217.65 5309.45 5343.65 5361.90 5336.10 5404.45 5435.20 5528.35
[46] 5631.75 5602.40 5536.95 5577.00 5691.95 5674.90 5653.40 5673.75 5684.80
[55] 5704.75 5727.70 5751.55 5815.00 5751.85 5708.15 5671.15 5663.50 5681.70
[64] 5674.25 5705.60 5681.10 5675.30 5703.30 5667.60 5715.65 5688.80 5683.55
[73] 5665.20 5656.35 5596.75 5609.85 5696.35 5693.05 5694.10 5718.60 5709.00
[82] 5731.10 5688.45 5689.70 5650.35 5624.80
> summary(open.ts)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   5118    5281    5431    5474    5682    5815
> z.diff<-diff(open.ts)
> z.diff
Time Series:
Start = c(1, 2)
End = c(1, 86)
Frequency = 252
 [1] -10.40  -4.30 -28.95  50.75 -16.30 -70.30 -34.45 -10.40   7.90  -2.00
[11]   5.45  85.10   5.85  12.40 -37.50  65.25  34.55  49.85   3.05 -40.10
[21]   8.15  26.90  42.70 -17.35   0.10  27.05  30.40 -33.55  -4.75 -39.80
[31]  -4.20 -75.25  29.60 -21.70 -27.35  -5.25 -26.25  91.80  34.20  18.25
[41] -25.80  68.35  30.75  93.15 103.40 -29.35 -65.45  40.05 114.95 -17.05
[51] -21.50  20.35  11.05  19.95  22.95  23.85  63.45 -63.15 -43.70 -37.00
[61]  -7.65  18.20  -7.45  31.35 -24.50  -5.80  28.00 -35.70  48.05 -26.85
[71]  -5.25 -18.35  -8.85 -59.60  13.10  86.50  -3.30   1.05  24.50  -9.60
[81]  22.10 -42.65   1.25 -39.35 -25.55
> returns<-cbind(open.ts,z.diff,lag(open.ts,k=-1))
> returns
Time Series:
Start = c(1, 1)
End = c(1, 87)
Frequency = 252
         open.ts z.diff lag(open.ts, k = -1)
1.000000 5242.75     NA                   NA
1.003968 5232.35 -10.40              5242.75
1.007937 5228.05  -4.30              5232.35
1.011905 5199.10 -28.95              5228.05
1.015873 5249.85  50.75              5199.10
1.019841 5233.55 -16.30              5249.85
1.023810 5163.25 -70.30              5233.55
1.027778 5128.80 -34.45              5163.25
1.031746 5118.40 -10.40              5128.80
1.035714 5126.30   7.90              5118.40
1.039683 5124.30  -2.00              5126.30
1.043651 5129.75   5.45              5124.30
1.047619 5214.85  85.10              5129.75
1.051587 5220.70   5.85              5214.85
1.055556 5233.10  12.40              5220.70
1.059524 5195.60 -37.50              5233.10
1.063492 5260.85  65.25              5195.60
1.067460 5295.40  34.55              5260.85
1.071429 5345.25  49.85              5295.40
1.075397 5348.30   3.05              5345.25
1.079365 5308.20 -40.10              5348.30
1.083333 5316.35   8.15              5308.20
1.087302 5343.25  26.90              5316.35
1.091270 5385.95  42.70              5343.25
1.095238 5368.60 -17.35              5385.95
1.099206 5368.70   0.10              5368.60
1.103175 5395.75  27.05              5368.70
1.107143 5426.15  30.40              5395.75
1.111111 5392.60 -33.55              5426.15
1.115079 5387.85  -4.75              5392.60
1.119048 5348.05 -39.80              5387.85
1.123016 5343.85  -4.20              5348.05
1.126984 5268.60 -75.25              5343.85
1.130952 5298.20  29.60              5268.60
1.134921 5276.50 -21.70              5298.20
1.138889 5249.15 -27.35              5276.50
1.142857 5243.90  -5.25              5249.15
1.146825 5217.65 -26.25              5243.90
1.150794 5309.45  91.80              5217.65
1.154762 5343.65  34.20              5309.45
1.158730 5361.90  18.25              5343.65
1.162698 5336.10 -25.80              5361.90
1.166667 5404.45  68.35              5336.10
1.170635 5435.20  30.75              5404.45
1.174603 5528.35  93.15              5435.20
1.178571 5631.75 103.40              5528.35
1.182540 5602.40 -29.35              5631.75
1.186508 5536.95 -65.45              5602.40
1.190476 5577.00  40.05              5536.95
1.194444 5691.95 114.95              5577.00
1.198413 5674.90 -17.05              5691.95
1.202381 5653.40 -21.50              5674.90
1.206349 5673.75  20.35              5653.40
1.210317 5684.80  11.05              5673.75
1.214286 5704.75  19.95              5684.80
1.218254 5727.70  22.95              5704.75
1.222222 5751.55  23.85              5727.70
1.226190 5815.00  63.45              5751.55
1.230159 5751.85 -63.15              5815.00
1.234127 5708.15 -43.70              5751.85
1.238095 5671.15 -37.00              5708.15
1.242063 5663.50  -7.65              5671.15
1.246032 5681.70  18.20              5663.50
1.250000 5674.25  -7.45              5681.70
1.253968 5705.60  31.35              5674.25
1.257937 5681.10 -24.50              5705.60
1.261905 5675.30  -5.80              5681.10
1.265873 5703.30  28.00              5675.30
1.269841 5667.60 -35.70              5703.30
1.273810 5715.65  48.05              5667.60
1.277778 5688.80 -26.85              5715.65
1.281746 5683.55  -5.25              5688.80
1.285714 5665.20 -18.35              5683.55
1.289683 5656.35  -8.85              5665.20
1.293651 5596.75 -59.60              5656.35
1.297619 5609.85  13.10              5596.75
1.301587 5696.35  86.50              5609.85
1.305556 5693.05  -3.30              5696.35
1.309524 5694.10   1.05              5693.05
1.313492 5718.60  24.50              5694.10
1.317460 5709.00  -9.60              5718.60
1.321429 5731.10  22.10              5709.00
1.325397 5688.45 -42.65              5731.10
1.329365 5689.70   1.25              5688.45
1.333333 5650.35 -39.35              5689.70
1.337302 5624.80 -25.55              5650.35
1.341270      NA     NA              5624.80
plot(returns)
> returns<-z.diff/lag(open.ts,k=-1)
> returns
Time Series:
Start = c(1, 2)
End = c(1, 86)
Frequency = 252
 [1] -1.983692e-03 -8.218105e-04 -5.537437e-03  9.761305e-03 -3.104851e-03
 [6] -1.343256e-02 -6.672154e-03 -2.027765e-03  1.543451e-03 -3.901449e-04
[11]  1.063560e-03  1.658950e-02  1.121796e-03  2.375160e-03 -7.165925e-03
[16]  1.255870e-02  6.567380e-03  9.413831e-03  5.706001e-04 -7.497710e-03
[21]  1.535360e-03  5.059862e-03  7.991391e-03 -3.221344e-03  1.862683e-05
[26]  5.038464e-03  5.634064e-03 -6.183021e-03 -8.808367e-04 -7.386991e-03
[31] -7.853330e-04 -1.408161e-02  5.618191e-03 -4.095731e-03 -5.183360e-03
[36] -1.000162e-03 -5.005816e-03  1.759413e-02  6.441345e-03  3.415269e-03
[41] -4.811727e-03  1.280898e-02  5.689756e-03  1.713828e-02  1.870359e-02
[46] -5.211524e-03 -1.168249e-02  7.233224e-03  2.061144e-02 -2.995458e-03
[51] -3.788613e-03  3.599604e-03  1.947566e-03  3.509358e-03  4.022963e-03
[56]  4.163975e-03  1.103181e-02 -1.085985e-02 -7.597556e-03 -6.481960e-03
[61] -1.348933e-03  3.213561e-03 -1.311227e-03  5.524959e-03 -4.294027e-03
[66] -1.020929e-03  4.933660e-03 -6.259534e-03  8.478015e-03 -4.697628e-03
[71] -9.228660e-04 -3.228616e-03 -1.562169e-03 -1.053683e-02  2.340644e-03
[76]  1.541931e-02 -5.793183e-04  1.844354e-04  4.302699e-03 -1.678733e-03
[81]  3.871081e-03 -7.441852e-03  2.197435e-04 -6.916006e-03 -4.521844e-03
> plot(returns)







Assignment 2: Do logit analysis for 700 data points and then predict for 150 data points.

sol:

z<-read.csv(file.choose(),header=T)

head(z)

z.data<-z[1:700,1:9]

sapply(z.data,mean)

z.data$ed<-factor(z.data$ed)

logit.est<-glm(default~age+employ+address+income+debtinc+creddebt+othdebt,data=z.data,family="binomial")

summary(logit.est)

confint.default(logit.est)

logit.eg2<-with(z[701:850,1:8],data.frame(age=mean(age),employ=mean(employ),address=mean(address),income=mean(income),debtinc=mean(debtinc),creddebt=mean(creddebt),othdebt=mean(othdebt),ed=factor(1:3)))

logit.eg2$prob<-predict(logit.est,newdata=logit.eg2,type="response")

head(logit.eg2)







Tuesday, 22 January 2013


ASSIGNMENT 1a:

Fit ‘lm’ and comment on the applicability of ‘lm’
Plot1: Residual vs Independent curve
Plot2: Standard Residual vs independent curve

> file<-read.csv(file.choose(),header=T)
> file
  mileage groove
1       0 394.33
2       4 329.50
3       8 291.00
4      12 255.17
5      16 229.33
6      20 204.83
7      24 179.00
8      28 163.83
9      32 150.33
> x<-file$groove
> x
[1] 394.33 329.50 291.00 255.17 229.33 204.83 179.00 163.83 150.33
> y<-file$mileage
> y
[1]  0  4  8 12 16 20 24 28 32
> reg1<-lm(y~x)
> res<-resid(reg1)
> res
         1          2          3          4          5          6          7          8          9
 3.6502499 -0.8322206 -1.8696280 -2.5576878 -1.9386386 -1.1442614 -0.5239038  1.4912269  3.7248633
> plot(x,res)


Assignment 1 (b) -Alpha-Pluto Data

Fit ‘lm’ and comment on the applicability of ‘lm’
Plot1: Residual vs Independent curve
Plot2: Standard Residual vs independent curve

Also do:
Qq plot
Qqline

> file<-read.csv(file.choose(),header=T)
> file
   alpha pluto
1  0.150    20
2  0.004     0
3  0.069    10
4  0.030     5
5  0.011     0
6  0.004     0
7  0.041     5
8  0.109    20
9  0.068    10
10 0.009     0
11 0.009     0
12 0.048    10
13 0.006     0
14 0.083    20
15 0.037     5
16 0.039     5
17 0.132    20
18 0.004     0
19 0.006     0
20 0.059    10
21 0.051    10
22 0.002     0
23 0.049     5
> x<-file$alpha
> y<-file$pluto
> x
 [1] 0.150 0.004 0.069 0.030 0.011 0.004 0.041 0.109 0.068 0.009 0.009 0.048
[13] 0.006 0.083 0.037 0.039 0.132 0.004 0.006 0.059 0.051 0.002 0.049
> y
 [1] 20  0 10  5  0  0  5 20 10  0  0 10  0 20  5  5 20  0  0 10 10  0  5
> reg1<-lm(y~x)
> res<-resid(reg1)
> res
         1          2          3          4          5          6          7
-4.2173758 -0.0643108 -0.8173877  0.6344584 -1.2223345 -0.0643108 -1.1852930
         8          9         10         11         12         13         14
 2.5653342 -0.6519557 -0.8914706 -0.8914706  2.6566833 -0.3951747  6.8665650
        15         16         17         18         19         20         21
-0.5235652 -0.8544291 -1.2396007 -0.0643108 -0.3951747  0.8369318  2.1603874
        22         23
 0.2665531 -2.5087486
> plot(x,res)




Assignment 2: Justify Null Hypothesis using ANOVA

> file<-read.csv(file.choose(),header=T)
> file

   Chair Comfort.Level Chair1
1      I             2      a
2      I             3      a
3      I             5      a
4      I             3      a
5      I             2      a
6      I             3      a
7     II             5      b
8     II             4      b
9     II             5      b
10    II             4      b
11    II             1      b
12    II             3      b
13   III             3      c
14   III             4      c
15   III             4      c
16   III             5      c
17   III             1      c
18   III             2      c

> file.anova<-aov(file$Comfort.Level~file$Chair1)
> summary(file.anova)

            Df Sum Sq Mean Sq F value Pr(>F)
file$Chair1  2  1.444  0.7222   0.385  0.687

Wednesday, 16 January 2013


R lab_Assignment_15Jan

Question 1

Binding the columns of 2 matrices into a new matrix

a<-mat1[ ,3]
b<-mat2[ ,1]
c<-cbind(a,b)
c


Question 2

Multiplication of two matrices :

mul<-mat1%*%mat2
mul



Question 3

Regression model :
NSE<-read.csv(file.choose(),header=T)
NSE



open<-NSE[ ,2]
high<-NSE[ ,3]
reg<-lm(open~high, data=NSE)
reg
residuals(reg)


Question 4

Normal Distribution

x<-seq(0,100)
y<-dnorm(x, mean=50, sd=10)
plot(x,y, type="l", col="red")