---
title: "Mayhem at DinoFunWorld"
author: "Petra Isenberg"
date: "October 5, 2015"
output: html_document
---

#Merging Data Files with R

##Loading Files

First we will load a file that contains attractions, their ids, and coordinates in the park
```{r}
coordinates <- read.csv("ParkCoordinates.csv")
head(coordinates)
```

Next we will load our data from the data cleaning exercise
```{r}
attractions <- read.csv("AttractionsOCR-txt.csv")
head(attractions)
```


##Merging both files

Next we need to merge both files into one that adds the coordinates to the park attractions file.
But first we should compare both files:

```{r}
str(coordinates)
str(attractions)
```

We notice that AttractionID is not a Factor in the attractions dataset. Lets fix this:

```{r}
attractions$AttractionID <- as.factor(attractions$AttractionID)
```

We can also see that there are a few more attractions in the coordinates dataset. So we need to be careful when merging and specify all.x=TRUE (see man page for the merge command by calling ?merge in the R console): 



```{r}
library(xtable)

#you can also use the option echo=FALSE to hide this code in the output 
#to hide the output use results="hide"  ```{r sectionname, results="hide"}
fulldata <- merge(coordinates,attractions, by.x="AttractionID",by.y="AttractionID",all.x=TRUE)
fulldata <- fulldata[order(fulldata$AttractionID),] 
```

```{r, results="asis"}
xt <- xtable(fulldata)
print.xtable(xt,type="html")
```


##Modifying the Data
From the table above we can see that park entrances have no Park Area or Attraction Type.
By looking at our park map we can make the following modifications:

![Park Map](http://aviz.fr/wiki/uploads/TeachingVA2015/parkmap.png)

```{r}
fulldata[fulldata$AttractionID=="N",]$ParkArea = "Entry Corridor"
fulldata[fulldata$AttractionID=="E",]$ParkArea = "Kiddie Land"
fulldata[fulldata$AttractionID=="W",]$ParkArea = "Tundra Land"

#first we need to generate a new level for the CategoryNames column
categorynames <- c(levels(fulldata$CategoryNames),"Entry-Exit")
levels(fulldata$CategoryNames) <- categorynames

#only now can we do this without generating any errors
fulldata[fulldata$AttractionID=="N",]$CategoryNames = "Entry-Exit"
fulldata[fulldata$AttractionID=="E",]$CategoryNames = "Entry-Exit"
fulldata[fulldata$AttractionID=="W",]$CategoryNames = "Entry-Exit"

#now let's check if this worked ok
tail(fulldata)
```

Also we really don't need the Attractions.y column if everything checks out so far. So lets get rid of it:

```{r}
fulldata$Attraction.y <- NULL
```


##Plotting the data
To double-check what we've done:
First a plot that shows coordinates by park area

```{r}
plot(fulldata$x, fulldata$y, col=fulldata$ParkArea)
```

You can try to make another plot colored by categoryNames



##Write a new data file
```{r}
write.csv(fulldata,file="Attraction-Coordinates.csv")
```

You may want to show the final data file again here