Use SQL To Operate R Data Frames | R-bloggers

In both research and application, we need to manipulate data frames by selecting desired columns, filtering records, transforming and aggregating data. R provides built-in functions for data frame manipulation. Suppose df is the data frame we are dealing with. We use df[1:100,] to select the first 100 rows, df[,c(«price»,»volume»)] to select price and volume columns, df[df$price >= mean(df$price),] to single out records with prices no less than their average, transform(df, totalValue=price*volume) to add a new column totalValue for each record, apply(df,2,mean) to calculate the mean of each column. However, if we want to do something more, together, the R code will be totally a mess. Say we want to sort df by a new column totalValue, which equals price times volume, and then average the price and totalValue columns for the top 20 records. The R code, if written in several lines, can be this: df$totalValue <- df$price * df$volume df.sorted <- df[order(df$totalValue,decreasing=T),] df.subset = 3000 Sorting can also be simple. Here we use ORDER BY to sort the records by totalValue in a descending way. SELECT *, price * volume AS totalValue FROM df ORDER BY totalValue DESC The code for subsetting a table is also intuitive. Here we use LIMIT to select only the top 30 records with the highest totalValue. SELECT *, price * volume AS totalValue FROM df ORDER BY totalValue DESC LIMIT 30 Note that we break the lines to make the statement clear. It works perfectly in the same way as a statement without line breaks. The power of SQL may not be very clear yet, unless we combine them together. For example, if we want to finish all the tasks in the first paragraph in one SQL statement, here it is: SELECT AVG(price), AVG(totalValue) FROM (SELECT *, price * volume AS totalValue FROM df ORDER BY totalValue DESC LIMIT 20) Here we embed a SQL statement inside another. Another example is to select the top 100 records ordered by totalValue in descending way where their prices are no less than the average price. SELECT *, price * volume AS totalValue FROM df WHERE price >= (SELECT AVG(price) FROM df) ORDER BY totalValue DESC LIMIT 100 If you are familiar with SQL, the statement above is almost as friendly as plain English, and it does not matter whether we write it in one line or in several lines. Here we separate the different clauses in the statement for greater readability. You may try to implement it only by built-in R functions and you will certainly find SQL a very powerful tool. Here I should remark that sqldf is based on SQLite memory database and provides its select functionality. Since different database engines support the standard of SQL to a different degree, we are only allowed to use the SQL-SELECT statements within the support of SQLite database engine. You may get more information here. In conclusion, SQL is a powerful tool so that R users should pick it up. And sqldf is the way we use this language with R to operate data frame in a more decent

Πηγή: Use SQL To Operate R Data Frames | R-bloggers

Mobile Games and App Development- The Best Option for Business to Make Profits – Devlon Infotech

Article Summary:

Start-Ups can also make use of Mobile App Development Services for their business. An interesting fact about Mobile Games is that they don’t only make the end user hooked to their phones but can also can be a great method for Advertising Campaigns.

Continue reading