As mentioned earlier in related articles, functions written in other languages like R, matlab or ILNumerics can be integrated with table functions using the high-level constructs like function router1 and subtable transformer2.
Following example scenarios demonstrate how different estimation functions written with the popular free and open-source language R can be applied on selected parts (i.e. subtables) of an input table containing some historical data for price estimations.
You can find all the relevant code in R and C# at the download page, including the C# code for generating test data.
Price estimators in R
For the example scenarios below we have three estimation functions in R, all based on Linear Regression (LR) of the global indicators market index and oil price for delivering price predictions.
- PriceEstimatorA: Estimated product price based on estimated market index and oil price
- PriceEstimatorB: Estimated product price based on estimated market index only
- PriceEstimatorC: Estimated product price based on estimated oil price only
# help function for displaying variables display_variable = function(v, vname) { print(paste(vname,' ='),quote = FALSE) print(v,quote = FALSE) } # Price Estimation Function A # Estimation with 1st order Linear Regression # based on historical market index and oil price # # Columns of input matrix HistoricalData: # market index, oil price, product price # # MarketIndex: Estimated market index for the next year # OilPrice: Estimated oil price for the next year EstimatedProductPriceA = function(HistoricalData, MarketIndex, OilPrice) { # historical market index and oil price RowCount = nrow(HistoricalData) X_train = rbind(matrix(1,1,RowCount), t(HistoricalData[,1:2])) # hostorical product prices Y_train = matrix(HistoricalData[,3], RowCount, 1); # optimal coefficient vector to minimize MSE Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train # get estimated product price X_test = matrix(c(1.0, MarketIndex, OilPrice), nrow=1) Y_test = X_test %*% Bopt return (Y_test) } # Price Estimator B # Estimation with 1st order Linear Regression # based on historical market index only # # Columns of input matrix HistoricalData: # market index, product price # # MarketIndex: Estimated market index for the next year EstimatedProductPriceB = function(HistoricalData, MarketIndex) { # historical market index and oil price RowCount = nrow(HistoricalData) X_train = rbind(matrix(1,1,RowCount), t(HistoricalData[,1])) # hostorical product prices Y_train = matrix(HistoricalData[,2], RowCount, 1); # optimal coefficient vector to minimize MSE Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train # get estimated product price X_test = matrix(c(1.0, MarketIndex), nrow=1) Y_test = X_test %*% Bopt return (Y_test) } # Price Estimator C # Estimation with 1st order Linear Regression # based on historical oil price only # # Columns of matrix HistoricalData: # oil price, product price # # OilPrice: Estimated oil price for the next year EstimatedProductPriceC = function(HistoricalData, OilPrice) { # historical market index and oil price RowCount = nrow(HistoricalData) X_train = rbind(matrix(1,1,RowCount), t(HistoricalData[,1])) # hostorical product prices Y_train = matrix(HistoricalData[,2], RowCount, 1); # optimal coefficient vector to minimize MSE Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train # get estimated product price X_test = matrix(c(1.0, OilPrice), nrow=1) Y_test = X_test %*% Bopt return (Y_test) } HistoricalData = matrix(c(1.48, 94.38, 2.89, 0.91, 91.68, 1.83, 1.27, 85.61, 2.74), nrow=3, byrow=TRUE) print('HistoricalData') print(HistoricalData) MarketIndex = 1.2 OilPrice = 100 # method A EstPrdPrice = EstimatedProductPriceA(HistoricalData, MarketIndex, OilPrice); display_variable(EstPrdPrice,'Estimated product price with estimator A') # method B EstPrdPrice = EstimatedProductPriceB(HistoricalData[,c(1,3)], MarketIndex); display_variable(EstPrdPrice,'Estimated product price with estimator B') # method C EstPrdPrice = EstimatedProductPriceC(HistoricalData[,c(2,3)], OilPrice); display_variable(EstPrdPrice,'Estimated product price with estimator C') |
Note that the details of the estimator functions (and whether they make sense in reality) are unimportant for the scenarios below. We will focus on the big picture and integration: How can we apply selected price predictors in R on selected subtables of an input table?
You may see the related articles listed below for more information about estimations based on Linear Regression (LR):
Historical data
Following tables with historical data are used as input for price estimations:
- PriceTable: Historical prices of primary products per country and product
- IndicatorsTable: Historical values of global indicators market index and oil price
Initializing R functions in C#
// initialize R connection var envPath = Environment.GetEnvironmentVariable("PATH"); var rBinPath = @"C:\Program Files\R\R-2.14.1\bin\x64"; // check this path on your computer Environment.SetEnvironmentVariable("PATH", envPath + System.IO.Path.PathSeparator + rBinPath); REngine r_engine = REngine.CreateInstance("RDotNet"); r_engine.Initialize(); // initialize price estimation functions in R // price estimator A string funcstr = @"EstProdPriceFuncA <- function(HistData, EstMarketInd, EstOilPrice) { RowCount = nrow(HistData); X_train = rbind(matrix(1,1,RowCount), t(HistData[,1:2])); Y_train = matrix(HistData[,3], RowCount, 1); Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train; X_test = matrix(c(1.0, EstMarketInd, EstOilPrice), nrow=1); Y_test = X_test %*% Bopt; return (Y_test); }"; Function EstProdPriceFuncA = r_engine.Evaluate(funcstr).AsFunction(); // price estimator B funcstr = @"EstProdPriceFuncB <- function(HistData, EstMarketInd) { RowCount = nrow(HistData); X_train = rbind(matrix(1,1,RowCount), t(HistData[,1])); Y_train = matrix(HistData[,2], RowCount, 1); Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train; X_test = matrix(c(1.0, EstMarketInd), nrow=1); Y_test = X_test %*% Bopt; return (Y_test); }"; Function EstProdPriceFuncB = r_engine.Evaluate(funcstr).AsFunction(); // price estimator C funcstr = @"EstProdPriceFuncC <- function(HistData, EstOilPrice) { RowCount = nrow(HistData); X_train = rbind(matrix(1,1,RowCount), t(HistData[,1])); Y_train = matrix(HistData[,2], RowCount, 1); Bopt = solve(X_train %*% t(X_train)) %*% X_train %*% Y_train; X_test = matrix(c(1.0, EstOilPrice), nrow=1); Y_test = X_test %*% Bopt; return (Y_test); }"; Function EstProdPriceFuncC = r_engine.Evaluate(funcstr).AsFunction(); |
Integration of R with table functions in C#
See related article for the installation of R.NET that we use here for the integration of C# and R.
Function routers and subtable transformers expect delegate table functions as input parameters in the following format:
// Generic delegate table function // InputTbl: Input table // OtherParameters: Other parameters of any type public delegate MatrixTable TransformTableFunc_OP(MatrixTable InputTbl, params Object[] OtherParameters); |
The initialized REngine and estimation functions are passed to the enveloping table functions (with the signature above) as additional parameters. The R functions are then called within the enveloping table functions. You may find the code examples below for these table functions:
C# code for table function PriceEstimatorA()using FinaquantProtos; using RDotNet; // Table function for price estimation, estimator A (LR based on global market index and oil price) // Input parameters: // InputTbl: Historical product prices with fields year and price // OtherParameters[0]: Table of historical indicators with fields year, market_index and oil_price // OtherParameters[1]: REngine // OtherParameters[2]: R Function // OtherParameters[3]: Estimated market index for next year (double) // OtherParameters[4]: Estimated oil price for next year (double) public static MatrixTable PriceEstimatorA(MatrixTable InputTbl, params Object[] OtherParameters) { // get all input parameters MatrixTable HistoricalProductPrices = InputTbl; MatrixTable HistoricalIndicators = (MatrixTable)OtherParameters[0]; REngine engine = (REngine)OtherParameters[1]; Function EstProdPriceFuncA = (Function)OtherParameters[2]; double[] EstMarketInd = {(double) OtherParameters[3]}; double[] EstOilPrice = {(double) OtherParameters[4]}; // combine tables MatrixTable HistoricalDataTbl = MatrixTable.CombineTables(HistoricalIndicators, HistoricalProductPrices); // get matrix of key figures from table KeyMatrix HistData = HistoricalDataTbl.KeyFigValues; // convert data types of input parameters from C# to R NumericMatrix HistDataR = engine.CreateNumericMatrix(HistData.toArray); engine.SetSymbol("HistDataR", HistDataR); NumericVector EstMarketIndR = engine.CreateNumericVector(EstMarketInd); engine.SetSymbol("EstMarketIndR", EstMarketIndR); NumericVector EstOilPriceR = engine.CreateNumericVector(EstOilPrice); engine.SetSymbol("EstOilPriceR", EstOilPriceR); // call function in R from c# NumericMatrix EstProductPrice = engine.Evaluate( @"EstProductPrice <- EstProdPriceFuncA(HistDataR,EstMarketIndR,EstOilPriceR)").AsNumericMatrix(); // return a single-element table with estimated product price double EstProdPrice = EstProductPrice[0, 0]; // return a single-element table with estimated product price TableFields tf = new TableFields(InputTbl.metaData); tf.AddNewField("product_price"); MatrixTable ResultTbl = MatrixTable.CreateTableWithElements_A(tf, EstProdPrice); return ResultTbl; } |
// Table function for price estimation, estimator B (LR based on global market index only) // Input parameters: // InputTbl: Historical product prices with fields year and price // OtherParameters[0]: Table of historical indicators with fields year, market_index and oil_price // OtherParameters[1]: REngine // OtherParameters[2]: R Function // OtherParameters[3]: Estimated market index for next year (double) // OtherParameters[4]: Estimated oil price for next year (double) public static MatrixTable PriceEstimatorB(MatrixTable InputTbl, params Object[] OtherParameters) { // get all input parameters MatrixTable HistoricalProductPrices = InputTbl; MatrixTable HistoricalIndicators = (MatrixTable)OtherParameters[0]; REngine engine = (REngine)OtherParameters[1]; Function EstProdPriceFuncB = (Function)OtherParameters[2]; double[] EstMarketInd = { (double)OtherParameters[3] }; double[] EstOilPrice = { (double)OtherParameters[4] }; // combine tables MatrixTable HistoricalDataTbl = MatrixTable.CombineTables(HistoricalIndicators, HistoricalProductPrices); // exclude column oil_price from table HistoricalDataTbl = MatrixTable.ExcludeColumns(HistoricalDataTbl, TextVector.CreateVectorWithElements("oil_price")); // get matrix of key figures from table KeyMatrix HistData = HistoricalDataTbl.KeyFigValues; // convert data types of input parameters from C# to R NumericMatrix HistDataR = engine.CreateNumericMatrix(HistData.toArray); engine.SetSymbol("HistDataR", HistDataR); NumericVector EstMarketIndR = engine.CreateNumericVector(EstMarketInd); engine.SetSymbol("EstMarketIndR", EstMarketIndR); NumericVector EstOilPriceR = engine.CreateNumericVector(EstOilPrice); engine.SetSymbol("EstOilPriceR", EstOilPriceR); // call function in R from c# NumericMatrix EstProductPrice = engine.Evaluate( @"EstProductPrice <- EstProdPriceFuncB(HistDataR,EstMarketIndR)").AsNumericMatrix(); // return a single-element table with estimated product price double EstProdPrice = EstProductPrice[0, 0]; // return a single-element table with estimated product price TableFields tf = new TableFields(InputTbl.metaData); tf.AddNewField("product_price"); MatrixTable ResultTbl = MatrixTable.CreateTableWithElements_A(tf, EstProdPrice); return ResultTbl; } |
// Table function for price estimation, estimator C (LR based on global oil price only) // Input parameters: // InputTbl: Historical product prices with fields year and price // OtherParameters[0]: Table of historical indicators with fields year, market_index and oil_price // OtherParameters[1]: REngine // OtherParameters[2]: R Function // OtherParameters[3]: Estimated market index for next year (double) // OtherParameters[4]: Estimated oil price for next year (double) public static MatrixTable PriceEstimatorC(MatrixTable InputTbl, params Object[] OtherParameters) { // get all input parameters MatrixTable HistoricalProductPrices = InputTbl; MatrixTable HistoricalIndicators = (MatrixTable)OtherParameters[0]; REngine engine = (REngine)OtherParameters[1]; Function EstProdPriceFuncC = (Function)OtherParameters[2]; double[] EstMarketInd = { (double)OtherParameters[3] }; double[] EstOilPrice = { (double)OtherParameters[4] }; // combine tables MatrixTable HistoricalDataTbl = MatrixTable.CombineTables(HistoricalIndicators, HistoricalProductPrices); // exclude column market_index from table HistoricalDataTbl = MatrixTable.ExcludeColumns(HistoricalDataTbl, TextVector.CreateVectorWithElements("market_index")); // get matrix of key figures from table KeyMatrix HistData = HistoricalDataTbl.KeyFigValues; // convert data types of input parameters from C# to R NumericMatrix HistDataR = engine.CreateNumericMatrix(HistData.toArray); engine.SetSymbol("HistDataR", HistDataR); NumericVector EstMarketIndR = engine.CreateNumericVector(EstMarketInd); engine.SetSymbol("EstMarketIndR", EstMarketIndR); NumericVector EstOilPriceR = engine.CreateNumericVector(EstOilPrice); engine.SetSymbol("EstOilPriceR", EstOilPriceR); // call function in R from c# NumericMatrix EstProductPrice = engine.Evaluate( @"EstProductPrice <- EstProdPriceFuncC(HistDataR,EstOilPriceR)").AsNumericMatrix(); // return a single-element table with estimated product price double EstProdPrice = EstProductPrice[0, 0]; // return a single-element table with estimated product price TableFields tf = new TableFields(InputTbl.metaData); tf.AddNewField("product_price"); MatrixTable ResultTbl = MatrixTable.CreateTableWithElements_A(tf, EstProdPrice); return ResultTbl; } |
Scenario 1: Applying PriceEstimatorA on a single subtable for country-product pair “Utopia” and “bread”
This example is designed rather for testing if the table function PriceEstimatorA() which envelopes the corresponding R function works as expected.
// apply table function on a single subtable for country-product pair "Utopia", "bread" // filter TemperatureTable with condition table to obtain a subtable var CondTblFields = TableFields.CreateEmptyTableFields(md); TableFields.AddNewField(CondTblFields, "country"); TableFields.AddNewField(CondTblFields, "product"); var CondTbl = MatrixTable.CreateTableWithElements_A(CondTblFields, "Utopia", "bread" ); var SubTable = MatrixTable.FilterTableA(PriceTable, CondTbl); // exclude fields country and product from subtable SubTable = MatrixTable.ExcludeColumns(SubTable, TextVector.CreateVectorWithElements("country", "product")); // view subtable // MatrixTable.View_MatrixTable(SubTable, "Subtable of PriceTable (Utopia, bread)"); // execute user-defined table function to calculate expected energy consumption (Sedrun, 2012) MatrixTable EstimatedPriceTbl = PriceEstimatorA(SubTable, GlobalIndTable, r_engine, EstProdPriceFuncA, 1.2, 90.0); // view result table MatrixTable.View_MatrixTable(EstimatedPriceTbl, "Estimated price for subtable (Utopia, bread)"); |
Result table:
Yes, the table function PriceEstimatorA() works as expected.
Scenario 2: Applying PriceEstimatorA on every subtable of price table
In this example, the same prediction function is applied for all countries and products.
Applying the same table (or matrix) function on all subtables of an input table… This is what a subtable transformer is created for.
// Apply the same table function on all subtables of Price Table // subtable transformer // define subtable fields var SubTblFields = TableFields.CreateEmptyTableFields(md); TableFields.AddNewField(SubTblFields, "year"); TableFields.AddNewField(SubTblFields, "product_price"); double EstMarketIndex = 1.2; double EstOilPrice = 100.0; // execute subtable transformer MatrixTable ResultTbl = MatrixTable.TransformSubTables(PriceTable, SubTblFields, PriceEstimatorA, GlobalIndTable, r_engine, EstProdPriceFuncA, EstMarketIndex, EstOilPrice); // view result table MatrixTable.View_MatrixTable(MatrixTable.Round(ResultTbl,2), "Estimator-A applied for all countries and products"); |
Result table:
Scenario 3: Applying different price estimators to different countries and products
This is where the fabulous function router comes into play.
What we want to do is summarized in the following table:
In the result table we want to see explicitly which estimation function is used to generate each estimation.
// Function Router and R // extend MetaData md.AddNewField("estimation_method", FieldType.TextAttribute); md.AddNewField("country_cell", FieldType.TextAttribute); md.AddNewField("product_cell", FieldType.TextAttribute); // create condition matrix table CondTblFields = TableFields.CreateEmptyTableFields(md); TableFields.AddNewField(CondTblFields, "country_cell"); TableFields.AddNewField(CondTblFields, "product_cell"); var CondMatTbl = MatrixTable.CreateTableWithElements_A(CondTblFields, "Utopia", "bread, cheese", // estimator A "Utopia", "butter, honig", // estimator B "Ignoria", "ALL", // estimator C "Euphoria", "ALL" // estimator A ); // create associated table to see the estimation method in the result table var AssocTblFields = TableFields.CreateEmptyTableFields(md); TableFields.AddNewField(AssocTblFields, "estimation_method"); var AssocTbl = MatrixTable.CreateTableWithElements_A(AssocTblFields, "estimator A", "estimator B", "estimator C", "estimator A" ); // array for delegate functions; assign a function for each row of condition matrix table var MyTableFuncList = new TransformTableFunc_OP[4]; MyTableFuncList[0] = PriceEstimatorA; MyTableFuncList[1] = PriceEstimatorB; MyTableFuncList[2] = PriceEstimatorC; MyTableFuncList[3] = PriceEstimatorA; // array for other parameters // pass same parameters for all subtables excluding R Function (estimators A, B, C) EstMarketIndex = 1.2; EstOilPrice = 100.0; var OtherParametersList = new object[4][]; OtherParametersList[0] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncA, EstMarketIndex, EstOilPrice }; OtherParametersList[1] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncB, EstMarketIndex, EstOilPrice }; OtherParametersList[2] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncC, EstMarketIndex, EstOilPrice }; OtherParametersList[3] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncA, EstMarketIndex, EstOilPrice }; // call function router C (with condition cell table) ResultTbl = MatrixTable.FunctionRouterC(PriceTable, SubTblFields, CondMatTbl, MyTableFuncList, AssocTbl, FirstMatchOnly: true, IgnoreHierarchy: true, ErrorIfConditionNotRelevant: false, OtherParametersList: OtherParametersList); // round all key figures to 2 digits after decimal point ResultTbl = MatrixTable.Round(ResultTbl, 2); // view result table MatrixTable.View_MatrixTable(ResultTbl, "Result table: Apply different estimators for different subtables"); |
Result table:
Scenario 4: Applying all price estimators for all countries and products
In this last scenario we want to apply all available estimation functions on all subtables of the price table in order to compare the estimation results. This is again a task for the function router.
// Function Router and R // Apply all price estimators to all countries and products // create condition table (not condition matrix table with condition cells!) CondTblFields = TableFields.CreateEmptyTableFields(md); TableFields.AddNewField(CondTblFields, "country"); CondTbl = MatrixTable.CreateTableWithElements_A(CondTblFields, "ALL", // estimator A "ALL", // estimator B "ALL" // estimator C ); // create associated table to see the estimation method in the result table AssocTbl = MatrixTable.CreateTableWithElements_A(AssocTblFields, "estimator A", "estimator B", "estimator C" ); // array for delegate functions; assign a function for each row of condition matrix table MyTableFuncList = new TransformTableFunc_OP[3]; MyTableFuncList[0] = PriceEstimatorA; MyTableFuncList[1] = PriceEstimatorB; MyTableFuncList[2] = PriceEstimatorC; // array for other parameters // pass same parameters for all subtables excluding R Function (estimators A, B, C) EstMarketIndex = 1.2; EstOilPrice = 100.0; OtherParametersList = new object[3][]; OtherParametersList[0] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncA, EstMarketIndex, EstOilPrice }; OtherParametersList[1] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncB, EstMarketIndex, EstOilPrice }; OtherParametersList[2] = new object[] { GlobalIndTable, r_engine, EstProdPriceFuncC, EstMarketIndex, EstOilPrice }; // call function router A (with condition table) ResultTbl = MatrixTable.FunctionRouterA(PriceTable, SubTblFields, CondTbl, MyTableFuncList, AssocTbl, JokerMatchesAllvalues: true, TextJoker: "ALL", NumJoker: 0, FirstMatchOnly: false, OtherParametersList: OtherParametersList); // round all key figures to 2 digits after decimal point ResultTbl = MatrixTable.Round(ResultTbl, 2); // view result table MatrixTable.View_MatrixTable(ResultTbl, "Result table: Apply all estimators for all subtables"); |
Result table:
Conclusions
In this article we demonstrated how selected functions in R can be applied on selected parts (subtables) of input tables using flexible constructs like subtable transformer and function router. In that sense, these constructs can be used to integrate R with data tables containing any number of attributes and key figures.
The same (or similar) integration approach can be used for other computing languages like matlab.
Copyright secured by Digiprove © 2013 Tunc Ali Kütükcüoglu