Using RLink
The purpose of RLink is that it allows us to execute R code from AIMMS. This alone is not very useful if AIMMS and R cannot exchange data. So to use RLink you need to know:
How to execute R code from AIMMS
How to pass data between AIMMS and R
Execute R code
DataLink has a function ExecScript
that can start an R session and pass code to R that will be executed.
Calling R code
Suppose we have a line of R code that we want to execute:
write( "Hello World" , file = "helloworld.txt" )
This script just takes the string “Hello World” and saves this in file helloworld.txt
.
Using RLink we can call the same R code from AIMMS by doing:
dl::ExecScript("write(\"Hello World\",file=\"helloworld.txt\")",MapName,XA);
where MapName
and XA
are defined as:
DeclarationSection RLinkSetup {
StringParameter MapName {
InitialData: "MyRLink";
}
StringParameter DataMap {
IndexDomain: (dl::dt,dl::idn,dl::cn,dl::dn);
}
StringParameter XA {
IndexDomain: dl::rwattr;
Definition: {
{ 'DataProvider' : rlink::DataLink };
}
}
}
The first thing to notice is that ExecScript
is a DataLink function and RLink is only attached as a provider. In this simple example this seems a bit silly. Here MapName
is a string that DataLink uses for the data map, so before the call can be made dl::AddDataSourceMapping
has to be called to associate MapName
with a data map. By making RLink a DataLink provider it seems that we added a lot of unnecessary overhead.
The second thing to notice are backslashes (\
) in front of the quotes ("
) in the R code around Hello World. This is because the command is passed as a string starting and ending with a quote. The backslash tells that the quote following the backslash does not end the string. Using the backslash like this is called escaping.
It is clear that too much escaping is not very readable and having many lines of R code like that results in very unmaintainable code. Typically you want to keep all R code in a file and then tell R to execute the file using the R function source
that also accepts single quotes that do not have to be escaped. So we could do:
dl::ExecScript("source('savehelloworld.r')",MapName,XA);
Here the file savehelloworld.r
contains the line of R code we want to execute.
Usually the file would not contain just one line of code but many, and then the DataLink overhead starts to make sense. It becomes very likely that data has to be send back and forth between AIMMS and R, and in this way DataLink is already setup for that.
Important
Always use the slash as path separator, even on windows. The backslash is also the path separator in windows and should be escaped. R does accept the slash on windows so the backslash is not needed for paths in
ExecScript
. This also makes the project platform independent, which is important if you develop on windows and wants to publish the project in the cloud.Always apply case sensitivity, as AIMMS installation on AIMMS Cloud is case sensitive
The R session
RLink starts by looking at the system to find an installment of R. It looks at R_HOME
, at some predefined locations and on windows it queries the registry. Once an installment of R is found it will establish a connection.
The next step is that it will make sure that Rcpp
and the aimms
package are loaded by executing library(Rcpp)
and library(aimms)
. In case the rlink::CheckAndInstallPackage
function is called, the aimms
package may be installed as well. The last step is initialing the aimms
package by telling it how to communicate with RLink.
The above initialization happens on the very first call to dl::ExecScript
where rlink::datalink
is chosen as provider. After that the connection has been established and the R session keeps running. This means that if we create a variable in R in one call it still exists in a next call. We can do:
dl::ExecScript(" myVariable <- 3 ",MapName,XA);
dl::ExecScript(" write(myVariable,file=\"helloworld.txt\") ",MapName,XA);
Here in the first call the value 3 is assigned to myVariable
and then in the next call the content of myVariable
is written to file helloworld.txt
. The file then contains the value 3
, because the R session was not closed after the first call and myVariable
still has the value 3
.
Passing Data
In RLink the functions aimms::SetData
and aimms::GetData
can be used to pass data between R and AIMMS. These function make use of data frames, so it is important to understand data frames.
The R data frame
A data frame is a build in R structure to store data tables. Let’s make a data frame df
:
We see in the first line that the function data.frame
is used to create a data frame. Its first argument Name=c("Alice","Bob","Claire")
creates a column Name
, with three string values. The second argument creates a second column Age
with integer values.
If we look at the data frame by calling df
, R will show the data frame. Here we see three columns, the two we created and the row index. The row index is not part of the data frame, and it can be used to select one row from the data frame.
In AIMMS jargon we could say that the row index acts like a domain. Since this row index is not part of the data frame itself it can not be transferred to AIMMS. Instead if we need it we should extend the data frame with an extra row index using the R function seq.int
.
Data frames have some limitations. All columns in a data frame should have the same length and all elements in one columns are of the same type. If we change the Age
of Alice
from 15
to fifteen
, then also the other integer values in Age
will change to string 25
and 35
. In order to make sure that the R structure is a data frame, the R function as.data.frame
can be used.
The columns in a data frame have a name. These names are important because they are used as column names by DataLink.
Important
In R columns can have empty values (called NA
in R). However, data frames with empty values are not supported by RLink yet.
SetData and GetData
RLink is a DataLink provider but it operates differently. It still uses a data map to specify the mapping between identifiers in AIMMS and names of tables and columns in the source. The difference is that it allows the R code to decide when data is being read or written. For this calls can be made in R using the functions aimms::SetData
and aimms::GetData
from the AIMMS package.
Data can be send from R to AIMMS using:
aimms::SetData( Name , Dataframe )
Data can be send from AIMMS to R using:
Dataframe <- aimms::GetData( Name )
Here Name
is a string and Dataframe
is a data frame.
Both aimms::SetData
and aimms::GetData
have as first argument a name. This is the table name in DataLink jargon, and represents a table specified in the data map. When we call SetData
or GetData
, the table name is used to determine how the names in the data frame are mapped onto the identifiers in AIMMS. Unlike DataRead
and DataWrite
that read and write all tables in the data map, SetData and GetData only write to the one particular table specified as first argument.
Now we can describe the setup of DataLink to Assume the AIMMS model contains a parameter AIMMS_age
that has as domain an index from set AIMMS_name
. Then the data frame described above can be read from R using aimms::SetData
.
First we have to create a data map:
dl::DataTables+={'MyDataFrame'};
empty DataMap;
DataMap(dl::dt,dl::idn,dl::cn,dl::dn) += data {
!( table_name , identifier , col , dom ) : name_in_dataframe
( MyDataFrame , AIMMS_Name , 1 , 1 ) : "Name",
( MyDataFrame , AIMMS_age , 2 , 0 ) : "Age"
} ;
In the first line we add MyDataFrame
to the set of tables in DataLink and then we reset the 4D string parameter DataMap.
Then we assign the names of the columns in the data frame (the string on the right hand side), to the 4D domain of DataMap. The first domain of DataMap is the table name which we just added, followed by the identifiers in AIMMS. Then we specify the column number followed by the domain number. The domain number is 0 when it is a parameter, and it has all identifiers with a non zero domain number as domain.
To associate the data map with a map name do:
MapName:="MyMapName";
dl::RemoveDataSourceMapping(MapName);
dl::AddDataSourceMapping( MapName, DataMap,
dl::DependEmpty,dl::TableAttributesEmpty,dl::ColAttributeEmpty);
To read the data from data frame df
in R into AIMMS_name and AIMMS_age, we can do:
dl::ExecScript("aimms::SetData(\"MyDataFrame\",df)",MapName,XA);
Here the second argument of SetData is the data frame df
from which is being read.
To write to data frame df
in R we can do:
dl::ExecScript("df <- aimms::GetData(\"MyDataFrame\")",MapName,XA);
GetData
only has the table name as argument and returns a data frame.
Important
Other DataLink providers use functions DataRead
and DataWrite
for passing data. They are not supported in RLink.